Spat of "inconclusive" and "invalid" Results for Solo_Collatz and Collatz 3.10
log in

Advanced search

Message boards : Macintosh : Spat of "inconclusive" and "invalid" Results for Solo_Collatz and Collatz 3.10

Author Message
Jon Fox
Send message
Joined: 6 Sep 09
Posts: 36
Credit: 352,068,539
RAC: 266,723
Message 16805 - Posted: 13 Jun 2013, 8:20:38 UTC

I'm seeing a significant rise in the number of invalid wu's, in fact I've not seen a successful wu in the past eight (8) processing days.

Here's a sample of the stderr.txt for an "invalid" wu:

<core_client_version>7.1.3</core_client_version>
<![CDATA[
<stderr_txt>
dyld: DYLD_ environment variables being ignored because main executable (/Library/Application Support/BOINC Data/slots/0/../../switcher/switcher) is setuid or setgid
Collatz Conjecture v3.10 OS X for OpenCL
Based on the AMD Brook+ kernels by Gipsel
Device 0
Start 2379854427062803343720
Checking 103079215104 numbers
Device Vendor AMD
Name ATI Radeon HD 6770M
Compute Units 6
Driver version 1.0
Version OpenCL 1.1
Optimizations cl-fast-relaxed-math,cl-mad-enable
Max Workgroup Size 1024
Reduce Group Size 256
Numbers/Kernel 262144
Kernels/Reduction 64
Numbers/Reduction 16777216
Reductions/WU 6144
Highest Steps 1049270 for 2379854427099089840958
Total Steps 54754514935583
GPU time 1146.05 seconds
CPU time 15.8943 seconds
Total time 1146.41 seconds
08:02:28 (16226): called boinc_finish

</stderr_txt>
]]>


Here's a sample of the stderr.txt for an "inconclusive" wu:
<core_client_version>7.1.3</core_client_version>
<![CDATA[
<stderr_txt>
dyld: DYLD_ environment variables being ignored because main executable (/Library/Application Support/BOINC Data/slots/1/../../switcher/switcher) is setuid or setgid
Collatz Conjecture v3.10 OS X for OpenCL
Based on the AMD Brook+ kernels by Gipsel
Device 0
Start 2379766375663390140776
Checking 824633720832 numbers
Device Vendor AMD
Name ATI Radeon HD 6770M
Compute Units 6
Driver version 1.0
Version OpenCL 1.1
Optimizations cl-fast-relaxed-math,cl-mad-enable
Max Workgroup Size 1024
Reduce Group Size 256
Numbers/Kernel 262144
Kernels/Reduction 64
Numbers/Reduction 16777216
Reductions/WU 49152
Highest Steps 1049128 for 2379766376388714407396
Total Steps 430534481886141
GPU time 9344.37 seconds
CPU time 122.868 seconds
Total time 9344.9 seconds
21:44:54 (46325): called boinc_finish

</stderr_txt>
]]>


--
jon

Jon Fox
Send message
Joined: 6 Sep 09
Posts: 36
Credit: 352,068,539
RAC: 266,723
Message 16869 - Posted: 19 Jun 2013, 12:52:13 UTC

The "inconclusive" and "invalid" tasks have become approximately 90% of the total work units completed.

As these work units do little to nothing in the way of contributing to the project objectives, I've reset the project to clear out the current task queue and set the project to NNT for the time being. I'll run another project for a while and come back to Collatz at a later date to retry.

Of course, I'll continue to monitor this thread (and others) in case someone uncovers a similar issue.

--
jon

Larry Pageler
Send message
Joined: 11 Nov 10
Posts: 3
Credit: 1,436,646
RAC: 0
Message 16895 - Posted: 21 Jun 2013, 1:26:32 UTC

I have been experiencing a similar issue with my imac running the opencl applications. Almost all of my returned wu are being marked inconclusive and then invalid. Anyone have solutions/reasons for this problem?

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 16902 - Posted: 21 Jun 2013, 14:42:12 UTC

They are marked as inconclusive and then later as errors because the steps calculated are not correct. For example, 2379766376388714407396 should have 552 steps, not 1,049,270 as reported in the output show above.

Since the app hasn't changed, it would appear that the problem is either due to it getting screwed up with resuming from a checkpoint or it is due to a change in the driver. If checkpointing worked OK with version 3.10 in the past, then it is driver related. I don't recall any checkpointing problems with the 3.xx apps and given the slew of issues with the PC drivers as of late, I have a feeling that recent OSX drivers are just as flaky.

Another option would be excessive heat making the GPU flaky. Summer heat or excessive overclocking will also return junk calculations. Once screwed up, it may continue that way until a cold reboot.

Jon Fox
Send message
Joined: 6 Sep 09
Posts: 36
Credit: 352,068,539
RAC: 266,723
Message 16914 - Posted: 21 Jun 2013, 18:02:21 UTC - in response to Message 16902.

Thanks for the update.

I'm running the stock OS X supplied drivers and have seen the errors in both 10.8.3 and 10.8.4 versions of OS X. I do suspect that it is a checkpoint restart issue as the one thing that has changed is that with the recent OS X/ATI supported applications from PrimeGrid, I did begin to receive GPU WUs from the PrimeGrid project. Previously, Collatz was supplying the only GPU project WUs for my OS X machine so there was no checkpointing required or invoked.

It also does appear to be isolated to OS X as I have the same pair of GPU WUs types (Collatz/CUDA50 plan and PrimeGrid) running on a Windows 7/X64 machine with no issues.

If there's something more I can supply to assist in troubleshoot, please let me know.

Thanks again.
--
jon


Post to thread

Message boards : Macintosh : Spat of "inconclusive" and "invalid" Results for Solo_Collatz and Collatz 3.10


Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.