Computation Errors with GPU work units
log in

Advanced search

Message boards : Number crunching : Computation Errors with GPU work units

Author Message
Pokey
Send message
Joined: 18 Sep 17
Posts: 5
Credit: 69,919,767
RAC: 33
Message 24283 - Posted: 19 Sep 2017, 22:20:10 UTC
Last modified: 19 Sep 2017, 22:21:56 UTC

I have a rig that is throwing "computation errors", and defying my efforts to correct.
Rig is Sandy ID: 821495 : (https://boinc.thesonntags.com/collatz/hosts_user.php)
I7-2600K, with two Nvidia GTX 970 cards, Win 10 and Boinc ver 7.6.33
I have Visual C++ 2012 both x64 and x86 installed.
I have tried:
different Nvidia drivers.
removed Colatz and reattached to project.
rebooted, reset, updated, etc., etc......................
Also, this computer works fine with other projects, like seti, GPUGrid, so on.

And the failure occurs in seconds.

Latest Stderr output:
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
Error performing inpage operation.
(0x3e7) - exit code 999 (0x3e7)
</message>
<stderr_txt>
Collatz Conjecture Sieve 1.21 Windows x86_64 for OpenCL
Written by Slicker (Jon Sonntag) of team SETI.USA
Based on the AMD Brook+ kernels by Gipsel of team Planet 3DNow!
Sieve code and OpenCL optimization provided by Sosiris of team BOINC@Taiwan
Error: (999)Unknown at 643 of GetContext
clCreateContext() failed (999) pYx&#252;
Error 999. Processing Aborted.
14:26:34 (4284): called boinc_finish

</stderr_txt>
]]>

Anybody see anything I'm missing??

Profile step2000
Avatar
Send message
Joined: 1 Aug 13
Posts: 96
Credit: 1,482,147,407
RAC: 1,916,589
Message 24284 - Posted: 19 Sep 2017, 23:29:05 UTC

Is your card over clocked or the CPU in any way?

If so clock at stock and see what happens.

Now if not over clocked try uninstall first the C and also the Project also the BOINC software.

Reverse install and see if that helps.
____________
Retired Business Owner/Developer
Working toward a real solution to why programming takes so many versions to get the end product that just keeps getting better with each version. Do Loop of Products!

Pokey
Send message
Joined: 18 Sep 17
Posts: 5
Credit: 69,919,767
RAC: 33
Message 24285 - Posted: 20 Sep 2017, 2:47:17 UTC - in response to Message 24284.

No OC here, I run everything stock.
I will give your suggestion a try tomorrow, and get back with the results.

Thanks.

Pokey
Send message
Joined: 18 Sep 17
Posts: 5
Credit: 69,919,767
RAC: 33
Message 24287 - Posted: 20 Sep 2017, 18:16:21 UTC - in response to Message 24285.

I followed the procedure outlined and no joy. I have now done on the order of 4-500 work units and all have failed. And now they are all showing as abandoned..............
Not sure how I managed to do that but I did.

I tried to put it back on seti and the gpu "failed to initialize".

This rabbit hole has gotten a lot deeper that I ever thought possible.

I am going to proceed now on the assumption that it is hardware related and try to check all the components before trying again.

If I do get the problem identified, I will post back here.
Thanks for the help.

Profile step2000
Avatar
Send message
Joined: 1 Aug 13
Posts: 96
Credit: 1,482,147,407
RAC: 1,916,589
Message 24288 - Posted: 20 Sep 2017, 18:53:19 UTC

Have seen this and it was related to power. I would maybe look at your PSU and see if the voltage it proper for your CPU and GPU.
____________
Retired Business Owner/Developer
Working toward a real solution to why programming takes so many versions to get the end product that just keeps getting better with each version. Do Loop of Products!

Pokey
Send message
Joined: 18 Sep 17
Posts: 5
Credit: 69,919,767
RAC: 33
Message 24289 - Posted: 21 Sep 2017, 20:48:02 UTC - in response to Message 24288.

I have tested Sandy's components and they appear fine, including the PSU, ram, grapics cards. Now it could be the motherboard I guess but I am undecided on that.

I will put Sandy to work elsewhere rather than continue butting my head against this wall. The other rigs will continue on Collatz for a while longer, at least with my GPUs. At several days per CPU work unit, I have decided to point the CPUs toward other projects.

Crunch on.

Profile step2000
Avatar
Send message
Joined: 1 Aug 13
Posts: 96
Credit: 1,482,147,407
RAC: 1,916,589
Message 24291 - Posted: 22 Sep 2017, 0:32:44 UTC

Okay I do have one thing to try if possible. Try changing the screen resolution before starting the application. If this works then what you could have is a bad memory chip as the buffering for the screen could be hung maybe....also I assume your cleared the runtime c classes out and reboot and reloaded? Okay well like you say crunch away!
____________
Retired Business Owner/Developer
Working toward a real solution to why programming takes so many versions to get the end product that just keeps getting better with each version. Do Loop of Products!

Pokey
Send message
Joined: 18 Sep 17
Posts: 5
Credit: 69,919,767
RAC: 33
Message 24307 - Posted: 28 Sep 2017, 16:12:42 UTC - in response to Message 24291.
Last modified: 28 Sep 2017, 16:14:34 UTC

The issue has been solved I think. Sandy would not work on other projects either, but an error message got me headed in the right direction. I rebooted in safe mode and used DDU to remove the old Nvidia drivers, and then reinstalled the latest driver. So now my card is doing work again. I can only surmise that the installed driver got corrupted and simply reinstalling wasn't the answer. I know DDU also cleans the registry so it could have been a registry entry that was the culprit. In any event the issue has been solved.


Post to thread

Message boards : Number crunching : Computation Errors with GPU work units


Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.