Posts by Stick
log in
1) Message boards : Number crunching : GPU Errors (Message 24446)
Posted 73 days ago by Stick
I had a recent problem with an Intel GPU driver which I posted on the Intel GPU thread. Bottom line, my problem was fixed by doing a "clean install" of the GPU driver. I used DDU (a free utility which I downloaded here) in Safe Mode to completely erase the driver that was installed. Then, did a restart, in which the GPU came up as the default Microsoft Display Adapter. After that I reinstalled the same driver and my Collatz problem was gone. (You may also need to do a Windows Update to reestablish your NVIDIA card in Device Manager.)

I don't know if this will work in your case but it may be worth a shot.

Note: DDU is available for download from a variety of sites which also try to sell you other software. Finding the correct link for the free download is a little tricky. If you pick the wrong one, you may download something you don't want. Just be careful!
2) Message boards : Number crunching : Intel GPU (Message 24437)
Posted 75 days ago by Stick
I would try then UNINSTALL of the BOINC software and install the latest version as others have had issues also with the newer Intel GEN of processors.

Already running latest BOINC software - version 7.8.3 (x64).
3) Message boards : Number crunching : Intel GPU (Message 24434)
Posted 76 days ago by Stick
Problem solved! I used DDU to uninstall my Intel driver (in Safe Mode). Then reinstalled the same driver. I now have a Collatz WU that is running normally and should probably finish later today.

Note: Before originally posting about this problem, I had deleted and reinstalled the Intel driver - without using DDU. Obviously, there was some junk in the driver space that was causing the problem. The first reinstall didn't clear it but DDU did the job.

Looks like Collatz doesn't have a problem after all.
4) Message boards : Number crunching : Intel GPU (Message 24433)
Posted 76 days ago by Stick
OpenCL is not loading properly with your current driver. I would look for newer one or maybe go back a version.
Not a viable option for this new computer. The current driver is the latest and there is nothing in "rollback".

Besides, since SETI is working fine, I am more inclined to believe it is a Collatz problem with OpenCL 2.01/Intel 6.XX drivers. And, apparently, you thought that, too, at one time (in Message 24114).
If not Collatz may just not be ready with Intel 6.XX drivers.
5) Message boards : Number crunching : Intel GPU (Message 24429)
Posted 77 days ago by Stick
As you can see below, my new Lenovo laptop has essentially the same problem as the others who have posted here. Seti works fine but Collatz tasks crash immediately. My driver version is 21.20.16.4590 and it is using OpenCL: 2.01. I have an older laptop with an Intel GPU using OpenCL: 1.02 and it does Collatz tasks with no problems. Could OpenCL: 2.01 be the problem here?

<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -11 (0xfffffff5)</message>
<stderr_txt>
Collatz Conjecture Sieve 1.21 Windows x86_64 for OpenCL
Written by Slicker (Jon Sonntag) of team SETI.USA
Based on the AMD Brook+ kernels by Gipsel of team Planet 3DNow!
Sieve code and OpenCL optimization provided by Sosiris of team BOINC@Taiwan
BUILD LOG
<built-in>:1:9: error: '__FINITE_MATH_ONLY__' macro redefined
<built-in>:265:9: note: previous definition is here

clBuildProgram() failed with error (-11)
Error: (-11)Program build failure at 1163 of SetupOpenCL

Error -11. Processing Aborted.
21:52:57 (11628): called boinc_finish

</stderr_txt>
6) Message boards : Number crunching : Intel GPU (Message 23993)
Posted 256 days ago by Stick
I have a similar Intel GPU setup and only had problems with it before I updated all the drivers, etc. as you have. You might look through the Optimizing Collatz Sieve thread and try experimenting with cc_config.xml file settings. Otherwise, just wait for Slicker to respond.

EDIT: Your Stderr files contains the following error messages:
<built-in>:1:9: error: '__FINITE_MATH_ONLY__' macro redefined
<built-in>:265:9: note: previous definition is here

clBuildProgram() failed with error (-11)
Error: (-11)Program build failure at 1163 of SetupOpenCL
7) Message boards : Number crunching : collatz_sieve_3324384911943423492096_52776558133248 (Message 23990)
Posted 260 days ago by Stick
Workunit 120284398 is probably going to disappear from the DB soon and, if so, will render this issue unanswerable. If that happens, it's OK. I am not reporting this because I missed out on some credit. I am reporting it because I think there may be a logic flaw in the Validater.

The WU had 3 tasks issued and reported: 1 - Completed and validated (issued first); 1 - Error while computing (issued second and possibly after the first's deadline); and, 1 - Completed, marked as invalid (issued after the second one errored out). The invalid one (mine) was the last to report. The valid one was a Collatz Sieve v1.20 and the invalid one was a Collatz Sieve v1.21 (opencl_amd_gpu).

My question is, how did the validater determine that mine was invalid and the other one was valid? If the two didn't match, shouldn't a tiebreaker have been issued? Maybe mine was marked invalid because it happened to be the last to report and therefore was just unneeded.
8) Message boards : Number crunching : Errors while computing (Message 22575)
Posted 584 days ago by Stick
Just had these 2 tasks error out: Task 75326929 and Task 75238701.
Different hosts and different error messages but they have this in common - they both failed to recover after a power failure. Thought you might want to review the Stderr outputs.

Task 75326929: -102 (0xffffffffffffff9a) ERR_READ
Task 75238701: -529697949 (0xffffffffe06d7363) Unknown error number
9) Message boards : Number crunching : (unknown error) - exit code -1073741515 (0xc0000135) (Message 21497)
Posted 833 days ago by Stick
I've got basically the same issue on computer 68709. The error messages are slightly different different: -1073741819 (0xffffffffc0000005) Unknown error number - but the symptoms are the same. The stderr txt says: Unhandled Exception Detected... Reason: Access Violation (0xc0000005) at address 0x000000007779AC04 write attempt to address 0x00000024.

And my Microsoft Visual C++ runtime files are up to date. I even did repair re-installs of them just in case they were damaged.

Note that my GPU is an AMD and the errors are with Collatz Sieve v1.21 (opencl_amd_gpu). And it never had any problems with the now deprecated Large and Solo apps.

Nevermind! I just updated my GPU driver to the latest version of Catalyst (v15.7.1) and downloaded some new WU's. It's still early, but the first task didn't crash immediately and seems to be progressing OK. I would also note that my old driver wasn't that old - probably less than a year.
10) Message boards : Number crunching : (unknown error) - exit code -1073741515 (0xc0000135) (Message 21484)
Posted 834 days ago by Stick
I've got basically the same issue on computer 68709. The error messages are slightly different different: -1073741819 (0xffffffffc0000005) Unknown error number - but the symptoms are the same. The stderr txt says: Unhandled Exception Detected... Reason: Access Violation (0xc0000005) at address 0x000000007779AC04 write attempt to address 0x00000024.

And my Microsoft Visual C++ runtime files are up to date. I even did repair re-installs of them just in case they were damaged.

Note that my GPU is an AMD and the errors are with Collatz Sieve v1.21 (opencl_amd_gpu). And it never had any problems with the now deprecated Large and Solo apps.
11) Message boards : News : Collatz Sieve 1.08 Released for Windows (Message 21078)
Posted 887 days ago by Stick
Just had a large download batch under V1.09 and where each unit failed miserably within the first second of execution.

Glad to see we are back to V1.08.

For whatever its worth, here is a sample failed unit: http://boinc.thesonntags.com/collatz/result.php?resultid=20983694


Me, too. All of mine failed with: -1073741795 (0xffffffffc000001d) Unknown error number.
12) Message boards : News : New Windows CUDA and OpenCL Versions Released (Message 20758)
Posted 929 days ago by Stick
Have you edited the config file? What settings are you using in it?

Haven't touched it.

However, I just figured out why it's so slow. That is, it's a full-size unit and I have been getting a steady diet of minis. In fact, it's been so long since I have had a full-size, I don't remember the size difference. But minis usually take 5 to 6 hours on this computer.
13) Message boards : News : New Windows CUDA and OpenCL Versions Released (Message 20752)
Posted 929 days ago by Stick
WU 16437712 is my first with Solo Collatz Conjecture v6.08 (opencl_intel_gpu). I am guessing there is a problem with the WU (or maybe with the new apps). That is, one wingman has errored out and another aborted. And my unit is progressing VERY slowly - only about 4% complete after about 5 hours. I will suspend it for now but keep it in cache in case there are suggestions for fixing and/or requests more info.
14) Message boards : Number crunching : -1073741819 (0xffffffffc0000005) Unknown error number (Message 17366)
Posted 1621 days ago by Stick
Congrats on finding a fix. Catalyst 13.4 fixes the problems for many, but not all. I'm hoping the OpenCL issue which AMD admitted they could duplicate will be fixed in the 13.8 release although there is no mention of any OpenCL fixes in the release notes that I've seen.


As it turns out, Catalyst 13.4 may have fixed my v4.07 problem, but it is not without problems of its own. It doesn't like to wake-up from sleep mode. All I get is a blank screen with a cursor that doesn't work and that forces me to do a reboot. When I updated, I used the AMD tool that scans your computer and recommends the best upgrade option. I know there are newer versions of Catalyst out there but I hesitate to go beyond their recommended version. So I think I will try the remove, clean and reinstall procedure just in case the install I did yesterday had some unreported glitches. If that doesn't work, I'll probably have to revert back to 13.1.
15) Message boards : Number crunching : -1073741819 (0xffffffffc0000005) Unknown error number (Message 17353)
Posted 1622 days ago by Stick
Zydor,

Thank you for the response! But, I don't think any of your ideas apply to my case. I don't overclock. I am not a gamer. I do disk maintenance and file purges on a regular basis, as well as Windows updates, etc. I am running BOINC 7.0.64. And, I am not having this problem with any other BOINC ATI Open CL programs.

However, after I posted, I rechecked my AMD driver version and found that it was slightly out of date (i.e. Catalyst 13.1 vs. 13.4). I have since updated and will report back as soon as I get another v4.07 task.

Again, thank you for your response.

Stick

EDIT: Apparently the problem was with Catalyst 13.1. I just got a new v4.07 task and it seems to be running OK with Cat 13.4.
16) Message boards : Number crunching : -1073741819 (0xffffffffc0000005) Unknown error number (Message 17347)
Posted 1622 days ago by Stick
My laptop's GPU is crashing every collatz v4.07 (opencl_ati_100) task it gets with the above error. But, it has no problems with collatz v2.09 (ati13ati) and mini collatz v2.09 (ati13ati) WU's. I am pretty sure my drivers are up to date. Any ideas?
17) Message boards : News : v4.06 Application Released for Windows x64 for OpenCL (Message 16476)
Posted 1713 days ago by Stick
I have a WU that's running way too long with solo_collatz v4.06 (opencl_ati_100). That is, it's only 7.5% complete after 35 hours. I am guessing it's the app, since my usual completion time with collatz v2.09 (ati13ati) is about 17 hours and my only other WU's with solo crashed immediately with v4.05. I will probably abort it but, before I do, is anyone interested in any data I could provide to help with debugging?


Try suspending Collatz and then after a 5 count resume it again and see if the slow unit doesn't start back up again.

I had tried suspending/resuming several times before posting. That's not the problem. It's not stuck, it's just progressing very, very slowly.
18) Message boards : News : v4.06 Application Released for Windows x64 for OpenCL (Message 16474)
Posted 1714 days ago by Stick
I have a WU that's running way too long with solo_collatz v4.06 (opencl_ati_100). That is, it's only 7.5% complete after 35 hours. I am guessing it's the app, since my usual completion time with collatz v2.09 (ati13ati) is about 17 hours and my only other WU's with solo crashed immediately with v4.05. I will probably abort it but, before I do, is anyone interested in any data I could provide to help with debugging?
19) Message boards : Windows : Lot of ATI errors (Message 16421)
Posted 1724 days ago by Stick
Got an error on this WU running solo_collatz v4.05 (opencl_ati_100). (I am pretty sure this was my first unit using that app.) Have been running collatz v2.09 (ati13ati) without any problem.


Just noticed another error running solo: -1073741819 (0xffffffffc0000005) Unknown error number. I am pretty sure it was the same error as in the first unit I reported (above) - but it is no longer in the DB.

EDIT: This latest WU has been reissued twice - once to a host also using solo_collatz v4.05 (opencl_ati_100) and it also errored out with the -1073741819 (0xffffffffc0000005) Unknown error number code. The other went to a host running solo_collatz v4.04 (cuda50) and it is still in progress.
20) Message boards : Windows : Lot of ATI errors (Message 16414)
Posted 1725 days ago by Stick
Got an error on this WU running solo_collatz v4.05 (opencl_ati_100). (I am pretty sure this was my first unit using that app.) Have been running collatz v2.09 (ati13ati) without any problem.




Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.