Message boards :
Number crunching :
Computation Errors
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Gero-T Send message Joined: 9 Oct 16 Posts: 9 Credit: 57,231,987,399 RAC: 0 |
my *.config file looks like this: verbose=1 kernels_per_reduction=48 threads=8 lut_size=17 cache_sieve=1 sieve_size=30 increasing threads=8 results in error !!! on my PC For 1070 and 1080 cards. |
![]() ![]() Send message Joined: 1 Aug 13 Posts: 17 Credit: 4,543,611,197 RAC: 4,483,269 |
I would increase it as 8 division could be the issue with RAM/Video combo. Try 9 or 11 and see if that hangs. If so then back to 7 would be my direction. You also said it fails but is it a hard fail, does the system lock etc. If you are also running other processes on these systems you may need to back things down to share the resources properly. Many times the use of some other processes just eat the GPU on thinks like GSYNC and even ADOBE apps. Just me being a Math Geek! |
Gero-T Send message Joined: 9 Oct 16 Posts: 9 Credit: 57,231,987,399 RAC: 0 |
No no. The main prob is: Error: GPU steps do not match CPU steps. Workunit processing aborted. Further on it helped to remove sleep=1 or 0 and to set threads=8 with warning of increasing threads=9 or more wu fails. I think the prob with new Nvidia drivers (4xx.xx) on Windows (10) is solved. |
BobMALCS Send message Joined: 7 Feb 14 Posts: 3 Credit: 214,578,960 RAC: 3 |
Installed NVIDIA 417.01 and Collatz immediately crashed out with Error: GPU steps do not match CPU steps. Workunit processing aborted. Other projects are running ok. I'll wait until this problem is reported fixed before running Collatz again. |
Padanian Send message Joined: 28 May 10 Posts: 13 Credit: 1,633,107,877 RAC: 3,135,784 |
Same here with 416.34. A bunch of wus failed on me with threads=10 |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 537 Credit: 14,303,541,965 RAC: 60,665,201 |
Installed NVIDIA 417.01 and Collatz immediately crashed out with Error: GPU steps do not match CPU steps. Workunit processing aborted. Nope the 400 series driver don't work here with less than the brand new 2000 series cards, I've been thru several. Just uninstall the brand new driver and unless you uninstalled the older drivers they are still in there and will take over, after a reboot, and you will be crunching again. |
BobMALCS Send message Joined: 7 Feb 14 Posts: 3 Credit: 214,578,960 RAC: 3 |
If all, or even some, of the other projects I run on the NVIDIA GPU failed with the 400 series drivers I would reinstall the latest 300 driver. However, Collatz is the only one to fail. I'll stay with the 400 series. |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 537 Credit: 14,303,541,965 RAC: 60,665,201 |
If all, or even some, of the other projects I run on the NVIDIA GPU failed with the 400 series drivers I would reinstall the latest 300 driver. However, Collatz is the only one to fail. Sounds good, happy crunching. |
Shadak Send message Joined: 6 Nov 09 Posts: 1 Credit: 4,846,985 RAC: 0 |
If all, or even some, of the other projects I run on the NVIDIA GPU failed with the 400 series drivers I would reinstall the latest 300 driver. However, Collatz is the only one to fail. ich have no problems with the 416er. (Geforce 1070) |
nedmanjo![]() Send message Joined: 7 Feb 16 Posts: 35 Credit: 3,280,940,745 RAC: 1,826,106 |
Random Error while computing. Usually 5 - 6 seconds into processing. Error repeats ever several hours. - Outcome Computation error - Client state Compute error - Exit status -102 (0xFFFFFF9A) ERR_READ System Config: Supermicro SYS-7047R-TRF 4U Server, X9DA7 MB, two Xeon E5-2697 V2, two Nvidia GTX 1080 TI Have used DDU to clear the drivers and reinstalled older driver, v388.13. No difference with newer driver. GPU is running at stock factory settings. Configuration: verbose=1 kernels_per_reduction=48 threads=9 lut_size=18 reduce_CPU=0 sieve_size=30 cache_sieve=1 sleep=0 Have tried this: verbose=1 kernels_per_reduction=48 threads=8 lut_size=17 reduce_CPU=0 sieve_size=30 cache_sieve=1 sleep=0 Now these are two cards, not new, currently running 1 card at a time due to thermals, summer weather. Both cards behave similarly. Pretty sure its not the cards. Any ideas? |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 537 Credit: 14,303,541,965 RAC: 60,665,201 |
Random Error while computing. Usually 5 - 6 seconds into processing. Error repeats ever several hours. Stop using the config files for a day and see if the errors go away, the config files push the card to higher than normal levels and yours could just be getting old and can't handle it. |
nedmanjo![]() Send message Joined: 7 Feb 16 Posts: 35 Credit: 3,280,940,745 RAC: 1,826,106 |
That's a possibility. I'll give it a try. Any knowledge about the meaning of Exit status -102 (0xFFFFFF9A) ERR_READ? |
![]() ![]() Send message Joined: 1 Aug 13 Posts: 17 Credit: 4,543,611,197 RAC: 4,483,269 |
The Hex Memory address is where the application failed to read the memory point. The read error in this range is strange. You could have other issues happening that are masked. I would try no config file and also I would move your ram on the motherboard as I think it is slot one related. Just me being a Math Geek! |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 537 Credit: 14,303,541,965 RAC: 60,665,201 |
That's a possibility. I'll give it a try. Any knowledge about the meaning of Exit status -102 (0xFFFFFF9A) ERR_READ? ERR_READ -102 - BOINC has a problem reading from the drive. Maybe you do not have rights to read from the BOINC directory. Solution: Make sure you have rights in your operating system to read from the drive. Check your drive for consistency, in Windows using chkdsk. https://boinc.mundayweb.com/wiki/index.php?title=Error_code_-100_to_-110_explained |
San-Fernando-Valley Send message Joined: 13 Apr 17 Posts: 5 Credit: 1,596,586,869 RAC: 273 |
Random Error while computing. Usually 5 - 6 seconds into processing. Error repeats ever several hours. My two cents worth of opinions start here: This error 102 appeared first ON or after May 1st --- BEFORE this date everything was fine! My rigs work OK on ALL other projects. It is NOT a SSD or HDD problem. NOR is it an access rights problem to any files. NOR is it an OS (WIN7or WIN10) specific problem. NOR is it a GPU or its driver version issue. I would bet that something must have changed on the project side! As others have said: just ignore it ... which I don't really want to support or accept. These types of errors usually tend to increase in frequency and become more complex. End of my two cents ... HAPPY crunching to all. |
©2019 Jon Sonntag; All rights reserved