Message boards :
Number crunching :
Computations errors
Message board moderation
Author | Message |
---|---|
AT Hiker Send message Joined: 26 Sep 19 Posts: 5 Credit: 105,524,215 RAC: 0 |
I am experiencing computation errors. Occurs about 2 seconds into the wu. The last time this happened there was some sort of system problem. Anyone else having computational errors? |
Anthony Ayiomamitis Send message Joined: 21 Jan 15 Posts: 14 Credit: 8,935,616,266 RAC: 925 |
Same here but with me it started following a Win 10 update. Coincidence? |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 805 Credit: 24,496,510,371 RAC: 2,626,508 |
Same here but with me it started following a Win 10 update. Coincidence? Yes because it's happening on Linux machines too, it's being caused by the Project sending out files that are empty ie nothing to process inside them. However some units are getting thru that work just fine so you can either switch to another Project for a bit or keep going thru them. |
David Riese![]() Send message Joined: 23 Sep 12 Posts: 152 Credit: 46,709,407,003 RAC: 120,653,306 |
It appears to be a coincidence, as this problem has been affecting a variety of my Macs, running MacOS 10.11 - 10.14. The good news is that the problem is dissipating, as my RAC is beginning to rebound. My individual Macs are recovering at different rates. This is probably due to the fact that I have not standardized my Collatz task cache across my machines; some have as few as 2 days of work, whereas others have as many as 10 days of work. So, the cache of affected tasks is being cleared at different rates. ![]() |
David Riese![]() Send message Joined: 23 Sep 12 Posts: 152 Credit: 46,709,407,003 RAC: 120,653,306 |
Hey, Mikey, I see that you have increased your hardware commitment to Collatz. Your RAC is more than 40M/day - very impressive! I may have to perform some more upgrades to stay in front of you - I have an RX580 that has yet to find a home! ![]() |
Jack-Hiker Send message Joined: 3 Apr 16 Posts: 5 Credit: 207,787,927,826 RAC: 60,091,400 |
Same here |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 805 Credit: 24,496,510,371 RAC: 2,626,508 |
Hey, Mikey, I see that you have increased your hardware commitment to Collatz. Your RAC is more than 40M/day - very impressive! I may have to perform some more upgrades to stay in front of you - I have an RX580 that has yet to find a home! I am coming for you!!! LOL!!! I got 2 new for me gpu's acouple weeks ago and decided to see what I can do if I mostly focus on a gpu project. I'm also earning GridCoins right now so it makes sense to focus a bit on a project. |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 805 Credit: 24,496,510,371 RAC: 2,626,508 |
Same here Mine are coming back too! |
crweaver Send message Joined: 22 Jan 11 Posts: 1 Credit: 4,356,162,762 RAC: 3,932,747 |
On one computer, I'm getting errors, but not on the other. The first has an nvidia gamer driver, the other - the working one - a studio driver. Both have gtx 1060 cards and are win10 machines. Both have the latest versions of their drivers. Hope that helps. |
kjohnson Send message Joined: 4 Jul 18 Posts: 8 Credit: 1,758,319,451 RAC: 0 |
Whew, this makes me feel much better. My 1080 was crunching just fine until a few days ago and now I have over 1200 failed tasks. I was unable to determine what was wrong, so hope it clears up with newly generated work. Thanks all! |
Miklos M. Send message Joined: 2 Oct 13 Posts: 15 Credit: 3,745,422,368 RAC: 40,397,642 |
Likewise, many errors after 1-2 seconds. |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 805 Credit: 24,496,510,371 RAC: 2,626,508 |
Likewise, many errors after 1-2 seconds. I think it's starting to end, the problem is fileswith no data in them, andmost of my gpu's are now processing files with data in them, I'm still getting some blankones though.i don't know if they are resends or new ones though. |
Gordon Lack Send message Joined: 14 Apr 12 Posts: 6 Credit: 471,890,117 RAC: 730,553 |
Good point.Likewise, many errors after 1-2 seconds. Each failing file will get tried 6 times before it is flagged as an error. My failures are becoming less frequent, and those that do fail are now on their fifth or sixth attempt. |
Padanian Send message Joined: 28 May 10 Posts: 13 Credit: 2,392,304,259 RAC: 0 |
Same here |
seanr22a Send message Joined: 3 Oct 19 Posts: 14 Credit: 22,883,712,410 RAC: 34,827,346 |
Still very bad, the batch with deadline 12/7/19 16:28 was all error, several 100. The batch that I received before that with deadline 12/7/19 15:16 was better with 'just' every 7th or 8th job in error. Now that rig has a project backoff 23 hours so it's more or less banned because of all errors. |
Padanian Send message Joined: 28 May 10 Posts: 13 Credit: 2,392,304,259 RAC: 0 |
Still very bad, the batch with deadline 12/7/19 16:28 was all error, several 100. The batch that I received before that with deadline 12/7/19 15:16 was better with 'just' every 7th or 8th job in error. Now that rig has a project backoff 23 hours so it's more or less banned because of all errors. i've been backed off for 24 hours too, with some hundreds WU errored out. Looks like the affected batch was generated on November 17th. |
Gordon Lack Send message Joined: 14 Apr 12 Posts: 6 Credit: 471,890,117 RAC: 730,553 |
On one computer, I'm getting errors, but not on the other. The first has an nvidia gamer driver, the other - the working one - a studio driver. Both have gtx 1060 cards and are win10 machines. Both have the latest versions of their drivers. Hope that helps.The reported error is: error reading input fileso the problem is nothing to do with what set-up you have - it's in the job data being sent. |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 805 Credit: 24,496,510,371 RAC: 2,626,508 |
On one computer, I'm getting errors, but not on the other. The first has an nvidia gamer driver, the other - the working one - a studio driver. Both have gtx 1060 cards and are win10 machines. Both have the latest versions of their drivers. Hope that helps. YES the problem is the units were being sent out with blank files, once they are all gone thru though everything should be back to normal again.I have 835 workunits in progress right now so we are going thru them!! I also have 3844 workunits that have had errors, before this started I had less than10!! The max number of errors for each workunit is 6 and today I saw a bunch of _4 and a few _2 at the end of the tasks that had problems, since it starts at _0 we ARE getting there. |
David Riese![]() Send message Joined: 23 Sep 12 Posts: 152 Credit: 46,709,407,003 RAC: 120,653,306 |
Ignore the comment below. The Mac in question performed an auto upgrade and its NVIDIA web driver was rendered incompatible. I updated the NVIDIA driver, and now it is able to crunch Collatz GPU tasks to completion. Arrgggh, I hate when I overlook something basic like that .... ----- I am not sure we are making progress. Several of the tasks that halted prematurely due to the characteristic computational error were created earlier today. Here are some examples: https://boinc.thesonntags.com/collatz/result.php?resultid=53680760 https://boinc.thesonntags.com/collatz/result.php?resultid=53680761 https://boinc.thesonntags.com/collatz/result.php?resultid=53680762 https://boinc.thesonntags.com/collatz/result.php?resultid=53680763 https://boinc.thesonntags.com/collatz/result.php?resultid=53680764 https://boinc.thesonntags.com/collatz/result.php?resultid=53680765 https://boinc.thesonntags.com/collatz/result.php?resultid=53680766 https://boinc.thesonntags.com/collatz/result.php?resultid=53680767 https://boinc.thesonntags.com/collatz/result.php?resultid=53680768 More than 900 tasks have halted prematurely on this single computer (850946 - a MacPro 5,1 with a GTX1070). ![]() |
![]() ![]() Send message Joined: 11 Aug 09 Posts: 805 Credit: 24,496,510,371 RAC: 2,626,508 |
Ignore the comment below. The Mac in question performed an auto upgrade and its NVIDIA web driver was rendered incompatible. I updated the NVIDIA driver, and now it is able to crunch Collatz GPU tasks to completion. Arrgggh, I hate when I overlook something basic like that .... It happens ALOT after Windows does it's updates too. |
©2021 Jon Sonntag; All rights reserved