"Computational error"

Message boards : Number crunching : "Computational error"
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
kotenok2000

Send message
Joined: 16 Jul 16
Posts: 5
Credit: 41,125,052
RAC: 833
Message 3280 - Posted: 3 Apr 2021, 12:34:05 UTC
Last modified: 3 Apr 2021, 12:36:46 UTC

I have 13 collatz_sieve files in E:\programdata\BOINC\projects\boinc.thesonntags.com_collatz
7 of them are of size 0 KB and there are 7 failed wus in my queue.
Maybe server for some reason gives empty input files?
ID: 3280 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 11 Aug 09
Posts: 927
Credit: 24,523,632,110
RAC: 0
Message 3281 - Posted: 4 Apr 2021, 10:44:47 UTC - in response to Message 3280.  

I have 13 collatz_sieve files in E:\programdata\BOINC\projects\boinc.thesonntags.com_collatz
7 of them are of size 0 KB and there are 7 failed wus in my queue.
Maybe server for some reason gives empty input files?


Are you trying to run cpu tasks here at Collatz, if so I strongly suggest you use your gpu here and your cpu's at a Project that can use them better. The cpu tasks here are very long and they don't get alot of credits compared to the gpu tasks and a Project like Rosetta or World Community Grid would appreciate your cpu's over there, you may even help find the cure for cancer or something else as prestigious.
ID: 3281 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Slicker
Project administrator

Send message
Joined: 11 Jun 09
Posts: 78
Credit: 943,644,517
RAC: 0
Message 3282 - Posted: 4 Apr 2021, 16:42:49 UTC

There are a number of possibilities:
1. Bad WU due to server running out of space on 3/20/21. The problem was rectified the same day, but with 734K workunit records and again as many files, and given the files are scattered across 1K folders, it's a bit of a needle in a haystack to find out if that is the problem since it requires custom code to investigate and with new WUs being created and others being deleted when finished, it's a moving target unless I shut down the project for extended periods of time.
2. Apple's OpenCL driver. They may have been the first to implement OpenCL, but they sure weren't the ones to actually adhere to the standards.
3. nVidia drivers. On more than one occasion, they have released drivers that improved performance in games or supported new GPUs at the cost of causing OpenCL errors with existing apps. They usually come out with a fix pretty quickly when that is the case.
4. BOINC client. The error description doesn't sound like that would be the case but there have been OS X issues in the last year that got by the testers and required new releases just for the MAC version.

You can check #1 easier than I since all you need to do is look in your BOINC data folder and if there are any WUs that are zero bytes in length, you'll know #1 is the issue since I would assume that BOINC would complain if it was assigned a WU but couldn't download the file. But assuming anything about BOINC usually gets me into trouble (like telling it to only send 64-bit apps to 64-bit machines and the BOINC server still sends 32-bit apps because it overrides the project admin's wishes just in case the 32-bit app runs faster than the 64-biit app.)
If anyone finds that #1 is the issue, let me know which WUs are causing the problem so I can regenerate the associated file to fix it.
ID: 3282 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tigers_Dave
Avatar

Send message
Joined: 23 Sep 12
Posts: 195
Credit: 79,654,436,975
RAC: 86,625,138
Message 3283 - Posted: 5 Apr 2021, 0:39:26 UTC - in response to Message 3276.  

I agree with your frustration. I just brought a "new" Mac Mini to the project on March 30 < https://boinc.thesonntags.com/collatz/show_host_detail.php?hostid=879920 >. At last check, that computer has attempted to crunch 2191 tasks. 213 of those tasks prematurely terminated with the "Error while computing". With all due respect to Mikey, who has been a tremendous help to me and other members of the Collatz community over the years, I don't think what he has proposed applies to these errors. Rather, I think there is a problem with the project.

Yet, I can't criticize Slicker. As others have noted quite eloquently, this is a volunteer effort ("labor of love") for him and life events can impact his ability to solve the project's problems. Moreover, having participated in DC projects since June 2001, I think that Collatz has been remarkably stable. For example, even today I can only run one class of Einstein AMD GPU tasks on my Macs without having a validation error rate in excess of 25%.

I have allocated approximately 20% of my GPU resources to Einstein and I allocate 100% of my available CPU resources to Rosetta. But, otherwise, I am going to stick with Collatz. Nonetheless, I would understand if you and others decide to leave Collatz.


At last check of the aforementioned computer, 229 tasks have prematurely terminated (with the "Error while computing" message) out of 2609 total tasks. So, it would appear that I am still receiving problematic tasks, although the percentage of problematic tasks appears to be dropping.


At last check of the aforementioned computer, 415 tasks have prematurely terminated out of 4029 total tasks. So, it looks like the percentage of problematic tasks has risen again. Indeed, because this computer prematurely terminated so many tasks, the server refused to send any more tasks to it without a manual "update".
ID: 3283 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tigers_Dave
Avatar

Send message
Joined: 23 Sep 12
Posts: 195
Credit: 79,654,436,975
RAC: 86,625,138
Message 3284 - Posted: 5 Apr 2021, 1:11:09 UTC - in response to Message 3282.  
Last modified: 5 Apr 2021, 1:25:26 UTC

There are a number of possibilities:
1. Bad WU due to server running out of space on 3/20/21. The problem was rectified the same day, but with 734K workunit records and again as many files, and given the files are scattered across 1K folders, it's a bit of a needle in a haystack to find out if that is the problem since it requires custom code to investigate and with new WUs being created and others being deleted when finished, it's a moving target unless I shut down the project for extended periods of time.
2. Apple's OpenCL driver. They may have been the first to implement OpenCL, but they sure weren't the ones to actually adhere to the standards.
3. nVidia drivers. On more than one occasion, they have released drivers that improved performance in games or supported new GPUs at the cost of causing OpenCL errors with existing apps. They usually come out with a fix pretty quickly when that is the case.
4. BOINC client. The error description doesn't sound like that would be the case but there have been OS X issues in the last year that got by the testers and required new releases just for the MAC version.

You can check #1 easier than I since all you need to do is look in your BOINC data folder and if there are any WUs that are zero bytes in length, you'll know #1 is the issue since I would assume that BOINC would complain if it was assigned a WU but couldn't download the file. But assuming anything about BOINC usually gets me into trouble (like telling it to only send 64-bit apps to 64-bit machines and the BOINC server still sends 32-bit apps because it overrides the project admin's wishes just in case the 32-bit app runs faster than the 64-biit app.)
If anyone finds that #1 is the issue, let me know which WUs are causing the problem so I can regenerate the associated file to fix it.


Slicker, thank you so much for looking into the problem. It doesn't appear to be possibility #2, as I am having problems with all of my Macs, including those running Darwin 15.6.0, 18.7.0, 19.2.0, and 19.7.0. It doesn't appear to be possibility #3, as I am having problems with Macs using the following AMD GPUs: RX 5700 XT, RX Vega 64, RX Vega 56, and RX 580. It doesn't appear to be possibility #4, as I am having problems with Macs using the following BOINC clients: 7.16.14, 7.16.12, 7.14.2, and 7.10.3. And, as you can see from my other post, possibility #1 does appear to account for at least some of the problem. Please let me know if I can provide any additional information.
ID: 3284 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tigers_Dave
Avatar

Send message
Joined: 23 Sep 12
Posts: 195
Credit: 79,654,436,975
RAC: 86,625,138
Message 3285 - Posted: 5 Apr 2021, 1:17:26 UTC - in response to Message 3282.  

There are a number of possibilities:
1. Bad WU due to server running out of space on 3/20/21. The problem was rectified the same day, but with 734K workunit records and again as many files, and given the files are scattered across 1K folders, it's a bit of a needle in a haystack to find out if that is the problem since it requires custom code to investigate and with new WUs being created and others being deleted when finished, it's a moving target unless I shut down the project for extended periods of time.
2. Apple's OpenCL driver. They may have been the first to implement OpenCL, but they sure weren't the ones to actually adhere to the standards.
3. nVidia drivers. On more than one occasion, they have released drivers that improved performance in games or supported new GPUs at the cost of causing OpenCL errors with existing apps. They usually come out with a fix pretty quickly when that is the case.
4. BOINC client. The error description doesn't sound like that would be the case but there have been OS X issues in the last year that got by the testers and required new releases just for the MAC version.

You can check #1 easier than I since all you need to do is look in your BOINC data folder and if there are any WUs that are zero bytes in length, you'll know #1 is the issue since I would assume that BOINC would complain if it was assigned a WU but couldn't download the file. But assuming anything about BOINC usually gets me into trouble (like telling it to only send 64-bit apps to 64-bit machines and the BOINC server still sends 32-bit apps because it overrides the project admin's wishes just in case the 32-bit app runs faster than the 64-biit app.)
If anyone finds that #1 is the issue, let me know which WUs are causing the problem so I can regenerate the associated file to fix it.


Slicker, thank you so much for looking into the issue. I found a number of zero length WU files. Below are 72 of them from one of my Macs (ID: 825825). Note that these were downloaded to this Mac as early as March 29 and as recently as April 4. Let me know if there is anything else I can do to help. Regards, Dave.

collatz_sieve_0a3a8ce9-ab3e-4860-bce1-a4fc2e9027bd
collatz_sieve_0c079d5b-8ef9-417f-aa6a-5390a6b102be
collatz_sieve_0d40c34f-f3cd-4ef8-98ee-5110cd4df38a
collatz_sieve_1ec7d2fb-64fc-4fe8-bf83-bb97b81d9a02
collatz_sieve_2b822406-c958-4268-b683-ef850ff609f4
collatz_sieve_2c78ef89-5637-4e26-b10f-2cffc5b2a73d
collatz_sieve_2e46808e-54d0-4ca2-b2ad-38b55a76eded
collatz_sieve_3d1ef589-24b5-453e-88b7-b74658511c34
collatz_sieve_3f8a71dd-2f65-4d35-8a01-2601a872fc10
collatz_sieve_4f04c731-0241-41cf-9870-0ecf42efcd53
collatz_sieve_5fde8366-51db-40d7-a4af-8476503fb214
collatz_sieve_7a532591-8ed5-4fce-a00c-820bf6a4ef67
collatz_sieve_7dd5a8b2-e814-492f-a231-d0a31521f723
collatz_sieve_7eec0e83-23f5-4b2a-94d1-16816a8d8413
collatz_sieve_08eb19b9-c87f-4385-9a8b-e7bb29c5343d
collatz_sieve_9a70f05c-2925-442d-b838-a001ec053f14
collatz_sieve_9bc26afc-6bc5-4408-9f7e-e38ca4caa483
collatz_sieve_09dd24bf-71a5-48d0-9912-819a516a1440
collatz_sieve_37bd85f5-cd22-4b6e-9274-321ad1bcee15
collatz_sieve_61a6f8a6-9892-43e7-ae5d-296a6226e633
collatz_sieve_63e0e592-4753-495f-9fce-e50469b6a99d
collatz_sieve_74d1d8ac-6a50-4e35-9896-dd034d130376
collatz_sieve_78c29cdc-6a5d-40a3-8795-1ba08c12f410
collatz_sieve_80d1e05c-a594-400e-b206-229bb59b8568
collatz_sieve_091ea25c-d344-46e6-88bd-b8b70535ab75
collatz_sieve_153fb750-d97c-4d8b-8750-38a6cb82438f
collatz_sieve_248f05f6-e658-4556-8d77-c354ddaac476
collatz_sieve_299e36e5-82ba-4f45-b0bb-37db43d5f336
collatz_sieve_0365a6f1-957c-4dc6-bce0-15af628ea878
collatz_sieve_429d03e6-5e5c-4972-a224-a9ba3ab71448
collatz_sieve_695cca1f-be0a-4ea1-ba7d-578510899ed9
collatz_sieve_1599fe35-eb62-48db-b57d-48db0bac0fea
collatz_sieve_3739e4d4-3fb9-4bdb-823a-07dc5ae44bd2
collatz_sieve_6302de80-d9ea-46ba-96dc-6a941b9b7a1c
collatz_sieve_11779ab5-9000-4b95-b047-4c81fbb14cd5
collatz_sieve_15351a5f-1ea3-46d6-b66c-877038fd2016
collatz_sieve_95956f03-10bd-472a-92f7-d20ae5ffd15d
collatz_sieve_149323bb-5251-4bd1-91f6-bbdf028ed1d8
collatz_sieve_301425d2-f1aa-4c0e-aadc-48b6149bed1d
collatz_sieve_3362527c-0abd-4aca-b8f3-fe86e9a29b02
collatz_sieve_16705638-d50f-4b3a-9c58-bdc64ef88017
collatz_sieve_a33304d7-3b71-4627-a045-3432826b5de5
collatz_sieve_aabf7315-f7a4-45cc-bf6d-b11f91c42563
collatz_sieve_b01fa586-c923-4c08-a40d-d3e88da0499f
collatz_sieve_b534559f-4f1e-45a9-9191-d19a1cf96c14
collatz_sieve_c4a4b1cb-9b22-48b3-a340-77369fc79e68
collatz_sieve_c4a8b174-fa38-4577-bc41-7b65be12f98d
collatz_sieve_c4c296a6-3485-49cf-9846-31b21efedbba
collatz_sieve_c6af03ea-0c66-4397-a02e-216860b98bba
collatz_sieve_c1388b1c-f640-4f8e-acb3-d8543eb88eb3
collatz_sieve_c6961ac1-d1a9-4f48-99c8-65204dc337dc
collatz_sieve_c82374a7-d06e-4fdf-b88e-e6b25016752b
collatz_sieve_c798115a-305f-4b6c-a912-251f6d630f49
collatz_sieve_cb062102-e224-4fbf-92e4-68432e9d078e
collatz_sieve_cc0e2791-5c38-41f3-874d-7e97747999de
collatz_sieve_d1bca852-24ec-4cd7-874b-88b5a86637aa
collatz_sieve_d6c5fedd-675d-4957-b65f-229987bfa746
collatz_sieve_d87e2954-4803-4b43-9779-a4e0fd97e309
collatz_sieve_d99aa3a8-a247-474f-b9c3-41ccac6467e7
collatz_sieve_d9129145-133d-4d77-a88b-00dac6f2f622
collatz_sieve_dc91bdcc-1779-44df-8637-c98b3cacb0be
collatz_sieve_e0b695de-1d6c-46fa-8628-332a9e6a1bd9
collatz_sieve_e2e7e576-ccba-4f4c-be5c-c1ffbd060ddc
collatz_sieve_e63b2e8a-a798-4f2c-883a-7a544ab340d5
collatz_sieve_e585edf2-fca2-4267-9b0f-d3953803ffa4
collatz_sieve_e609ea08-b1c6-4717-b36d-da4f4db22266
collatz_sieve_e796a1ee-2a86-47b5-9e6f-d6792e60519d
collatz_sieve_ea444825-b9b8-4e9a-98d3-bb561660058a
collatz_sieve_efa06477-0a26-42ce-8d39-5d1c6e8a3f45
collatz_sieve_f90e8fe2-86ad-48a3-a0ba-9ba6dd069cd8
collatz_sieve_f99cfdc0-896a-4092-a2ae-94d81907e859
collatz_sieve_fbe5ce0e-ba48-4f7a-89d6-9d43f6334ea3
ID: 3285 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
KAMasud

Send message
Joined: 20 Oct 11
Posts: 48
Credit: 4,131,098,060
RAC: 8,527,116
Message 3286 - Posted: 6 Apr 2021, 15:01:32 UTC - in response to Message 3283.  
Last modified: 6 Apr 2021, 15:19:29 UTC

My error rate has fallen drastically after "Visual Studio" but now whatever is erroring out gives the following, I will copy and paste. Run Time about 2.04 to 2.06. Nothing about 0kb files.
Peculiar though, it is happening only on one computer
ID 874656? The Boinc on this machine is behaving like a virus. I have tried to delete it in order to re-install but I have failed. In the morning I will try through Config.sys to go into Safe Mode and then delete it. Windows 10 has hidden the Safe Mode somewhere.
Stderr output
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 4294967194 (0xffffff9a)</message>
<stderr_txt>
Collatz Conjecture Sieve 1.30 Windows x86_64 for OpenCL
Written by Slicker (Jon Sonntag) of team SETI.USA
Based on the AMD Brook+ kernels by Gipsel of team Planet 3DNow!
Sieve code and OpenCL optimization provided by Sosiris of team BOINC@Taiwan
Processor Type NVIDIA
Device Vendor NVIDIA Corporation
Name GeForce RTX 2060
Driver Version 457.51
OpenCL Version OpenCL 1.2 CUDA
worker: error reading input file.
Error -102. Processing Aborted.
11:45:07 (4960): called boinc_finish(-102)

</stderr_txt>
]]>
ID: 3286 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
KAMasud

Send message
Joined: 20 Oct 11
Posts: 48
Credit: 4,131,098,060
RAC: 8,527,116
Message 3287 - Posted: 7 Apr 2021, 6:29:27 UTC
Last modified: 7 Apr 2021, 6:39:59 UTC

Mikey, sorry I was not able to delete Boinc even in safe mode? What is happening in the upfront GUI Boinc Manager gets deleted but the background process is not. I used Revo Uninstaller and got rid of all floating junk. Afterburner, Task Manager, HWiNFO64 all are telling me. I might have to take a route that I hate, manual removal. I have played around though and the error rate has gone down further.
BOINC Admin has done some changes which are giving control to individual projects to access BOINC. How do I know this? CPDN WU's report back. I can see that in my internet usage. GPU Grid also can access individual BOINC. My power rates double between 5 PM and 11 PM, so I power off. My WU reported back to mamma and I am not getting further WU's. Again Task Manager and internet usage told me about this incident. They are working on COVID and need their WU's back A.S.A.P.
I am not active on Boinc Forums so I do not exactly what Boinc Admin has done under the hood. As Slicker said, it is up to BOINC to hand out 32-bit WU's to 64-bit machines. I am getting too old it seems.
ID: 3287 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 11 Aug 09
Posts: 927
Credit: 24,523,632,110
RAC: 0
Message 3288 - Posted: 7 Apr 2021, 10:45:07 UTC - in response to Message 3287.  

Mikey, sorry I was not able to delete Boinc even in safe mode? What is happening in the upfront GUI Boinc Manager gets deleted but the background process is not. I used Revo Uninstaller and got rid of all floating junk. Afterburner, Task Manager, HWiNFO64 all are telling me. I might have to take a route that I hate, manual removal. I have played around though and the error rate has gone down further.
BOINC Admin has done some changes which are giving control to individual projects to access BOINC. How do I know this? CPDN WU's report back. I can see that in my internet usage. GPU Grid also can access individual BOINC. My power rates double between 5 PM and 11 PM, so I power off. My WU reported back to mamma and I am not getting further WU's. Again Task Manager and internet usage told me about this incident. They are working on COVID and need their WU's back A.S.A.P.
I am not active on Boinc Forums so I do not exactly what Boinc Admin has done under the hood. As Slicker said, it is up to BOINC to hand out 32-bit WU's to 64-bit machines. I am getting too old it seems.


First thing I would do is run an antivrius check from a couple of different places, being unable to act as Admin of your own pc is a one key component of a virus. I would use the one you have on the pc now and something else like MalwareBytes or Commodo or even the one built into Win10, depending on the pc I pay for Eset but also use the Win10 free one and the Avast free version on my various pc's, I do not run multiple ones on the same pc everyday though

After that I would try and uninstall Boinc again and then reinstall the latest version, As for Boinc choosing which version, 32 or 6 bit, you pc gets that's not exactly what he said, what Slicker said was that the Boinc software, meaning the Server side he is running, may choose to send you 32bit tasks if there are no 64bit tasks available. So it's not Boinc choosing it's his setting in the Server to send us tasks when we ask for them rather than say 'sorry no tasks are available try again later'.
ID: 3288 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 5 Jul 11
Posts: 39
Credit: 506,043,727
RAC: 12,056
Message 3289 - Posted: 7 Apr 2021, 14:35:55 UTC - in response to Message 3277.  
Last modified: 7 Apr 2021, 14:36:42 UTC

The only thing I can say is, copied and pasted from Home Page.
"Windows applications require the Microsoft Visual C++ Redistributable for Visual Studio 2017. It is recommended to install both the x86 and x64 versions since BOINC may decide to run the 32-bit app if no 64-bit work units are available when requesting work".
After so many years of sleeping peacefully and letting us sleep peacefully, someone has decided to send out 32-bit app's. Ask Mikey or Slicker. No crashed WU's to report so far. I wish I knew the reason why in simple language.
I just checked my Windows 10 machines. They have many of those, but 2005, 2008, 2010, 2012, 2013, and "2015-2019". I assume 2015-2019 includes the 2017 you speak of.

I have 3 machines with GPUs. On one machine I have 6 error 108 valid. On another I have 12 error 41 valid. On the other I have 22 error 104 valid. Every time I check, they've been to many other people too, so something is up. At least they only waste several seconds at my end. I guess eventually somebody will spot these on the server (unless they get done eventually by somebody who has the secret to making them work?)

Maybe.... they're crashing because they've found a flaw in the conjecture and we've done it?
ID: 3289 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 5 Jul 11
Posts: 39
Credit: 506,043,727
RAC: 12,056
Message 3290 - Posted: 7 Apr 2021, 14:43:41 UTC

Whoops! It doesn't say "visual studio" on the end of mine. I guess I have something different, they just say something like "Microsoft Visual C++ 2015-2019 Redistributable (x64)".

I'll try to find the visual studio one and see if that helps. Strange it's only some of the tasks that crash though.
ID: 3290 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 5 Jul 11
Posts: 39
Credit: 506,043,727
RAC: 12,056
Message 3291 - Posted: 7 Apr 2021, 14:57:40 UTC

Not sure what's going on, I installed this (x86 and x64):
https://support.microsoft.com/en-us/topic/the-latest-supported-visual-c-downloads-2647da03-1eea-4433-9aff-95f26a218cc0
Which should put on 2015-2019 but it doesn't show up in programs and features under Microsoft or under Visual, or by searching "visual". I'll just have to assume it's there.
ID: 3291 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
KAMasud

Send message
Joined: 20 Oct 11
Posts: 48
Credit: 4,131,098,060
RAC: 8,527,116
Message 3292 - Posted: 8 Apr 2021, 4:11:53 UTC - in response to Message 3288.  

Mikey, sorry I was not able to delete Boinc even in safe mode? What is happening in the upfront GUI Boinc Manager gets deleted but the background process is not. I used Revo Uninstaller and got rid of all floating junk. Afterburner, Task Manager, HWiNFO64 all are telling me. I might have to take a route that I hate, manual removal. I have played around though and the error rate has gone down further.
BOINC Admin has done some changes which are giving control to individual projects to access BOINC. How do I know this? CPDN WU's report back. I can see that in my internet usage. GPU Grid also can access individual BOINC. My power rates double between 5 PM and 11 PM, so I power off. My WU reported back to mamma and I am not getting further WU's. Again Task Manager and internet usage told me about this incident. They are working on COVID and need their WU's back A.S.A.P.
I am not active on Boinc Forums so I do not exactly what Boinc Admin has done under the hood. As Slicker said, it is up to BOINC to hand out 32-bit WU's to 64-bit machines. I am getting too old it seems.


First thing I would do is run an antivrius check from a couple of different places, being unable to act as Admin of your own pc is a one key component of a virus. I would use the one you have on the pc now and something else like MalwareBytes or Commodo or even the one built into Win10, depending on the pc I pay for Eset but also use the Win10 free one and the Avast free version on my various pc's, I do not run multiple ones on the same pc everyday though

After that I would try and uninstall Boinc again and then reinstall the latest version, As for Boinc choosing which version, 32 or 6 bit, you pc gets that's not exactly what he said, what Slicker said was that the Boinc software, meaning the Server side he is running, may choose to send you 32bit tasks if there are no 64bit tasks available. So it's not Boinc choosing it's his setting in the Server to send us tasks when we ask for them rather than say 'sorry no tasks are available try again later'.

__________________________________________
Thank you Mikey, I know what Slicker said about 32-bit tasks. No problem.
"Windows applications require the Microsoft Visual C++ Redistributable for Visual Studio 2017. It is recommended to install both the x86 and x64 versions since BOINC may decide to run the 32-bit app if no 64-bit work units are available when requesting work."
x86 and x64 versions bit. I cannot find x86 anywhere on Microsoft.
The second point is one computer with the visual studio has gone to zero error rate. You can check.
It is the other one which bothering me with RTX2060 in it. The Boinc on it seems to be infected and like all infected things, it is protecting itself. Windows UnInstall cannot get rid of it. Revo Uninstaller cannot get rid of it. Plus, if I suspend it in order to free resources. After a little while when it knows my mind is somewhere else, it promptly unsuspends itself.
The error rate has gone down further though with whatever I am doing. Only one task errored out yesterday.
I suppose an old fashioned butchery is in order. Manuel UnInstall.
ID: 3292 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 5 Jul 11
Posts: 39
Credit: 506,043,727
RAC: 12,056
Message 3293 - Posted: 8 Apr 2021, 10:32:10 UTC - in response to Message 3292.  

Thank you Mikey, I know what Slicker said about 32-bit tasks. No problem.
"Windows applications require the Microsoft Visual C++ Redistributable for Visual Studio 2017. It is recommended to install both the x86 and x64 versions since BOINC may decide to run the 32-bit app if no 64-bit work units are available when requesting work."
x86 and x64 versions bit. I cannot find x86 anywhere on Microsoft.
The second point is one computer with the visual studio has gone to zero error rate. You can check.
It is the other one which bothering me with RTX2060 in it. The Boinc on it seems to be infected and like all infected things, it is protecting itself. Windows UnInstall cannot get rid of it. Revo Uninstaller cannot get rid of it. Plus, if I suspend it in order to free resources. After a little while when it knows my mind is somewhere else, it promptly unsuspends itself.
The error rate has gone down further though with whatever I am doing. Only one task errored out yesterday.
I suppose an old fashioned butchery is in order. Manuel UnInstall.
This page has both 32 and 64 bit.
I installed it on my 3 GPU machines and I still get errors, I don't think there's anything wrong with your machines, it's the project.
ID: 3293 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 11 Aug 09
Posts: 927
Credit: 24,523,632,110
RAC: 0
Message 3294 - Posted: 8 Apr 2021, 10:40:20 UTC - in response to Message 3293.  

Thank you Mikey, I know what Slicker said about 32-bit tasks. No problem.
"Windows applications require the Microsoft Visual C++ Redistributable for Visual Studio 2017. It is recommended to install both the x86 and x64 versions since BOINC may decide to run the 32-bit app if no 64-bit work units are available when requesting work."
x86 and x64 versions bit. I cannot find x86 anywhere on Microsoft.
The second point is one computer with the visual studio has gone to zero error rate. You can check.
It is the other one which bothering me with RTX2060 in it. The Boinc on it seems to be infected and like all infected things, it is protecting itself. Windows UnInstall cannot get rid of it. Revo Uninstaller cannot get rid of it. Plus, if I suspend it in order to free resources. After a little while when it knows my mind is somewhere else, it promptly unsuspends itself.
The error rate has gone down further though with whatever I am doing. Only one task errored out yesterday.
I suppose an old fashioned butchery is in order. Manuel UnInstall.
This page has both 32 and 64 bit.

I installed it on my 3 GPU machines and I still get errors, I don't think there's anything wrong with your machines, it's the project.


You posted earlier in this thread I think that you are getting errors in units others have also gotten errors in, if you look each task has to error out 8 times before it is considered 'bad' and the Admin takes a look at it. With the recent Server problems there could be a bunch of bad units to go thru yet.
ID: 3294 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 5 Jul 11
Posts: 39
Credit: 506,043,727
RAC: 12,056
Message 3295 - Posted: 8 Apr 2021, 10:55:27 UTC - in response to Message 3294.  
Last modified: 8 Apr 2021, 10:56:07 UTC

You posted earlier in this thread I think that you are getting errors in units others have also gotten errors in, if you look each task has to error out 8 times before it is considered 'bad' and the Admin takes a look at it. With the recent Server problems there could be a bunch of bad units to go thru yet.
Yes every time I check I'm about the 6th or 7th one to get it (number at the end of the task name).
Is KAMasud getting a different error?
ID: 3295 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
KAMasud

Send message
Joined: 20 Oct 11
Posts: 48
Credit: 4,131,098,060
RAC: 8,527,116
Message 3297 - Posted: 8 Apr 2021, 15:03:23 UTC - in response to Message 3295.  

You posted earlier in this thread I think that you are getting errors in units others have also gotten errors in, if you look each task has to error out 8 times before it is considered 'bad' and the Admin takes a look at it. With the recent Server problems, there could be a bunch of bad units to go thru yet.
Yes every time I check I'm about the 6th or 7th one to get it (number at the end of the task name).
Is KAMasud getting a different error?

_______________

Come to think over the matter, here is your answer. "errors Too many errors (may have bug) Too many total results". I would love to put an "a" between (have and bug) but that won't be copy and paste.
Some still are on some machines waiting for their last run.
ID: 3297 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 5 Jul 11
Posts: 39
Credit: 506,043,727
RAC: 12,056
Message 3298 - Posted: 9 Apr 2021, 10:05:09 UTC - in response to Message 3297.  

You posted earlier in this thread I think that you are getting errors in units others have also gotten errors in, if you look each task has to error out 8 times before it is considered 'bad' and the Admin takes a look at it. With the recent Server problems, there could be a bunch of bad units to go thru yet.
Yes every time I check I'm about the 6th or 7th one to get it (number at the end of the task name).
Is KAMasud getting a different error?

_______________

Come to think over the matter, here is your answer. "errors Too many errors (may have bug) Too many total results". I would love to put an "a" between (have and bug) but that won't be copy and paste.
Some still are on some machines waiting for their last run.
I wonder if we've got to such large numbers that there's an overflow? Time to go double precision? I've got some nice old 280X cards, the ones before they took the fast double precision away :-(
ID: 3298 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 11 Aug 09
Posts: 927
Credit: 24,523,632,110
RAC: 0
Message 3299 - Posted: 9 Apr 2021, 10:43:49 UTC - in response to Message 3298.  

You posted earlier in this thread I think that you are getting errors in units others have also gotten errors in, if you look each task has to error out 8 times before it is considered 'bad' and the Admin takes a look at it. With the recent Server problems, there could be a bunch of bad units to go thru yet.
Yes every time I check I'm about the 6th or 7th one to get it (number at the end of the task name).
Is KAMasud getting a different error?

_______________

Come to think over the matter, here is your answer. "errors Too many errors (may have bug) Too many total results". I would love to put an "a" between (have and bug) but that won't be copy and paste.
Some still are on some machines waiting for their last run.


I wonder if we've got to such large numbers that there's an overflow? Time to go double precision? I've got some nice old 280X cards, the ones before they took the fast double precision away :-(


Yes double precision cards do much better here
ID: 3299 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 5 Jul 11
Posts: 39
Credit: 506,043,727
RAC: 12,056
Message 3300 - Posted: 9 Apr 2021, 12:36:01 UTC - in response to Message 3299.  

You posted earlier in this thread I think that you are getting errors in units others have also gotten errors in, if you look each task has to error out 8 times before it is considered 'bad' and the Admin takes a look at it. With the recent Server problems, there could be a bunch of bad units to go thru yet.
Yes every time I check I'm about the 6th or 7th one to get it (number at the end of the task name).
Is KAMasud getting a different error?

_______________

Come to think over the matter, here is your answer. "errors Too many errors (may have bug) Too many total results". I would love to put an "a" between (have and bug) but that won't be copy and paste.
Some still are on some machines waiting for their last run.


I wonder if we've got to such large numbers that there's an overflow? Time to go double precision? I've got some nice old 280X cards, the ones before they took the fast double precision away :-(


Yes double precision cards do much better here
AFAIK Collatz is single precision. Primegrid seems to be a mix of the two. MW is double. Einstein is single.

These are judging by how fast they run on my 280X (4096 Gflops single, 1024 double) compared to my RX560 (2611 Gflops single, 163 double) - which means if a project is single precision, it should run almost twice as fast on the 280X, and if it's double about 8 times as fast.
ID: 3300 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : "Computational error"


©2022 Jon Sonntag; All rights reserved