Posts by BetelgeuseFive
log in
1) Message boards : Number crunching : Optimizing Collatz Sieve (Message 22445)
Posted 487 days ago by BetelgeuseFive
God catch! In the 1st post it's written that 6 is actually the minimum for threads. And it makes sense that too small values should not be allowed, since we've got vector ALUs and not scalar ones. I wrote that the difference between those thread values was quite small, 3% at most, so maybe I simply didn't average over enough WUs.

Regarding lut_size: yes, values of 1 - 2 high than what I wrote perform better. But you should see increased DRAM power consumption with that (doesn't matter much) and, more importantly, increased memory bandwidth consumption which probably slows down your other tasks. Hence I would not generally recommend this, especially if your CPU is also feeding a fast discrete GPU.

MrS


Thanks for the extra info. I am now back to lut_size=19.

Tom
2) Message boards : Number crunching : Optimizing Collatz Sieve (Message 22442)
Posted 487 days ago by BetelgeuseFive
Thanks for the info.
On my i5-6500 lut_size=20 seems to work better than lut_size=19.
What exactly does threads=1 do ? I see no difference in the output file:

threads 2^5 (32)
actual threads 32

My config file:

verbose=1
kernels_per_reduction=64
lut_size=20
sieve_size=26
threads=1

With threads=1 I would expect to see:

threads 2^1 (2)
actual threads 2

Am I missing something here ?

Thanks,

Tom
3) Message boards : Number crunching : Optimizing Collatz Sieve (Message 22354)
Posted 505 days ago by BetelgeuseFive
Then go for this config :)
lut_size=20

... and maybe higher in a few days, if you want to test it. In the worst case CC and other BOINc projects would become slower.

MrS


Thank you for sharing this !
For me (i5-6500 with 2 * 8 Gb dual channel DDR4-2133) setting lut_size to 20 meant that run times went from appr. 2800 seconds to less than 1800 seconds.
Did you make any other changes except for the lut_size ? (I can't check your results as your computers are hidden)

Thanks,

Tom
4) Message boards : Number crunching : Intel HD 530 - Functional Driver (Message 22283)
Posted 517 days ago by BetelgeuseFive
I cannot get my Intel HD 530 (with i3 6300 CPU) to work under Win 10 Pro 64-bit with the work units having returned with a computational error immediately upon starting.

Can I get some feedback as to the driver version with which people have been successful in processing their units?

Thanks.


I have no problems with my HD 530 (i5-6500, Win 10 Home 64-bit).

I am using driver version 15.40.18.4380 downloaded from here:
https://downloadcenter.intel.com/download/25818/Intel-Graphics-Driver-for-Windows-7-8-1-10-15-40-

Tom
5) Message boards : Number crunching : Computation error: out of resources (Message 21964)
Posted 606 days ago by BetelgeuseFive
Hello,

One of my tasks gave the following error message:

Error: (-5)Out of resources at 643 of GetContext
clCreateContext() failed (-5) Out of resources
Error -5. Processing Aborted.

http://boinc.thesonntags.com/collatz/result.php?resultid=43953103

Other tasks (both here and on other projects) did not result in any errors.
The NVidia card is used for number crunching only (display is connected to the integrated Intel GPU). All settings are default and I only run one task at the same time.
When monitoring other tasks using GPU-Z there is plenty of memory available.

Any thoughts on what may be the cause of this error message ?

Thanks,

Tom
6) Message boards : Number crunching : Mini Collatz errors (Message 21305)
Posted 735 days ago by BetelgeuseFive
Hi folks,

I have a couple of tasks that resulted in validation errors.

http://boinc.thesonntags.com/collatz/workunit.php?wuid=19784824
http://boinc.thesonntags.com/collatz/workunit.php?wuid=19784665

The first workunit was assigned to another CUDA host and resulted in the same error.

Any clues ?

Thanks,

Tom
7) Message boards : Number crunching : Invalid task (Message 20872)
Posted 793 days ago by BetelgeuseFive
Hi folks,

I have an invalid task, something I have not seen before:

http://boinc.thesonntags.com/collatz/result.php?resultid=19752441

The stderr output looks OK to me. Any thoughts on what may have caused this ?

Thanks,

Tom


I had 2 more:

http://boinc.thesonntags.com/collatz/result.php?resultid=19792563
http://boinc.thesonntags.com/collatz/result.php?resultid=19802167

Any clues ?

Thanks,

Tom
8) Message boards : Number crunching : Invalid task (Message 20869)
Posted 794 days ago by BetelgeuseFive
Hi folks,

I have an invalid task, something I have not seen before:

http://boinc.thesonntags.com/collatz/result.php?resultid=19752441

The stderr output looks OK to me. Any thoughts on what may have caused this ?

Thanks,

Tom
9) Message boards : News : New Windows CUDA and OpenCL Versions Released (Message 20761)
Posted 808 days ago by BetelgeuseFive
Over 60 units (mostly CUDA, only a couple of OpenCL) completed and validated, not a single error. Problems fixed I should say.

Thanks again,

Tom
10) Message boards : News : New Windows CUDA and OpenCL Versions Released (Message 20749)
Posted 809 days ago by BetelgeuseFive
Found it in my account settings. Disabled OpenCL there.

Thank you !

The first two (CUDA) units completed and validated.
Now running two OpenCL units. Don't like it very much that these are using two CPU cores as well. Is there a way to select CUDA only ?

Tom
11) Message boards : News : New Windows CUDA and OpenCL Versions Released (Message 20748)
Posted 809 days ago by BetelgeuseFive
Thank you !

The first two (CUDA) units completed and validated.
Now running two OpenCL units. Don't like it very much that these are using two CPU cores as well. Is there a way to select CUDA only ?

Tom
12) Message boards : Number crunching : Errors on CUDA workunit (Message 20709)
Posted 813 days ago by BetelgeuseFive
I'm seeing exactly the same problem on my GTX-750:

http://boinc.thesonntags.com/collatz/result.php?resultid=18915235

Collatz Conjecture v6.05 Windows x86_64 for CUDA 5.5
Based on the AMD Brook+ kernels by Gipsel
Using optimizations provided by Sosirus
Config: verbose=1 items_per_kernel=131072 kernels_per_reduction=64 threads=512 sleep=1
Name GeForce GTX 750
Compute 5.0
Parameters --device 0
Start 2397882917555217104896
Checking 107374182400 numbers
Numbers/Kernel 131072
Kernels/Reduction 64
Numbers/Reduction 8388608
Reductions/WU 12800
Threads 512
Using: verbose=1 items_per_kernel=131072 kernels_per_reduction=64 threads=512 sleep=1
cudaSafeCall() failed at CollatzCudaKernel11.cu:362 : device not ready
17:29:06 (3768): called boinc_finish

Other projects work fine so I don't think there is something wrong with my computer.

Tom
13) Message boards : Number crunching : Errors on CUDA workunit (Message 20644)
Posted 825 days ago by BetelgeuseFive
Another indication this problem has nothing to do with 32-bit apps on 64-bit platforms.
Have a look at this host:

http://boinc.thesonntags.com/collatz/show_host_detail.php?hostid=127401

It is a 32-bit Windows system and it has exactly the same problems.

Tom
14) Message boards : Number crunching : Computation Errors (Message 20635)
Posted 827 days ago by BetelgeuseFive
Recently with in the last 10 days I have been seeing computations errors popping up as the last WU is finished the next one to start goes just 2 secs then goes to error. This will usually occur in pairs then stop and go to the next WU and perform flawlessly.
And i with several R3 Kalindi by Debian Jessie.
http://boinc.thesonntags.com/collatz/result.php?resultid=18364275

I have nothing changed.


This one is interesting:

http://boinc.thesonntags.com/collatz/workunit.php?wuid=15913445

Exactly the same error message for both an AMD and an NVIDIA GPU:

At offset 16777216 got 627 from the GPU when expecting 203

My guess: this is the same problem as discussed in this thread:

http://boinc.thesonntags.com/collatz/forum_thread.php?id=1272

Tom
15) Message boards : Number crunching : Errors on CUDA workunit (Message 20634)
Posted 827 days ago by BetelgeuseFive
x86_64 = 64-bit
intelx86 = 32-bit


All my mini_collatz tasks (both the failed and the succesfull ones) report:

Collatz Conjecture v6.04 Windows x86_64 for CUDA 5.5

Also there is no 32-bit mini_collatz app in my collatz directory.
This combined with the fact that the problem also occurs when I specified no_alt_platform in my cc_config.xml leads me to believe that the problem has nothing to do with 32-bit apps running on a 64-bit system.
As the problem always seems to happen in the first seconds of execution I think it is more likely that there is some kind of initialization issue (uninitialized variables ?). Especially when they are on the stack uninitialized variables can be hard to find because they may consistently receive the same value left from the previous function being called.

Tom
16) Message boards : Number crunching : Errors on CUDA workunit (Message 20615)
Posted 828 days ago by BetelgeuseFive
Managed to get the sieve app up and running (not only needed to select the sieve app, but also had to enable 'test'). After 6 minutes it is at 0.5%. System is very sluggish (not nice to work with anymore). GPU at 99% memory usage above 900 Mb. Will let it run for now ...

Tom
17) Message boards : Number crunching : Errors on CUDA workunit (Message 20614)
Posted 829 days ago by BetelgeuseFive
Hello Slicker, thank you for trying.

I know it is hard to find problems if you can't reproduce them (I am a software engineer with over 25 years of experience).

I will try the sieve app when I am able to get some work (so far I had 9 failed requests):

17-6-2015 18:03:58 | Collatz Conjecture | Sending scheduler request: Requested by user.
17-6-2015 18:03:58 | Collatz Conjecture | Requesting new tasks for NVIDIA GPU
17-6-2015 18:04:00 | Collatz Conjecture | Scheduler request completed: got 0 new tasks

Is there a way I can check (in the task output) if a 32-bit or a 64-bit app is used ? All I see is:

Collatz Conjecture v6.04 Windows x86_64 for CUDA 5.5

My BOINC thesonntags.com_collatz directory only contains a single executable named:

mini_collatz_6.04_windows_x86_64__cuda55.exe

Based on the name I would assume this is a 64 bit executable.

How do I find/recognize the 32 bit app ?

What also surprises me is that I still got errors after adding:

<no_alt_platform>1</no_alt_platform>

to my cc_config.xml file. The errors occurred after restarting BOINC (causing errors on some other projects) and requesting new tasks.

Is there another way to force using the 64-bit app ? I know that for
you can specify the executables used in a separate file (I used anonymous platform over on SETI@home in the past).

If there is anything I can do to try to figure out what is causing this problem, please let me know. I think it is highly unlikely that this problem is caused by hardware/heat issues. The results should be more random in that case and not all hosts should fail in exactly the same way. Problems with drivers may be an issue, but I checked my wingmen on the failed results and there were a lot of different driver versions being used (but no, that does not prove anything ...).


Tom
18) Message boards : Number crunching : fail (Message 20591)
Posted 831 days ago by BetelgeuseFive
Anybody seen the likes of this:14.06.2015 16:31:50 | Collatz Conjecture | [error] Checksum or signature error for mini_collatz_6.04_windows_intelx86__opencl_nvidia_gpu.exe ?


That looks like an internet error thru some of the hops from the Collatz Server to you.

This error though 'Error: GPU steps do not match CPU steps.' makes me wonder if you are leaving a cpu core free just for the gpu to use? This error is seen on the machine with the Nvidia 960 in it.


The GPU steps do not match CPU steps errors is a problem in the Mini Collatz CUDA app. See this thread:

http://boinc.thesonntags.com/collatz/forum_thread.php?id=1272

Personally I do not believe it is related to 32-bit application on 64-bit platforms (and I have sent Slicker a PM about this). I have not yet seen a single case where such an error occurred while the same workunit was later successfully completed by another CUDA host.

Tom
19) Message boards : Number crunching : Errors on CUDA workunit (Message 20554)
Posted 834 days ago by BetelgeuseFive
Even with the no_alt_platform option I'm still seeing some errors.

http://boinc.thesonntags.com/collatz/result.php?resultid=18191525

From the earlier errors I found it strange that not a single one of the workunits that failed for me were later correctly handled by another CUDA hosts. All other CUDA tasks sent for the same workunit failed with exactly the same message.
So my question: are you absolutely sure this is a 32/64 bit platform issue ? I get the impression the problem is related to the CUDA app and not to 32/64 bit issues. Is there a way I can see if a task was handled by a 32-bit or a 64-bit app ?

Thanks,

Tom
20) Message boards : Number crunching : Errors on CUDA workunit (Message 20541)
Posted 835 days ago by BetelgeuseFive
Hello,

I've recently returned to this project and noticed that my system had a number of tasks with errors. As my system (GTX-750) does not produce any errors on other projects (Einstein, SETI, POEM) I took another look. It turns out that other wingmen that were also running CUDA returned exactly the same error message.
Here are a couple of examples:

http://boinc.thesonntags.com/collatz/workunit.php?wuid=15778372
http://boinc.thesonntags.com/collatz/workunit.php?wuid=15779377

Is this a known issue (I tried to search the forum but could not find anything about it) ? If so, is a fix expected in the near future ?

Thanks,

Tom


I typed in a lengthy response talking about heat, etc. which is usually the case but after re-reading your post, I double checked the other errors. Congratulations. You appear to have found a compatibility issue.

All of you are running a 64-bit version of Windows. BOINC, even though the flag is set on the server to only send 64-bit apps to 64-bit operating systems, decided to send you 32-bit applications just in case they ran faster. So.. there are really two bugs. The first is that the CUDA55 doesn't return the proper results when run on Win32. The second is with BOINC's handling of the "use preferred platform" flag.

Why does BOINC do that at all? Some project admins are so technology challenged (that's a nice way of saying they probably shouldn't even be in IT) that they have 64-bit apps that run slower than their 32-bit versions and don't remove them. They instead expect BOINC to figure out which of their versions runs faster on which operating systems. That's why BOINC randomly sends 32-bit apps to 64-bit operating systems.

The only workaround I know of is to set the no_alt_platform in your BOINC cc_config.xml. The problem with doing that is that if you crunch some other project that only has 32-bit applications, it won't get any work from them.

The fix would be to change the BOINC server code so that it never sends 32-bit apps to 64-bit operating systems, or at least on Collatz (yet another server specific code change that I'll have to merge back in the next time I upgrade the server software.)

Hopefully the new sieve apps won't behave the same way. In my testing so far, the 32-bit nVidia apps run just fine on Win 7 x64 and on Win 8.1 x64. I don't have a CUDA specific version compiled yet (just OpenCL).


Thanks for the explanation. I will try the no_alt_platform option (would not have thought of that without you mentioning it). Got a lot of error messages after restarting BOINC, but I don't think there are any serious issues.
If it does cause problems I will let you know.

Tom


Next 20

Main page · Your account · Message boards


Copyright © 2017 Jon Sonntag; All rights reserved.