Optimizing Collatz v6.xx OpenCL and CUDA Applications
log in

Advanced search

Message boards : Number crunching : Optimizing Collatz v6.xx OpenCL and CUDA Applications

1 · 2 · 3 · 4 . . . 8 · Next
Author Message
Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2451
Credit: 675,942,722
RAC: 4,809
Message 16503 - Posted: 16 May 2013, 20:45:19 UTC
Last modified: 8 Jul 2015, 16:26:41 UTC

Each Collatz 6.xx application is distributed with an empty config file. The config file has the same name as the executable but with the extension ".config".

There are a number of parameters that can be altered to improve speed or video response or to aid in solving issues. They are:

verbose=[0|1]
A value of 1 causes more information about the GPU, OpenCL version, etc. to be written to the log file. If enabled, this should be the first line of the config file so that it will report the other settings in the log file.

items_per_kernel=[10..22]
The number is the power of two 256-bit numbers (e.g. 2^N) that will be calculated per kernel call. Setting this number higher places a larger load on the GPU. Setting the number too high WILL cause the driver to crash and the application to hang. The default is 14, or 2^14, or 16384 items.

kernels_per_reduction=[2..9]The number (2^N once again) of kernels to run before doing a reduction. The default is 8 or 2^8 = 256. A lower number can improve video response. A larger number may result in a higher GPU load. Too high a number will result in CPU as well as GPU utilization.

threads=[5..10]
This contains the number of work groups to run in parallel. Higher is not necessarily faster. This number is device dependent. If set too high, the application will automatically reduce it to a value compatible with the device.
Most AMD GPUs allow up to 256 (a setting of 8). NVidia GPUs may allow 512 or even 1024 (a setting of 9 or 10). OpenCL requires a minimum of 32 (a setting of 5) according to the Khronos specifications.

build_options=[string containing any optional OpenCL build options]
This was added strictly for debugging in order to be able to use "-cl-opt-disable -Werror". If the OpenCL application crashes within 1-2 seconds of starting, you may want to use "build_options=-cl-opt-disable -Werror" and see if that fixes the problem.

sleep=[1..1000]
This controls the number of milliseconds that the application goes into a sleep state while waiting for the asynchronous kernel calls to complete. The default is 1. Setting this higher (e.g. 2-5) will result in better video response but will slow down the application considerably.

lut_size=[5..20]
This controls the size of the lookup table and is new as of version 6.05. The goal is to make the table as large as possible while still fitting into the L2 cache of the GPU. The default is 12. The table size in bytes is 8(2^N), so the default uses 8x(2^12) or 32,768 or 32K. If your GPU has an L2 cache size greater than 64K, you will likely want to increase this. A lut_size that uses 50-75% of your L2 cache size is likely optimal. Some may find that exceeding the cache size has no negative affect on performance. For those, using the previous default setting of 20 may yield the best results.

The config file will be renamed to collatz.config when it is copied to the BOINC slot folder when an application starts running. Exiting BOINC and editing the version in the project folder will not change the settings of the applications in progress as their config is taken from the slot folder.

A sample collatz.config file looks like:

verbose=1
items_per_kernel=20
kernels_per_reduction=9
threads=8
sleep=1
lut_size=12
build_options=-Werror


Since the workunits very somewhat in the number of total steps they produce, I would suggest that you run several and take the average runtime to determine whether one set of values in the config works better than another set.

Note: The values in the sample above work quite well on my HD 6970 and HD 7970 without making either too sluggish.

zombie67 [MM]
Volunteer tester
Avatar
Send message
Joined: 3 Jul 09
Posts: 156
Credit: 612,310,084
RAC: 1,097
Message 16562 - Posted: 24 May 2013, 14:00:21 UTC
Last modified: 24 May 2013, 14:24:00 UTC

Has anyone played around with the 7970, and figured out the best setting for a dedicated cruncher? In other words, the best performance that is stable, regardless of video response?

Edit: I forgot to mention, I am asking about solo_collatz
____________
Dublin, California
Team: SETI.USA

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 17007 - Posted: 30 Jun 2013, 0:11:12 UTC
Last modified: 30 Jun 2013, 0:52:39 UTC

Solo_Collatz Useage

Thanks Slicker ..... the effect of those settings has been substantial for me. I am now running that on my main machine, and it produced a 40% improvement for me from circa 870sec to circa 530 secs for each crunched file.

My hardware remains unchanged, apart from upping the fan from 40% to 55%, and utilization has increased to 98%. Apart from the fan change, the settings below were running both before and after the addition of the "XML Configuration" file

For those thinking of using an "XML configuration" format file as given by Slicker, below are my various settings in The Beast as a start point for thought process.

MSI AMD 7970 Lightning GPU Cards, running on a 3960x@3.6Ghz 2133Mhz sitting over a Corsair SSD
GPU Drivers: 13-4_win7_win8_64_dd_ccc_whql, BOINC 7.0.64

Afterburner Setting:
Core Voltage: 1213
Power Limit: +0
Core Clock: 1225
Memory Clock: 1675
Fan Speed: 55% (running temperature of the GPU is 74 degrees, utilisation is 98%)

The XML Configuration File I used is the same as the one Slicker posted ie:
<configuration>
verbose=1
items_per_kernel=20
kernels_per_reduction=9
threads=8
sleep=1
build_options=-Werror
</configuration>

My app_config is:
<app_config>
<app>
<name>solo_collatz</name>
<max_concurrent>1</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
</app_config>

My cc_config is:
<cc_config>
<options>
<ignore_ati_dev>0</ignore_ati_dev>
<save_stats_days>180</save_stats_days>
<max_file_xfers>48</max_file_xfers>
<max_file_xfers_per_project>36</max_file_xfers_per_project>
</options>
</cc_config>

I am running Collatz on the second (device 1) 7970, leaving me free to mess around with non-boinc stuff on the first (device zero) 7970

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 17041 - Posted: 3 Jul 2013, 6:30:02 UTC
Last modified: 3 Jul 2013, 6:39:22 UTC

Been using this for a few days now, currently I am getting 500-530 secs per GPU WU average on solo-collatz with this.
From what I can find out, most crunchers who are not using this, are running the same WUs in excess of 1,500 to 2,000 secs (with either a 7970 or close to 7970 GPU) ...... not many Crunchers have implemented this, astonishingly.

On mid-range low end Cards, the actual time saving will be immense.

Mileage will vary from machine to machine - as always, especially with varying detailed settings for varying machines - but I have found the improvement on my 3960/7970x2 machine is nothing short of guargantuan - far in excess of even my initial post above now that I have got into it and tweeked it.

Run don't walk to implement it :)

EG
Avatar
Send message
Joined: 9 Jun 13
Posts: 27
Credit: 12,004,881,759
RAC: 41,700,230
Message 17081 - Posted: 9 Jul 2013, 6:21:30 UTC
Last modified: 9 Jul 2013, 6:23:26 UTC

Hi,
I've been trying to make this work on my MSI 7970 OC BE for a week now, I can't seem to get it to do anything over normal crunching of average of 20 minutes a Solo_Collatz WU

No matter what configuration files I write..... and forget two WU on the GPU....

How are you guys getting 500 sec on a WU? How are you getting two WU's on a GPU? (I read that in the other thread)

I need some educatin here...

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 17082 - Posted: 9 Jul 2013, 7:01:02 UTC - in response to Message 17081.
Last modified: 9 Jul 2013, 7:45:56 UTC

I need some educatin here...


First off - standard disclaimer ..... this pushes the card beyond standard use, and into territory not supported by either NVidia or AMD - you do so at your own risk - don't yell at me if you burn out the card :)

(its unlikely ..... but ..... its non standard use, so its at your risk, not mine)

The key is optimising the output. To do that you must paste this ......


verbose=1
items_per_kernel=18
kernels_per_reduction=7
threads=8
sleep=1
build_options=-Werror


.... into the existing file "solo_collatz_4.07_windows_x86_64__opencl_ati_100" marked as a CONFIG file. Probably the directory on yours is

C:\ProgramData\BOINC\projects\boinc.thesonntags.com_collatz

To do so, open that file with notepad, don't mess with the file extension, just paste the lines shown above into it (do not use the "<configuration>" elements in my post above, just paste the lines of text.

When pasted, save the file - don't mess with extensions etc - just save as is, "default". You will need to restart BOINC, and then the app for it to take effect. I have set cautious values of 18 & 7 in the lines above - in reality I am running them as 22 and 9 - however take it slow and steady at first, get it running, then increase each value separately in stages. If you reach the max of 22 & 9 - well, fine - but you may crash, if the latter back off one step with those values.

When that file is loaded with those values, set your other config files to suit your machine - the ones in the other post above are for my 3960x.

How are you guys getting 500 sec on a WU? How are you getting two WU's on a GPU? (I read that in the other thread)

Don't try to put two of these on the same GPU, the GPU will be heavily loaded as is, doesn't need any more.

What you will then need to do to get faster than around 670 secs or so, is optimise your GPU. You do that by a slow and steady tuning of the GPU via the Valley Benchmark Tool, you can download that from here:

http://unigine.com/products/valley/

A note of caution re the tool, its pretty benign, but you are stretching the card, so take it slow and easy. It you get to this stage, I suggest you post in the standard Crunching forum for help on using the tool - not here, as that would help others and keep this thread with essential elements only.

Re the speed comment
I am running 7970s on a 3960x CPU, your GPUs are NVIDIA GeForce GTX 660 - you can expect substantially slower output than mine, and - probably - circa 20 mins is not far off what you can expect ..... but try it and see, just don't expect 530 secs with a NVIDIA GeForce GTX 660

However it looks like you have a 7970 on the machine as well - statement of the obvious .... but .... use that card for this, not the 660

EG
Avatar
Send message
Joined: 9 Jun 13
Posts: 27
Credit: 12,004,881,759
RAC: 41,700,230
Message 17083 - Posted: 9 Jul 2013, 8:11:54 UTC - in response to Message 17082.

Thank you for the fast reply!

And the timely advice!

Assured I don't intend to overclock this to astronomical levels I can't afford any more of these cards right now.....

I just want to push it a little up around 90% or so.....

The GTX660 already does that by itself but the 7970 has been floating around 60-70%......

Not at all what I was told about these cards capabilities.......

I will take the advice you offer and go slow......

Thank you, when I do get to the overclocking of the card I will take it to the crunchin forum, all advice is welcome....

Again Thank YOU!

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 17084 - Posted: 9 Jul 2013, 8:20:38 UTC - in response to Message 17083.
Last modified: 9 Jul 2013, 8:31:07 UTC

Its all benign stuff up to the point where you start to use the Valley tool - you'd have to go some to burn a card prior to that :)

Just BE CAREFUL using the Valley tool if you have not used a GPU tool before.

I'll keep an eye open on the other forums as well - yell if you need help, or even if its just a double check before doing something.

Up to the point where you get the Valley tool out, hang loose :) BUT ..... once that tool gets wound up, focus ...... post even if in the slightest doubt on how to use the Valley Tool, don't mess, be 100% certain before you do things with the Valley tool.

Lots of video crashes are the norm when using a GPU tool - its only when the card crashes that you know you have gone a step too far, just back off a step - job done. Post on the number crunching forum on how to use the Valley Tool if you have not used one before, it will save huge grief :)

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 17085 - Posted: 9 Jul 2013, 9:41:27 UTC
Last modified: 9 Jul 2013, 10:39:20 UTC

Tools List for the last stage ....

Valley Benchmark Tool :
Unigine Valley Benchmark Tool:
http://unigine.com/products/valley/

Clocking the GPU, use MSI Afterburner:
http://www.guru3d.com/news_story/download_msi_afterburner_3_beta_10.html
There are trips and traps to using Afterburner, not least the fact that the author is (rightly) paranoid about aspects of Afterburner that *can* be dangerous if abused or used lightly. Post if you don't know the items and where to unlock various bits.

Monitoring whats going on in the PC:
http://www.hwinfo.com/index.html
(32 and 64 bit tools, select the one for your machine)

Monitoring whats happening on a GPU:
http://www.techpowerup.com/gpuz/

Step by Step guide to Overclocking a GPU:
http://forums.overclockers.co.uk/showthread.php?s=e8a04b35efbc0eacee9768ccbdc07eab&t=18431335
LtMatt on that thread (the Thread Owner)is excellent, trust him totally, ask him anything however small. First post is how to overclock a GPU, read through the follow ones to get a feel for issues and solutions to overclocking a GPU. He keeps an eye on the thread - no matter if a few days etc since last post - post there, he'll reply quickly - he's good :)

Above all ....... treat all this with respect, clocking a card is not hard, but its not for the blasé flippant mind-set, treat it seriously, and all is well.

Post detailed questions on these in the Number Crunching Forum

Claggy
Send message
Joined: 27 Sep 09
Posts: 288
Credit: 14,320,498
RAC: 0
Message 17088 - Posted: 9 Jul 2013, 12:05:34 UTC - in response to Message 17082.

You will need to restart BOINC, and then the app for it to take effect.

No, you don't need to restart Boinc, the whole point of having a config file is so Stock users can supply cmd parameters to the apps, and that changes can be done on the fly,
either suspending GPU usage momentary, or suspending and resuming the running GPU task will cause the app to use the new parameters.
(If you were putting cmd parameters into an app_info, then yes a Boinc restart is required as the app_info is only read on startup)

Claggy

Profile chip
Avatar
Send message
Joined: 8 May 11
Posts: 30
Credit: 41,295,305
RAC: 0
Message 17090 - Posted: 9 Jul 2013, 13:06:54 UTC

Collatz Conjecture v4.07 x86_64 for CUDA 5.0
Based on the AMD Brook+ kernels by Gipsel

verbose=1
items_per_kernel=22
kernels_per_reduction=9
threads=7

sleep=1
solo=0

Name GeForce GTX 580 [810MHz core]
Checking 824633720832 numbers
Numbers/Kernel 4194304
Kernels/Reduction 512
Numbers/Reduction 2147483648
Reductions/WU 384
Threads 128
Reduction CPU

Highest Steps 1842
Total Steps 420409572575007 [Credit 8,267.14]
Avg Steps 509
CPU time 0.171601 seconds
Total time 1170.46 seconds

Profile chip
Avatar
Send message
Joined: 8 May 11
Posts: 30
Credit: 41,295,305
RAC: 0
Message 17240 - Posted: 2 Aug 2013, 10:36:40 UTC

What does the parameter solo in the collatz.config file?

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2451
Credit: 675,942,722
RAC: 4,809
Message 17242 - Posted: 2 Aug 2013, 17:57:45 UTC - in response to Message 17240.

What does the parameter solo in the collatz.config file?


In the early 4.x releases, it caused the solo app to generate output in the same format as the collatz and mini_collatz format apps. That way, the output was compared to the other apps to make sure it was reporting the exact same results. The current version ignores the solo parameter.

schizo1988
Send message
Joined: 23 Aug 09
Posts: 13
Credit: 255,721,631
RAC: 2
Message 17315 - Posted: 11 Aug 2013, 5:30:16 UTC

http://www.guru3d.com/files_tags/download_afterburner_beta.html

Beta 12 is out
____________

Profile Mad Matt
Send message
Joined: 25 Sep 09
Posts: 19
Credit: 160,395,508
RAC: 191
Message 17511 - Posted: 19 Sep 2013, 21:22:59 UTC - in response to Message 17007.

Solo_Collatz Useage

Afterburner Setting:
Core Voltage: 1213
Power Limit: +0
Core Clock: 1225
Memory Clock: 1675
Fan Speed: 55% (running temperature of the GPU is 74 degrees, utilisation is 98%)

The XML Configuration File I used is the same as the one Slicker posted ie:
<configuration>
verbose=1
items_per_kernel=20
kernels_per_reduction=9
threads=8
sleep=1
build_options=-Werror
</configuration>

My app_config is:
<app_config>
<app>
<name>solo_collatz</name>
<max_concurrent>1</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
</app_config>




First off - standard disclaimer ..... this pushes the card beyond standard use, and into territory not supported by either NVidia or AMD - you do so at your own risk - don't yell at me if you burn out the card :)

(its unlikely ..... but ..... its non standard use, so its at your risk, not mine)

The key is optimising the output. To do that you must paste this ......


verbose=1
items_per_kernel=18
kernels_per_reduction=7
threads=8
sleep=1
build_options=-Werror


.... into the existing file "solo_collatz_4.07_windows_x86_64__opencl_ati_100" marked as a CONFIG file. Probably the directory on yours is

C:\ProgramData\BOINC\projects\boinc.thesonntags.com_collatz

To do so, open that file with notepad, don't mess with the file extension, just paste the lines shown above into it (do not use the "<configuration>" elements in my post above, just paste the lines of text.

When pasted, save the file - don't mess with extensions etc - just save as is, "default". You will need to restart BOINC, and then the app for it to take effect. I have set cautious values of 18 & 7 in the lines above - in reality I am running them as 22 and 9 - however take it slow and steady at first, get it running, then increase each value separately in stages. If you reach the max of 22 & 9 - well, fine - but you may crash, if the latter back off one step with those values.



Hammer, nail. Thank you Zydor. Went to your values 22 and 9 directly as it seems I am using the same GPU. Right away down from >1500 to <600 seconds. Now curious to see how it unfolds. Still at 1150/1500 clocks so there seems room for improvement.

Cheers,
Matt
____________

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 17519 - Posted: 20 Sep 2013, 23:59:41 UTC - in response to Message 17511.
Last modified: 21 Sep 2013, 0:20:05 UTC

Your Welcome ....

Just had a peek at your results so far. You are probably getting close to max re GPU speed looking at the timings which are pretty well the same as mine as such. Mine is with a AMD Phenom(tm) II X6 1090T (my main machine is still down, should be back on next week hopefully), so you should be able to go a little further with your better CPU, probably down to an average of around 525-540 secs.

Don't push to these last steps overnight, do it during the day when you can spot the "one step too far" crash - as this would be the last "downward" step its likely you'll hit the Wall at some point, so best done during the day when you can quickly spot an issue.

I've backed off a little from my original values, and settled at 1180/1625, and +5 on the power slider - its rock solid at those values. I would have thought something like 1195/1650 would be in reach for you with your better CPU. If you go over 1200, watch it like a hawk for a day or so and expect a crash.

Another thing extremely worth while is use "Diskeeper" made by Condusiv - its one of those tools that really is worth shelling out the dosh for full licence. Especially if you use SSDs - the tool will cut disc access to virtually zero. If you do use it - and I highly recommend it - just let it do its own thing, it will do it far more efficiently that you intervening with a few "manual" runs. Annual maintenance is £48 after initial buy, and I consider it the most valuable tool I have. If running SSDs (as I am on my main machine) it rates as essential for speed and minimising SSD accesses - pretty well zero disc access on the SSDs. There is a 30 day trial available, so its a try before buy job - I have not known anyone to not buy after the trial - its excellent.

http://www.condusiv.com/evaluation-software/default.aspx?p=home

(Home version is fine - you don't need the Professional version - dont need the Professional Version price either rofl )

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2451
Credit: 675,942,722
RAC: 4,809
Message 17520 - Posted: 21 Sep 2013, 4:56:27 UTC - in response to Message 17519.

Another thing extremely worth while is use "Diskeeper" made by Condusiv - its one of those tools that really is worth shelling out the dosh for full licence. Especially if you use SSDs - the tool will cut disc access to virtually zero. If you do use it - and I highly recommend it - just let it do its own thing, it will do it far more efficiently that you intervening with a few "manual" runs. Annual maintenance is £48 after initial buy, and I consider it the most valuable tool I have. If running SSDs (as I am on my main machine) it rates as essential for speed and minimising SSD accesses - pretty well zero disc access on the SSDs. There is a 30 day trial available, so its a try before buy job - I have not known anyone to not buy after the trial - its excellent.

http://www.condusiv.com/evaluation-software/default.aspx?p=home

(Home version is fine - you don't need the Professional version - dont need the Professional Version price either rofl )



What many people do not know is that the defrag tool included with Windows XP was actually the super-lite version of Diskeeper Lite which they used to offer for free. That was part of the deal so that Microsoft would put certain "hooks" into the operating system so it could defrag a volume without unmounting it. The free version had to be run manually and didn't defrag the MFT or system files which made the paid home version worth paying for. That opinion hasn't changed since Vista or Win7/8/8.1 even though the defragger in them has been improved. It still doesn't compare to Diskeeper.

John Clark
Send message
Joined: 21 Sep 09
Posts: 548
Credit: 56,516,565
RAC: 0
Message 17525 - Posted: 21 Sep 2013, 23:06:00 UTC

Thanks for the guidance Zydor

I completed the pasting the verbose-1, etc., text to the solo_collatz_4.07_ etc. to the project folder on 2 HD7970 Win7 x64 rigs, bit no other cc_config or app_config files. The speed up, without hardware change is fantastic.

The older quad (I support the GPUs with 2 cores) was improved from 1540 seconds to 650 seconds. My slightly newer PC dropped from 1050 seconds to 585 seconds.

GREAT.

GPU-Z showed the load increased from about 65% to 97%. The downside id the graphics are a bit sluggish.

I have not significantly overclocked the GPUs above their factory settings.

I will now let the rigs crunch quietly for a few days as it clears the Collatz WUs away.

QUESTION: Is there a similar speed up for the cc_config file for Collatz WUs. Or should they be restricted using an app_config file or only allowing the solo_collatz WUs in the preferences file?
____________
Go away, I was asleep

Said a Russell, 3 Shih-Tzus & a Bischeon Frize

Profile Mad Matt
Send message
Joined: 25 Sep 09
Posts: 19
Credit: 160,395,508
RAC: 191
Message 17526 - Posted: 22 Sep 2013, 1:15:05 UTC - in response to Message 17519.
Last modified: 22 Sep 2013, 1:16:37 UTC

Your Welcome ....

Just had a peek at your results so far. You are probably getting close to max re GPU speed looking at the timings which are pretty well the same as mine as such. Mine is with a AMD Phenom(tm) II X6 1090T (my main machine is still down, should be back on next week hopefully), so you should be able to go a little further with your better CPU, probably down to an average of around 525-540 secs.

Don't push to these last steps overnight, do it during the day when you can spot the "one step too far" crash - as this would be the last "downward" step its likely you'll hit the Wall at some point, so best done during the day when you can quickly spot an issue.


Adding to that:

verbose=1
items_per_kernel=18 start -->22 best possible
kernels_per_reduction=7 -->9 best possible
threads=8
sleep=1
build_options=-Werror

These best settings do not seem to pose problems to any 7970 or an MSI GTX 680 Lightning. Unless of course lag is a problem. ;) Regarding clocks, at least without adapting power many 7970 GPUs will not go beyond 1125 without failing, e.g. my XFX 7970.

I kept RAM to 1500 throughout all 7970s as I found it more rewarding to pay attention to the CPU clocks which seemed to cause more of a difference. So if you don't toast the CPU on its CPU-only tasks, investing some of the thermal headroom may be well spent on clocking CPU and CPU-RAM first.
____________

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 17527 - Posted: 22 Sep 2013, 2:55:00 UTC
Last modified: 22 Sep 2013, 3:39:48 UTC

QUESTION: Is there a similar speed up for the cc_config file for Collatz WUs. Or should they be restricted using an app_config file or only allowing the solo_collatz WUs in the preferences file?


If you enable all classes of WU in your preferences all at once you will find three lots of "CONFIG, Application and PDB" files added to the Collatz directory. They work in the same way, benign until you added values for the CONFIG file in each group of three. The maximum values for any WU of any type is (pointless putting larger values there - they will be ignored) for each of the COLLATZ files are:

verbose=1
items_per_kernel=22
kernels_per_reduction=9
threads=8
sleep=1
build_options=-Werror

Name groupings are "collatz", "mini-collatz" and "solo-collatz". That will make sense once you enable all three in your account preferences (you'll see the groups of files added once you enable all three, have a look at file list of the Collatz directory, you'll see them there.

I have not played around with the CONFIG settings of the three files for collatz and mini-collatz (only solo-collatz - but I wouldn't imagine they would be much different overall). Have a play with a low setting at first in the other two groups, and then crank it up from there. Could try 14 and 6 see what happens, then step it up .... see what happens.

Could always be adventurous and wack in 22 & 9 as in the solo-collatz group, close your eyes and "Press button A" ....... :)

Once you have sorted out the values for each of the three CONFIG groups, go back to "Preferences" in your account and just make sure only those you wish are ticked (don't have to have all three ticked)

As always ...... watch temperature of CPU/GPU like a Hawk adjusting Fan speed as needed until you have settled on the values you want - the temperatures WILL rise markedly. Don't roast your CPU & cards :)

1 · 2 · 3 · 4 . . . 8 · Next
Post to thread

Message boards : Number crunching : Optimizing Collatz v6.xx OpenCL and CUDA Applications


Main page · Your account · Message boards


Copyright © 2016 Jon Sonntag; All rights reserved.