Long Run-Times? Optimize.
log in

Advanced search

Message boards : News : Long Run-Times? Optimize.

1 · 2 · Next
Author Message
Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 18463 - Posted: 11 Feb 2014, 16:14:35 UTC
Last modified: 11 Feb 2014, 18:25:04 UTC

A number of people have been complaining about the increased run time of the recent Collatz applications. For each platform, there is only one OpenCL application and that one application has to work with really slow hardware and also work with little or no video lag. Unfortunately, faster hardware then runs at sub-optimal speeds. It isn't that the work units have increased in size, but rather that the GPU is only working half as hard as it used to.

The key to getting good performance on high end GPUs is optimization. But, it has been brought to my attention that the QWERTY challenged that live among us are incapable of typing in the five lines as explained in the Optimizing Collatz 4.07 Applications message board thread. I try not to discriminate against QWERTY challenged volunteers any more than I discriminate against those who cannot master using a mouse that has more than one button. So, I have attempted to make optimization easier. If you have a high end GPU simply replace the .config file(s) in the Collatz project folder with one from the opt_config folder. You will have to use your one-to-one correspondence skills to match exactly the config file name on your computer with the one from the opt_config folder, but I know you can do it!

For Windows Vista/7/8/8.1, the Collatz config files are located at: C:\ProgramData\BOINC\projects\boinc.thesonntags.com_collatz

For OS X users, the Collatz config files are located at: /Library/Application Support/BOINC Data/projects/boinc.thesonntags.com_collatz

For Linux users, well... you are Linux users! You don't need me to tell you where you installed BOINC! ;-)

The only caveat is that when a new Collatz version is released, you will need to repeat this process.

For those with really fast GPUs where the above still doesn't increase the load to 99%, add the app_config.xml file also found in the opt_config folder to the Collatz project folder. It is configured to run two Collatz GPU applications at once.

The opt_config folder also has a basic cc_config.xml with the schd_op_debug and use_all_gpus flags set. If you do not already have one, it would be placed in the BOINC folder located at C:\ProgramData\BOINC or /Library/Application Support/BOINC for Windows or OS X respectively.

Profile Peciak
Avatar
Send message
Joined: 4 Aug 09
Posts: 13
Credit: 669,955,177
RAC: 931
Message 18466 - Posted: 11 Feb 2014, 16:50:24 UTC

Optimization is a great idea and there is no need to cry.

Profile [AF>Amis des Lapins] Phil1966
Send message
Joined: 25 Dec 11
Posts: 17
Credit: 262,005,816
RAC: 2,315,456
Message 18467 - Posted: 11 Feb 2014, 19:15:01 UTC

Thank You Slicker.

Everyone will appreciate your message in its own way.

GoodBye collatz.

Profile Overtonesinger
Avatar
Send message
Joined: 16 Jul 10
Posts: 21
Credit: 142,646,892
RAC: 2,366
Message 18470 - Posted: 11 Feb 2014, 23:38:19 UTC - in response to Message 18463.

Thanx, Slicker.

I have given there the config for powerful GPUs. Still, only around 56 percent load ... even with idle CPUs (I tried to stop all other projects). :)

OK, I suppose that I shall try the extreme version with 2 collatz on one GPU. :) ... But this is only a notebook class 1 card (of year 2011 ! ) ... it has *only* 800 stream processors :O at 700 MHz , with GDDR5 at 1000 MHz physically (its probably 4000 effectively)...........

OK, I will.

Filip
____________
Melwen - child of the Fangorn Forest

Alez
Send message
Joined: 28 Nov 12
Posts: 29
Credit: 1,128,253,902
RAC: 808,476
Message 18473 - Posted: 12 Feb 2014, 0:51:58 UTC

Just got back from the latest jaunt abroad and pleased to report that all my nVidia's running under ubuntu are now quite happily munching away at Collatz. Thanks Slicker for the work. Only two more boxes to optimize and it's happy days.
Cheers :)

Dirk Broer
Send message
Joined: 20 Aug 10
Posts: 33
Credit: 176,258,227
RAC: 561,228
Message 18476 - Posted: 12 Feb 2014, 8:42:55 UTC - in response to Message 18463.
Last modified: 12 Feb 2014, 8:51:40 UTC

I don't have a problem with the running time, I have a problem with the fact that all other BOINC work seems to freeze while running Collatz...And I use APUs and HD 6670s, no high-end. But running three Collatz WUs will easily double or triple the times of other projects.
____________

Profile sosiris
Send message
Joined: 11 Dec 13
Posts: 123
Credit: 55,800,869
RAC: 0
Message 18479 - Posted: 12 Feb 2014, 14:08:00 UTC - in response to Message 18476.

I don't have a problem with the running time, I have a problem with the fact that all other BOINC work seems to freeze while running Collatz...And I use APUs and HD 6670s, no high-end. But running three Collatz WUs will easily double or triple the times of other projects.


One Collatz GPU WU takes about 0.9 CPU core to do validation(so a wingman is not needed and we get the credit instantly). Perhaps that's the reason.

bolu$trolu$
Send message
Joined: 10 Dec 13
Posts: 1
Credit: 3,483,011
RAC: 0
Message 18480 - Posted: 12 Feb 2014, 14:41:16 UTC

Luckily, I have no problems with collatz. Optimizing is described very well, it is easy to understand.

I write this only because i feel obliged to show my support for Slicker. All these people are complaining and threatening with canceling this project, with no real reason, just because they now have to spend 10 or 15 minutes on optimizing. It's strange, it's not cool at all.

We all should be grateful to people who make this project run for opportunity of participating in collatz. Not the opposite!

I'm not a computer geek and i managed to optimize my collatz, like instructions said. It is not hard, really! 5 lines of text - it can't be too much for you. (It's even easier now. Thanks to Slicker, we have ready config files to download).

What is more important, this 5 lines of text and 10 minutes spent allows other people with worse hardware to participate. You, with good, new gpus, at least know that something is wrong - wu are calculated too long. So you can look for the solution, and after reading proper instructions - solve this by optimizing. If default config of project will be like it used to be (fast on new gpus), new participants with old gpu will not know that they can optimize, they would only see that application is not working and cancel even before starting.

Maybe someday there will be automated optimizer, but for now i think this little optimization is not too much to do. In the name of science :)

Thanks to Slicker and all other responsible (if any) for this project!
Sorry for my poor english, and sorry if i am overreacting ;)

PS. Just out of curiosity, what was the previous default config?

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 18483 - Posted: 12 Feb 2014, 16:13:04 UTC - in response to Message 18479.

I don't have a problem with the running time, I have a problem with the fact that all other BOINC work seems to freeze while running Collatz...And I use APUs and HD 6670s, no high-end. But running three Collatz WUs will easily double or triple the times of other projects.


One Collatz GPU WU takes about 0.9 CPU core to do validation(so a wingman is not needed and we get the credit instantly). Perhaps that's the reason.


I'm working on a CUDA app with the exact same additional validation and it uses 0.1% CPUs so the issue is definitely with the OpenCL drivers. It has been an issue since OpenCL 1.0. Unfortunately, all the vendors have decided that since the OpenCL spec does't require kernels to run asynchronously they don't have to. Quite frankly, nVidia doesn't really care since they "own" the GPU crunching market with CUDA and AMD seems more interested in gaming support than crunching support. Intel is a new player so maybe they will take the lead and produce decent drivers.

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 18485 - Posted: 12 Feb 2014, 16:33:44 UTC - in response to Message 18480.

PS. Just out of curiosity, what was the previous default config?


threads=6
items_per_kernel=20
kernels_per_reduction=6

2^20 (one million items) was too high for low end GPUs and caused the app to crash whereas 2^16 works ok (65536 items). 2^6 is a little high for kernels_per_reduction as the video is really sluggish. 32, or 2^5 seems to work better. It is still a little sluggish, but going any lower really slows down high end GPUs.

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 18486 - Posted: 12 Feb 2014, 16:48:01 UTC - in response to Message 18470.

Thanx, Slicker.

I have given there the config for powerful GPUs. Still, only around 56 percent load ... even with idle CPUs (I tried to stop all other projects). :)

OK, I suppose that I shall try the extreme version with 2 collatz on one GPU. :) ... But this is only a notebook class 1 card (of year 2011 ! ) ... it has *only* 800 stream processors :O at 700 MHz , with GDDR5 at 1000 MHz physically (its probably 4000 effectively)...........

OK, I will.

Filip


There's a bug in the config files. In the config file, change "items_per_iteration" to "items_per_reduction" or download the new fixed version.

[AF>Quebec]ut1
Send message
Joined: 16 Jul 09
Posts: 5
Credit: 565,688,843
RAC: 0
Message 18490 - Posted: 12 Feb 2014, 18:16:56 UTC
Last modified: 12 Feb 2014, 18:19:33 UTC

This is because-we grumble that the files have summers available.

For my part I was wrong to say that it was not our job to make the App config changes!

If now the Slicker think we're rednecks he said!

@+

chf1949
Send message
Joined: 18 Mar 11
Posts: 16
Credit: 1,363,723,776
RAC: 0
Message 18493 - Posted: 13 Feb 2014, 2:14:07 UTC

Any suggestions for running 3 - 7990's/ Right now 3 of the six available gpu's run at 99%, 3 run at 15-20%. I'd like to run them all at full capability.

Thanks for the great project!

chf1949
Send message
Joined: 18 Mar 11
Posts: 16
Credit: 1,363,723,776
RAC: 0
Message 18494 - Posted: 13 Feb 2014, 2:19:04 UTC - in response to Message 18493.

I forgot to mention I'm using the latest AMD config file. Thanks!

EG
Avatar
Send message
Joined: 9 Jun 13
Posts: 74
Credit: 28,770,512,393
RAC: 27,545,632
Message 18498 - Posted: 13 Feb 2014, 8:19:43 UTC - in response to Message 18493.
Last modified: 13 Feb 2014, 8:58:37 UTC

Any suggestions for running 3 - 7990's/ Right now 3 of the six available gpu's run at 99%, 3 run at 15-20%. I'd like to run them all at full capability.

Thanks for the great project!


What motherboard/power supply are you running?

A single processor MB doesn't have the PCIE lanes to run three 7990's.

Only certain MB's can even run two....

Most can only run one...

A properly tuned 7990 WILL draw 500 watts of power, running two, you need at least a 1250 watt power supply to not starve the cards for power.

My suggestion? Pull the middle card, and increase the PS, and load the case with as many fans as you can fit. Two 7990's running full bore WILL heat your room for you...

Then build yourself a second box with the extra card.
____________

Profile [AF>Amis des Lapins] Phil1966
Send message
Joined: 25 Dec 11
Posts: 17
Credit: 262,005,816
RAC: 2,315,456
Message 18500 - Posted: 13 Feb 2014, 10:09:14 UTC - in response to Message 18483.


I'm working on a CUDA app with the exact same additional validation and it uses 0.1% CPUs so the issue is definitely with the OpenCL drivers. It has been an issue since OpenCL 1.0. Unfortunately, all the vendors have decided that since the OpenCL spec does't require kernels to run asynchronously they don't have to. Quite frankly, nVidia doesn't really care since they "own" the GPU crunching market with CUDA and AMD seems more interested in gaming support than crunching support. Intel is a new player so maybe they will take the lead and produce decent drivers.


Dear Slicker,

I would like to thank you for this good news and for all the improvements you have developped and implemented.

Kind Regards,

Philippe

EG
Avatar
Send message
Joined: 9 Jun 13
Posts: 74
Credit: 28,770,512,393
RAC: 27,545,632
Message 18501 - Posted: 13 Feb 2014, 11:19:49 UTC - in response to Message 18480.
Last modified: 13 Feb 2014, 11:20:37 UTC

Luckily, I have no problems with collatz. Optimizing is described very well, it is easy to understand.

I write this only because i feel obliged to show my support for Slicker. All these people are complaining and threatening with canceling this project, with no real reason, just because they now have to spend 10 or 15 minutes on optimizing. It's strange, it's not cool at all.
.......

Thanks to Slicker and all other responsible (if any) for this project!
Sorry for my poor english, and sorry if i am overreacting ;)


Let me explain something. The project is great and Slicker (Jon) is the best and is obviously doing something that is worthwhile to him. It is appreciated by all the crunchers out there that run this in the hopes of being the one person where this conjecture is disproven.

Yes, we are searching for a negative. And there is only one negative. So far it hasn't been found and personally I doubt it ever will. what we are searching for is the one number that will not devolve to one when structured in the form of 3x+1~.

Basically this is a stats driven project, what attracts people is the ability to watch their stats grow and hardware work to it's most efficient. somilar to the overclockers drive to build the most efficient fastest computer that remains stable.

The issue people are expressing is they want to crunch, not spend hours in configuration or do no configuring at all. but most of us want the spped and to pile up numbers.

That being said, the current stats are built upon a set amount of credits per WU that are produced at a fairly set rate. Better hardware works faster and the leaders are those with the largest quantity of the fastest crunchers. Everyone has the same ability to get such hardware and crunch faster than anyone else. I'm a good example of that, only 8 months on the project and I'm the #2 producer. I built my hardware to accomplish exactly this....

but at this point, if you reconfigure the software so it produces more work for less credit, you stats leaderboard becomes static, no one is ever going to catch the leaders because the credits produced to get to the top are cheaper and less work to produce.

On a project that is stats driven, slower WU's for less credit will cost you crunchers that no longer see their efforts reflected in the leaderboard....

what you see as a lack of appreciation for both the project creator and the "Science" is actually a view of the people making such statements that the efforts they are making are being wasted. If they have no chance to climb the leaderboard and the results of crunching the new configuration actually make that harder, they will get frustrated and leave.

The fact that they are willing to say something to the creator, is representative of the work and effort of the people who have dedicated themselves over, in some cases, many years to advance the project.

It should not be ignored......

And it is an issue that needs to be addressed as soon as the new configuration is stable on the platforms it needs to run on.....

I do believe that the creator has expressed his intention to address the credit issue once he gets the configurations figured out...

At least I think he has...
____________

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 18502 - Posted: 13 Feb 2014, 11:27:41 UTC - in response to Message 18493.
Last modified: 13 Feb 2014, 12:15:44 UTC

Any suggestions for running 3 - 7990's/ Right now 3 of the six available gpu's run at 99%, 3 run at 15-20%. I'd like to run them all at full capability.


Please post the motherboard type, from that a few things flow - you have some chunky cards, and the motherboard type has a big effect.

It'll depend on the motherboard you have as such. It is likely to be 2x16 and 2x8 PCI-E, unless you have a high end motherboard. 8 lanes will be enough to tun a 7990 without too many issues, especially as you have an i7-3930K and likely any wait state will be CPU imposed not the PCI lane count.

You will be running some cards on 8 PCI Lanes, and some on 4 PCI lanes, the mix will depend on the motherboard type - hence my question. From the answer, forward expectations can be set.

Its highly likely that the cards will go into wait state due to the processor more than the GPU/PCI lane availability. So whilst busy, the PC should cope fine, how fine will depend on the motherboard type, its important to know in order to set expectations correctly.

Power maybe an issue. With three of those beasts, you will need 1000w PSU minimum, would be more comfortable with a 1200w PSU. Chances are you have a chunky one else it would not even power up, but would be good to know the type of PSU, as it could be an issue. Until we have verified power, stay at defaults on the cards, don't go above that in order to keep power useage down.

Meanwhile .... virtually certain from the timings you gave that you have not got all three configuration files running, and 99% certain the file called:

solo_collatz_6.04_windows_x86_64__opencl_amd_gpu (an XML Config file)

is likely a blank file. Its very important that all three configuration files are put into use:

in the BOINC Directory: cc_config.xml
in the Collatz Project Directory: app_config.xml
in the Collatz Project Directory: the XML Config file shown a few lines up in this post

Step1 - put all three config files into use, and closely check the contents - use the post at:
http://boinc.thesonntags.com/collatz/forum_thread.php?id=1009&postid=17007#17007
to get the contents correct. Especially read the first post from Slicker, its very important to get that aspect correct (its why your timings are running into thousands of secs per WU). Bare in mind the machine used in my post only had one gpu from two 7970 cards made live to Collatz, adjust your app_config.xml accordingly.

Step 2
Run the machine and cards at "defaults", don't mess around, set the baseline. Its likely to be around +/- 750 secs I would have thought, but wait and see. Your CPU time shown per WU should go to around +/- 250 secs. The latter is fine, don't get twisted up over mega low secs value there.

Step 3
Let the machine settle down at those default values for a day - patience is a virtue :). Once you get consistent results from the cards, start the process of going above default speeds. Do NOT rush this, take your time, the aim is to get a proper baseline set over a few days, and once you are stable and running, then you can go back to "normal", but at present its important to get the core settings correct.

Post the toolsets you use, you need good ones to manage the box properly above default CPU/default GPU speeds, if you don't have one I'll take you through setting up one.

Once we have the right toolset (MSI Afterburner) I'll take you through a step by step process to setup the cards without turning them into molten metal :)

There will be detailed questions no doubt, feel free to PM me if you rather do that.

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 18505 - Posted: 13 Feb 2014, 15:13:39 UTC - in response to Message 18502.
Last modified: 13 Feb 2014, 15:15:14 UTC

Additional note re the power for the 3x7990s....

If its falling over and not been starting properly (after having completed the config files as shown, and turned down power all round to defaults) its likely in your case, that the PSU is too small.

If that becomes a possiblity, just unplug the power lead from one card (the one furthest away from the CPU) - leave the rest attached, the PC will ignore any powered down card. That way you can test with two cards re power without dismantling everything.

You might - just - get away with 1000w for 3x7990s, tight, but might happen if you don't (for now) overclock et al.

Give it a go as shown below (all the way), if it crashes totally in the end - its power, do as shown above.

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 18529 - Posted: 14 Feb 2014, 14:34:13 UTC
Last modified: 14 Feb 2014, 15:18:57 UTC

To ....... anyone :)

I am going to stick my head above the parapet to get shot it ....... :)

I am working with a few to get over this resistance to a third configuration file. The only difference (as such) in terms of configuration of Collatz to other BOINC Projects is the use of this third file.

Its evolved this way for good reason, the attempt to get as many people with lower based cards participating, but in BOINC terms it is unusual and viewed by some as "another file I have to mess with". Yes you do, and for good reason, and is only a few lines of text that once done can be left for (almost) the whole time you crunch at Collatz.

The dreaded lines of text in the file that I use are:

verbose=1
items_per_kernel=21
kernels_per_reduction=9
threads=8
sleep=1

..... placed inside the third configuration file which is:

solo_collatz_6.04_windows_x86_64__opencl_amd_gpu.xml config

The value of 21 is for my 7970x, the value will spin from 16 to 22 depending on the cards you use - don't blindly use 21 I show above scale down for your card to somewhere between 16 and 22

Note the unusual extension ".xml config", the full file name will change slightly to suit the card you use, but its created automatically for you and sits in the main Collatz directory. All that has to be done is paste those 5 lines of text inside it via Notepad (defaults as opened don't mess with name or extension), and your done, assuming you have created the standard cc_config and app_config files as well. Just make sure its not been saved as a .txt file.

Those with mid range to high end cards who have a crunch time of (circa) 1200 to 2000+ seconds have not set this file correctly. Pays your money, takes your choice as they say :)

See and read the whole thread (its short) from Slicker's Post 1 in the thread below.......

http://boinc.thesonntags.com/collatz/forum_thread.php?id=1009#18522

1 · 2 · Next
Post to thread

Message boards : News : Long Run-Times? Optimize.


Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.