AMD GPU reduce clock under load
log in

Advanced search

Message boards : Number crunching : AMD GPU reduce clock under load

Author Message
Hans-Ulrich Hugi
Send message
Joined: 14 Sep 09
Posts: 6
Credit: 1,433,843,661
RAC: 76,409
Message 19279 - Posted: 29 Apr 2014, 16:17:37 UTC

I've noticed a strange behaviour with the new apps (Large Collatz Conjecture v6.04 (opencl_amd_gpu)):

If the app starts the GPU is using the correct clock speed (high). During calculation the clock speed drops and remains on the lower level. This results in a massive longer execution time.
It happens with a 7950, a 280X and a 290 GPU. It happens with the 14.3 Beta driver and with the 14.4 final driver.

This issue is not a thermic problem under load! The GPU doesn't overheat. On the 290 the GPU clock speed goes down from 1 GHz to 570MHz. Memory clock is also dropping. The GPU load remains somewhere between 40 and 100% (= doesn't change). I must disable Collatz and start another project (eg Milkyway) to have the standard clock back. Sometimes it helps to stop and start all running projects.
Does anybody else have that problem?

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 19286 - Posted: 29 Apr 2014, 21:14:02 UTC
Last modified: 29 Apr 2014, 21:14:23 UTC

If you are using Afterburner, make sure 'Force Constant Voltage" is ticked. Its usually inside the BIOS but programmes like Afterburner can access it.

Its not a "cure all", and may not even be your issue, but its definitely worth checking out, because with todays PC Maker's focus on User perception re Sales etc, its often the case that CPU and/or GPU power save / dynamic voltage usesage functions are auto enabled, and many people don't know they are even there. However, Crunchers and High End Gamers do notice.

If you see voltages constantly changing in CPU and/or GPU monitoring programmes, its (many not all) times a good bet that dynamic (or "variable" voltage has been enabled inside the PC BIOS for CPU/GPU inside the BIOS. Particularly CPUs where typically 2xCPU are run at a constant high voltage, and the rest at "dynamic" - the latter gives up front speed for those not into configuration, or hardware unaware gamers that don't use all CPU/GPU - good for Sales :)

That will have its own issues of course - no such thing as a free lunch - and likely power consumption and Core temperatures will rise a little. From a Crunchers viewpoint, whilst Dynamic Voltage can be a good thing, many times it can also be a pain depending what the PC is used for.

No guarantees .... but its definitely worth checking out on the CPU(s) and GPU(s), such manipulation as a default setting is becoming commonplace these days - its great for casual gamers, but not so great for many Crunchers.... depending on how much they get into hardware performance.

Profile Zydor
Avatar
Send message
Joined: 19 Aug 09
Posts: 364
Credit: 840,811,292
RAC: 0
Message 19287 - Posted: 29 Apr 2014, 23:08:48 UTC

oppps ..... forgot one thing .... be even more suspicious if you have dual cards with a power strap between them.

In those cases. if power save is enabled, a lot of times (not all, but most) there can be issues where BIOS powers down the second GPU Card to "save power" because its not detecting BOINC activity.

Its not so improbable as you might think, as these days the CPU is hardly accessed at all in comparison to a constant GPU when a GPU WU is running. Its not uncommon for the power save to kick in over the dual strap for the second GPU whilst BOINC is running. Whilst many (not all) auto restart in time for the next GPU instruction, many don't, causing a delay.

Best way to avoid that is don't use a strap across GPU cards - unless you are a mega high end gamer with a Screen Wall and micro-seconds count in zapping the bad guys in shared On Line gaming, todays card speed often means you don't need that strap.

BOINC is fine with that and will access both GPU cards over the BIOS, it does not need the dedicated strap.

Profile R.Stanneveld
Send message
Joined: 8 Dec 09
Posts: 26
Credit: 231,803,553
RAC: 0
Message 19682 - Posted: 29 Jun 2014, 22:05:26 UTC

It will reduce clocks according to how much power is being used.
It has some kind of power limit, if it reaches that is reduce clock speeds.
____________

Profile sosiris
Send message
Joined: 11 Dec 13
Posts: 123
Credit: 55,800,869
RAC: 0
Message 19683 - Posted: 29 Jun 2014, 23:28:39 UTC - in response to Message 19279.

Please try setting the power limit higher in MSI Afterburner. I haven't met this problem because I only have a mid-range card (HD7850).
Does game playing slow down over time as well?
____________
Sosiris, team BOINC@Taiwan

Profile R.Stanneveld
Send message
Joined: 8 Dec 09
Posts: 26
Credit: 231,803,553
RAC: 0
Message 19684 - Posted: 1 Jul 2014, 7:23:33 UTC
Last modified: 1 Jul 2014, 7:33:13 UTC

I will not toucht afterburner.
I got an ATI card can do that in Catalyst control center (its not an nvidia card:))
Mine is on power limit 17% now or so uses about 850Mhz fine by me card gets 60c
Gonna crank that up later today as i get an EK waterblock :P

____________

Profile R.Stanneveld
Send message
Joined: 8 Dec 09
Posts: 26
Credit: 231,803,553
RAC: 0
Message 19685 - Posted: 1 Jul 2014, 9:46:09 UTC
Last modified: 1 Jul 2014, 9:50:19 UTC

Some more Fiddeling.
At 1000Mhz now temps 66c
13 minutes 22 seconds on a solo collatz.

And no Sosiris games dont suffer from powrlimit.
They dont use nearly enough to reach the power limit.
In games u will be fine :P
I play Battlefield 4 no problems with powerlimit.
____________

Profile FalconFly
Avatar
Send message
Joined: 25 Oct 09
Posts: 12
Credit: 207,961,802
RAC: 0
Message 19820 - Posted: 3 Sep 2014, 13:52:57 UTC - in response to Message 19685.
Last modified: 3 Sep 2014, 13:53:57 UTC

I've just done a few tests by running a couple of workunits and I can confirm that ATI cards don't clock up appropriately when running the Collatz App.

It is the worst when they share their work with a fully loaded CPU.
Increasing the Thread Priority to a higher value increases that but doesn't fix the issue.

Even running the WorkUnit alone (all other CPU tasks suspended) still results in frequent downclocking of the GPU, effectively it never reaches even rated speed.

(all done on stock settings with Solo and Large Tasks running)

All that combined results in extremely poor runtimes. The stock settings IMHO need to be fixed, a single task must be able to run at sufficient priority and be able to fully load a GPU to whatever max. it can reach per Task w/o needing any additional tweaks.
____________

Profile FalconFly
Avatar
Send message
Joined: 25 Oct 09
Posts: 12
Credit: 207,961,802
RAC: 0
Message 19821 - Posted: 3 Sep 2014, 22:17:06 UTC - in response to Message 19820.

I've placed the app_config.xml in the project directory and edited the config file for the large Collatz I'm running on a HD7790 right now.

Basically didn't make an inch of a difference - unless I let the GPU basically use half of the entire CPU cores (50% of other CPU tasks disabled).

The way it is now, the workunit will take way over 150hrs to complete, another large task running on a small HD7750 (all default) will take in excess of 200hrs to complete.

Overall, the ATI GPU Apps are really messed up on stock settings. It appears they run only at less than 5% of the GPUs potential and any increased CPU load despite free cores will downright grind the performance nearly to a halt (?)
____________

RFGuy_KCCO
Send message
Joined: 8 Oct 13
Posts: 21
Credit: 1,096,742,075
RAC: 271
Message 19830 - Posted: 7 Sep 2014, 1:40:40 UTC
Last modified: 7 Sep 2014, 1:45:02 UTC

The problem you are seeing with long run times is likely due to the fact you need to set up optimized settings in your application config files. It is a simple thing to do and it will give you much better WU run times. See this thread for details and how to:

http://boinc.thesonntags.com/collatz/forum_thread.php?id=1117#18463

This post is probably the best for you to start with:

http://boinc.thesonntags.com/collatz/forum_thread.php?id=1117&postid=18529#18529
____________

Profile SDC JAURA
Avatar
Send message
Joined: 2 Sep 14
Posts: 7
Credit: 1,433,525
RAC: 0
Message 19831 - Posted: 8 Sep 2014, 8:11:40 UTC - in response to Message 19279.

Getting the same problems.

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 19835 - Posted: 8 Sep 2014, 13:10:10 UTC

It used to be that using a large lookup table in texture memory on the GPU worked best. It still does with nVidia hardware. With OpenCL, that no longer seems to be the case with AMD hardware. By reducing the lookup table to the point where it fits within AMD's cache, it can keep the stream processors busy. At present, the bittleneck for AMD GPUs is memory speed rather than processing speed. Sosiris has been doing quite a bit of R&D and has suggested a number of changes that improve performance on newer AMD GPUs. I'm working on merging those changes into the Collatz apps.

Profile FalconFly
Avatar
Send message
Joined: 25 Oct 09
Posts: 12
Credit: 207,961,802
RAC: 0
Message 19838 - Posted: 8 Sep 2014, 22:55:53 UTC - in response to Message 19830.
Last modified: 8 Sep 2014, 23:04:03 UTC

The problem you are seeing with long run times is likely due to the fact you need to set up optimized settings in your application config files. It is a simple thing to do and it will give you much better WU run times. See this thread for details and how to:

http://boinc.thesonntags.com/collatz/forum_thread.php?id=1117#18463

This post is probably the best for you to start with:

http://boinc.thesonntags.com/collatz/forum_thread.php?id=1117&postid=18529#18529


I did try multiple optimized settings as based on the recommendations and adjusted them to the hardware capabilities (tried low end and higher end values as well).

All that didn't make the GPU clock up to spec, also not with multiple tasks running (max of 2, as the hardware I used isn't too potent). But they do run to about 65-80% clock and chug along, and have at least a pretty constant GPU loading.
But chances are, my low-end GPUs currently attached don't benefit much from optimized settings as opposed to high-end GPUs that have much more headroom to get fully loaded)

The overall runtimes seem okay-ish (minus what is lost due to not clocking to full speed). As I was just looking for a run-along project next to CPU tasks, getting absolute optimum performance wasn't an issue this time around.

I can live with the way it is.

My initial fears of running at extreme low effectivity luckily turned out unfunded. The server apparently doesn't check hardware capabilities (i.e. GFlops) before sending out tasks (before I had a chance to intervene to adapt more suitable preferences) - hence even my slow HD7750 initially got a large task which I guess just takes its time on such a slow GPU. Looking at the massive Credits granted for a large task, it all fits together and makes sense.

Running Mini or Solo tasks gives much more reasonable runtimes on my hardware, so I'm fine with that.
Plus, Collatz behaves very nicely concerning CPU usage with otherwise loaded CPU tasks and at least gives work :)
(I don't get work for my ATI cards on Milkyway, which seems to have run into some GPU identification issues ; Einstein on the other hand is absolutely grinding to a near complete halt when not getting a full CPU core reserved to it).

So all in all, we're good enough for what I wanted to do. With optimized settings, I'm sure I could squeeze some 20-30% more out of the slow GPUs but that would also require reserving CPU cores, which I can't afford this time. So for a compromise (no reserved CPU cores), that's okay.
____________

RFGuy_KCCO
Send message
Joined: 8 Oct 13
Posts: 21
Credit: 1,096,742,075
RAC: 271
Message 19853 - Posted: 12 Sep 2014, 4:57:04 UTC - in response to Message 19838.
Last modified: 12 Sep 2014, 5:00:08 UTC

Ah, I understand now - if you don't reserve a CPU core for the AMD OpenCL apps, they will run very slowly because the GPU is starved for work from the CPU.

I use an app_config.xml to allocate one full CPU core to each AMD OpenCL GPU task. It isn't necessary for the CUDA apps, as they run just fine with <25% of a core allocated, usually only using <2% of a core on both my GTX770's and my GTX780Ti. However, to get the best completion times out of the AMD OpenCL apps, a full core is absolutely essential. I find these apps will use ~95% of a core when they run. This includes all of my AMD GPU's - HD7870, HD7950, HD7970, R9 280 (same as an HD7950), and an R9 280X (same as an HD7970). All of them are CPU hogs. Since you want to focus on CPU projects, reserving a core isn't an option, which is understandable.

When choosing which sized WU's you want to run, remember this: 30 minutes of processing time is key. You want to pick a size that requires >30 minutes to complete, that way you get bonus points for each WU. Also, keep in mind that each size up from the Micro WU's is a 16X increase in WU size (and processing time, of course). So, Mini = 16X Micro, Solo = 16X Mini, and Large = 16X Solo.

Happy crunching!
____________

Profile FalconFly
Avatar
Send message
Joined: 25 Oct 09
Posts: 12
Credit: 207,961,802
RAC: 0
Message 19964 - Posted: 20 Oct 2014, 20:09:57 UTC
Last modified: 20 Oct 2014, 20:14:19 UTC

I'm now running a number of AMD GPUs on the Project using maximum optimization.

One oddball thing however I can't understand :
Right now I'm achieving good performance without reserving a CPU core for the GPU tasks. All CPUs used are 100% busy running SIMAP WorkUnits.

When testing what difference that would make, I freed one or more CPU core(s) for Collatz GPU tasks.
I found out that the GPU runtimes basically remained unchanged, despite the reserved CPU core being fully loaded by Collatz (?)
Not sure that the CPU was doing, but judging from the overall result, it appeared to be "very busy doing nothing useful" so to speak ;)

So for me, it doesn't seem to make any difference if a CPU core (or even multiple cores) are free or not. It even appears wasteful reserving one.
That results in what looks to be good processing times without any significant CPU load...
I do remember though, that it helped GPU utilization alot when running the Tasks completely unoptimized.

Good thing is that I really need all CPU cores for the other project anyway, so to me that's good news.
____________


Post to thread

Message boards : Number crunching : AMD GPU reduce clock under load


Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.