Posts by Hans-Ulrich Hugi
log in
1) Message boards : Number crunching : AMD GPU reduce clock under load (Message 19279)
Posted 1365 days ago by Hans-Ulrich Hugi
I've noticed a strange behaviour with the new apps (Large Collatz Conjecture v6.04 (opencl_amd_gpu)):

If the app starts the GPU is using the correct clock speed (high). During calculation the clock speed drops and remains on the lower level. This results in a massive longer execution time.
It happens with a 7950, a 280X and a 290 GPU. It happens with the 14.3 Beta driver and with the 14.4 final driver.

This issue is not a thermic problem under load! The GPU doesn't overheat. On the 290 the GPU clock speed goes down from 1 GHz to 570MHz. Memory clock is also dropping. The GPU load remains somewhere between 40 and 100% (= doesn't change). I must disable Collatz and start another project (eg Milkyway) to have the standard clock back. Sometimes it helps to stop and start all running projects.
Does anybody else have that problem?
2) Message boards : News : v4.07 Released for Windows (Message 16517)
Posted 1712 days ago by Hans-Ulrich Hugi
Thank you Slicker!

I found the article without your link and modified the config files. Because i run Collatz "out of the box" and unchanged i didn't notice that i must modify anything for a 69x0. And without to read it completely i thought the change is needed only for the 79x0 boards to run Collatz. My fault.

Anyway. Performance is not slower, i would say performance is very (!) poor:
my "Collatz 2.09 (ati13ati)" tasks finish in less than 30 minutes
but a "Collatz 4.07 (opencl_ati_100)" task is at 1,3% after 30 minutes

I'm aware that the 69x0 cards don't have the opencl power of the 79x0 boards. But compared to the old app it's a shame!
3) Message boards : News : v4.07 Released for Windows (Message 16508)
Posted 1712 days ago by Hans-Ulrich Hugi
All OpenCL Tasks fail immediately running an AMD69x0 With Catalyst 13.1 / Win7-X64.
Error message in "solo_collatz v4.07 (opencl_ati_100)" and "collatz v4.07 (opencl_ati_100)" is allways exaclty the same:

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00000000776CE4B4 write attempt to address 0x00000024

Examples:
Task 140478053 (solo_collatz_2379515330667410598248_824633720832_0) or
Task 140471613 (collatz_2379514024138359155048_824633720832_1)
4) Message boards : Number crunching : 6850 performance? (Message 10689)
Posted 2589 days ago by Hans-Ulrich Hugi
My 6950 running @ stock isn't detected correctly:

CAL Runtime: 1.4.900
Found 1 CAL device

Device 0: unknown ATI card 2048 MB local RAM (remote 1855 MB cached + 1855 MB uncached)
GPU core clock: 800 MHz, memory clock: 1250 MHz
1760 shader units organized in 22 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Run times are between

CPU time: 2.54282 seconds, GPU time: 414.335 seconds, wall clock time: 414.846 seconds

and

CPU time: 2.46482 seconds, GPU time: 467.909 seconds, wall clock time: 468.658 seconds
5) Message boards : Number crunching : Radeon HD 5870 Reviews now online. (Message 2466)
Posted 3033 days ago by Hans-Ulrich Hugi
Today i did some testing with the 5870 at different clock speeds. All results are under Win7 Ultimate 64bit (Build 7600) with a Q9650 @ 3.75 GHz, Boinc Client 6.10.11 and the 2.04 collatz SW (no command line parameters changed).

Clocks 850 / 1200 (@stock)
===========================

GPU load max. 50% (GPU-Z 0.3.5)

GPU core clock: 850 MHz, memory clock: 300 MHz (Wrong: 1200 MHz!)

predicted runtime per iteration is 34 ms (33.3333 ms are allowed), dividing each iteration in 2 parts
borders of the domains at 0 2048 4096
needed 1674 steps for 2361224431037583010991
72615528055281 total executed steps for 137438953472 numbers

CPU time: 0.358802 seconds, GPU time: 512.367 seconds, wall clock time: 512.773 seconds, CPU frequency: 3.7371 GHz
CPU time: 0.358802 seconds, GPU time: 512.757 seconds, wall clock time: 513.149 seconds, CPU frequency: 3.7371 GHz
CPU time: 0.343202 seconds, GPU time: 512.399 seconds, wall clock time: 512.79 seconds, CPU frequency: 3.7371 GHz

Clocks 935 / 1320 (both + 10%)
===============================

GPU load max. 99% (GPU-Z 0.3.5)

GPU core clock: 935 MHz, memory clock: 300 MHz (Wrong: 1320 MHz!)

predicted runtime per iteration is 31 ms (33.3333 ms are allowed)
borders of the domains at 0 4096
needed 1679 steps for 2361218053894194294923
76789984427940 total executed steps for 137438953472 numbers

CPU time: 0.561604 seconds, GPU time: 256.979 seconds, wall clock time: 257.348 seconds, CPU frequency: 3.7371 GHz
CPU time: 2.32441 seconds, GPU time: 263.585 seconds, wall clock time: 264.059 seconds, CPU frequency: 3.7371 GHz
CPU time: 1.06081 seconds, GPU time: 257.785 seconds, wall clock time: 258.158 seconds, CPU frequency: 3.7371 GHz

The higher clock speed nearly doubles the load and the calaculation time is half. So i want to know it is the GPU clock or the memory clock that gives this advantage:

Clocks 850 / 1320 (GPU @stock / Memory + 10%)
=============================================

GPU load max. 54% (GPU-Z 0.3.5)

GPU core clock: 850 MHz, memory clock: 300 MHz (Wrong: 1320 MHz!)

predicted runtime per iteration is 34 ms (33.3333 ms are allowed), dividing each iteration in 2 parts
borders of the domains at 0 2048 4096
needed 1674 steps for 2361211536082181078012
70073165112471 total executed steps for 137438953472 numbers

CPU time: 0.358802 seconds, GPU time: 512.445 seconds, wall clock time: 512.776 seconds, CPU frequency: 3.7371 GHz
CPU time: 0.327602 seconds, GPU time: 512.352 seconds, wall clock time: 512.704 seconds, CPU frequency: 3.7371 GHz
CPU time: 0.358802 seconds, GPU time: 512.398 seconds, wall clock time: 512.81 seconds, CPU frequency: 3.7371 GHz

Clocks 935 / 1200 (GPU + 10% / Memory @stock)
==============================================

GPU load max. 98% (GPU-Z 0.3.5)

GPU core clock: 935 MHz, memory clock: 300 MHz (Wrong: 1200 MHz!)

predicted runtime per iteration is 31 ms (33.3333 ms are allowed)
borders of the domains at 0 4096
needed 1723 steps for 2361229808201720130217
64635609237816 total executed steps for 137438953472 numbers

CPU time: 0.842405 seconds, GPU time: 310.285 seconds, wall clock time: 311.448 seconds, CPU frequency: 3.7371 GHz
CPU time: 0.748805 seconds, GPU time: 306.697 seconds, wall clock time: 310.218 seconds, CPU frequency: 3.7371 GHz
CPU time: 0.702005 seconds, GPU time: 305.527 seconds, wall clock time: 307.559 seconds, CPU frequency: 3.7371 GHz

My 5870 runs stable at 935/1320 MHz with BOINC and all tested games. If i use only Boinc higher clocks run stable, but don't give a much better result. Btw: With Milkyway the higher clocks give only a minimal advantage.
6) Message boards : Number crunching : Radeon HD 5870 Reviews now online. (Message 2316)
Posted 3036 days ago by Hans-Ulrich Hugi
I didn't notice the 300 MHz memory clock message but my 5870 says the same:

Running Collatz Conjecture (3x+1) ATI GPU application version 1.22 by Gipsel (Win64, CAL 1.4)
instructed by BOINC client to use device 0
Reading input file ... done.
Checking 137438953472 numbers starting with 2361220262607845042536

CPU: Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz (4 cores/threads) 3.73699 GHz (3ms)

CAL Runtime: 1.4.427
Found 1 CAL device

Device 0: ATI Radeon HD5800 series (Cypress) 1024 MB local RAM (remote 2047 MB cached + 2047 MB uncached)
GPU core clock: 900 MHz, memory clock: 300 MHz
1600 shader units organized in 20 SIMDs with 16 VLIW units (5-issue), wavefront size 64 threads
supporting double precision

Initializing lookup table (16384 kB) ... done
Starting WU on GPU 0
Copy lookup table to GPU memory (16384 kB)
Initialize step array on GPU (256 MB)
predicted runtime per iteration is 33 ms (33.3333 ms are allowed)
borders of the domains at 0 4096
No checkpoint data found.
needed 2028 steps for 2361220262660541204137
71333559129137 total executed steps for 137438953472 numbers
Generating result output.

WU completed.
CPU time: 0.374402 seconds, GPU time: 384.65 seconds, wall clock time: 384.981 seconds, CPU frequency: 3.73711 GHz

Will check it today evening with GPU-Z, then we know the memory is running at 2D speed. Thanks Gipsel for the updated GPU detection! Only the BOINC client is unable to detect the correct GPU:

Coprocessors ATI ATI unknown (1024MB) driver: 1.4.427
(WinXP 32bit and Win7 64bit)




Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.