Posts by nedmanjo

1) Message boards : Number crunching : Optimizing the apps (Message 806)
Posted 1 day ago by Profile nedmanjo
Post:
I take it this is the CPU config file? "collatz_sieve_1.40_windows_x86_64" Are there any optimizations for this like for the "collatz_sieve_1.30_windows_x86_64__opencl_nvidia_gpu" config file?
2) Message boards : Number crunching : Optimizing the apps (Message 642)
Posted 14 days ago by Profile nedmanjo
Post:
Hi Martin, Appreciate your advice. I may get to that in time. Prior to Flashing the BIOS I found an error, SMBIOS 0X01 P1-DIMMB1 Single Bit ECC Memory Error. Could be the error originated from the multiple system crashes, could be I have a bad memory module. Happens. I'm creating a Memtest86 bootable USB drive to check it out.

Regarding the GPU. It simply won't run any GPU tasks (Collatz, GPUGrid, Amicable...) with any of the three AMD drivers written specific for this card. Not at default settings, not tweaked. Runs just several minutes, screen corrupts, goes black and there's no getting it back. I had no such issues for the first week. Ran stable 24/7 at default settings but throttled constantly. Downclocking eliminated the throttling and it was stable at ~ 1400Mhz at 20 degrees below its thermal ceiling. Ran great for several days. I had my best day ever at 6,610,612 credits!

Something changed... bad hardware, software, settings... don't know.

So, reseated the hardware, no issues found. Flashed the BIOS so it's factory configured with minor but necessary adjustments. I'll run a full test of the memory modules and we'll see where we go from there.

Thanks, I appreciate you reaching out.
3) Message boards : Number crunching : Optimizing the apps (Message 635)
Posted 14 days ago by Profile nedmanjo
Post:
Update.... I have no clue... no stablity with GPU under load at stock settings or otherwise. So, start from scratch.

Flash BIOS
Restore Defaults
Reseat hardware
Reinstall OS & Drivers
Test
Upon Fail... stick cattle prod in box, then activate.

much apologies for all the lost WU's.

:(
4) Message boards : Number crunching : Optimizing the apps (Message 599)
Posted 15 days ago by Profile nedmanjo
Post:
Problem sorted but not fully understood. It was tied to running LHC but not at 75% as I assumed, I was running 23 WU's at 100%. So much for going off memory. MB thermals were fine but guess this PC couldn't take it. Reduced the CPU use % to 75% and I'm stable again. It's not the first time I've run CPU projects at 100%, some projects don't drive high thermals and when they do I reduce % use. Lesson learned.
5) Message boards : Number crunching : Optimizing the apps (Message 598)
Posted 15 days ago by Profile nedmanjo
Post:
The Vega FE gaming driver was rock solid until yesterday. Thermals were good, steady clock, no apparent points of concern. Then, suddenly, my PC locked up. Have been working to get it settled since. Evidently a finicky card and or drivers. Had to do a clean uninstall, reboot, then install the Radeon Pro Enterprise driver 18.Q2.1, reboot, then install the Radeon Adrenalin driver 18.4.1, reboot. Same issue returned. I gave the Crimson Blockchain driver a go and it ran well, for a while, and without tweaking but then the PC locked up again.

So, at the moment, I'm a bit perplexed. The only thing I changed leading up to this was my CPU based project. Thought I'd run LHC. Long CPU intensive work but shouldn't be related. I'm running two E5-2667's v2 and only using 75% of the cores at 75%.

I'll see how it does just running Collatz and WUProp. See if the stability returns.
6) Message boards : Number crunching : Optimizing the apps (Message 571)
Posted 21 days ago by Profile nedmanjo
Post:
I think I'd need to be at < 350 seconds per single WU to achieve two WU's at 700 seconds each. Running with current settings. Looks steady, thermals are good. We'll see how it does. I've currently got the core steady at ~ 1,400Mhz with good thermals. Specs indicate it should be able to run steady > 1,400Mhz but I'm thinking it will definitely take some time and effort to settle the thermals and steady the core.

Update: > 900 seconds each, pushed the core clock and voltage and for my efforts got a back screen and a frozen GUI. The best part was to come when I rebooted.... more black screen... not a process for the weak of heart.

7) Message boards : Number crunching : Optimizing the apps (Message 568)
Posted 22 days ago by Profile nedmanjo
Post:
Yes, much improved and much improved performance vs the two Titan Black's I retired. Loved those cards but they were making my electric meter whirl and I'm no longer in danger of passing out from heat exhaustion. Both of my Titans ran near their thermal limit... was so hot. The Vega FE is running 20 degrees cooler and it's one thermal plant vs two. Did some more tweaking this morning, ran below 400 seconds per WU all day!
8) Message boards : Number crunching : Optimizing the apps (Message 564)
Posted 23 days ago by Profile nedmanjo
Post:
Switched to Vega Gaming Driver so I could use OnedriveNTool to Down volt the card. I read that could address the throttling problem. So, down volted the card and I'm now under 450 seconds per WU. Less power, dropped the GPU temp by 5. Nice! Probably no where near optimized but improved. GPU clock is stable just under 1,200Mhz.
9) Message boards : Number crunching : Optimizing the apps (Message 563)
Posted 23 days ago by Profile nedmanjo
Post:
I've increased, decreased various parameters. Can't seem to reduce the time further. My GPU does seem to throttle.

Spec's say GPU Clock speed: 1382MHz “typical,” 1600MHz peak

- GPU Clock cycling between 1100MHz - 1200MHz
- Memory clock cycling between 500 - 945MHz
- Temperature cycling within a few degrees of limit
10) Message boards : Number crunching : Optimizing the apps (Message 562)
Posted 23 days ago by Profile nedmanjo
Post:
Currently running one task. ~ 600 +/- 5 seconds per. Current settings:

verbose=1
kernels_per_reduction=48
threads=8
lut_size=17
sieve_size=30
cache_sieve=1
sleep=1
reduce_cpu=0
11) Message boards : Number crunching : Optimizing the apps (Message 552)
Posted 23 days ago by Profile nedmanjo
Post:
Trying to optimize an AMD Vega Frontier.

verbose=1
kernels_per_reduction=48
threads=7
lut_size=16
sieve_size=30
cache_sieve=1
sleep=1
reduce_cpu=0

Running two tasks simultaneously, finishing between 1,000 - 1,100 seconds each. GPU utilization is at 99 - 100%, some signs of throttling.

Tasks at 6 or 8 threads seem slower, at 9 they abort. lut_size at 17 seems to cause a pause every few cycles.

Any benefit to adding kernels_per_reduction?

Any help / guidance would be appreciated.
12) Message boards : News : Use at your own risk (Message 332)
Posted 13 May 2018 by Profile nedmanjo
Post:
Having same issue as others have reported. No WU's are dropping. Been using the same NVidia driver v391.24, not the recently reported version that had an issue v397.31. For GP's performed a reset, no change, and detached, reattached the project, again no change. GPUGrid has been running since without issue.
13) Message boards : Number crunching : Dual GPU question (Message 93)
Posted 21 Apr 2018 by Profile nedmanjo
Post:
GPU portion is same as what's in my cc_config file I'm running two Titan Blacks, 2 tasks per GPU.

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
<report_results_immediately>1</report_results_immediately>
</options>
</cc_config>

Did you check your app-config file?

<app_config>
<app>
<name>collatz_sieve</name>
<max_concurrent>4</max_concurrent>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.50</cpu_usage>
</gpu_versions>
</app>
</app_config>
14) Message boards : Number crunching : Optimizing the apps (Message 84)
Posted 21 Apr 2018 by Profile nedmanjo
Post:
One other question. What are your thoughts on running 1 task per GPU or more? I've run as many as 4 per each Titan Black GPU.
15) Message boards : Number crunching : Optimizing the apps (Message 83)
Posted 21 Apr 2018 by Profile nedmanjo
Post:
Any chance of anyone having an optimization for a Titan Black? Also, what's the best way to determine if the changes are improving computational speed? Do you just run a task and then check to see if the task time is decreasing / increasing?
16) Message boards : News : Use at your own risk (Message 46)
Posted 19 Apr 2018 by Profile nedmanjo
Post:
Backed off to 2 per GPU. Checked stat's this morning. So many errors... all CPU - Collatz Sieve v1.30 windows_x86_64. Toggling off CPU tasks for now.

GPU tasks running ok but my perception is they are taking longer to process.
17) Message boards : News : Use at your own risk (Message 45)
Posted 19 Apr 2018 by Profile nedmanjo
Post:
Looked closer after reading through threads. GPU tasks ok, CPU tasks, not so much, all failed. Toggling off CPU tasks for now.
18) Message boards : News : Use at your own risk (Message 44)
Posted 19 Apr 2018 by Profile nedmanjo
Post:
Just checked my stat's this morning, yikes! So many errors. -1073741795 (0xC000001D) STATUS_ILLEGAL_INSTRUCTION

Valid (16) · Invalid (0) · Error (125)
19) Message boards : News : Use at your own risk (Message 24)
Posted 19 Apr 2018 by Profile nedmanjo
Post:
Getting WU's... over 60 now, running 6 at a time over 2 GPU's. Seems to be running normally. Credit or no will let them run. I appreciate the effort to get this project back up and running.




©2018 Jon Sonntag; All rights reserved