Collatz v3.17 CUDA Testers for Windows Needed
log in

Advanced search

Message boards : News : Collatz v3.17 CUDA Testers for Windows Needed

1 · 2 · Next
Author Message
Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15359 - Posted: 30 Oct 2012, 20:00:10 UTC

Version 3.17 of the Collatz CUDA application for Windows is now available in both 32 and 64 bit versions. This fixes a bug in versions 3.15 and 3.16 which can result in invalid results being returned. If you are testing either v3.15 or v3.16, you strongly encouraged to update to v3.17.

Windows 32-bit: http://boinc.thesonntags.com/collatz/download/test/collatz_3.17_windows_intelx86__cuda42.zip

Windows x64: http://boinc.thesonntags.com/collatz/download/test/collatz_3.17_windows_x86_64__cuda42.zip

It requires CUDA v4.2 drivers and must be installed as an anonymous application (a.k.a. using app_info.xml file). Also, optimization parameters are located in a config file so check the README.

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 30 Aug 09
Posts: 219
Credit: 676,877,192
RAC: 17,625
Message 15364 - Posted: 31 Oct 2012, 2:50:33 UTC - in response to Message 15359.

Still nothing.


10/30/2012 7:48:08 PM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
10/30/2012 7:48:08 PM | | Data directory: C:\ProgramData\BOINC
10/30/2012 7:48:08 PM | | Running under account David
10/30/2012 7:48:08 PM | | Processor: 4 AuthenticAMD AMD FX(tm)-4100 Quad-Core Processor [Family 21 Model 1 Stepping 2]
10/30/2012 7:48:08 PM | | Processor: 2.00 MB cache
10/30/2012 7:48:08 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm svm sse4a osvw ibs xop skinit wdt lwp fma4 page1gb rdtscp
10/30/2012 7:48:08 PM | | OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)
10/30/2012 7:48:08 PM | | Memory: 11.98 GB physical, 23.97 GB virtual
10/30/2012 7:48:08 PM | | Disk: 279.45 GB total, 155.53 GB free
10/30/2012 7:48:08 PM | | Local time is UTC -7 hours
10/30/2012 7:48:08 PM | | Couldn't get Device IDs for platform #0: error -1
10/30/2012 7:48:08 PM | | NVIDIA GPU 0: GeForce GTX 670 (driver version 310.33, CUDA version 5.0, compute capability 3.0, 2048MB, 1951MB available, 2634 GFLOPS peak)
10/30/2012 7:48:08 PM | | NVIDIA GPU 1: GeForce GTX 650 Ti (driver version 310.33, CUDA version 5.0, compute capability 3.0, 1024MB, 943MB available, 1646 GFLOPS peak)
10/30/2012 7:48:08 PM | | OpenCL: NVIDIA GPU 0: GeForce GTX 670 (driver version 310.33, device version OpenCL 1.1 CUDA, 2048MB, 1951MB available)
10/30/2012 7:48:08 PM | | OpenCL: NVIDIA GPU 1: GeForce GTX 650 Ti (driver version 310.33, device version OpenCL 1.1 CUDA, 1024MB, 943MB available)
10/30/2012 7:48:08 PM | Collatz Conjecture | Found app_info.xml; using anonymous platform
10/30/2012 7:48:08 PM | SETI@home | Found app_info.xml; using anonymous platform
10/30/2012 7:48:08 PM | | Config: use all coprocessors
10/30/2012 7:48:08 PM | | Config: don't compute while saintsrowthethird_dx11.exe is running
10/30/2012 7:48:08 PM | | Config: don't compute while SBW.exe is running
10/30/2012 7:48:08 PM | | Config: don't compute while TESV.exe is running
10/30/2012 7:48:08 PM | | Config: GUI RPC allowed from:
10/30/2012 7:48:08 PM | | Config: Bruno
10/30/2012 7:48:08 PM | | Config: Johann
10/30/2012 7:48:08 PM | | Config: 192.168.1.2
10/30/2012 7:48:08 PM | | Config: 192.168.1.3
10/30/2012 7:48:08 PM | | Config: 192.168.1.4
10/30/2012 7:48:08 PM | | Config: 192.168.1.5
10/30/2012 7:48:08 PM | | Config: 192.168.1.6
10/30/2012 7:48:08 PM | Albert@Home | URL http://albert.phys.uwm.edu/; Computer ID 2728; resource share 0
10/30/2012 7:48:08 PM | Collatz Conjecture | URL http://boinc.thesonntags.com/collatz/; Computer ID 81107; resource share 0
10/30/2012 7:48:08 PM | Milkyway@Home | URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 344435; resource share 0
10/30/2012 7:48:08 PM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 5407589; resource share 100
10/30/2012 7:48:08 PM | SETI@home Beta Test | URL http://setiweb.ssl.berkeley.edu/beta/; Computer ID 55184; resource share 50
10/30/2012 7:48:08 PM | PrimeGrid | URL http://www.primegrid.com/; Computer ID 231264; resource share 0
10/30/2012 7:48:08 PM | SETI@home | General prefs: from SETI@home (last modified 06-Oct-2012 22:17:07)
10/30/2012 7:48:08 PM | SETI@home | Computer location: home
10/30/2012 7:48:08 PM | | General prefs: using separate prefs for home
10/30/2012 7:48:08 PM | | Preferences:
10/30/2012 7:48:08 PM | | max memory usage when active: 6135.59MB
10/30/2012 7:48:08 PM | | max memory usage when idle: 11044.06MB
10/30/2012 7:48:08 PM | | max disk usage: 10.00GB
10/30/2012 7:48:08 PM | | max CPUs used: 3
10/30/2012 7:48:08 PM | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
10/30/2012 7:48:08 PM | | Using proxy info from GUI
10/30/2012 7:48:08 PM | | Not using a proxy
10/30/2012 7:48:10 PM | | Suspending computation - user request
10/30/2012 7:48:15 PM | SETI@home | project suspended by user
10/30/2012 7:48:21 PM | Collatz Conjecture | update requested by user
10/30/2012 7:48:26 PM | Collatz Conjecture | [sched_op] Starting scheduler request
10/30/2012 7:48:26 PM | Collatz Conjecture | Sending scheduler request: Requested by user.
10/30/2012 7:48:26 PM | Collatz Conjecture | Requesting new tasks for NVIDIA
10/30/2012 7:48:26 PM | Collatz Conjecture | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
10/30/2012 7:48:26 PM | Collatz Conjecture | [sched_op] NVIDIA work request: 1.00 seconds; 2.00 devices
10/30/2012 7:48:28 PM | Collatz Conjecture | Scheduler request completed: got 0 new tasks
10/30/2012 7:48:28 PM | Collatz Conjecture | [sched_op] Server version 611
10/30/2012 7:48:28 PM | Collatz Conjecture | No work sent
10/30/2012 7:48:28 PM | Collatz Conjecture | Message from server: No work available for the applications you have selected. Please check your project preferences on the web site.
10/30/2012 7:48:28 PM | Collatz Conjecture | Project requested delay of 182 seconds
10/30/2012 7:48:28 PM | Collatz Conjecture | [sched_op] Deferring communication for 3 min 1 sec
10/30/2012 7:48:28 PM | Collatz Conjecture | [sched_op] Reason: requested by project

____________

zzuupp
Send message
Joined: 14 Mar 10
Posts: 128
Credit: 347,767,607
RAC: 30,373
Message 15371 - Posted: 1 Nov 2012, 3:07:45 UTC - in response to Message 15364.

3.17 is working well & no more double printing.

-----
Stderr output

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<stderr_txt>
Collatz Conjecture v3.17 i686 for CUDA 4.2
Based on the AMD Brook+ kernels by Gipsel
verbose=1
items_per_kernel=19
kernels_per_reduction=8
threads=9
sleep=1
Parameters --device 0

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15374 - Posted: 1 Nov 2012, 13:56:55 UTC

Collatz Conjecture | URL http://boinc.thesonntags.com/collatz/; Computer ID 81107; resource share 0


I thought I had posted the following yesterday, but evidently not....

Have you tried increasing the resource share? I'm not sure how setting it as a backup project works as the server code is from prior to backup projects working on the boinc client. So, if the resource shares as 0 and 100 for two projects and the server looks at the total time requested and applies the fracton of the resource to apply, then 0 * N = 0. From the scheduler.log, it displys each plan class as it checks to see which is the best fit for the host. When I look at the log for your computer, it doesn't even make it that far makes me think that it isn't the custom scheduler code but something else.

Has anyone else gotten work from Collatz with it set to a backup project?

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 30 Aug 09
Posts: 219
Credit: 676,877,192
RAC: 17,625
Message 15376 - Posted: 1 Nov 2012, 15:32:15 UTC - in response to Message 15374.

Collatz Conjecture | URL http://boinc.thesonntags.com/collatz/; Computer ID 81107; resource share 0


I thought I had posted the following yesterday, but evidently not....

Have you tried increasing the resource share? I'm not sure how setting it as a backup project works as the server code is from prior to backup projects working on the boinc client. So, if the resource shares as 0 and 100 for two projects and the server looks at the total time requested and applies the fracton of the resource to apply, then 0 * N = 0. From the scheduler.log, it displys each plan class as it checks to see which is the best fit for the host. When I look at the log for your computer, it doesn't even make it that far makes me think that it isn't the custom scheduler code but something else.

Has anyone else gotten work from Collatz with it set to a backup project?


Upping the resource to 1 give the same message.


11/1/2012 8:30:12 AM | Collatz Conjecture | [sched_op] Starting scheduler request
11/1/2012 8:30:12 AM | Collatz Conjecture | Sending scheduler request: To fetch work.
11/1/2012 8:30:12 AM | Collatz Conjecture | Requesting new tasks for NVIDIA
11/1/2012 8:30:12 AM | Collatz Conjecture | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
11/1/2012 8:30:12 AM | Collatz Conjecture | [sched_op] NVIDIA work request: 604800.00 seconds; 2.00 devices
11/1/2012 8:30:15 AM | Collatz Conjecture | Scheduler request completed: got 0 new tasks
11/1/2012 8:30:15 AM | Collatz Conjecture | [sched_op] Server version 611
11/1/2012 8:30:15 AM | Collatz Conjecture | No work sent
11/1/2012 8:30:15 AM | Collatz Conjecture | Message from server: No work available for the applications you have selected. Please check your project preferences on the web site.
11/1/2012 8:30:15 AM | Collatz Conjecture | Project requested delay of 182 seconds
11/1/2012 8:30:15 AM | Collatz Conjecture | [sched_op] Deferring communication for 3 min 1 sec
11/1/2012 8:30:15 AM | Collatz Conjecture | [sched_op] Reason: requested by project

____________

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15379 - Posted: 1 Nov 2012, 20:02:15 UTC - in response to Message 15376.

Collatz Conjecture | URL http://boinc.thesonntags.com/collatz/; Computer ID 81107; resource share 0


I thought I had posted the following yesterday, but evidently not....

Have you tried increasing the resource share? I'm not sure how setting it as a backup project works as the server code is from prior to backup projects working on the boinc client. So, if the resource shares as 0 and 100 for two projects and the server looks at the total time requested and applies the fracton of the resource to apply, then 0 * N = 0. From the scheduler.log, it displys each plan class as it checks to see which is the best fit for the host. When I look at the log for your computer, it doesn't even make it that far makes me think that it isn't the custom scheduler code but something else.

Has anyone else gotten work from Collatz with it set to a backup project?


Upping the resource to 1 give the same message.


11/1/2012 8:30:12 AM | Collatz Conjecture | [sched_op] Starting scheduler request
11/1/2012 8:30:12 AM | Collatz Conjecture | Sending scheduler request: To fetch work.
11/1/2012 8:30:12 AM | Collatz Conjecture | Requesting new tasks for NVIDIA
11/1/2012 8:30:12 AM | Collatz Conjecture | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
11/1/2012 8:30:12 AM | Collatz Conjecture | [sched_op] NVIDIA work request: 604800.00 seconds; 2.00 devices
11/1/2012 8:30:15 AM | Collatz Conjecture | Scheduler request completed: got 0 new tasks
11/1/2012 8:30:15 AM | Collatz Conjecture | [sched_op] Server version 611
11/1/2012 8:30:15 AM | Collatz Conjecture | No work sent
11/1/2012 8:30:15 AM | Collatz Conjecture | Message from server: No work available for the applications you have selected. Please check your project preferences on the web site.
11/1/2012 8:30:15 AM | Collatz Conjecture | Project requested delay of 182 seconds
11/1/2012 8:30:15 AM | Collatz Conjecture | [sched_op] Deferring communication for 3 min 1 sec
11/1/2012 8:30:15 AM | Collatz Conjecture | [sched_op] Reason: requested by project


I turned on max debugging on the scheduler so if you can do one more update, that would be appreciated.

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 30 Aug 09
Posts: 219
Credit: 676,877,192
RAC: 17,625
Message 15381 - Posted: 1 Nov 2012, 23:24:28 UTC

Here is several updates, with both regular and mini selected.

11/1/2012 2:29:32 PM | Collatz Conjecture | [sched_op] Starting scheduler request
11/1/2012 2:29:32 PM | Collatz Conjecture | Sending scheduler request: To fetch work.
11/1/2012 2:29:32 PM | Collatz Conjecture | Requesting new tasks for NVIDIA
11/1/2012 2:29:32 PM | Collatz Conjecture | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
11/1/2012 2:29:32 PM | Collatz Conjecture | [sched_op] NVIDIA work request: 157811.69 seconds; 0.00 devices
11/1/2012 2:29:34 PM | Collatz Conjecture | Scheduler request completed: got 0 new tasks
11/1/2012 2:29:34 PM | Collatz Conjecture | [sched_op] Server version 611
11/1/2012 2:29:34 PM | Collatz Conjecture | No work sent
11/1/2012 2:29:34 PM | Collatz Conjecture | Project requested delay of 182 seconds
11/1/2012 2:29:34 PM | Collatz Conjecture | [sched_op] Deferring communication for 3 min 1 sec
11/1/2012 2:29:34 PM | Collatz Conjecture | [sched_op] Reason: requested by project
11/1/2012 2:52:28 PM | Collatz Conjecture | [sched_op] Starting scheduler request
11/1/2012 2:52:28 PM | Collatz Conjecture | Sending scheduler request: To fetch work.
11/1/2012 2:52:28 PM | Collatz Conjecture | Requesting new tasks for NVIDIA
11/1/2012 2:52:28 PM | Collatz Conjecture | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
11/1/2012 2:52:28 PM | Collatz Conjecture | [sched_op] NVIDIA work request: 160774.54 seconds; 0.00 devices
11/1/2012 2:52:31 PM | Collatz Conjecture | Scheduler request completed: got 0 new tasks
11/1/2012 2:52:31 PM | Collatz Conjecture | [sched_op] Server version 611
11/1/2012 2:52:31 PM | Collatz Conjecture | No work sent
11/1/2012 2:52:31 PM | Collatz Conjecture | Project requested delay of 182 seconds
11/1/2012 2:52:31 PM | Collatz Conjecture | [sched_op] Deferring communication for 3 min 1 sec
11/1/2012 2:52:31 PM | Collatz Conjecture | [sched_op] Reason: requested by project
11/1/2012 3:44:08 PM | Collatz Conjecture | [sched_op] Starting scheduler request
11/1/2012 3:44:08 PM | Collatz Conjecture | Sending scheduler request: To fetch work.
11/1/2012 3:44:08 PM | Collatz Conjecture | Requesting new tasks for NVIDIA
11/1/2012 3:44:08 PM | Collatz Conjecture | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
11/1/2012 3:44:08 PM | Collatz Conjecture | [sched_op] NVIDIA work request: 167001.83 seconds; 0.00 devices
11/1/2012 3:44:09 PM | Collatz Conjecture | Scheduler request completed: got 0 new tasks
11/1/2012 3:44:09 PM | Collatz Conjecture | [sched_op] Server version 611
11/1/2012 3:44:09 PM | Collatz Conjecture | No work sent
11/1/2012 3:44:09 PM | Collatz Conjecture | Project requested delay of 182 seconds
11/1/2012 3:44:09 PM | Collatz Conjecture | [sched_op] Deferring communication for 3 min 1 sec
11/1/2012 3:44:09 PM | Collatz Conjecture | [sched_op] Reason: requested by project

____________

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15386 - Posted: 2 Nov 2012, 15:59:47 UTC - in response to Message 15381.

Here's what I'm seeing on the server:

2012-11-01 17:44:14.3471 [PID=5880 ] Request: [USER#1652] [HOST#81107] [IP 24.156.119.117] client 7.0.36
2012-11-01 17:44:14.3476 [PID=5880 ] [send] Not using matchmaker scheduling; Not using EDF sim
2012-11-01 17:44:14.3477 [PID=5880 ] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2012-11-01 17:44:14.3477 [PID=5880 ] [send] CUDA: req 167001.83 sec, 0.00 instances; est delay 0.00
2012-11-01 17:44:14.3477 [PID=5880 ] [send] work_req_seconds: 167001.83 secs
2012-11-01 17:44:14.3477 [PID=5880 ] [send] available disk 8.49 GB, work_buf_min 259200
2012-11-01 17:44:14.3477 [PID=5880 ] [send] active_frac 0.890552 on_frac 0.991667
2012-11-01 17:44:14.3477 [PID=5880 ] Anonymous platform app versions:
2012-11-01 17:44:14.3477 [PID=5880 ] app: collatz version 311 cpus 0.01 cudas 0.00 atis 0.00 flops 100.000000G
2012-11-01 17:44:14.3477 [PID=5880 ] app: mini_collatz version 311 cpus 0.01 cudas 0.00 atis 0.00 flops 100.000000G
2012-11-01 17:44:14.3480 [PID=5880 ] [send] [AV#125] not reliable; cons valid 0 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#145] not reliable; cons valid 0 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#173] not reliable; cons valid 0 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#1000003] not reliable; cons valid 0 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#1000004] not reliable; cons valid 1 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 1 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#2000004] not reliable; cons valid 1 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 1 < 10, don't use single replication
2012-11-01 17:44:14.3574 [PID=5880 ] [send] [HOST#81107] is looking for work from a non-preferred application
2012-11-01 17:44:14.3637 [PID=5880 ] Sending reply to [HOST#81107]: 0 results, delay req 181.80
2012-11-01 17:44:14.3638 [PID=5880 ] Scheduler ran 0.021 seconds


Looks to me like it is requesting CUDA work but giving a CUDA count of 0.

Can you check that the count for the coproc is > 0 in the app_info? If it is, could it be a possible 7.0.36 bug where it doesn't display the count correctly? Or doesn't do it correctly only for non-preferred platforms?

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15387 - Posted: 2 Nov 2012, 16:09:08 UTC

And here's one that get's work OK with an anonymous app:

2012-11-01 13:03:38.5372 [PID=18022] Request: [USER#8369] [HOST#94620] [IP 70.201.3.178] client 6.12.34
2012-11-01 13:03:38.5377 [PID=18022] [send] Not using matchmaker scheduling; Not using EDF sim
2012-11-01 13:03:38.5377 [PID=18022] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2012-11-01 13:03:38.5377 [PID=18022] [send] CUDA: req 197.97 sec, 0.00 instances; est delay 0.00
2012-11-01 13:03:38.5377 [PID=18022] [send] work_req_seconds: 197.97 secs
2012-11-01 13:03:38.5377 [PID=18022] [send] available disk 8.81 GB, work_buf_min 864
2012-11-01 13:03:38.5377 [PID=18022] [send] active_frac 0.999759 on_frac 0.997722
2012-11-01 13:03:38.5377 [PID=18022] Anonymous platform app versions:
2012-11-01 13:03:38.5377 [PID=18022] app: collatz version 311 cpus 0.01 cudas 1.00 atis 0.00 flops 365.197242G
2012-11-01 13:03:38.5378 [PID=18022] app: mini_collatz version 311 cpus 0.01 cudas 1.00 atis 0.00 flops 100.000000G

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 30 Aug 09
Posts: 219
Credit: 676,877,192
RAC: 17,625
Message 15388 - Posted: 2 Nov 2012, 23:58:36 UTC - in response to Message 15386.

Here's what I'm seeing on the server:

2012-11-01 17:44:14.3471 [PID=5880 ] Request: [USER#1652] [HOST#81107] [IP 24.156.119.117] client 7.0.36
2012-11-01 17:44:14.3476 [PID=5880 ] [send] Not using matchmaker scheduling; Not using EDF sim
2012-11-01 17:44:14.3477 [PID=5880 ] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2012-11-01 17:44:14.3477 [PID=5880 ] [send] CUDA: req 167001.83 sec, 0.00 instances; est delay 0.00
2012-11-01 17:44:14.3477 [PID=5880 ] [send] work_req_seconds: 167001.83 secs
2012-11-01 17:44:14.3477 [PID=5880 ] [send] available disk 8.49 GB, work_buf_min 259200
2012-11-01 17:44:14.3477 [PID=5880 ] [send] active_frac 0.890552 on_frac 0.991667
2012-11-01 17:44:14.3477 [PID=5880 ] Anonymous platform app versions:
2012-11-01 17:44:14.3477 [PID=5880 ] app: collatz version 311 cpus 0.01 cudas 0.00 atis 0.00 flops 100.000000G
2012-11-01 17:44:14.3477 [PID=5880 ] app: mini_collatz version 311 cpus 0.01 cudas 0.00 atis 0.00 flops 100.000000G
2012-11-01 17:44:14.3480 [PID=5880 ] [send] [AV#125] not reliable; cons valid 0 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#145] not reliable; cons valid 0 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#173] not reliable; cons valid 0 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#1000003] not reliable; cons valid 0 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#1000004] not reliable; cons valid 1 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 1 < 10, don't use single replication
2012-11-01 17:44:14.3481 [PID=5880 ] [send] [AV#2000004] not reliable; cons valid 1 < 10
2012-11-01 17:44:14.3481 [PID=5880 ] [send] set_trust: cons valid 1 < 10, don't use single replication
2012-11-01 17:44:14.3574 [PID=5880 ] [send] [HOST#81107] is looking for work from a non-preferred application
2012-11-01 17:44:14.3637 [PID=5880 ] Sending reply to [HOST#81107]: 0 results, delay req 181.80
2012-11-01 17:44:14.3638 [PID=5880 ] Scheduler ran 0.021 seconds


Looks to me like it is requesting CUDA work but giving a CUDA count of 0.

Can you check that the count for the coproc is > 0 in the app_info? If it is, could it be a possible 7.0.36 bug where it doesn't display the count correctly? Or doesn't do it correctly only for non-preferred platforms?


It is the standard app_info that you provided with the app.

<app_info>
<app>
<name>collatz</name>
</app>
<app>
<name>mini_collatz</name>
</app>
<file_info>
<name>collatz_3.17_windows_x86_64__cuda42.exe</name>
<executable/>
</file_info>
<file_info>
<name>cudart64_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>collatz.config</name>
</file_info>
<app_version>
<app_name>collatz</app_name>
<version_num>311</version_num>
<platform>windows_x86_64</platform>
<plan_class>cuda42</plan_class>
<avg_ncpus>0.011</avg_ncpus>
<max_ncpus>1</max_ncpus>
<flops>1.0e11</flops>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>collatz_3.17_windows_x86_64__cuda42.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart64_42_9.dll</file_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>collatz.config</file_name>
<copy_file/>
</file_ref>
</app_version>
<app_version>
<app_name>mini_collatz</app_name>
<version_num>311</version_num>
<platform>windows_x86_64</platform>
<plan_class>cuda42</plan_class>
<avg_ncpus>0.011</avg_ncpus>
<max_ncpus>1</max_ncpus>
<flops>1.0e11</flops>
<coproc>
<type>CUDA</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>collatz_3.17_windows_x86_64__cuda42.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart64_42_9.dll</file_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>collatz.config</file_name>
<copy_file/>
</file_ref>
</app_version>
</app_info>

____________

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 30 Aug 09
Posts: 219
Credit: 676,877,192
RAC: 17,625
Message 15389 - Posted: 3 Nov 2012, 2:43:32 UTC

Just tried changing to the 32-bit version and still nothing.

I am successful in getting OpenCL work from Milkyway and CUDA work from SETI.
____________

zzuupp
Send message
Joined: 14 Mar 10
Posts: 128
Credit: 347,767,607
RAC: 30,373
Message 15390 - Posted: 3 Nov 2012, 4:06:54 UTC - in response to Message 15387.

And here's one that get's work OK with an anonymous app:

2012-11-01 13:03:38.5372 [PID=18022] Request: [USER#8369] [HOST#94620] ...


That looks familiar.

And now for a question:

How much more CPU time should the 32 use compared to the 64?

The 32 is taking about 45 seconds. (Which is darn good!) However, the 64 is taking 1 or 2 seconds.

Or is it something on my end???

Profile mimeq
Send message
Joined: 14 Jul 10
Posts: 4
Credit: 81,224,132
RAC: 872
Message 15407 - Posted: 5 Nov 2012, 0:14:46 UTC
Last modified: 5 Nov 2012, 0:19:30 UTC

http://boinc.thesonntags.com/collatz/result.php?resultid=127647358


<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Collatz Conjecture v3.17 x86_64 for CUDA 4.2
Based on the AMD Brook+ kernels by Gipsel
verbose=1
items_per_kernel=17
kernels_per_reduction=7
threads=8
sleep=1
Parameters --device 0
Start 2377253573435906042216
Checking 824633720832 numbers
Numbers/Kernel 131072
Kernels/Reduction 128
Numbers/Reduction 16777216
Reductions/WU 49152
Threads 256
Name GeForce GTX 460
Memory 767 MB
Compute 2.1
Processors 7
Memory Clock 1800000 kHz
Warp Size 32
Shader Clock 1350000 kHz
Max Grid 65535 x 65535 x 65535
Max Threads 1024 x 1024 x 64
Texture Align 512

Highest Steps 1935 for 2377253573705936587591
Total Steps 397641121809668
Avg Steps 482
GPU time 2590.44 seconds
CPU time 169.791 seconds
Total time 2860.61 seconds
16:14:45 (9120): called boinc_finish

</stderr_txt>
]]>


app_info at *.zip file has

<version_num>311</version_num>


I changed it to

<version_num>317</version_num>

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15413 - Posted: 5 Nov 2012, 15:34:03 UTC - in response to Message 15390.

And here's one that get's work OK with an anonymous app:

2012-11-01 13:03:38.5372 [PID=18022] Request: [USER#8369] [HOST#94620] ...


That looks familiar.

And now for a question:

How much more CPU time should the 32 use compared to the 64?

The 32 is taking about 45 seconds. (Which is darn good!) However, the 64 is taking 1 or 2 seconds.

Or is it something on my end???


Collatz uses unsigned 64-bit integers. On a 32-bit platform, it has to emulate 64-bit integers which takes twice as long or more.

It looks like you are comparing a Q9450 @ 2.66Ghz running the 32-bit app vs an i7-2600K @ 3.40 Ghz running the 64-bit app so even if the 32 and 64 bit apps did run at the same speed, I'd expect to see a 30% difference just due to clock speed and transistor size differences between the two machines.

There is also some CPU used while waiting for the GPU to finish each kernel. Not a lot, but some. The longer the GPU takes, the more often the CPU has to check if the GPU is done. Each check uses a few CPU cycles. It then sleeps until it is time to check again. Sleep too long and the GPU sits idle while waiting for the next kernel. Sleep too short and the CPU utilization increases while it checks to see of the GPU is done.

zzuupp
Send message
Joined: 14 Mar 10
Posts: 128
Credit: 347,767,607
RAC: 30,373
Message 15419 - Posted: 6 Nov 2012, 1:47:05 UTC - in response to Message 15413.

And here's one that get's work OK with an anonymous app:

2012-11-01 13:03:38.5372 [PID=18022] Request: [USER#8369] [HOST#94620] ...


That looks familiar.

And now for a question:

How much more CPU time should the 32 use compared to the 64?

The 32 is taking about 45 seconds. (Which is darn good!) However, the 64 is taking 1 or 2 seconds.

Or is it something on my end???


Collatz uses unsigned 64-bit integers. On a 32-bit platform, it has to emulate 64-bit integers which takes twice as long or more.

It looks like you are comparing a Q9450 @ 2.66Ghz running the 32-bit app vs an i7-2600K @ 3.40 Ghz running the 64-bit app so even if the 32 and 64 bit apps did run at the same speed, I'd expect to see a 30% difference just due to clock speed and transistor size differences between the two machines.

There is also some CPU used while waiting for the GPU to finish each kernel. Not a lot, but some. The longer the GPU takes, the more often the CPU has to check if the GPU is done. Each check uses a few CPU cycles. It then sleeps until it is time to check again. Sleep too long and the GPU sits idle while waiting for the next kernel. Sleep too short and the CPU utilization increases while it checks to see of the GPU is done.


Thank you for the detailed response.

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15428 - Posted: 6 Nov 2012, 21:00:54 UTC

I've added CUDA 5 versions of Collatz v3.17 for Win32 and x64 to the optimized applications page. It doesn't look like there is any speed difference, but the kernel is compiled everything from compute 1.0 to compute 3.5 GPUs. The CUDA 4.2 version by comparison contains kernels for GPUs with compute 1.0 to 3.0.

zzuupp
Send message
Joined: 14 Mar 10
Posts: 128
Credit: 347,767,607
RAC: 30,373
Message 15430 - Posted: 7 Nov 2012, 3:09:37 UTC - in response to Message 15428.

Tasks are completing with the update. One has been validated already.

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 30 Aug 09
Posts: 219
Credit: 676,877,192
RAC: 17,625
Message 15431 - Posted: 7 Nov 2012, 3:30:26 UTC
Last modified: 7 Nov 2012, 3:33:02 UTC

Still unable to get work on my machine.

I installed the CUDA 5.0 app, with both regular and mini selected it once again gives me no work.

11/6/2012 8:25:47 PM | Collatz Conjecture | [sched_op] Starting scheduler request
11/6/2012 8:25:47 PM | Collatz Conjecture | Sending scheduler request: Requested by user.
11/6/2012 8:25:47 PM | Collatz Conjecture | Requesting new tasks for NVIDIA
11/6/2012 8:25:47 PM | Collatz Conjecture | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
11/6/2012 8:25:47 PM | Collatz Conjecture | [sched_op] NVIDIA work request: 602891.77 seconds; 0.00 devices
11/6/2012 8:25:49 PM | Collatz Conjecture | Scheduler request completed: got 0 new tasks
11/6/2012 8:25:49 PM | Collatz Conjecture | [sched_op] Server version 611
11/6/2012 8:25:49 PM | Collatz Conjecture | No work sent
11/6/2012 8:25:49 PM | Collatz Conjecture | Project requested delay of 182 seconds
11/6/2012 8:25:49 PM | Collatz Conjecture | [sched_op] Deferring communication for 3 min 1 sec
11/6/2012 8:25:49 PM | Collatz Conjecture | [sched_op] Reason: requested by project


I am waiting for 7.0.39 as .38 has a couple of bugs in the installer.
____________

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15435 - Posted: 7 Nov 2012, 14:54:58 UTC - in response to Message 15431.

I am waiting for 7.0.39 as .38 has a couple of bugs in the installer.


Since I have enough issues dealing with beta versions of my own apps, I really try not to test BOINC client versions at all. The most recent BOINC version on any of my machines is 7.0.28 and that's only where I needed to test OpenCL apps. All the rest are running 6.10.x or 6.12.x. I finally upgrade the last box I had on 5.10.45 a few weeks ago.

So, I kind of hope it is an issue with multi-GPU scheduling requests from the client and not a server issue since, not that they've switched to git, I think I'd have to merge in all the Collatz specific changes manually. That is, once I remember what I all changed and why and figure out if it still applies to the current server version. When DA tells you NOT to use server_stable version because it isn't and instead use the bleeding edge version, I get worried. The "I checked in a fix so just upgrade your server again" line may work for those who do nothing but manage BOINC environments, but for those of us with a life, that's 1-3 weeks or coding and testing every time. But, hunting season is almost over so I'll have more time to do upgrades if needed.

My next project is the upgrade a MAC from 10.6.6 to 10.8 so I can compile a CUDA 5 app for OS X since the CUDA apps are failing if CUDA 5 is installed.
...and Linux versions also.

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 15437 - Posted: 7 Nov 2012, 15:01:53 UTC

Posts like this to the boinc_dev email list really make me hesitant to think a server upgrade would solve more problems than it creates with scheduler issues....

I get the impression that the present scheduler is broken in some
respect, on both my i7-2600K/GTX460/HD7770 and my E8500/GTX9800+ hosts
they will request work for one device, then get sent work for another
device that wasn't requesting work

1 · 2 · Next
Post to thread

Message boards : News : Collatz v3.17 CUDA Testers for Windows Needed


Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.