Collatz Conjecture OpenCL kernel causes crash with minimum settings.
log in

Advanced search

Message boards : Windows : Collatz Conjecture OpenCL kernel causes crash with minimum settings.

Author Message
D337z
Send message
Joined: 25 Feb 12
Posts: 7
Credit: 333,514
RAC: 0
Message 13547 - Posted: 25 Feb 2012, 16:24:48 UTC

Hello all! I just wanted to let you know that I've tested the OpenCL kernel for Collatz Conjecture and it caused a driver crash. You might want to go back and look over the code or have the OpenCL file separate so that the ATI SDK can run it without having to pull it out of the exe. Granted, this exposes the code to modification or viewing, but it will be far easier to update and correct any problems. Just have the OpenCL code output to the exe and it should be fine.
Theoretically, you don't even need to use CPU if you load the values into the program initially and allow the GPU to do all of the work. It should be able to handle it on its own. But I'm just working off of theory since I can't see the code.
From what I've seen, utilizing 4 and 5 vectors via VLIW based hardware (depending on what GPU you use) should be default as nothing exists that can run OpenCL and can't handle those vectors.
Remember, when utilizing more than 4 vectors, that the vectors are addressed utilizing .s0, .s1, .s2...
It'll be tricky to code, but not impossible. I think the 79xx series can handle 16 vectors, but don't quote me on that.
I'll leave the rest up to you. But I wouldn't mind having something to play around with on my spare time. ^_^

Profile Slicker
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 11 Jun 09
Posts: 2525
Credit: 740,580,099
RAC: 1
Message 13548 - Posted: 25 Feb 2012, 18:06:37 UTC - in response to Message 13547.

Did you follow the steps in the readme to see how well it runs in standalone mode first?

There are often issues running the latest ATI driver versions. They work for some or even most people, but never work for all. Downgrade to Catalyst 11.9 or 11.6 and see if that makes a difference. My guess is that your problems will be solved. I really can't say this often enough: Once you find a driver version that works, don't upgrade until it stops working. New driver = new bugs. Every five or six versions, AMD gets [most of] the bugs out. Versions in between just cause headaches.

The only reason to run the OpenCL app on an ATI GPU is if it is a 7xxx series GPU which no longer supports CAL/Brook+. The OpenCL app performs much worse than the Brook+ app even though it is running the exact same kernel logic. Until AMD drops call support from their drivers, we'll be using the ATI app here.

The main reason the OpenCL version is avaiable is because people with 7xxx cards can't use the ATI app. While they probably aren't thrilled with the OpenCL app's performance they can at least use Collatz as a backup project if they want. That beats not being able to run it at all.

D337z
Send message
Joined: 25 Feb 12
Posts: 7
Credit: 333,514
RAC: 0
Message 13555 - Posted: 26 Feb 2012, 21:12:53 UTC - in response to Message 13548.

Yes, I ran the batch file.
Also, I program in OpenCL, so I can tell you first hand that if your OpenCL kernel runs slower than your Brook+ app or doesn't run in the first place, you're either using depreciated logic or have a few optimizations that need to be made.
The problem with programming for VLIW and GCN architectures is they are fundamental opposites. While VLIW has 5 ALUs that can handle multiple different instructions, GCN has 16 that handle fewer instructions (I think [I haven't programmed for the 7xxx series yet]). But yeah, what OpenCL version does your OpenCL kernel specify that it uses?

D337z
Send message
Joined: 25 Feb 12
Posts: 7
Credit: 333,514
RAC: 0
Message 13560 - Posted: 28 Feb 2012, 0:17:55 UTC

If you have the OpenCL script, I can toss it into the kernel analyzer and figure out where the hiccup is and see if I can iron out some of the kinks to make it work faster than your brook+ app.
But that's up to you.

Profile Gipsel
Volunteer moderator
Project developer
Project tester
Send message
Joined: 2 Jul 09
Posts: 279
Credit: 77,354,864
RAC: 77,630
Message 13999 - Posted: 14 May 2012, 13:50:59 UTC - in response to Message 13555.

Yes, I ran the batch file.
Also, I program in OpenCL, so I can tell you first hand that if your OpenCL kernel runs slower than your Brook+ app or doesn't run in the first place, you're either using depreciated logic or have a few optimizations that need to be made.

It isn't actually slower than a Brook+ version, it is slower than some handtuned kernels written in IL. ;)
You simply can't squeeze that speed out of a high level C-like language before some (more) intrinsics (like carries for integer adds) are exposed in OpenCL by some extension. And the compiler optimizations are not that good so far. For example, the code generated for evaluating conditionals are often not as efficient as they could be (what you can get by writing in IL).

And the speed of that IL code on the HD7000 series is terrific (as Collatz achieved only a medium occupancy of the VLIW slots [below 3 of 5], lots of dependencies in the code), alone it doesn't return correct results. Or it does return correct results only sometimes. The problem is obviously some synchronization/coherency thing when doing read-write-accesses to textures/images in the old version. GCN changed the whole memory structure and the caches for that so the old rules are not valid anymore.

Btw., I would like to try writing directly ISA/assembler for GCN. It is a really clean architecture with quite some goodies. That would probably be even faster than IL (as not all possibilities of GCN got exposed to IL so far). But the documentation for it is only spotty at best.


Post to thread

Message boards : Windows : Collatz Conjecture OpenCL kernel causes crash with minimum settings.


Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.