PCI-e Bandwidth Usage
log in

Advanced search

Message boards : Number crunching : PCI-e Bandwidth Usage

Author Message
Hiigaran
Send message
Joined: 7 Mar 14
Posts: 4
Credit: 24,522
RAC: 0
Message 22584 - Posted: 18 Jun 2016, 14:23:43 UTC

Copying this from the BOINC forums:

I've been having some discussions on several sites regarding GPUs and bandwidth usage for distributed-computing projects, and I wanted to broaden things by hopefully getting some BOINC experts in on the matter.

Now most of us are probably familiar with GPU mining and how hardware is generally deployed in these farms, but for anyone who isn't too familiar with it, a mining farm is typically comprised of a cheap motherboard with as many PCI-e slots of any size, some basic RAM, a cheap CPU, and of course, the GPUs. Due to space limitations, these GPUs are normally connected to the motherboard via a flexible riser which is an x1 adapter at the motherboard end, an x16 adapter on the GPU, and a USB 3.0 cable connecting the two to each other. Essentially, these are PCI-e x1 extension cables. They do not actually use a USB interface. a 3.0 cable is used simply because it has the right number of wires inside to map to an x1 interface.

Now, given that these risers are bottlenecked at x1 bandwidths, this would limit performance for high-bandwidth applications such as gaming, and significant performance reductions would be observed. Since cryptocurrency mining does not require high bandwidth, no performance loss occurs here, as x1 bandwidth on PCI-e 2.0 or 3.0 is never maxed out.

I had assumed that since mining does not require such high levels of bandwidth, perhaps distributed-computing projects might be the same. In the past few weeks, I've been discussing this over on the Folding@Home forums, and to my disappointment, anything less than PCI-e 3.0 x4 or PCI-e 2.0 x8 would result in bandwidth saturation, and thus a performance loss occurs due to the GPUs never reaching full load. This was rather disappointing for me, as I had wanted to build a system specced similarly to a mining rig, for the purposes of distributed-computing.

After a bit of thinking, I started to wonder if every project would require the same levels of bandwidth as F@H, so here I am. With the lengthy backstory out of the way, my question to you guys is simply this: Are there any GPU projects on the BOINC platform that do not saturate the PCI-e x1 interface?

I would love to get some data from anyone working on GPU projects. MSI Afterburner shows bus usage, so if a few people are willing to spend two or three minutes to take a few measurements, I would really appreciate it. Please let me know what size and version the PCI-e slot of your GPU is as well.


If anyone is able to help out with posting some data from their rigs, it would really help.

Profile mikey
Avatar
Send message
Joined: 11 Aug 09
Posts: 3242
Credit: 1,691,202,015
RAC: 5,722,054
Message 22588 - Posted: 19 Jun 2016, 15:42:34 UTC - in response to Message 22584.

Copying this from the BOINC forums:

I've been having some discussions on several sites regarding GPUs and bandwidth usage for distributed-computing projects, and I wanted to broaden things by hopefully getting some BOINC experts in on the matter.

Now most of us are probably familiar with GPU mining and how hardware is generally deployed in these farms, but for anyone who isn't too familiar with it, a mining farm is typically comprised of a cheap motherboard with as many PCI-e slots of any size, some basic RAM, a cheap CPU, and of course, the GPUs. Due to space limitations, these GPUs are normally connected to the motherboard via a flexible riser which is an x1 adapter at the motherboard end, an x16 adapter on the GPU, and a USB 3.0 cable connecting the two to each other. Essentially, these are PCI-e x1 extension cables. They do not actually use a USB interface. a 3.0 cable is used simply because it has the right number of wires inside to map to an x1 interface.

Now, given that these risers are bottlenecked at x1 bandwidths, this would limit performance for high-bandwidth applications such as gaming, and significant performance reductions would be observed. Since cryptocurrency mining does not require high bandwidth, no performance loss occurs here, as x1 bandwidth on PCI-e 2.0 or 3.0 is never maxed out.

I had assumed that since mining does not require such high levels of bandwidth, perhaps distributed-computing projects might be the same. In the past few weeks, I've been discussing this over on the Folding@Home forums, and to my disappointment, anything less than PCI-e 3.0 x4 or PCI-e 2.0 x8 would result in bandwidth saturation, and thus a performance loss occurs due to the GPUs never reaching full load. This was rather disappointing for me, as I had wanted to build a system specced similarly to a mining rig, for the purposes of distributed-computing.

After a bit of thinking, I started to wonder if every project would require the same levels of bandwidth as F@H, so here I am. With the lengthy backstory out of the way, my question to you guys is simply this: Are there any GPU projects on the BOINC platform that do not saturate the PCI-e x1 interface?

I would love to get some data from anyone working on GPU projects. MSI Afterburner shows bus usage, so if a few people are willing to spend two or three minutes to take a few measurements, I would really appreciate it. Please let me know what size and version the PCI-e slot of your GPU is as well.


If anyone is able to help out with posting some data from their rigs, it would really help.


I'm running an AMD 6870 gpu and using MSIAfteburner and do NOT see the Bus speed listed. I am using the latest version of Afterburner, I think anyway.

Hiigaran
Send message
Joined: 7 Mar 14
Posts: 4
Credit: 24,522
RAC: 0
Message 22591 - Posted: 19 Jun 2016, 17:49:07 UTC - in response to Message 22588.

I could have the software wrong, though someone else originally reported using it apparently. I'm stuck on a 10 year old linux laptop, so I can't confirm. I definitely know that GPU-Z shows PCI-e bandwidth information, but I don't know if it reports it real time.

Profile mikey
Avatar
Send message
Joined: 11 Aug 09
Posts: 3242
Credit: 1,691,202,015
RAC: 5,722,054
Message 22593 - Posted: 20 Jun 2016, 12:14:29 UTC - in response to Message 22591.

I could have the software wrong, though someone else originally reported using it apparently. I'm stuck on a 10 year old linux laptop, so I can't confirm. I definitely know that GPU-Z shows PCI-e bandwidth information, but I don't know if it reports it real time.


Gpu-z shows me what the pc is capable of, not what's actually being used. I also checked a 2nd machine and still don't see it listed in AfterBurner.

Profile sosiris
Send message
Joined: 11 Dec 13
Posts: 123
Credit: 55,800,869
RAC: 0
Message 22599 - Posted: 21 Jun 2016, 9:13:56 UTC

Theoretically the collatz sieve kernel uses little bandwidth once the look up tables are loaded into VRAM, just about one number per kernel launch.
____________
Sosiris, team BOINC@Taiwan

Gator 1-3
Send message
Joined: 15 Sep 15
Posts: 4
Credit: 1,025,245,439
RAC: 279,522
Message 22607 - Posted: 23 Jun 2016, 3:57:18 UTC - in response to Message 22584.

I, too, have looked into making one of those mining rigs for GPU crunching. I've accumulated most of the parts and am about to buy the pieces that I don't have, and will hopefully get it up and running by the first week of July, depending on shipping speeds. You'll probably only be interested in #1 below, but for others thinking about making a similar rig, I'll share some info I've found out along the way...

1) With the exception of one of my computers, any computer I have where more than one GPU is attached to it, the additional GPU(s) is/are attached by a riser. A few are powered risers using the USB cable setup you mentioned, but most are the simple ribbon-style x1 to x16 risers. As far as I can tell, every task that gets crunched on those GPUs takes the same amount of time as ones being crunched on the GPU plugged directly into the PCIex16 slot. (That one machine I mentioned above has two GPUs plugged directly into x16 slots, and one plugged in with a powered riser. All crunch at the same speed.) I have checked this on several different GPU projects, such as Milkyway@Home, SETI@Home and this project. If you look at my results and see one every now and then that took longer, it's because a couple of my machines have different types of GPUs... which brings me to...

2) If you use more than one GPU by the same manufacturer, but different model numbers, you'll have a few problems. The first is that none of your projects will accurately show your GPUs when you look at your computer setup in the "My Computers" links. For example, one of my machines has a GTX 470 and a GTX 275, but on the sites it's listed as having two GTX 470s. The biggest problem, however, is if you're using models that require different drivers, as in that setup I just mentioned (the 470 uses 368.xx, which the 275 uses 341.xx). Try to avoid this if possible. I've had to reboot that computer over and over, plugging one card in, updating a driver, then rebooting and unplugging/replugging to update another, etc. It's a pain, so if you want to use all of your slots but don't have all the same models, at least try to get them so they use the same driver.

3) Be prepared for LOTS of electricity usage. Those mining rigs eat up electricity like you wouldn't believe. I watched a video of a guy running six Toxic Sapphire R7 290s, which required two 1200 watt and one 600 watt PSUs. That's 3000 watts, or the equivalent of running thirty 100 watt light bulbs all day, every day. Of course, you can use GPUs that use up less electricity... I've found a decent website where you enter the specifics about your rig and it'll tell you how much wattage it will consume, and then you can make changes to things (like which GPUs you'll be running) to make comparisons. Try http://outervision.com/power-supply-calculator

4) Also be prepared for LOTS of heat. That first computer that I mentioned in #1 earlier has three GTX 285s in it, and it cranked out so much heat I had to remove it from my bedroom. After just one hour the room was over 90 degrees, and that's with a window mounted AC unit going. I'm not looking forward to how much heat is going to be produced when that gets upped to six cards. I'll probably have to keep it in my garage.

5) Build a rig for your super GPU cruncher. You can find the specifics for them on any number of sites. There's a really great design on a site called Highoncoins, but it has a big problem... it's made of wood. The guy who designed it says he saw other rigs using angle aluminum for the frame, but he wanted cheaper, and since aluminum conducts electricity he was worried about a short. Of course, he forgot that those cards are almost always attached to computer frames which are made of aluminum, so I think that's a rather foolish reason to use wood... which will catch on fire if it's heated up too much. Do yourself a favor and use aluminum. It's more expensive, but has to hit 1200 degrees in order to even melt. I'm trying to redesign his rig for aluminum angle, but I may just give up and go with a slightly less impressive setup that's already been designed.

6) Last, but probably most importantly, you'll need to make sure your CPU can handle six cards (if that's how many you're going to use). By that I mean, most people forget that their GPU still uses some CPU when crunching BOINC tasks, as can be seen in the task window of your manager. Every GPU task will have something like "Running (0.733 CPUs and 1 NVIDIA GPU)" in the status column. As far as I can tell, the amount of CPU that is used is determined by both the project and the type of GPU you have... some cards use more CPU, some less. The problem comes when you attach another GPU. In this example you're then using 0.733 CPUs for each card, for a total of 1.466 CPUs. When you add up the CPU usage, if you go over the amount you have, GPUs won't crunch. Using that number above, if you plan on using six GPUs, if you multiply 0.733 by 6 you get 4.398. If you only have a dual core, most of your GPUs will be idle. If you have a quad core, you're still going to have at least one not doing anything if you have six attached. Coin miners don't have to worry about this, so they usually buy the cheapest CPU they can get. A cruncher using a mining rig setup won't have that option. Personally, I plan on getting a hex core processor, but may opt for an 8 core if I can find it cheap enough on Ebay.

I know all you really wanted to know about was the risers, so I'll leave you with this nugget... If you mount your GPUs in a rig and attach all of them with risers, try to get a ribbon style 16x16 riser for the card that will actually have the video cable attached. This will essentially eliminate the bandwidth issue for at least that card, since all the riser really is is an extension cord. I've plugged my monitor into cards attached with 1x16 risers just to see if they still worked... they did, but I didn't experiment to see how much of a graphics load they could handle. As I mentioned earlier, though, they all crunch at the same speed as similar cards plugged directly into a x16 slot.

Hiigaran
Send message
Joined: 7 Mar 14
Posts: 4
Credit: 24,522
RAC: 0
Message 22613 - Posted: 23 Jun 2016, 12:57:05 UTC - in response to Message 22607.

That's all well and good, and there's some really good information here. However, you are referring to some outdated GPUs there. My original plan was to use a heap of GTX 1080s (expensive, but uses less than 190 watts peak), but given how much performance those cards have, x1 sounds like a limiting factor, even if it is 3.0.

I have no doubt that if someone wishes to use low-end or low mid-end cards, this will work just fine. I need to gather data from someone in possession of a flagship GPU to see if the limited bandwidth would hold out.

Gator 1-3
Send message
Joined: 15 Sep 15
Posts: 4
Credit: 1,025,245,439
RAC: 279,522
Message 22614 - Posted: 23 Jun 2016, 13:52:15 UTC - in response to Message 22613.

The most "high end" cards I have are GTX 580s. They're a bit old, but on a few projects they're still one of the fastest cards for crunching. Up until a few days ago, they were the 2nd fastest card for Milkyway@Home, doing WUs in 120 seconds versus GTX 980s doing them in 90 seconds. When I had two hooked up to one computer (one directly on the PCIex16 slot, and the other attached to a PCIex1 slot with a 1x16 ribbon-style riser), they both knocked out tasks in the same amount of time. I'm no computer expert (and I've never played one on TV), but from my understanding, they only time you need to worry about bandwidth is when you're using your video card for what it's actually made for... rendering graphics for output to a display. I will admit my experience is limited to the few GPU projects that I crunch, so it could be different for other projects.

Of course, since you're planning on buying these cards in the first place, you could always buy the mobo/CPU/other necessities, and get two cards at first. Make sure the mobo has the PCI slots spaced out so that you can attach them directly if this experiment fails. Buy two 1080s, attach one directly to the slot and the other via a riser, and see if the riser is bottlenecking the performance. If it is, make sure you switch the cards around to ensure it isn't the card doing it. If the output is still bottlenecked, you can always put the mobo into a normal case and use it like a normal computer with a pair of 1080s SLI'd. If the performance isn't affected, buy the rest of the cards and crunch on

Finally, I think you're going to be waiting a long time to get the information you desire if you want it to be specific to 1080s. This is partially due to them being fairly new, so a lot of people don't have multiple ones yet, and those that do attach them directly to PCIe slots and use them for gaming. The other reason is that most miners use AMD cards and not NVIDIAs in their rigs. Since risers cost about $3 for the ribbon style and $6 to $10 for the powered style, you'd probably be better off trying it yourself instead of waiting to see if others have done it. If you do that, please let us know how it turns out. Personally, I'm always interested in learning new things.

Good luck!

Hiigaran
Send message
Joined: 7 Mar 14
Posts: 4
Credit: 24,522
RAC: 0
Message 22616 - Posted: 23 Jun 2016, 17:26:09 UTC - in response to Message 22614.

I do plan on getting a new computer regardless, so I'm thinking I'll have one system primarily for personal use, and DC as a secondary function. I can test that with and without x1 risers.

If things do work, then I'll probably run one or two headless systems full of GPUs and access them via SSH on the first system. If not...well I guess I'll probably need to find a motherboard that supports x8/x8/x8 and build three systems.


Post to thread

Message boards : Number crunching : PCI-e Bandwidth Usage


Main page · Your account · Message boards


Copyright © 2018 Jon Sonntag; All rights reserved.