And wages the GPU compute war on many fronts
Just like Nvidia, AMD provides developers with a high-level application programming interface (API) to tap into its latest graphics processors for non-graphics compute applications. Unlike Nvidia, though, AMD doesn't make very much noise about what it's doing in that area. Curious, we got on the phone with AMD Stream Computing Director Patti Harrell and asked her to shed a little light on AMD's Stream Computing initiative.
Harrell started from the beginning, explaining the evolution of AMD's general-purpose GPU computing APIs:
Initially we came out with Close-to-Metal, which was a very low level interface. You had to know a fair bit about the GPU to use it, [but] you could get very good performance out of it.What we've done in the two years since we came out with CTM, is come out with higher-level tools. So, last November we launched Brook+, and what that is a C-level interface that is quite similar to [Nvidia's] CUDA and the OpenCL standard that's been proposed by the Khronos working group. So, it's kind of the same level, and the effort of Khronos is to try a standard API so that people don't get locked into one hardware platform or another based on initial software investment.
The other high-level tools that we have in the stack are a version of our AMD Computation and Math Library that is targeted to GPUs for those functions that can take advantage of them. So initially things like SGEMM [single-precision general matrix multiply] and DGEMM [double-precision general matrix multiply] and some of the other standard . . . functions are implemented on the GPU in ACML. And that provides scientists with a tool to program high-level and not even worry about the API, they can use this library and run their functions. And that seems to work very well, we were just in a meeting with a University Professor who did some testing on that and came away really happy about the ease of use of that methodology.
And then there are the third-party tools. We work with about half a dozen different companies on third-party development tools that give you either an alternative path or in some way augment the software stack you see here. The idea of course is to take a really open approach to this and let people approach the hardware in whatever way they like, but provide as many easy-to-use and highly performing tools for people to get good results.
Close-to-Metal came out a few months before Nvidia's CUDA GPGPU API, and the Stream SDK in its current form followed a few months later. So, I asked, what's the difference between CUDA and the Stream SDK? Harrell explained:
At their core, they're essentially a very similar idea. Brook+ was based on a graduate project out of Stanford called Brook, which has been around for years and is designed to target various architectures with a high-level API. And in this case there's back-ends for GPUs . . . What our engineering team did was take that project and bring that in-house, clean it up, write a new back-end that talks to our lower-level interface, and post the thing back out to open-source in keeping with our open systems philosophy.Brook looks like C. . . Function calls go to the GPU very much like CUDA. In fact, the guy who was one of the core designers on Brook went to Nvidia and did CUDA. . . . And another guy who recently got his doctorate at Stanford and worked extensively on Brook at Stanford is one of the core Brook+ architects now at AMD. So, they were both born out of the same idea.
In terms of what we do differently, the one thing we've tried to do is publish all of our interfaces from top to bottom so that developers can access the technology at whatever level they want. So, underneath Brook we have what we call CAL, Compute Abstraction Layer, which you can think of as an evolution of the original CTM. It provides a run-time, driver layer, as well as an intermediate language. Think of it as analogous to an assembly language. So Brook has a back-end that targets CAL, basically, as does ACML and some of the other third-party tools that we're working on. . . . From the beginning we published the API for CAL as well as for Brook so people could program at either level. We also published the instruction set architecture . . . so [people] can essentially tune low-level performance however they want. And Brook+ itself is open-source.
To put things in perspective, here's an AMD slide showing how the different Stream SDK elements fit together:

Source: AMD.
Since CUDA runs on GeForce 8 and GeForce 9-series GPUs, and Nvidia has showed mainstream general-purpose apps like video transcoding running on those cards, I asked Harrell whether AMD was doing anything similar. She replied that the Stream SDK supports Radeon HD 2900- and 3000-series graphics cards, and that mainstream apps are indeed in the pipeline:
We do have people who use [the Stream SDK] for some mainstream applications. Video encode is a really good example. And I think we're gonna see much more of that in mainstream consumer applications. Video encode, video game physics is another example, and some other consumer-level image processing would all be really well-accelerated on GPUs, and it stands to reason you want to use the GPU you already have in the system on your desk.
Last, but not least, I asked how the Stream SDK fit into AMD's Fusion initiative, which aims to integrate graphics cores into desktop and mobile microprocessors. Would we see mainstream GPGPU apps run on the integrated GPU core?
[That's] absolutely the direction you should expect, and in fact one of the big reasons for AMD to invest in this technology... Well, [there are] three reasons. One, we believe that this will be a fundamental requirement for mainstream graphics in the next few years, as we move towards a programming standard you're gonna see more and more mainstream application developers and ISVs [independent software vendors] want to take advantage of this capability, so we believe it's gonna be a requirement. Just baseline. On top of that, there certainly is market for incremental GPUs in technical markets and some of the high-end professional markets and we'd certainly like to play in that space.And, finally, this really helps to set us up and take us down a technology path to prepare for the Fusion programour Accelerated Computing programsand there are a couple of ways in which that happens. The software stack, which you see represented here, basically evolves into a more comprehensive and higher-level set of tools that we think the company needs to get to and the industry needs to get to, to enable developers to take advantage of heterogeneous architectures without having to be early adopters who can program anything. So we think this toolkit is a good beginning for what it ultimately needs to be to handle much broader heterogenous architectures like those that we'll implement in the Fusion project.
Also, we learn a lot dealing with the Stream Computing customers today about the sorts of workloads that accelerate well and the directions we need to move in to design graphics cores that would be part of these future processors. And we're also working with software partners and ISVs today, who we would expect would want to take advantage of those integrated architectures when they come out. So there's a lot that we're doing today that seeds directly into the Accelerated Computing project, and in fact, we are in communication all the time because . . . they are very interested in what we're doing now that they can then grab and take advantage of in future architectures.
In other words, while AMD may not be anywhere near as chatty about its GPGPU endeavors as its competitor, it definitely isn't twiddling its thumbs on that front. The company doesn't seem as tied to the notion of having its own, semi-proprietary API as Nvidia, though, and it has high hopes for the proposed OpenCL standard. If all goes well, OpenCL might allow developers to write GPGPU code that can run on both AMD and Nvidia GPUs, among others.
-
16 comments —
Last post by Helmore at 6:18 PM on June 20, 2008 - Email the author(s): Cyril Kowaliski
- Sign up to receive notices when we publish new articles
- Or go back to TR's front page
-
A tour of a Gigabyte factory
What better way to understand how a motherboard is created than seeing it for yourself? Read on as we tour Gigabyte's Nan-Ping production facility and watch a motherboard being made. Read more...
30 comments —
Last post by xtalentx at 7:29 AM on November 3, 2008 -
AMD's Radeon HD 4830 graphics processor
AMD has a brand-new Radeon to unveil today, and it's certainly worthy of our attention. However, our time to devote to this card is limited. We'll be in and out of our look at the Radeon HD 4830 in no time, faster and cleaner than a celebrity... Read more...
91 comments —
Last post by MadManOriginal at 12:57 AM on October 29, 2008 -
Nvidia's GeForce 9300 chipset
AMD's 780G has been our integrated graphics chipset of choice for nearly eight months, but Nvidia's new GeForce 9300 looks poised to claim the crown for the green team. Keep reading for the goods on the latest GeForce MCP and how it compares with the rest of the IGP... Read more...
57 comments —
Last post by Phatkat at 7:37 PM on November 24, 2008 -
Gigabyte's Open Overclocking Championship 2008
23 teams representing 20 countries duked it out for glory at Gigabyte's Open Overclocking Championship. Liquid nitrogen flowed freely and world records were broken during an exciting day of extreme benchmarking. Read on for our coverage of the event. Read more...
48 comments —
Last post by The Stench at 10:52 PM on October 19, 2008 -
Intel's G45 Express chipset
AMD and Nvidia have beefed up their integrated graphics chipsets to offer decent gaming performance and Blu-ray decode acceleration. We take Intel's latest G45 Express for a spin to see if it can keep up with the graphics giants. Read more...
34 comments —
Last post by derFunkenstein at 8:10 AM on October 11, 2008 -
GeForce GTX 260 reloaded vs. the Radeon HD 4870 1GB
Tight competition has resulted in two new video cards that redefine their end of the market for just a smidgen under 300 bucks: the GeForce GTX 260 "reloaded" and the Radeon HD 4870 1GB. Read more...
219 comments —
Last post by StuG at 6:29 PM on October 22, 2008 -
Thermaltake's A2413 7-inch LCD
Thermaltake's A2413 7" LCD display packs a touch-sensitive screen into a motorized drive bay that looks perfect for car PCs and enterprising modders. We take the screen for a spin to see whether there's utility behind the novelty. Read more...
17 comments —
Last post by d0g_p00p at 11:33 PM on September 30, 2008 -
Can a sub-$100 graphics card get the job done?
So just how much money should you spend on a graphics card? The latest models for under 100 bucks might surprise you with their potency, both in games and HD video playback. We've lined up eight cards to see where the values are. Read more...
112 comments —
Last post by tocatl at 11:43 PM on October 24, 2008

