Way back when I was writing my OpenGL Book, I had the good fortune to run across some stupendous folks at Microsoft who were forthright and open and very very helpful to me, answering questions, implementation details, and generally reviewing a lot of my work. One of these guys was Otto Berkes. Later the API wars would put him and most of the other OpenGL team onto the DirectX team (which is one reason it suddenly jumped in quality between releases), and eventually he went on with Seamus Blackley, Kevin Bachus, and Ted Hase to be the core Xbox team. He’s the last one to leave Microsoft, and I wish him all the best.
It turns out that Microsoft did something very interesting Windows 7, something incredibly cool in fact! Windows 7 will contain something called WARP10, which stands for Windows Advanced Rasterization Platform. This is, in essence, a software rasterizer for DirectX. WARP10 is a high speed, fully conformant software rasterizer that supports DX10+. WARP allows 3D rendering in a variety of situations where hardware implementations are unavailable, including:
- When the user does not have any Direct3D capable hardware or driver, or the card or driver crashes
- When running as a service or in a server environment
Microsoft lists a bunch of other scenarios, but basically WARP10 will run anytime you do not have a working video card or driver. (Assuming you program for it, that is – that is are targeting feature set DX 10.1 or less) What this means is application developers are no longer constrained from using 3-D effects when they find compatible hardware. This is going to open up a whole new realm of casual games that don’t have extreme hardware requirements as well as applications that could be improved by rendering a static 3-D scene. When you combine this with the new Direct3D hardware requirements, applications that are not real-time 3-D will be able to first try hardware driver, and if that’s not available, fall back to a WARP10 interface.
This is an incredibly great thing that Microsoft has done! This brings us back to the halcyon days of software rasterizers where GPUs were measured by their bit-blt rate. This is circa 1999. Things were simple in those days, as you didn’t need to check for shader model support or for a particular pixel buffer format or for a CAPS bit.
I see the biggest use of this is in the coming years, as WARP is multithreaded, so that it can effectively use higher-end hardware – in particular multi-core CPUs and SSE instructions to maximize the throughput. Over the past decade we’ve been maxing out GPU’s while the power of CPU and continued to grow steadily and now pretty much all CPUs are multicore systems I continue to see game developers hitting a GPU limit and failing to take advantage of multicore CPU systems. While I rail against this short-sightedness, after all, while physics and AI can be programmed on a GPU, you really can’t do anything fancy very easily in a shader, it’s just not made for it, but it does attack the problem from the other end – offloading some of the GPU tasks onto the CPU. There’s no reason you can’t have two rendering pipelines – one for the GPU and one for the CPU. This allows you to off-load simple things, like occlusion queries, shadow map generation, etc. onto the CPU, giving you more time for the GPU to actually render stuff.
Visit the Microsoft WARP web page here.
|Having just returned from Siggraph this year I was fielding some questions from some co-workers. “Didja hear that Microsoft announced DirectX 10.1?”, “Yeah”, “And that it makes all DX10.0 hardware obsolete?” – To which my witty reply was “huh?” I was there when the DX10.1 features were described and I’m sure I would have noticed it if Microsoft had made such an announcement.
What I do remember was the description of the architecture for DX10 and why it’s breaking from the DX9 interface. Basically it’s that fact that the API has just gotten bigger and bigger and has gotten to the point where 1) The underlying hardware doesn’t work the way the API was originally laid out, and 2) the drivers are now these huge things to force the legacy API calls to talk to the current hardware setup. DX10 (and OpenGL 3 for that matter) is where we make a clean break and get back to a thin driver layer over the GPU. Gone are the fixed lighting pipeline of yor. In fact the whole Begin Scene – render – EndScene architecture is gone. Cap bits are finally going away and DX is adopting (waaaay to late) the OpenGL conformance test model. In other words DX10.0 is an API specification. If you want you hardware to be certified as being DX10.0 compatible, it has to run all the features that are in the DX10.0 spec. (And I assume it has to generate conforming output when tested against the API). The programmers now get to code to one API, not various flavors, and the consumer gets to know that a DX10 card will run all DX10 games.
OK, so what’s the difference between DX10.0 and DX10.1? Basically what I heard was that DX10.1 was what Microsoft wanted for Vista ship, but not all the major hardware vendors could get all the features in the current hardware generation. So what shipped was most of the features minus some more esoteric things (like how 4 sample full screen antialiasing is implemented). The reason DX10.1 is coming out so quickly is that Microsoft wants the spec out there so developers can see what’s going to be available in the near future as well as putting a stake in the ground that hardware venders have to meet. The new features that are in DX10.1 are:
So, as the Microsoft guy said, it’s all about the rendering quality. So, I doubt that when DX10.1 comes out suddenly your DX10.0 game will stop working. These are just enhancements to the API that don’t reflect the current state of the hardware, just where the hardware will be forced to go in the near future. The DX9 API is getting a final revision that then will be frozen so any non-Vista OS will be able to run a DX9 (or 8, or 7) game, as will Vista since the DX9 DLL will coexist with the DX10.x DLL on Vista. If you want to try it out you’ll need the Vista SP1 beta and the D3D10.1 Tech Preview – both will be downloadable from Microsoft.
At the GDC, graphics chip maker NVIDIA announced they are releasing a bunch of updated and new tools. The tool upgrades are: FX Composer 2, PerfHUD 5, ShaderPerf 2. They are also releasing a new GPU-accelerated texture tool, plus a Shader Library. The most interesting toolkit is the DX10 SDK. Targeted towards the GeForce 8 series of GPU (the only GPU that can run D3D10 so far), it’s a collection of samples for both OpenGL and DirectX, executables and source, that demonstrate and showcase DX10 features. The installer looks and acts like the Microsoft DX Sampler.
The DirectX collection makes up the bulk of the samples, while the OpenGL side is a bit thin. The samples include Rain, Smoke, Fur, Shadows, cloth simulation, Render Target usage, etc. The Direct3D SDK is 256MB while the OpenGL SDK is 45MB.
To compile the source you’ll need Microsoft Visual Studio 2005 plus have the Feb. 2007 DirectX SDK installed (which you can get here.) if you want to compile the DX samples.
If you actually want to run the code you’ll need a DirectX10 video card – which currently means an NVIDIA GeForce 8. There are videos of the programs so even if you don’t have a DX10 video card you can still see the programs running.
When it rains, it pours. We’re being treated to a slew of new shader writing tools. ATI updates their previously available RenderMonkey. NVIDIA jumps into the fold with FX Composer. And RTZen joins as well with a more production oriented tool for shader writers. There are some other tools that are out there but they are generally student/hobbyist products and the support is somewhat iffy. Here’s a quick summary of each product.
RTzen™, Inc., a new company dedicated to empowering 3D artists, today announced it’s conducting the first public demonstration of its RT/shader editing tool at the Game Developer Conference (GDC) in numerous partner booths.* RT/shader, shipping later this month, is the industry’s first 3D editing tool with the capability to automatically generate high-level shading language code in real-time, leading to a dramatic reduction in development time and cost. Moreover, it enables more 3D graphic artists to leverage shaders to increase the realism and image quality of 3D digital content – like in games, product design and environment simulation. RT/shader will be on display in the ATI Technologies (booth#827) and NVIDIA Corporation (booth#808) booths and during Alias and Discreet technical seminars. If you get it at GDC the price is $US 1595. – regularly it’s $US 1995. Get more information here.
NVIDIA has released FX Composer. FX Composer enables developers to create high performance DirectX 9.0 HLSL shaders in an IDE with unique real-time preview and optimization features. FX Composer was designed with the goal of making shader development and optimization easier for programmers while providing an intuitive GUI for artists customizing shaders for a particular scene. FX Composer comes with dozens of sample projects, performance tutorials, and more than 120 sample shaders. You down load it here.
ATI has released RenderMonkey 1.5, and it’s the only tool out there that supports the OpenGL shader language GLSL. There have been a lot of usability improvements since the initial release. The interface has been redone and it’ll look familiar to those who use Microsoft’s Visual Studio. A lot of drag and drop, right-click properties menus, and much more attention to making writing shaders as effortless as possible. There’s a review of RenderMonkey in the April Game Developer magazine (which is distributed free at GDC). Download RenderMonkey here.
Summary: From conversations with Jeremy Hubbell at RT/Zen & the FX Composer group at NVIDIA and the RenderMonkey group at ATI about the capabilities of their products I can do a quick summary of the products. If RT/Shader has done a good job of integrating with Maya and 3dsMax so that it really is seamless integration and the price is well worth it. ATI and NVIDIA are seemingly squaring off against each other, intentionally or not, with very similar tools. They both are similar in outlook and design, but neither offers seamless integration with the usual tools. NVIDIA has done a nice job of predicting performance using GPU cycle count, register usage, utilization rating, and FPS but it requires a information of the target GPU – which currently exist only for NVIDIA’s products. ATI supports OpenGL shaders on GLSL capable ATI & 3DLabs, which pretty much decides that issue if you need to use GLSL. And NOBODY provides a real debugger in any of these tools. The closest is to use the DirectX 9 reference (i.e. software emulation) driver in Visual Studio.
|DirectX Next – Oh Pleeeze!
The slides from Microsoft’s Meltdown 2003 are available here. I’ve not been a fan of DirectX’s piecewise distribution of shader technology – not so much for the hardware folks as for the consumers. When I’d chat with the folks who write shader code for a living (outside the Evil Empire) – I’d get hints as to the stuff “for the next release”. This was particularly annoying as I was writing a book targeting this audience at the time and you’d think Microsoft, at the very least, would want to publicize this stuff. The hardware folks, the top-tier game writers, they were all in the know. They’d let me know, generally, that there was more to be had. Even when they did come out and state what was going on, I, under NDA, couldn’t disclose what I knew. It was very frustrating. For all intents and purposes, Microsoft does indeed seem to want to disseminate this info. Unfortunately they don’t seem to speak with a single clear voice since Phil Taylor left for the warm arms of ATI. Sigh, instead of having someone spoon-feed this out to the public, you’ve now got to glean this stuff yourself. Let’s look at the recent Meltdown slides for example.
What’s new with DX?
If you’re an artist leaning shader programming, a programmer interesting in having a shader testbed, or just interested in shader programming, ATI has released RenderMonkey 1.0. (To find out what RenderMonkey can do, look here.) According to ATI RenderMonkey has undergone a major rewrite since the V0.9 beta. These changes have greatly improved the stability and usability of RenderMonkey and also provided a much more developer friendly framework for the introduction of the RenderMonkey SDK. Find out more at the ATI site here. You’ll need the DirectX9.0b. The following features have changed or been added since the V0.9 beta:
- Completely rewritten preview window including a more extensive Trackball user interface.
- Completely rewritten HLSL and Assembly editors with improved user interface and syntax highlighting.
- Support for REFRAST
- Additions to existing set of RenderMonkey special variables giving user control and adding functionality such as random number generation.
- Addition of Camera object types allowing for per-pass camera parametersto be stored in the workspace.
- Display of HLSL disassembly.
- Addition of many more HLSL examples
- Improved error checking and reporting.
- Automatic mipmap generation for renderable textures.
XBitlabs reports some Mercury Research results for Q2 graphics market share from Q1. The big winner is Intel (even before they release the fricken Grantsdale chipset) increasing share from 27% to 32% – due to the integrated P4 chipsets. ATI increased from 20% to 21%, and NVIDIA fell from 31% to 27%. The report also said that NVIDIA had 60% of the DirectX 9 market share. Not bad for a company that botched its first DirectX 9 product release, although it seem that most of these were entry level (GeForceFX 5200) components. Still, it goes a long way to verify NVIDIA’s strategy of shipping only DirectX 9 capable products in its latest lineup. That leaves ATI with 40% of the DirectX market, though those are where most of the high end cards went. This roughly means 45% of the Q2 graphics market share was DirectX 9 capable cards.
The remaining 20% of the graphics market went mostly to SiS/XGI. Matrox Graphics, Trident Microsystems (now sold to XGI), S3 Graphics/VIA, Silicon Motion and 3Dlabs now occupy very small market shares.