Other than that, I've started tinkering on a little animation pipeline for XNA together with a 3D artist as a learning experience. I have no idea if this will be anywhere near finished any time soon, or if it'd even be useful for the community since there seem to be a lot of libraries out there already. Like any good software project I settled on an acronym for the name anyway before its anywhere near done, so I lay claim to Uxmal. It's a good fit since the artist is working in Maya and I'm on my third rewrite of the codebase already. Xna Model Animation Library also fits, but my development efforts are currently stumped by trying to come up with what the U should mean. Useful would be ideal, but it might also be Unnecessary or Useless.
Time will tell :)
]]>I hope to find some time for the write-ups and posting code soon, please bear with me :)
]]>output.Position.z = log(C*output.Position.z + 1) / log(C*Far + 1) * output.Position.w;
The C constant allows you to define the resolution you want at the near plane and Far is the value you use for the far plane. More details can be found in the journal linked above. I happened to be tinkering on a model of our solar system with a bunch of stars surrounding us (from Hipparcos data) to scale. The lightyear distances for the stars and the AU distances within the system were a source of woes with the depth buffer, but using this simple line made my z-fighting stars play nice.
]]>
I haven't gotten around to writing interesting samples or uncovering any deep XNA truths lately. Instead I've been tinkering on a little beat-em-up game with stylized stick men to do the fighting. These little actors are entirely procedural, so the game itself generates the geometry and animations rather than using models created by artists. Obviously nothing can replace a good artist and this proved painfully true when it came to the animations.
The animations are generated by a particle-based physics system (described here), which works by applying force to the attacking limb towards the victim, checking collisions and letting the simulation run its course. The base skeleton displayed below is set up easily enough, but without additional contraints the resulting movements are far from natural. As noted in the original article, a lot of tweaking can also be done using the mass of particles to get some control over how easily particles (i.e. joints) can move.
So if the skeleton and animations are that hard to tweak, you might be wondering what good this procedural technique is then. The beauty of this -admittedly simple- physics based rendering setup is that you essentially get inverse kinematics for free. If I want to hit my opponent with a hand, I just apply some force on the hand towards where I want to hit him. With sufficient tweaking, this produces convincing animations for accurately hitting the victim anywhere with any part of the attacking actor. Headbutts, kicks and more exotic attacks are just a matter of picking target and subject particles on either side.
Another nice benefit of this procedural approach is that the geometry is very accessibly to the program and thus can be altered in a variety of ways. With a few minutes of tinkering, style variations like those below are easily implemented.
The project is still a pretty long way off from becoming a playable game, but it's already made its way around the office for passive-aggressive stress relief
I'm afraid I can't put a playable build out anytime soon, but in the meantime here's a little movie (WMV, 7mb) showing some basic pummeling and the style so far. Since a lot of tweaking is involved and much of the style is still up in the air, comments and/or suggestions would be much appreciated.
Foregoing lamenting my own slacking, I want to point everyone who cares to read this to a great online resource I stumbled upon today:
A free online book called "Programming Vertex, Geometry, and Pixel shaders"
It is geared towards D3D10, but it contains an incredible wealth of information about nearly every graphics topic you'd want to implement. The concise and thorough theory certainly holds for XNA and most shaders should give you a good idea how stuff gets implemented, if they don't work out of the box. It's written by these guys, who deserve a truckload of cookies in my opinion!
Oh and Jack, if you ever should find your way here, we have to discuss your definition of 'not very active' :)
]]>A bit of an off-topic post for any Dutch readers that may come across our little blog. MS decided to revamp their Dutch .NET Magazine and change the subscription to Opt-In, meaning you'll explicitly need to tell them you want to keep receiving the magazine after the next issue of September 22nd. You can find more details and re-subscribe over at this page:
http://www.microsoft.nl/netjesgeregeld
It's quite a risky step for them to take, but they want to make sure they're reaching their reader base and get a clearer picture of the interests of their readers. The magazine remains free and they've drummed up a full-blown redactional team for the revamp, so if you've enjoyed the magazine so far make sure you re-subscribe!
]]>
-The 360 has 512MB of unified memory, which means it's shared by both the CPU and the GPU. You don't have all of that available to you, since some of it is taken up by the console's "OS" and some will also be taken up by the .NET Compact Framework. You'll also be working from the managed heap, rather than directly working with native memory.
-You can only execute pure managed code on the 360. You can't, for example, P/Invoke into a non-managed DLL.
-As far as GPU shaders go, you're pretty unrestricted. You can use SM3.0 HLSL, or you can also write portions of your shaders in the GPU's native microcode. This is really very nice...it lets you do things like un-normalized texture addressing, full texturing capabilities in the vertex shader, or directly fetching an element from a vertex stream. The microcode set is referred to as xvs_3_0 and xps_3_0.
-The 360's GPU is different from your average PC GPU in that it has an eDRAM framebuffer. The eDRAM is 10MB in size, and has tremendous bandwidth (256GB/s). What this means is that writing out to the framebuffer or reading it back for blending is very very quick. Multi-sampling is also very quick, since again you don't have the bandwidth problem. In fact MSAA would be "free" if it weren't for tiled rendering...you see the downside of eDRAM is that if your render-target + z-buffer is too big to fit in eDRAM, you have to render to it in tiles. This means you render one portion of the target, then another. This isn't so bad, except for the fact that any geometry that's on the edges of 2 tiles has to be drawn twice. If you're not doing scenes with hugely complex geometry you probably won't even notice tiling (it happens automatically). To figure out whether you're going to tile you need to count the amount of bytes per pixel and then multiply by resolution. So for example if you're rendering to the Color format which is 4 bytes per pixel and you're using Depth24Stencil8 which is also 4 bytes per pixel, you have 8 bytes per pixel total. When multi-sampling, you multiply this amount by the number of samples (so 4xMSAA would by 32 bytes per pixel). 1280 x 720 with 4xMSAA would be ~28MB, so you'd need 3 tiles.
-Be prepared to get CPU-bound really quick if you're doing anything non-trivial. DrawPrimitive calls are extremely expensive on the 360...I've seen my framerate go from about 70 to 30 just from going from 24 DP calls to 34. Instancing is a must if you need to draw a lot of meshes...there's a good sample on the CC website. By the same token if you're doing any really fancy logic on the CPU that's not graphics-related, you'll probably need to run it on another thread on a different core since your main thread can get bogged down pretty quick.
-Watch out for performance pitfalls with the .NET Compact Framework. Things like Garbage Collection compaction and virtual function calls are much more expensive than they are on the PC. I suggest reading this blog for tips. Just remember to keep your live object count as low as possible, and you should be okay.
-Floating-point performance is not so great on the 360 CPU. Most of the fp power is in the vector units, but you have no access to those through XNA.
-Avoid the surface formats that are larger than 32bpp. Mainly HalfVector4 and Vector2. Their performance is generally pretty terrible, and I've run into all kinds of driver bugs with them. This means you can't do HDR in straightforward way, but there are other options. There's an entry on my XNA blog where I talk about how I got around it.
-Watch your texture sampling bandwidth. Framebuffer access may be quick, but reads from textures are limited by the 22.1GB/s read bandwidth. This may be quite a bit less than what you're used to, if you're prototyping on a higher-end card like an 8800. This can be especially painful in scenarios where you want to take multiple samples per pixel, like PCF for shadow maps or SSAO.
-Prototyping and developing on the PC is a good idea since you have access to PIX, but make sure you test pretty often on the 360. You may need to optimize for quite few scenarios if you need to keep the framerate up.
My early prototyping used a standard tone-mapping chain and I didn't want to ditch that, nor did I want to move away from what I was comfortable with. This pretty much eliminated the second option for me off the bat...although I was unlikely to choose it anyway due its other drawbacks (having nice HDR bloom was something I felt was an important part of the look I wanted for my game, and in my opinion Valve's method doesn't do a great job of determining average luminance). When I tried out the first method I found that it worked as well as it always did on the PC (I've used it before), but on the 360 it was another story. I'm not sure why exactly, but for some reason it simply does not like the HalfVector4 format. Performance was terrible, I couldn't blend, I got all kinds of strange rendering artifacts (entire lines of pixels missing), and I'd get bizarre exceptions if I enabled multi-sampling. Loads of fun, let me tell you.
This left me with option #3. I wasn't a fan of this approach initially, as my original design plan called for things to be simple and straightforward whenever possible. I didn't really want to have two versions of my material shaders to support encoding, nor did I want to integrate decoding into the other parts of the pipeline that needed. But unfortunately, I wasn't really left with any other options after I found there were no plans to bring the support for the 360's special fp10 backbuffer format to XNA (which would have conveniently solved my problems on the 360). So, I started doing my research. Naturally the first place I looked was to actual released commercial game. Why? Because usually when a technique is used in a shipped game, it means it's gone trhough the paces and has been determined to actually be feasible and practical in game environment. Which of course naturally led me to consider NAO32.
NAO32 is a format that gained some fame in the dev community when ex-Ninja Theory programmer Marco Salvi shared some details on the technique over on the beyond3D forums. Used in the game Heavenly Sword, it allowed for multi-sampling to be used in conjuction with HDR on a platform (PS3) whose GPU didn't support multi-sampling of floating-point surfaces (The RSX is heavily based on Nvidia G70). In this technique, color is stored in the LogLuv format usinga standard R8G8B8A8 surface. Two components are used to store X and Y at 8-bit precision, and the other two are used to store the log of luminance at 16-bit precision. Having 16 bits for luminance allows for a wide dynamic range to be stored in this format, and storing the log of the luminance allows for linear filtering in multi-sampling or texture sampling. Since he first explained it other games have also used it, such as Naughty Dog's Uncharted. It's likely that it's been used in many other PS3 games, as well.
My actual shader implementation was helped along quite a bit by Christer Ericson's blog post, which described how to derive optimized shader code for encoding RGB into the LogLuv format. Using his code as a starting point, I came up with the following HLSL code for encoding and decoding:
// M matrix, for encoding
const static float3x3 M = float3x3(
0.2209, 0.3390, 0.4184,
0.1138, 0.6780, 0.7319,
0.0102, 0.1130, 0.2969);
// Inverse M matrix, for decoding
const static float3x3 InverseM = float3x3(
6.0013, -2.700, -1.7995,
-1.332, 3.1029, -5.7720,
.3007, -1.088, 5.6268);
float4 LogLuvEncode(in float3 vRGB)
{
float4 vResult;
float3 Xp_Y_XYZp = mul(vRGB, M);
Xp_Y_XYZp = max(Xp_Y_XYZp, float3(1e-6, 1e-6, 1e-6));
vResult.xy = Xp_Y_XYZp.xy / Xp_Y_XYZp.z;
float Le = 2 * log2(Xp_Y_XYZp.y) + 127;
vResult.w = frac(Le);
vResult.z = (Le - (floor(vResult.w*255.0f))/255.0f)/255.0f;
return vResult;
}
float3 LogLuvDecode(in float4 vLogLuv)
{
float Le = vLogLuv.z * 255 + vLogLuv.w;
float3 Xp_Y_XYZp;
Xp_Y_XYZp.y = exp2((Le - 127) / 2);
Xp_Y_XYZp.z = Xp_Y_XYZp.y / vLogLuv.y;
Xp_Y_XYZp.x = vLogLuv.x * Xp_Y_XYZp.z;
float3 vRGB = mul(Xp_Y_XYZp, InverseM);
return max(vRGB, 0);
}
Once I had this implemented and worked through a few small glitches, results were much improved in the 360 version. Performance was much much better, I could multi-sample again, and the results looked great. So once again things didn't exactly work out in an ideal way, but I'm pleased with the results.

]]>Since the stuff I'm working on for XNAInfo is taking forever to finish, I thought I'd post a link to this little gem here in the meantime. On my forum rounds I find myself linking to Tom's Excellent DirectX Faq at least once a week. Obviously it's DirectX specific, but that easily translates to tons of useful information on XNA development on Windows. It's a great resource that should prove useful for just about anyone.
]]>
Some time ago a topic popped up on GameDev on how to stick a WebBrowser control on a D3D surface, or rather an XNA texture level. After some tinkering we got it to work (obviously Windows only), so it can be used to render any webpage to an XNA texture. The next step was to try and make it interactive, paving the way for HTML and Flash based GUIs. Unfortunately we ran into a strange bug here, which I haven't been able to solve. So I figured I'd post this out here, hoping anyone comes across this who can help out.
Here are the (messy) demo projects:
The bug surfaces in the 2nd project. Basically it works fine until the user left-clicks anywhere on the control (doesn't have to be a link), after which WebBrowser.DrawToBitmap fails silently and only an empty white bitmap gets rendered. The strange thing is that the WebBrowser control does load the new webpage. By uncommenting line 241, mouse moves are posted to the control and the window title will show the HREF of links on the (loaded but invisible) page as you hover over them.
Anyway, perhaps someone better versed in window messages can check this out, see if the code for posting mouse presses/releases is correct. While researching this problem, I also came across this MSDN page which states DrawToBitmap isn't supported for the WebBrowser control anyway, so it seems a miracle it works in the first place. If anyone cares to comment on that, please let's hear it.
]]>In a recent discussion on RTS AI, Matias Goldberg pointed out the quite ancient text The Art of War by Sun Tzu. Since I've been fostering an RTS pet project for years now, I decided to get a copy and it turned out the best $4.95 I probably ever spent (ISBN10 0-486-42557-6).
Obviously it's not a cookbook for writing RTS strategies, but after a first read I can definitely see how various stratagems and axioms could be used to construct a clever AI. In fact, I'm amazed how many clear cut rules of thumb Sun Tzu puts forward that could be readily applied. These however may only serve to highlight the limited scale on which popular RTS games play out.
Food for thought at any rate.
]]>
]]>