So, I don't know if people took a look on Photon's source code(my Particle Engine, comes with Brain 2 ( http://brain2.codeplex.com ) . I did not include the source, but I will on the next release maybe.
I made some "crazy ***" on HLSL to make the Particle System work. It works pretty well, got more than 50.000 512x512 particles on the Xbox 360.
When I was making Photon, I used the Particles Sample as a start(only as a start). Combining with a Instancing system, I would have a lot more particles. So, I had a CPU Particle System, which indexed Matrixes(16 floats) and send them to the GPU which rendered them.
But then, studying Matrixes, I saw that it's easy to make matrixes. So... I converted the Indexing System to a way like this:
-> PositionTime: XYZ = Position of the Particle. W = Time of the Particle-> LinearVelocitySize: XYZ = Start Linear Velocity. W = Size of the Particle-> StartColor: XYZW = The color of the particle
Well, that's 4 floats less. It may seem less, but when you have 50000 particles, it gets around 200.000 less floats. I guess I can reduce this even more by using half2(I don't almost nothing on this particular part, need to study a little bit more).
But the strange thing about the Renderer, what you must be asking: "To render a model, you need a World matrix"
So I ended up with this:
float4x4 Matrix = (float4x4)0;
Matrix._11 = 1;
Matrix._22 = 1;
Matrix._33 = 1;
Matrix._44 = 1;
float4x4 CreateWorld(float3 Position, float Scale)
float4x4 Trans = Identity();
Trans._11 = Scale;
Trans._22 = Scale;
Trans._33 = Scale;
Trans._41 = Position.x;
Trans._42 = Position.y;
Trans._43 = Position.z;
And on the Vertex Shader:
float4x4 world = CreateWorld(FinalPosition, Size);
Is this crazy or cool? One thing I'm sure, it works. But is it crazy? Is there any better way?
And is there a way I can get this kind of instancing on SunBurn? Because, on the Instancing Sample SunBurn uses a Skinning like system, which would make me reformulate the whole system of the particle system...
Looks good to me!
Follow me on Twitter – development and personal tweetsAwesome XNA Videos – Lighting, Rendering, and game videos
Wow. I like your thinking. Cool.
Now I just realized one think, it's really less memory, and I made the comparison wrong:
-> If I was using World Matrix, I would end up with a VertexDeclaration like this:
Vector4 Wor1Vector4 Wor2 Vector4 Wor3 Vector4 Wor4float TimeVector4 LinearVelocitySizeVector4 StartColor
Which would be 25 floats, which is 100 bytes(800 bits), which for me for just 1 particle is too much. With my system I can use less than half the floats, ending with 12 floats(48 bytes).
But, there's still one problem, I don't if that's slow, but making a WorldMatrix for every vertex on the Vertex Shader seems kinda bad. IDK, there are no lots of calculations on the VS, the baddest thing I do is to calculate the Billboarding.
Ok, that's it ;) I'm going to work more on Photon, and maybe on Brain 0.13 it will support 100.000 particles on the 360 :D
It's usually the case - you optimize the memory you hit the performance. Optimize the performance you hit the memory. You can optimize both same time only when you are fixing lame code :)
Can't comment how much extra latency was introduced, but at least it's in the shader... and interesting thinking.