I've been playing around with tree models and animations
I want a really simple and cheap animation. I want to be able to put a lot of trees and still have good overall performance. My previous tree animations were bone animations. A tree had only two bones, but still I felt that's too expensive. Not that i made any tests, just felt that it has to be expensive. So i've made a simple vertex offset animation. The red channel on the vertex colour means how much vertex moves:
But I'm doing just simple translation on one axis, so animation has to be subtle. It doesn't look good if i overdo the translation:
Maybe i will want to make a stormy weather with strong wind. I will have to come up with something better for that. But for now subtle movement is ok.
So I've spent couple days fighting dynamic batching in Unity. First of all my vertex offset was applied in shader before multiplying by MVP matrix. It causes problems with batching. When batched, vertices are sent to the gpu in world coordinates. So you are applying your translation in a different space. For example, local model space Z axis is up in Blender. In Unity Y axis is up. So applying translation in local and world coordinates means moving in different axis. I fixed it by dividing MVP matrix in two multiplications. The first multiplication is the model matrix or objecttoworld matrix. Then I apply the translation and after that multiply by VP matrix. It means that the translation is in global coordinates. Here's shader's fragment to show what i mean:
o.posWorld = mul(_Object2World, v.vertex);
o.posWorld.xyz += (o.vertexColor.r * ( (-1 + sin(_Seed + node_4324.g)) / 2)*float3(0,0,_Amplitude));
o.pos = mul(UNITY_MATRIX_VP, o.posWorld );
It sucks, because now all trees bend in the same direction. Without batching i can just randomly rotate trees and they will bend in different directions.
So i've been curious how much did the batching improve performance. I made a scene with ~1100 trees:
1100 trees in 26 batches sounds quite good. Half of it is shadow generation. I was curious how unity's dynamic batching work. Does it sord meshes by materials? Or does it just go in some random order? I checked the frame buffer. There are 3 materials for trees with 3 different colors. You can see (maybe not so clearly, because colors are similar), that it renders all medium-green trees, then all light-green trees and at the end all brownish trees. Pretty neat. So far so good.
First weird thing i noticed is that stats in unity shows 480k vertices rendered. Well that's bullshit. My tree has 50 vertices:
1100 x 50 is 55k vertices. Then i checked model in unity:
244 vertices. Why? Because of flat shading. If you have flat shading it means that vertex has different normal for every face it belongs to. The way Unity handles it is to duplicate vertices with different normals. You can actually make a simple test - create a cube with flat and smooth shading. The one with smooth shading will have 8 vertices, and the one with flat shading 24 vertices. I've searched for a different approach but with no luck.
Ok, so how good is batching? On my laptop i had 50 frames on average with batching. Without batching it's actually better, 60 fps. I thought - ok, with batching cpu has to do some extra work and multiply vertices into world coordinates. There's actually 5 times more vertices than i expected and draw calls aren't so expensive on pc as they are on mobile. So I've checked it on my mobile devices. On my tablet and on my phone version without batching was slightly better (by 1-2 fps). That was unexpected. Here are stats from batched and non-batched version:
If you have any thoughts about this, let me know, i'm little bit shocked that batching doesn't improve performance in my case.