What is the fastest way to store VBO content?

digitalgibs

Level 0

« on: April 29, 2010, 05:23:46 PM »

I am looking to store my vertex data into a VBO, but I am wondering what the best approach would be. Originally I created a vertex format that was more interlaced.

Code:

struct InterlacedVertex
{
	vec3f pos;
	vec3f normal, tan, binormal;
	vec2f texcoord;
	byte color[4];
};

This worked okay, but it was pretty limiting. I couldn't use more than 1 set of texture coordinates, and my animation code was forced to upload the entire vertex array since there was no easy way to only update a subset <pos, normal, tan, binormal>.

Now I am thinking about creating separate arrays for each component, but this seems like it would be a cache nightmare.

VBO Memory Layout:
N = number of verts
M = number of uv sets
[N points][N normals][N tangents][N colors][N*M texcoords]

CON - for each vertex, the driver will have to sample across huge spans of memory...
PRO - can update subsets of the model without having to re-upload unchanged data.
PRO - can support multiple texture coordinates, now that it is not interlaced.

Code:

std::vector<vec3f> points;
std::vector<Tangents> normals;
std::vector<vec2f> texcoords;
std::vector<VertexColor> colors;


	Logged

David Pittman

Level 2

MAEK GAEM

Re: What is the fastest way to store VBO content?

« Reply #1 on: April 30, 2010, 08:20:01 AM »

Are you implying that those four std::vectors would be of different lengths (or that a single vertex might have a different index of each)? Because AFAIK you can't use separate indices for different elements of a vertex.


	Logged

Games: Slayer Shock - NEON STRUCT - Eldritch - More

digitalgibs

Level 0

Re: What is the fastest way to store VBO content?

« Reply #2 on: May 01, 2010, 06:36:39 PM »

Yes all the arrays would be of the same length. The only difference is that each type (pos, normal, etc) would be stored as a packed array in the VBO and not interlaced. I just wasn't sure if that was smart...

I'm curious if anyone out there has tried both ways and noticed any performance differences. I'll most likely be using it for a lot of dynamic geometry.


	Logged

zacaj

Level 3

void main()

Re: What is the fastest way to store VBO content?

« Reply #3 on: May 03, 2010, 04:46:56 AM »

I havent tried it myself, but, at least on the iPhone, interlaced is supposed to be faster


	Logged

My twitter: @zacaj_

Quote from: mcc

Well let's just take a look at this "getting started" page and see--

Quote

Download and install cmake

Noooooooo

westquote

Level 0

I make games in Philly. How rare!

Re: What is the fastest way to store VBO content?

« Reply #4 on: May 03, 2010, 08:45:57 PM »

I think you mean 'interleaved', not 'interlaced'.

Interleaved data should generally be more efficient with regards to cache coherency/cache misses. What I do in my codebase is define a runtime structure that maps components to offsets within vertex buffers. This allows me to interleave the data all I want, or break it into separate buffers, or any combination thereof. I tend to put all my static components (those that do not change every frame) into a single interleaved buffer, and all my dynamic components (those that will change every frame) into a single different interleaved buffer. This lets me take advantage of the one array being cached to the hardware, which is a nice bandwidth savings, while still interleaving the data that is going to change every tick.

Remember that OpenGL provides you with all these options because there is no globally-optimal choice on all hardware for all use cases. Particles, sprites, static meshes, dynamic geometry, skeletally-animated meshes, etc... all represent different optimization spaces, and you will most likely find that the best data regarding optimization is the data you gather yourself. I certainly would not trust anyone else's data to the point of architecting my graphics engine around it without verifying their findings locally first.

As for all current iPhone hardware, it is confusing to say that interleaved VBO's are "supposed to be faster." VBO's do not offer you any performance gain on the iPhone over normal vertex arrays, as they are not cached in video memory. Here is an analysis that breaks this fact down fairly clearly:

http://blackpixel.com/blog/399/iphone-vertex-buffer-object-performance/

VBO's aside, as I mentioned above, the performance gains for a given use case can't be broken down into a heuristic as simple as "always use interleaved VBO's". Also, the performance gains associated with interleaving are unlikely to be truly significant except in very specific situations.

I recommend the following page to anyone interested in getting off to a good start in learning about vertex specifications:

http://www.opengl.org/wiki/Vertex_Specification_Best_Practices

Do post your findings as you continue your experiments!


	Logged

Twitter: @westquote - Webpage: Final Form Games

Will Vale

Level 4

Re: What is the fastest way to store VBO content?

« Reply #5 on: May 04, 2010, 03:41:00 PM »

One pattern I've seen quite often is splitting your vertex data into a "position only" stream and an "everything else" stream. This makes it easy to do cheaper rendering for shadow maps, depth prime, etc. since you can just bind the positions in that case.

More generally, the hardware companies provide advice on this kind of thing - ATI/AMD and NVidia have optimisation guides on their developer sites. That'd be my first port of call. I seem to recall one common point was to fit your vertex size to a multiple of the IA cache line size (and keep it as small as possible using compressed components) to avoid overfetching.

HTH,

Will


	Logged

Pages: [1]

« previous next »