Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length

 
Advanced search

1075933 Posts in 44152 Topics- by 36119 Members - Latest Member: Royalhandstudios

December 29, 2014, 04:22:41 PM
TIGSource ForumsDeveloperTechnical (Moderators: Glaiel-Gamer, ThemsAllTook)Optimizing drawing speed [2D]
Pages: [1] 2
Print
Author Topic: Optimizing drawing speed [2D]  (Read 1376 times)
kamac
Level 10
*****


Notorious posts editor


View Profile Email
« on: August 31, 2012, 02:31:05 AM »

Hi there.

I am willing to draw many many sprites and/or colored rectangles on screen..
Althrough, when I draw 4000 colored rectangles (with no rotation or translation applied, only projection * view), I get 50-60 FPS.

I've been told to use either batching or instancing, but are those valid in OpenGL 2.1?
Because I am going with this iteration and it has to be compatible with it.

I am drawing my rectangles in way such as this (I am currently not home so I don't have access to my code. Written from head, but I'm pretty sure that's how it looks):

Code:
glUseProgram(basic_program);

float vertex_array[] =
{
x1, y1, z,
x2, y1, z,
x2, y2, z,
x1, y2, z
};
glBindBuffer(GL_ARRAY_BUFFER, vertex_buffer);
glBufferData(GL_ARRAY_BUFFER, sizeof(vertex_array), vertex_array, GL_DYNAMIC_DRAW);

float color_array[] =
{
r, g, b, a,
r, g, b, a,
r, g, b, a,
r, g, b, a
};
glBindBuffer(GL_ARRAY_BUFFER, color_buffer);
glBufferData(GL_ARRAY_BUFFER, sizeof(color_array), color_array, GL_DYNAMIC_DRAW);

glm::mat4 MVP = Projection * View;
glUniformMatrix4fv(uniform_mvp, 1, GL_FALSE, &MVP[0][0]);

glEnableVertexAttribArray(attrib_vertices);
glBindBuffer(GL_ARRAY_BUFFER, vertex_buffer);
glVertexAttribPointer(attrib_vertices, 3, GL_FLOAT, GL_FALSE, 0, 0);
glEnableVertexAttribArray(attrib_color);
glBindBuffer(GL_ARRAY_BUFFER, color_buffer);
glVertexAttribPointer(attrib_color, 4, GL_FLOAT, GL_FALSE, 0, 0);

glDrawArrays(GL_QUADS, 0, 4);

glDisableVertexAttribArray(attrib_vertices);
glDisableVertexAttribArray(attrib_color);

You can guess that calling that over and over 4000 times per frame might turn to slow things down.

Could anybody please show me some example / point me in the correct direction of how to use instancing / batching or pseudo-instancing. It'd be good if I could use transformation and textures with it later on for sprites.


PS.
BONUS POINTS FOR SOLVING THIS ONE:
I am also drawing lines using the same method as above, but with glDrawArrays(GL_LINES, 0, 2); etc. Though, it slows everything down sooooo much, that drawing like 10 lines with thickness of 1 pixel I get about 40 FPS.
The wider the line is (higher thickness) the lower FPS I get.

I don't get such performance leak with lines when not using shaders for them.
(My shaders are very very basic. Only to display color and calculate correct position, which is basically:
Code:
gl_Position = MVP * vec4(coord3d, 1.0);
)

Cheers.

<cpt obvious>Btw. I am using C++.</cpt obvious>
« Last Edit: August 31, 2012, 02:40:34 AM by kamac » Logged

Nothing to do here
Xienen
Level 3
***


Greater Good Games


View Profile WWW
« Reply #1 on: August 31, 2012, 04:30:59 AM »

There are basically 2 high-level factors in render pipeline speed, as I know it, which are drawing speed(vertex count, texture size, shader complexity, etc.) and draw count(in your case, the number of calls to glDrawArrays).  Of course there can be a "devil in the details" scenario, but basically these are the 2 main categories.  In 2D rendering, drawing speed is pretty much never a problem, but draw count can quickly become a crippling factor.  Someone else may have some other ideas for you, but the way that I intend on getting around this limitation in my engine implementation is grouping quads together(either statically or possibly dynamically) by putting multiple sets of vertices into a single Vertex Buffer and combining the textures into a single, larger texture sheet(and/or using a complex shader that samples from the proper bound texture index for each quad(though I haven't worked out the full details on how I'd really accomplish that...maybe using UV coordinates over 1.0 to denote which texture to use?).

As for the line rendering issue, it's definitely best to combine those into a single vertex buffer and use GL_LINE_STRIP.  Note, though, that rendering lines is definitely a slow function on a lot of modern hardware(because it's basically deprecated functionality that's essentially emulated). Also, on one of my machines in the past, I didn't have the latest Drivers(had recently switched cards from NVIDIA to ATI) and it was running at 1/100th of its potential.
Logged

motorherp
Level 2
**



View Profile WWW Email
« Reply #2 on: August 31, 2012, 06:04:03 AM »

What he said  Cheesy, if you sort your sprites by texture used and combine your images into as few texture atlases as possible you'll then be able to maximise how many sprites you can combine into the same vertex buffer and minimise your draw calls.  Once sorted into seperate lists by texture, you might also want to then further sort each list by z depth with closer sprites at the start and further away sprites at the end which will reduce over-draw.  If you have semi-transparent sprites though then you''ll have to sort them the opposite way in order to get alpha blending to work correctly so you might want to create seperate lists for opaque and transparent sprites.
Logged

kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #3 on: August 31, 2012, 06:13:58 AM »

These are good advices I guess, though I wonder how would I merge two textures into one?
 Crazy
Logged

Nothing to do here
motorherp
Level 2
**



View Profile WWW Email
« Reply #4 on: August 31, 2012, 06:18:37 AM »

These are good advices I guess, though I wonder how would I merge two textures into one?
 Crazy

Open them both in photoshop, position them next to each other, click save Wink
Logged

kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #5 on: August 31, 2012, 06:23:42 AM »

Oh, you mean that  Concerned

Well anyway, I am having bad time detecting if one rect is able to merge with another one...
I am not convinced it's the best way through  My Word!
Logged

Nothing to do here
Schrompf
Level 2
**

Always one mistake ahead...


View Profile WWW
« Reply #6 on: August 31, 2012, 06:36:55 AM »

It is the best way, and IMO the only way. Group all your sprites per topic on a texture. Google for "texture atlas", there are tools available that also spit out suitable texture coords.

Then, while drawing, batch multiple sprites together into a single vertex array and draw them in one call. That's the "batching".

I personally solved this by writing a set of classes. One combines all rectangular graphics you hand to it into a texture, the other collects draw calls and uploads them to a dynamic vertex buffer when you're done with it. Uses instancing and rotation/scaling in a vertex shader to minimize the amount of dynamic data. I get approx. 20 million sprites per second with this approach.
Logged

Let's Splatter it and then see if it still moves.
kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #7 on: August 31, 2012, 06:40:48 AM »

Let's say I am using rectangles for now.

It's only colored rectangles. I know I can put their vertices into one vector, same with colors, but how would I draw them all in one batch? Giving OpenGL these vertices on one draw would cause the spaces between my rectangles dissapear (since it'd be merged).

An example would be great  Tired
Logged

Nothing to do here
motorherp
Level 2
**



View Profile WWW Email
« Reply #8 on: August 31, 2012, 06:44:59 AM »

Not all vertices in a vertex buffer need be attached to each other.  Since your draw mode is set to quads, then each group of 4 verts in your vertex buffer will be intepreted as a seperate quad, the spaces inbtween each quad aren't auto filled in.
Logged

kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #9 on: August 31, 2012, 06:54:12 AM »

Well, I can't get it to work really  Waaagh!

Here's how it looks like atm.

Code:

namespace _BASICSHAPES_VARS
{
ShaderClass *shader;
static GLuint vertex_buffer;
static GLuint color_buffer;
float current_thickness;
std::vector<float> temp_vertices;
std::vector<float> temp_colors;
}

void BasicShapes::DrawRect(float x1, float y1, float x2, float y2, float z, RGB rgb, bool filled = false)
{
_BASICSHAPES_VARS::temp_vertices.push_back(x1);
_BASICSHAPES_VARS::temp_vertices.push_back(y1);
_BASICSHAPES_VARS::temp_vertices.push_back(z);
_BASICSHAPES_VARS::temp_vertices.push_back(x2);
_BASICSHAPES_VARS::temp_vertices.push_back(y1);
_BASICSHAPES_VARS::temp_vertices.push_back(z);
_BASICSHAPES_VARS::temp_vertices.push_back(x2);
_BASICSHAPES_VARS::temp_vertices.push_back(y2);
_BASICSHAPES_VARS::temp_vertices.push_back(z);
_BASICSHAPES_VARS::temp_vertices.push_back(x1);
_BASICSHAPES_VARS::temp_vertices.push_back(y2);
_BASICSHAPES_VARS::temp_vertices.push_back(z);
_BASICSHAPES_VARS::temp_colors.push_back(rgb.r);_BASICSHAPES_VARS::temp_colors.push_back(rgb.r);
_BASICSHAPES_VARS::temp_colors.push_back(rgb.g);_BASICSHAPES_VARS::temp_colors.push_back(rgb.g);
_BASICSHAPES_VARS::temp_colors.push_back(rgb.b);_BASICSHAPES_VARS::temp_colors.push_back(rgb.b);
_BASICSHAPES_VARS::temp_colors.push_back(rgb.a);_BASICSHAPES_VARS::temp_colors.push_back(rgb.a);
_BASICSHAPES_VARS::temp_colors.push_back(rgb.r);_BASICSHAPES_VARS::temp_colors.push_back(rgb.r);
_BASICSHAPES_VARS::temp_colors.push_back(rgb.g);_BASICSHAPES_VARS::temp_colors.push_back(rgb.g);
_BASICSHAPES_VARS::temp_colors.push_back(rgb.b);_BASICSHAPES_VARS::temp_colors.push_back(rgb.b);
_BASICSHAPES_VARS::temp_colors.push_back(rgb.a);_BASICSHAPES_VARS::temp_colors.push_back(rgb.a);
}

void BasicShapes::DrawAll()
{
glUseProgram(_BASICSHAPES_VARS::shader->ProgramID);
glBindBuffer(GL_ARRAY_BUFFER,_BASICSHAPES_VARS::vertex_buffer);
glBufferData(GL_ARRAY_BUFFER,sizeof(_BASICSHAPES_VARS::temp_vertices.data()),
_BASICSHAPES_VARS::temp_vertices.data(),GL_DYNAMIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER,_BASICSHAPES_VARS::color_buffer);
glBufferData(GL_ARRAY_BUFFER,sizeof(_BASICSHAPES_VARS::temp_colors.data()),
_BASICSHAPES_VARS::temp_colors.data(),GL_DYNAMIC_DRAW);
glm::mat4 MVP = GLP::Projection * GLP::View;
glUniformMatrix4fv(_BASICSHAPES_VARS::shader->uniform_MVP,1,GL_FALSE,&MVP[0][0]);

glEnableVertexAttribArray(_BASICSHAPES_VARS::shader->attrib_coord3d);
glBindBuffer(GL_ARRAY_BUFFER,_BASICSHAPES_VARS::vertex_buffer);
glVertexAttribPointer(_BASICSHAPES_VARS::shader->attrib_coord3d,3,GL_FLOAT,GL_FALSE,0,0);
glEnableVertexAttribArray(_BASICSHAPES_VARS::shader->attrib_color);
glBindBuffer(GL_ARRAY_BUFFER,_BASICSHAPES_VARS::color_buffer);
glVertexAttribPointer(_BASICSHAPES_VARS::shader->attrib_color,4,GL_FLOAT,GL_FALSE,0,0);
glDrawArrays(GL_QUADS,0,_BASICSHAPES_VARS::temp_vertices.size()/3);
glDisableVertexAttribArray(_BASICSHAPES_VARS::shader->attrib_coord3d);
glDisableVertexAttribArray(_BASICSHAPES_VARS::shader->attrib_color);
_BASICSHAPES_VARS::temp_vertices.clear();
_BASICSHAPES_VARS::temp_colors.clear();
}

I get some weird shapes and weird colors instead though  Concerned
Logged

Nothing to do here
motorherp
Level 2
**



View Profile WWW Email
« Reply #10 on: August 31, 2012, 07:13:36 AM »

Its been a long time since I used OpenGL raw like this so I might be wrong but I think I see two potential issues.  For a start it looks like you're building your colours array incorrectly, you're pushing on two red vals next to each other, then two green vals, etc.

Secondly I think this is wrong:

glBufferData(GL_ARRAY_BUFFER,sizeof(_BASICSHAPES_VARS::temp_vertices.data()),
      _BASICSHAPES_VARS::temp_vertices.data(),GL_DYNAMIC_DRAW);

Here you're going to be returning the size of the data pointer, ie. the size of the memory address, not the size of the data itself.  Use something like this instead:

glBufferData(GL_ARRAY_BUFFER,sizeof(float) * _BASICSHAPES_VARS::temp_vertices.size(),
      _BASICSHAPES_VARS::temp_vertices.data(),GL_DYNAMIC_DRAW);
Logged

kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #11 on: August 31, 2012, 07:17:36 AM »

Incredible!

1024 rectangles at 800 FPS!  Waaagh!

Oh my god oh my god oh my god!
Thanks a lot  Corny Laugh

PS.

10.000 rectangles at 250 FPS  Tongue
20.000 rectangles at 135 FPS.
40.000 rectangles at 70 FPS.

(AMD Athlon 64 X2 2.71 GHz processor dual core, 2GB ram and an ATI Radeon HD series card Epileptic)

I'll have to battle with sprites later on.
« Last Edit: August 31, 2012, 07:25:18 AM by kamac » Logged

Nothing to do here
ThemsAllTook
Moderator
Level 10
******


Alex Diener


View Profile WWW
« Reply #12 on: August 31, 2012, 10:54:04 AM »

One small tweak you might want to make: It looks like you want GL_STREAM_DRAW instead of GL_DYNAMIC_DRAW. I was confused about the difference between these two for a while, but as I understand it, it goes something like this:

  • Use GL_STATIC_DRAW if you're calling glDrawElements (or glDrawArrays, in your case) many times per call to glBufferData, without modifying the contents of the buffer (using glBufferSubData or glMapBuffer).
  • Use GL_STREAM_DRAW if you're calling glDrawElements once or twice per call to glBufferData, without modifying the contents of the buffer.
  • Use GL_DYNAMIC_DRAW if you're calling glDrawElements many times per call to glBufferData, and will be modifying the contents of the buffer between draws.

Dunno if this will make a measurable difference in performance in your case, but it should at least be more semantically correct.
Logged
kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #13 on: August 31, 2012, 11:11:06 AM »

Hm. Changing it to GL_STREAM_DRAW doesn't change a thing  WTF
But I'll keep it I guess.
Logged

Nothing to do here
kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #14 on: September 02, 2012, 05:56:26 AM »

Still relative question.

I am now trying to batch sprites, but I don't really know how can I pass many MVPs to the shader.
Currently I've added this:

Code:
attribute mat4 MVP;

To the shader, but I am unsure how can I fill it..
Currently I do it this way,

Code:
glEnableVertexAttribArray(shader->attrib_MVP);
glBindBuffer(GL_ARRAY_BUFFER,MVP_buffer);
glVertexAttribPointer(shader->attrib_MVP,1,GL_FLOAT,GL_FALSE,0,0);

And that's how I fill MVP_buffer:

Code:
std::vector<glm::mat4> transformation;
for(int n=0; n<spritesToDraw.size(); n++)
{
transformation.push_back(spritesToDraw[n].MVP);
}
glBindBuffer(GL_ARRAY_BUFFER,MVP_buffer);
glBufferData(GL_ARRAY_BUFFER,sizeof(float)*transformation.size(),transformation.data(),GL_DYNAMIC_DRAW);

Not sure I can do it this way... Or can I?

Or should I rather calculate position * MVP on the CPU?
Logged

Nothing to do here
Ludophonic
Level 2
**


View Profile WWW
« Reply #15 on: September 02, 2012, 08:13:02 AM »

You're trying to pass one MVP matrix for every four vertices right? You can do that but not in OpenGL 2.1.

You need OpenGL 3.3 or the ARB_instanced_arrays extension. Then you could call glVertexAttribDivisor to set it up.
 
Logged
kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #16 on: September 02, 2012, 08:52:18 AM »

Hm, so what's my solution for OpenGL 2.1?  Undecided
Shall I calculate MVP * position for every vertex on CPU (from C++)?
Logged

Nothing to do here
Ludophonic
Level 2
**


View Profile WWW
« Reply #17 on: September 02, 2012, 09:53:57 AM »

You have several options. Some will depend on whether or not you need rotation or scaling.

  • Calculate MVP * position on the CPU
  • Calculate Modelview * position on the CPU. Do projection on GPU.
  • If you don't need rotation or scaling, store the translation in each of the four vertices as an attribute. Do transforms on GPU.
  • If you only need rotation around one axis, do the above and store the rotation as the w value in the attribute.

What OS/hardware are you targeting btw? Most things that support OpenGL 2.1 also support the ARB_instanced_arrays extension.
« Last Edit: September 02, 2012, 10:03:08 AM by Ludophonic » Logged
kamac
Level 10
*****


Notorious posts editor


View Profile Email
« Reply #18 on: September 02, 2012, 10:03:04 AM »

Thanks, these are useful tips!
I'll be sure to check them out in a sec Wink
Logged

Nothing to do here
Polly
Level 4
****


View Profile
« Reply #19 on: September 02, 2012, 11:45:24 AM »

Hm, so what's my solution for OpenGL 2.1?  Undecided

You could add a identifier attribute and pass the matrices as uniform array.

However, it's pretty hard giving optimization suggestion without knowing what you're actually trying to do / working on.
Logged
Pages: [1] 2
Print
Jump to:  

Theme orange-lt created by panic