Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411490 Posts in 69377 Topics- by 58433 Members - Latest Member: graysonsolis

April 29, 2024, 03:35:23 PM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)OpenGL thread
Pages: [1]
Print
Author Topic: OpenGL thread  (Read 2540 times)
Glaiel-Gamer
Guest
« on: March 27, 2009, 08:41:09 PM »

I'm gonna consolidate all my opengl questions in here from now.



This one is slightly faster for large numbers of sprites, but slightly slower for small numbers (40 = small, 400 = large):

Code:
void Texture::bind(){
    glBindTexture(GL_TEXTURE_2D, tex);
}

as opposed to this one:
Code:
void Texture::bind(){
  if(cbound != this){
    glBindTexture(GL_TEXTURE_2D, tex);
    cbound = this;
  }
}


anyway as of now I'm not sorting sprites in terms of their currently bound texture.

Any suggestions? Should I splice all the textures into one massively large texture and not worry about rebinding them? Or will consciously sorting the order of which textures are swapped solve most lag problems?

(stress test = rendering 400 animated sprites, then doing a 4 pass post process shader)
« Last Edit: March 31, 2009, 04:17:37 PM by Glaiel-Gamer » Logged
Glaiel-Gamer
Guest
« Reply #1 on: March 27, 2009, 09:24:32 PM »

keeping them on the same texture actually slows it down for some reason
Logged
nihilocrat
Level 10
*****


Full of stars.


View Profile WWW
« Reply #2 on: March 28, 2009, 10:48:28 AM »

The guy who made the Canvas plugin for OGRE3D uses an "Atlas", where he uses some sort of packing algorithm to store all the sprites needed onto a single texture, and simply slices out the regions he needs for each particular sprite. He also does something to alleviate the need for material-switching, keeping batch count to one, but that's probably a higher-level OGRE-specific thing.

I'm also really confused why the one-texture approach isn't faster. I think the best way of optimizing OpenGL is to reduce the amount of communication between CPU and GPU to the smallest possible, but I'm not a guru or anything. This means, vertex buffers, batching, etc.
Logged

Glaiel-Gamer
Guest
« Reply #3 on: March 28, 2009, 12:01:18 PM »

well as of currently I can set up my texture objects to have a "crop" rectangle which wraps 0-1 to that crop area so it wouldn't be too much of a stretch to have each "texture" reference the same texture, and it would propagate up through the rest of the program.

But it seems useless if it isn't gonna actually speed up much.
Logged
Saint
Level 3
***



View Profile WWW
« Reply #4 on: March 28, 2009, 12:16:37 PM »

keeping them on the same texture actually slows it down for some reason

If you have a very large texture instead of several small, you might have to do more fetches as the memory isn't laid out in a way that's optimized for the texture cache; meaning the card will have to access the memory several times and copy parts of the texture to cache while it might be able to fit an entire smaller texture in the cache and thus only needs to copy once. Accessing memory takes time, so this is likely why you see a slowdown.

Sorting the drawcalls by textures is a good idea. Trying to keep your state changes to a minimum will also help.
Logged
Glaiel-Gamer
Guest
« Reply #5 on: March 28, 2009, 12:59:48 PM »

If I was sorting by texture, I'd have to enable the z-buffer to keep my layering correct. Is using the z-buffer slower or faster than switching texture states more times than necessary?

If I bind an already bound texture, is there overhead in that or should I manually check for that?
Logged
Saint
Level 3
***



View Profile WWW
« Reply #6 on: March 28, 2009, 01:14:49 PM »

If I was sorting by texture, I'd have to enable the z-buffer to keep my layering correct. Is using the z-buffer slower or faster than switching texture states more times than necessary?

If I bind an already bound texture, is there overhead in that or should I manually check for that?

Using the Z buffer is likely somewhat slower since that's a per-pixel test, and you also need to push an additional coordinate for each vertex resulting in lower cache efficiency. It depends on your GPU and drivers though, as most of these things do.

Yes, there is an overhead for binding an already bound texture, but at the same time there's an overhead for checking for it so it depends a lot on how often you expect the test to fail. This is also highly dependent on drivers. I would suggest doing as you have already done and simply test it with the content you will be using.
Logged
Snakey
Level 2
**


View Profile WWW
« Reply #7 on: March 28, 2009, 01:23:50 PM »

It depends on how you are drawing your actual sprites. There are three primary methods of drawing quads onto the screen for example.

1. Sending GL commands
2. Precompiling GL commands into call lists, then using call lists per frame
3. Using extensions

Method 1 is the slowest but is the most flexible. It's slow because you're sending the commands to the gpu per frame. The gpu has no chance of really caching anything. Adopting an atlas method within this method is pretty easy.

Method 2 is quite a lot faster than method 1 since you're precompiling the commands into a list on the gpu and then just invoking it via the call list call. Adopting an atlas method is a bit harder since you can't modify the call list easily (modifying the call list per frame is sort of pointless).

Method 3 is pretty much as fast as you're going to get since you are then dealing directly with memory i/o, particularly if you're using vertex buffer objects with or without shaders. Vertex buffer objects with vertex shaders can adopt the atlas method really easily, provided the gpu has said extensions.

It is preferable to batch as much as you can rather doing a lot of pointless checking. For example, if the renderer wanted to render five sprites, using texture binds 1, 3, 1, 1, 5, it is much better to batch like so, 1, 1, 1, 3, 5. Since you're batching, theres no need to check if you've already bound to a number, plus you'll never rebind to the same texture id again. Using a simplistic checking system won't stop the system from rebinding to the same texture if the follow patterns or random patterns. Batching will.

At the end of the day, reducing the number of GL commands is the best optimization. Texture binding isn't a frame killer as it used to be now, but even then batching + reduction of GL commands is going to be the best thing you can do.

As for the z-buffer, since thats primarily dealt by the gpu, it's hardware accelerated ... so I doubt it has any performance impact at all.
Logged

I like turtles.
Glaiel-Gamer
Guest
« Reply #8 on: March 31, 2009, 04:18:05 PM »

Anyone know the math behind GL_LINEAR for magnification? Also some links on how to use glReadPixels properly would be appreciated.
Logged
Glaiel-Gamer
Guest
« Reply #9 on: March 31, 2009, 06:58:45 PM »

Anyone know the math behind GL_LINEAR for magnification? Also some links on how to use glReadPixels properly would be appreciated.

Solved

Anyone have a good 2-pass bloom shader? The one I have isn't really very good
Logged
Glaiel-Gamer
Guest
« Reply #10 on: March 31, 2009, 08:44:31 PM »

Anyone know the math behind GL_LINEAR for magnification? Also some links on how to use glReadPixels properly would be appreciated.

Solved

Anyone have a good 2-pass bloom shader? The one I have isn't really very good

Solved it :D

Logged
mcc
Level 10
*****


glitch


View Profile WWW
« Reply #11 on: March 31, 2009, 09:44:02 PM »

I'm not sure what's happening in that screenshot, but whatever it is it's attractive
Logged

My projects:<br />Games: Jumpman Retro-futuristic platforming iJumpman iPhone version Drumcircle PC+smartphone music toy<br />More: RUN HELLO
havchr
Level 2
**

gaming, coding, experimenting photo science party


View Profile WWW
« Reply #12 on: April 01, 2009, 05:23:00 AM »

Here's my performance tips:

 - figure out what is outside of your view(do this step fast)
   and don't send that as opengl-commands to render.

 - sort your list of things to render, front-to-back for early-z-cull.
   in my engine I have a "layer number", which describes the overall rendering-order.
   this allows me to render a set of transparent sprites at the very end, so they blend.

 - only calculate new positions in a scenegraph if you need.

 - Use Vertex Buffer Objects (search for opengl VBO) or display lists
   if you are not using that for rendering, you are lazy and code runs slow Smiley

 - Batch together drawcalls. Drawing a million boxes with a million drawcalls is slow,drawing a million boxes with one drawcall, is fast.. If you want the boxes to animate independantly store an ID-per-vertex and use that in the vertex-shader to fuck your million boxes up...
 
Logged

Pizza is delicious.
Oddball
Level 10
*****


David Williamson


View Profile WWW
« Reply #13 on: April 01, 2009, 02:52:46 PM »

- Use Vertex Buffer Objects (search for opengl VBO) or display lists
   if you are not using that for rendering, you are lazy and code runs slow Smiley
In my experience VBOs are slower than Vertex Arrays on systems with shared video memory. Not an issue if you're aiming at the hard core market, but most casual gamers have intergrated graphics with shared memory. Just something to consider.
Logged

Pages: [1]
Print
Jump to:  

Theme orange-lt created by panic