Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length

 
Advanced search

1038028 Posts in 41936 Topics- by 33555 Members - Latest Member: BankingTycoon

September 01, 2014, 10:36:28 AM
TIGSource ForumsDeveloperTechnical (Moderators: Glaiel-Gamer, ThemsAllTook)Generating MipMaps on the GPU, GLES2 iOS
Pages: 1 [2]
Print
Author Topic: Generating MipMaps on the GPU, GLES2 iOS  (Read 1646 times)
PompiPompi
Level 10
*****



View Profile WWW
« Reply #15 on: March 17, 2013, 09:30:58 PM »

Ok thanks.
Well you are still bound by the GLView's frame rate, so your test still doesn't isolate glGenerateMipmaps.
Perhaps a more plausible test would be to create and fill as many possible textures as possible of the same size, then sleep a bit, then measure the overall time it takes to perform glGenerateMipMaps on all the textures.
Assuming that they didn't "optimize" glGenerateMipMaps to happen only when the texture is actually used, it might give a better result.
Maybe simply binding the texture would make it generate the mipmaps in case they are lazy.

It seems odd that it would use GPU implementation for square textures and CPU for non square, because that would defeat the purpose of making an effort to optimize for square textures. It's like making the job half done, it doesn't help much.
Logged

Master of all trades.
ham and brie
Level 3
***



View Profile Email
« Reply #16 on: March 17, 2013, 11:33:31 PM »

Well you are still bound by the GLView's frame rate, so your test still doesn't isolate glGenerateMipmaps.
This doesn't make sense. You can comment out glGenerateMipmap() and see how that changes the work done. What you are looking for is whether the call to glGenerateMipmap() adds to the GPU utilisation or the CPU time. This test program does answer the question of whether the work is being done by the GPU or CPU.
Quote
Perhaps a more plausible test would be to create and fill as many possible textures as possible of the same size, then sleep a bit, then measure the overall time it takes to perform glGenerateMipMaps on all the textures.
That isn't really any more "plausible" and it makes it a bit harder to see whether the work was done by the GPU.
Quote
It seems odd that it would use GPU implementation for square textures and CPU for non square, because that would defeat the purpose of making an effort to optimize for square textures. It's like making the job half done, it doesn't help much.
This sort of thing is common when you are using a GPU, which is why it ocurred to me to try non-square. Some things the hardware will do quickly. Some things it won't. It makes mipmap generation for square textures many times faster, so it's not true to say it doesn't help much. Where it's important that glGenerateMipmap() is fast, you can make the texture square.
Logged
PompiPompi
Level 10
*****



View Profile WWW
« Reply #17 on: March 18, 2013, 12:09:39 AM »

You see, the thing is... when you draw to GLView and bound it to 60 FPS, the CPU time will increase because it is stalled and waiting for the GPU to synch to 60 FPS!
Logged

Master of all trades.
ham and brie
Level 3
***



View Profile Email
« Reply #18 on: March 18, 2013, 02:33:45 AM »

Though there is a bit of CPU work used to do the screen update loop, when the CPU doesn't have work to do the vast majority of the time between frames does not count as CPU time.

Anyway, the main point of the test is to see whether the GPU is doing the work. The way the measurements given about the GPU from the profiler works it's better to give the GPU work each frame rather than try to give it just a single lump.
Logged
PompiPompi
Level 10
*****



View Profile WWW
« Reply #19 on: March 18, 2013, 11:05:51 AM »

You measure the time the CPU waits for the GPU, not the actual time the CPU works. What is so hard to understand?
Logged

Master of all trades.
powly
Level 3
***



View Profile WWW
« Reply #20 on: March 18, 2013, 11:15:41 AM »

I've generated mipmaps by hand on a fragment shader by just rendering from the previous level to the next one. Never tried if it's faster than the reference implementation, since I also manipulated some extra data on each level and had to do it anyway.

Ham and brie would seem to be correct, there's no way the CPU can do 2048x2048 smooth and stall with 512x256. Though you shouldn't rely on one specific implementation as they can change - just do it by hand if you want to be sure it's GPU accelerated.
Logged
PompiPompi
Level 10
*****



View Profile WWW
« Reply #21 on: March 18, 2013, 11:46:27 AM »

But there is a lot more going on than generating the mipmaps when drawing to the screen is involved.
Drawing to the screen is bound to a refresh rate, even on a mobile phone. The code shows he hints for 60 frames per seconds. His results also show that at first the frame rate is 60 FPS. This means the CPU is stalled because it's waiting for the screen rendering to sync to 60 FPS, not because the GPU is busy generating the mipmaps.

Yea, generating the next level using the previous level simply with bi-linear filtering is what I already thought to do. I just kept on arguing on what seems to be pseudo logic.

In other words, the stall with the 512x256 might be due to synching with 60FPS, not related to generating the mipmaps.
Logged

Master of all trades.
ham and brie
Level 3
***



View Profile Email
« Reply #22 on: March 18, 2013, 11:52:19 AM »

OK, you've convinced me: you really are too thick to understand this and I ought to give up.
Logged
PompiPompi
Level 10
*****



View Profile WWW
« Reply #23 on: March 18, 2013, 11:59:51 AM »

Why is it so hard for you to do the same "experiment" without involving drawing to the screen? You would think that if you want to isolate generating mip maps you would not add drawing to a screen which is also being synched to 60FPS. Where the hell is the logic in that?
Or does measuring this without drawing to the screen doesn't give you the desired results so you say that it's wrong?
Logged

Master of all trades.
ThemsAllTook
Moderator
Level 10
******


Alex Diener


View Profile WWW
« Reply #24 on: March 18, 2013, 12:15:06 PM »

Come on guys, no need to fight. If one of you is going to disagree with the other's conclusion, maybe you could bring data of your own that shows a different result?
Logged
PompiPompi
Level 10
*****



View Profile WWW
« Reply #25 on: March 18, 2013, 12:35:12 PM »

That would require me to think of all the possible optimizations, threading issues, and edge cases that might be inside the OpenGL driver which is a black box for me.
If you bring data and claim that it proves something, I would also be happy to know how you think it is done under the hood, and show ENOUGH data to prove that proves the structure and logic of the implementation of the OpenGL driver.

Maybe you are very smart and I am stupid, but I really can't tell from what you have show me the pipeline and what happens inside the OpenGL Driver.

Does generating the mip maps always happen on glGenerateMipMap? Is it done lazily only when required? Does the synching to 60 FPS of the screen happen only when on the function that draws present into the screen? Maybe the screen synching also inside glGenerateMipMap?
Are they triple buffering?
Does generating mipmaps make your drawing of the box wait in queue until they end like most other GPU operations?
Do you have answers to these questions?

Edit: My answer is, they could be doing anything inside their drivers, ffff if I know what.
Having slow small non square textures make generating mipmap on the GPU mostly pointless, since I need both square and non square.
Logged

Master of all trades.
Pages: 1 [2]
Print
Jump to:  

Theme orange-lt created by panic