|
Title: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 14, 2013, 11:26:10 AM Is it possible to generate mipmaps faster than glGenerateMipMaps on OpenGLES2 in iOS? For instance by doing this in a fragment shader?
I assume glGenerateMipMaps is computed on the CPU, since otherwise it might stall the GPU just to generate mipmaps, which is bad. I can just try and see, but I know it will take time because of poorly documented gl function and the unclear side effect C nature of Open GL. Have you done this before? Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ThemsAllTook on March 14, 2013, 11:43:50 AM Is it possible to generate mipmaps faster than glGenerateMipMaps on OpenGLES2 in iOS? Pregenerating them for all of your textures and uploading each mipmap level at texture upload time might be faster. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 15, 2013, 03:21:19 AM You are answering a question I did not ask, so it doesn't help me much.
Lets just say that I cannot do any offline calculations. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ham and brie on March 15, 2013, 01:58:02 PM I assume glGenerateMipMaps is computed on the CPU, since otherwise it might stall the GPU just to generate mipmaps, which is bad. I'd actually hope it would be done on the GPU. Especially if I was doing something like generating mipmaps for a texture created each frame by drawing to an FBO. Profiling on an iPad and iPod touch, it does seem to be. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 16, 2013, 02:27:23 AM It's very unlikely it's being done on the GPU.
If it's done on the GPU it means OpenGL needs to either constantly allocate texture for this kind of work or allocate a new texture each time glGenerateMipmaps is called. It also means that OpenGL needs to compile shaders at some point of the program which means memory and CPU time resources. It also means you cannot generate mipmaps in parallel to other openGL operations that are being processes on the same thread. What makes you think it's being done on the GPU? Edit: it is also possible the GenerateMipmaps happen on a seperate thread but not on the GPU. Which makes you think like it's on the GPU because the command doesn't wait when processed on the thread you call it. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: Polly on March 16, 2013, 03:39:14 AM Useful article http://www.g-truc.net/post-0256.html
Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ham and brie on March 16, 2013, 04:30:53 AM What makes you think it's being done on the GPU? Because profiling you can see CPU and GPU utilisation. ...It has just occurred to me to see what difference a non-square texture makes. On my 3rd gen iPod touch (older device, supports OpenGL ES 2, single core CPU), generating mipmaps for 512x512 texture can be done at 60fps. GPU utilisation is around 44% and CPU is low. If I change it to 512x256 (area halved), then it slows to less than 20fps, GPU utilisation is low, CPU is the clear bottleneck (nearly all time spent in glGenerateMipmap()). Similar on my 3rd gen iPad: it can manage 2048x2048 at 60fps, but 512x256 is under 30, bottlenecked by the CPU. So, looks like if the GPU can do the work, it is much faster at doing it (which is why I hoped it would be), but it won't do non-square textures. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 16, 2013, 07:51:45 AM Err, I don't see from your "benchmark" that the mipmaps are generated on the GPU.
Are you just putting a glGenerateMipMaps inside a loop? If you put something inside a loop than you might get 100% CPU usage no matter what you put inside the loop, unless you are blocking. Getting 20 fps for half the texture instead of 60 fps sounds very wrong.Sounds like you are doing it wrong. Polly: Except for one bullet point saying "hardware accelration" I don't see any mention that it is actually done on the GPU. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ham and brie on March 16, 2013, 09:31:46 AM Are you just putting a glGenerateMipMaps inside a loop? I am using it once each frame.Each frame: I use a framebuffer object to clear the top layer of the texture to a colour which varies each frame I use glGenerateMipmap() I use the texture to colour a rotating cube Quote Getting 20 fps for half the texture instead of 60 fps sounds very wrong.Sounds like you are doing it wrong. That result makes good sense. What I saw from the profiler was consistent with the GPU taking the load of generating mipmaps when the texture is square. The GPU device utilisation is high, CPU load is fairly low. Changing the texture size (but keeping it square) affects the GPU load. If I try to use a 1024x1024 texture on my iPod touch, the framerate drops, GPU device utilisation is up to 100%, while CPU even goes down (because with fewer frames drawn it has even less to do).When the texture is not square (but still power-of-two) then the load is clearly on the CPU, and the CPU is much slower at doing the work. You seem to want to believe your guesswork and assumptions over someone who has got evidence from trying it in practice. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 16, 2013, 09:45:10 AM Correlation does not imply causation.
Seems retarded that generating mipmaps on the GPU will be only possible for square textures. I could easily implement a pow2 GPU mipmap generation for both square and non square. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ham and brie on March 16, 2013, 10:23:32 AM Correlation does not imply causation. Do you use that line to ignore any measurements that don't fit your guesswork? Do you do any profiling?Quote Seems retarded that generating mipmaps on the GPU will be only possible for square textures. I could easily implement a pow2 GPU mipmap generation for both square and non square. It's not really a matter of whether it's easy to write in code. Restrictions such as something only working quickly in hardware for square textures are usually to keep the chip simpler.Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 16, 2013, 10:53:42 AM I think you don't know what you are talking about. You measured a specific scenario that involves several things, it doesn't isolate glGenerateMipmaps.
Your explainations of the cause from the very simple data you collected are misleading rationalizations. Anyway, whatever... I will keep looking for the answer myself. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: Columbo on March 16, 2013, 03:05:03 PM I think you don't know what you are talking about. You measured a specific scenario that involves several things, it doesn't isolate glGenerateMipmaps. Your explainations of the cause from the very simple data you collected are misleading rationalizations. Anyway, whatever... I will keep looking for the answer myself. Wow - that's a pretty rude response to someone who's taken the time to help with a question you've posted. Sounds to me like Ham and Brie at least knows to measure, not assume for one and has some experience in interpreting profile results. If, with a non-square texture he's seeing a load of CPU time in glGenerateMipmaps, and with a square texture he's seeing little CPU time in glGenerateMipmaps and a measurable increase in GPU utilization, then that's compelling evidence that some work is being done on the GPU. Also, I'm not sure your reasoning for why a GPU implementation is so unlikely is sound. I don't see any obstacle to the driver allocating memory, compiling shaders and inserting the necessary items into the command list to generate the mipmaps. Any GPU stalls/flushes that it'll have to insert between the GPU mipmap generation and the first use of the texture by the GPU would be relatively small (remember this is a platform where shaders are compiled lazily, so avoiding stalls obviously isn't particularly high on the driver writer's priorities). Nor would I be that surprised that the driver writers only bothered handling the most common case and fall-back onto the CPU implementation for non-square textures. Still, if you find any evidence to the contrary or if you find you can significantly beat glGenerateMipmaps, that'd be interesting. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 17, 2013, 10:32:19 AM I don't have any evidence because the OpenGL driver is a black box to me.
Unless I will find documents online detailing about the implementatiopn of GLES2 on iOS, or unless I have access to the source code I really can't tell that much. Your measurements are equivalent to measuring which sport car is faster by measuring their engine cycle on neutral. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ham and brie on March 17, 2013, 04:21:27 PM Not remotely an accurate analogy. I used profiling to answer the question of whether the load was on the GPU or CPU and show how the load on each varied with texture size. That is very much the kind of question the instrumentation is able to answer.
Perhaps it would be easier for you to believe if you could try it for yourself? I'm using XCode 4.6.1 Create a project using the iOS OpenGL game as the stating point. Use Mipmap as the class prefix. Replace the contents of the MipmapViewController.m file with: Code: #import "MipmapViewController.h" @interface MipmapViewController () {} @property (strong, nonatomic) EAGLContext *context; @end @implementation MipmapViewController - (void)viewDidLoad { [super viewDidLoad]; self.context = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES2]; self.preferredFramesPerSecond = 60; GLKView *view = (GLKView *)self.view; view.context = self.context; [EAGLContext setCurrentContext:self.context]; const GLsizei texture_width = ( 1 << 10 ); const GLsizei texture_height = ( 1 << 10 ); GLuint texture; glGenTextures(1, &texture); glBindTexture(GL_TEXTURE_2D, texture); glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, texture_width, texture_height, 0, GL_RGBA, GL_UNSIGNED_BYTE, NULL); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); } - (void)glkView:(GLKView *)view drawInRect:(CGRect)rect { glGenerateMipmap(GL_TEXTURE_2D); glClear(GL_COLOR_BUFFER_BIT); } @end Since you felt I hadn't isolated glGenerateMipmap(), this is cut down to just calling glGenerateMipmap() and clearing the screen each frame. You'll need a device attached to run on. Start profiling by pressing ⌘I and it should come up with a window that lets you choose OpenGL ES Driver from in the section Graphics. The "Instruments" program should start. Next to the chart for the "OpenGL ES Driver" there should be an "i" in a circle. Click it, then on the configure button. There should be a list of possible "statistics to list" to select from. Toggle on "Device Utilization %", click done. Under "Statistics to Observe" there should now be a "Device Utilization %" to toggle on. You can stop and restart profiling by clicking the red circle button. The textual tables are more useful than the graphs. If you select the OpenGL ES Driver instrument you should see a table with Device Utilization % as one of the columns. If you select the Time Profiler instrument you can see in the call tree the CPU time used and break it down by calls. You can compare the CPU time with the time you've run for. You can leave Instruments open, modify the code (such as changing the texture size) and then run the new version to profile it by pressing ⌘I. Instruments will keep measurements from previous runs so you can compare. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 17, 2013, 09:30:58 PM Ok thanks.
Well you are still bound by the GLView's frame rate, so your test still doesn't isolate glGenerateMipmaps. Perhaps a more plausible test would be to create and fill as many possible textures as possible of the same size, then sleep a bit, then measure the overall time it takes to perform glGenerateMipMaps on all the textures. Assuming that they didn't "optimize" glGenerateMipMaps to happen only when the texture is actually used, it might give a better result. Maybe simply binding the texture would make it generate the mipmaps in case they are lazy. It seems odd that it would use GPU implementation for square textures and CPU for non square, because that would defeat the purpose of making an effort to optimize for square textures. It's like making the job half done, it doesn't help much. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ham and brie on March 17, 2013, 11:33:31 PM Well you are still bound by the GLView's frame rate, so your test still doesn't isolate glGenerateMipmaps. This doesn't make sense. You can comment out glGenerateMipmap() and see how that changes the work done. What you are looking for is whether the call to glGenerateMipmap() adds to the GPU utilisation or the CPU time. This test program does answer the question of whether the work is being done by the GPU or CPU.Quote Perhaps a more plausible test would be to create and fill as many possible textures as possible of the same size, then sleep a bit, then measure the overall time it takes to perform glGenerateMipMaps on all the textures. That isn't really any more "plausible" and it makes it a bit harder to see whether the work was done by the GPU.Quote It seems odd that it would use GPU implementation for square textures and CPU for non square, because that would defeat the purpose of making an effort to optimize for square textures. It's like making the job half done, it doesn't help much. This sort of thing is common when you are using a GPU, which is why it ocurred to me to try non-square. Some things the hardware will do quickly. Some things it won't. It makes mipmap generation for square textures many times faster, so it's not true to say it doesn't help much. Where it's important that glGenerateMipmap() is fast, you can make the texture square.Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 18, 2013, 12:09:39 AM You see, the thing is... when you draw to GLView and bound it to 60 FPS, the CPU time will increase because it is stalled and waiting for the GPU to synch to 60 FPS!
Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ham and brie on March 18, 2013, 02:33:45 AM Though there is a bit of CPU work used to do the screen update loop, when the CPU doesn't have work to do the vast majority of the time between frames does not count as CPU time.
Anyway, the main point of the test is to see whether the GPU is doing the work. The way the measurements given about the GPU from the profiler works it's better to give the GPU work each frame rather than try to give it just a single lump. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 18, 2013, 11:05:51 AM You measure the time the CPU waits for the GPU, not the actual time the CPU works. What is so hard to understand?
Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: powly on March 18, 2013, 11:15:41 AM I've generated mipmaps by hand on a fragment shader by just rendering from the previous level to the next one. Never tried if it's faster than the reference implementation, since I also manipulated some extra data on each level and had to do it anyway.
Ham and brie would seem to be correct, there's no way the CPU can do 2048x2048 smooth and stall with 512x256. Though you shouldn't rely on one specific implementation as they can change - just do it by hand if you want to be sure it's GPU accelerated. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 18, 2013, 11:46:27 AM But there is a lot more going on than generating the mipmaps when drawing to the screen is involved.
Drawing to the screen is bound to a refresh rate, even on a mobile phone. The code shows he hints for 60 frames per seconds. His results also show that at first the frame rate is 60 FPS. This means the CPU is stalled because it's waiting for the screen rendering to sync to 60 FPS, not because the GPU is busy generating the mipmaps. Yea, generating the next level using the previous level simply with bi-linear filtering is what I already thought to do. I just kept on arguing on what seems to be pseudo logic. In other words, the stall with the 512x256 might be due to synching with 60FPS, not related to generating the mipmaps. Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ham and brie on March 18, 2013, 11:52:19 AM OK, you've convinced me: you really are too thick to understand this and I ought to give up.
Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 18, 2013, 11:59:51 AM Why is it so hard for you to do the same "experiment" without involving drawing to the screen? You would think that if you want to isolate generating mip maps you would not add drawing to a screen which is also being synched to 60FPS. Where the hell is the logic in that?
Or does measuring this without drawing to the screen doesn't give you the desired results so you say that it's wrong? Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: ThemsAllTook on March 18, 2013, 12:15:06 PM Come on guys, no need to fight. If one of you is going to disagree with the other's conclusion, maybe you could bring data of your own that shows a different result?
Title: Re: Generating MipMaps on the GPU, GLES2 iOS Post by: PompiPompi on March 18, 2013, 12:35:12 PM That would require me to think of all the possible optimizations, threading issues, and edge cases that might be inside the OpenGL driver which is a black box for me.
If you bring data and claim that it proves something, I would also be happy to know how you think it is done under the hood, and show ENOUGH data to prove that proves the structure and logic of the implementation of the OpenGL driver. Maybe you are very smart and I am stupid, but I really can't tell from what you have show me the pipeline and what happens inside the OpenGL Driver. Does generating the mip maps always happen on glGenerateMipMap? Is it done lazily only when required? Does the synching to 60 FPS of the screen happen only when on the function that draws present into the screen? Maybe the screen synching also inside glGenerateMipMap? Are they triple buffering? Does generating mipmaps make your drawing of the box wait in queue until they end like most other GPU operations? Do you have answers to these questions? Edit: My answer is, they could be doing anything inside their drivers, ffff if I know what. Having slow small non square textures make generating mipmap on the GPU mostly pointless, since I need both square and non square. |