Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411485 Posts in 69371 Topics- by 58427 Members - Latest Member: shelton786

April 24, 2024, 03:35:22 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)deferred vs. forward (with depth pass) debate
Pages: [1]
Print
Author Topic: deferred vs. forward (with depth pass) debate  (Read 1754 times)
digitalgibs
Level 0
**



View Profile
« on: September 10, 2014, 12:46:30 PM »

Okay guys.  Maybe this is a solved problem but I've mostly worked with Forward rendering systems over the years, including ones that deal with real-time lighting (ala Doom 3).  The new hotness right now seems to be that everyone is moving toward Deferred rendering however.

What's the big benefit?  I mean, fill rate is generally the hard argument, but if you are performing a depth pass first then technically we are still only processing visible fragments.  Forward rendering has a lot of triangle transform, clipping, and rasterization to perform multiple times but those are generally lightning fast on modern hardware.

I guess my question is, what are the big reasons for all the major engines moving to deferred rendering?
Logged
Krux
Level 2
**



View Profile
« Reply #1 on: September 10, 2014, 05:05:44 PM »

deferred rendering allows you to render a lot of light sources quickly. Deferred rendering allows you to render small light sources only in that area, where it has a notable effect. Forward rendering requires you to iterate all light sources per fragment and therefore you are limited to a very low number. But deferred has it's disadvantages, because multisample antialiasing doesn't work anymore and transparency won't work.
Logged
Geti
Level 10
*****



View Profile WWW
« Reply #2 on: September 10, 2014, 08:19:03 PM »

Why does MSAA "not work"? I mean I know it's fillrate limited but is it really that bad, or is there some other reason MSAA breaks?
Logged

JakobProgsch
Level 1
*



View Profile
« Reply #3 on: September 11, 2014, 04:52:44 AM »

Why does MSAA "not work"? I mean I know it's fillrate limited but is it really that bad, or is there some other reason MSAA breaks?

MSAA is so fast (compared to straight up supersampling) because it can skip expensive calculations inside polygons and only has to do more work on the edges. Deferred throws away that advantage... It does totally "work" though you just have to write the correct MSAA resolves etc. in the render passes (so it doesn't work automatically by just enabling MSAA).
Logged

digitalgibs
Level 0
**



View Profile
« Reply #4 on: September 11, 2014, 06:40:35 AM »

Forward rendering requires you to iterate all light sources per fragment and therefore you are limited to a very low number.

Well you could write a shader for a single light and then multi-pass that (ala Doom3) instead of an uber shader.  That would technically result in the same as deferred.  But I do see what you mean about small lights though.  If I was rendering a dense chunk of mesh that was 50k triangles (like a character) but the light was only effecting the character's hand (like a fireball effect) then I'd have to transform, clip, and reject 50k triangles for the 200 triangles that make up his hand...  interesting.

I suppose the reason why it's only recommended for high end machines though is the extreme bandwidth.  It seems like there is definitely a point of diminishing returns for low end PC or mobile.  But if the machine is low end then I guess we wouldn't be rendering dozens of tiny lights anyways =), it would be just a fill light and maybe a key light like the sun or moon.  For that kind of setup I can see benefits to just using forward rendering, otherwise it seems like using up a lot of frame buffers and copying pixels around will only cost more.
Logged
Geti
Level 10
*****



View Profile WWW
« Reply #5 on: September 11, 2014, 05:02:30 PM »

Thanks for clearing that up Jakob, makes sense Smiley
Logged

Ashaman73
Level 0
***



View Profile WWW
« Reply #6 on: September 12, 2014, 04:04:13 AM »

The new hotness right now seems to be that everyone is moving toward Deferred rendering however.
Hotness is good, it is already used for half a decade.

Quote
Well you could write a shader for a single light and then multi-pass that (ala Doom3) instead of an uber shader.
Theoretically it would be the same, but it would just destroy your performance.

In a forward shader, even with z-prepass, you need to render and blend the lit meshes, so that every light, which influence it, has been considered. So,eg if you have a scene with 100 lights, all have an influence on your character, you need to render your character 100x (1 light per shader pass), or atleast 10x if you have 10 lights per shader pass etc.

That is, the number of rendered triangles scales with the number of lights (passes),therefor instead of rendering 1 mio tris per frame, you suddenly renders 10 to 100 mio tris per frame. If you would consider just bandwidth, you could argue, that deferred rendering is really bandwidth hungry too.

But an more important disadvange of forward rendering is, that the old APIs ("OpenGL", DirectX <12) have trouble with lot of rendering calls (overhead). Calling the API 10000x per frame is a lot slower than calling it 1000x, even if the GPU would use up the same bandwidth/shader calculation ! (this might change with newer approaches, mantle,dx12 etc.)

Deferred rendering comes to the rescue, because you render the tris only once to fill up the g-buffer and afterwards you use simple (fullscreen) quads to render all the lights. That is, rendering the lights does not scale with the scene complexity.

Nevertheless, forward rendering have a lot of benefits compared to deferred rendering (please, give me a simple way to render transparency  Cry ) and most professional engines use a hybrid version therefor. With newer APIs/approaches this might even change to take more often adavantage of forward rendering again.
Logged

Krux
Level 2
**



View Profile
« Reply #7 on: September 12, 2014, 05:01:55 AM »

more to MSAA. MSAA calculates multiple depth values per fragment, eg 4. Therefore dour depth map is 4 times bigger. on edges of intersecting triagles a portion of pixel coverage is calculated and the resulting pixel is a blend of both fragments.
Logged
dhontecillas
Level 0
***


View Profile
« Reply #8 on: September 13, 2014, 05:41:06 AM »

You should check this article:

http://c0de517e.blogspot.ca/2014/09/notes-on-real-time-renderers.html

It is concise and goes straight to the pros and cons of every aproach.
Logged
Columbo
Level 0
***


View Profile
« Reply #9 on: September 13, 2014, 10:33:53 PM »

In addition to the performance implications already mentioned, one nice feature of deferred rendering is the separation of concerns between drawing the geometry and lighting the geometry.

Rather than the number of shaders (or at least the number of codepaths through your uber-shaders) being material modes*lighting modes, it becomes material modes+lighting modes, which addresses the combinatorial explosion of shaders quite nicely.
Logged

dhontecillas
Level 0
***


View Profile
« Reply #10 on: September 14, 2014, 05:33:29 AM »

Yep, the forward render is also good to have for other effectd, for example to have depth of field, or to apply the screen space ambient oclusion (althought this is considered part of the lighting process, no? ), or to apply other postprocessing fx to the full image ( edge detections, etc.. ).
Logged
Fallsburg
Level 10
*****


Fear the CircleCat


View Profile
« Reply #11 on: September 14, 2014, 06:18:49 AM »

Why would you want forward rendering for depth of field or SSAO?  I guess if by forward rendering you mean not deferred rendering then I agree, but typically any of those types of post-processing are just done by ping-ponging between render targets and then blitting the final one to the screen.
Logged
dhontecillas
Level 0
***


View Profile
« Reply #12 on: September 14, 2014, 08:53:19 AM »

Ooops.. Sorry, I meant deferred rendering  Embarrassed . When I talk about deferred rendering I always think about having all data in separate buffers, and being able to sample it, to perform any modification based on that ( usally normals - other things like specular , glossiness, are only useful for lighting , I think)

But I see what you say: for some post-processing you don't care about normals (since you would have the Z-buffer in any kind of render), and so , changing render targets would be enough.

By the way, for SSAO, you actually need the normal, don't you? or you sample the neighbours of the Z-buffer to extract the normal ? I had "problems" when sampling too much from textures ( for a blur effect ).
« Last Edit: September 14, 2014, 09:05:16 AM by dhontecillas » Logged
Fallsburg
Level 10
*****


Fear the CircleCat


View Profile
« Reply #13 on: September 15, 2014, 06:27:44 AM »

Ah, yes that makes more sense.

Most modern SSAO methods use the normal buffer, but the OG of Crysis actually just got by on the depth buffer.  The problems with not using the normals are that you waste roughly half your samples for flatish surfaces by sampling behind and convex corners are undersampled.

Logged
Boreal
Level 6
*


Reinventing the wheel


View Profile
« Reply #14 on: September 17, 2014, 04:13:26 PM »

Well, the new high-tech fad has shifted from primitive deferred shading to tile-based deferred shading, where the fragment data is stored in linked lists to allow order-independent transparency.

I'm not sold on it yet but it's pretty neat.  In my game projects I'll probably stick to basic deferred rendering but moving over to a compute-based pipeline is quite intriguing.
Logged

"In software, the only numbers of significance are 0, 1, and N." - Josh Barczak

magma - Reconstructed Mantle API
Krux
Level 2
**



View Profile
« Reply #15 on: September 17, 2014, 04:38:31 PM »

Well, the new high-tech fad has shifted from primitive deferred shading to tile-based deferred shading, where the fragment data is stored in linked lists to allow order-independent transparency.

I am not shure what you mean here, but I am pretty shure that linked lists are not used at all in computer graphis they are just too slow. Where do you have that information?
Logged
raigan
Level 5
*****


View Profile
« Reply #16 on: September 17, 2014, 11:08:29 PM »

Well, the new high-tech fad has shifted from primitive deferred shading to tile-based deferred shading, where the fragment data is stored in linked lists to allow order-independent transparency.

I am not shure what you mean here, but I am pretty shure that linked lists are not used at all in computer graphis they are just too slow. Where do you have that information?

Nope, apparently GPU atomic ops are fast enough that you can actually do per-pixel linked lists on modern hardware and it's not insanely slow: http://www.cescg.org/CESCG-2011/papers/TUBudapest-Barta-Pal.pdf
Logged
Milkybar
Level 0
**



View Profile
« Reply #17 on: September 18, 2014, 04:52:14 AM »

What's the big benefit?  I mean, fill rate is generally the hard argument, but if you are performing a depth pass first then technically we are still only processing visible fragments.

In a forward rendering system you would likely end up processing the same visible fragment multiple times so you would still end up with a much higher fill rate.

In forward rendering a lights position and colour are usually going to be specified as uniform variables for a shader. This will then be used to render a mesh with appropriate lighting values calculated for each fragment. We then blend this additively to the back buffer to add the contribution of this light. So as a basic rule we will end up rendering [number of fragments for a given mesh] * [number of lights affecting the mesh].

Depending on shader complexity for a surface we may be able to pass in the values for several lights in a single pass.

A major drawback to this approach is that small lights or lights that only affect a small number of the visible fragments of a surface are still going to perform the required lighting calculations on all the visible fragments of a mesh, even when most are producing a zero result. Remember lighting calculations are likely to include looking up several textures (depending on lighting model) so they are not exactly cheap.

The speed of modern GPUs actually increases this problem. As triangle rendering has become so fast the overhead of making the draw call is appreciable. To overcome this we want to batch up geometry into a small number of meshes with many triangles in (we can even use atlas textures push this even further). The downside now is that a single mesh is probably quite large in size (spatially) meaning more lights are likely to be affecting the mesh, and the lights that are touching it only effect a small proportion of the visible fragments for that mesh. The end results is that we are likely to end up redrawing visible fragments many times for lights that have a zero contribution to that fragment.

Early fragment rejection tests can sometimes help alleviate some of this stress but are dependant of conditional statements within that shader and conditional statements in the shader, and the performance of these varies greatly between GPUs (overall they are pretty poor though). Alternatively you could use the stencil buffer to mask out only fragments effected by the light for that pass. Clearly you would have do to extra rendering and make more draw calls to maintain the stencil buffer.


Deferred rendering overcomes these issues by rendering the light separately. This allows up to apply the lighting calculations only to the fragments that the light will be effected by. It is also independent of scene geometry meaning we are free to batch up meshes as much as we please.

The downside to deferred comes from the G buffer. For rendering of a single light a forward rendering system can do this in a single pass. Deferred rendering requires us to first create out G buffers than render the light requiring two passes. By using custom lighting vertex format to store lighting volumes and 3D textures to store shadow maps in slices we can batch a large number of lights into a single draw call but still only requires two passes. Where forward rendering will need a great deal more passes to accommodate all of the lights.


To overall there is no correct answer to if forward or deferred is better. It depends on the makeup of your scene including the structure of the geometry and the number/size of the lights in your scene. However when trying to push graphics complex geometry (batching) and handling large numbers of lights are two things that you would probably want (for realistic rendering anyway) so it’s no surprise that deferred rendering is becoming the “new hotness” as far as AAA development is concerned anyway


One thing that does confuse me still though is why people think deferred rendering is bad for transparent objects. It is certainly a pain, but I don’t see it as any more of a pain than transparent rendering in a forward rendering system
Logged
Pages: [1]
Print
Jump to:  

Theme orange-lt created by panic