Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411431 Posts in 69363 Topics- by 58416 Members - Latest Member: JamesAGreen

April 20, 2024, 02:42:44 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)Light simulation thing
Pages: [1] 2 3 4
Print
Author Topic: Light simulation thing  (Read 9532 times)
BleakProspects
Level 4
****



View Profile WWW
« on: February 04, 2013, 06:46:07 PM »

Getting derailed by this little experiment at the moment. I posted some 2D global illumination thing in "Beautiful fails" of a light rendering thing I was working on. It's really basic. Just simulates photons as balls that bounce around and get multiplied by the color of whatever they hit. Original version was in processing and was quite slow (took about 1 minute to render a 256 x 256 image with good quality), and here's a faster one I wrote in C# which takes about 1-10 seconds for the same kind of render. I also added high dynamic range + tone mapping to make things look a bit better.

Here's a light strip:


And a point light:


With gravity:


Lights of several colors (higher contrast here)


Also supports loading in textures for collision data + materials


Adding these effects together can produce some surreal/beautiful results:


Could probably make it real-time if it were written on the GPU.
« Last Edit: February 04, 2013, 07:49:28 PM by BleakProspects » Logged

ink.inc
Guest
« Reply #1 on: February 04, 2013, 08:00:20 PM »

Wow really cool!
Logged
BleakProspects
Level 4
****



View Profile WWW
« Reply #2 on: February 04, 2013, 09:24:59 PM »

I have a pretty near real-time demo working now:

http://www.youtube.com/watch?v=ZaWXVK2Z5Sw&feature=youtu.be

Just had to lower the quality a bit and decrease the exposure time, plus split the photon calculations into 8 threads.
Logged

Xienen
Level 3
***


Greater Good Games


View Profile WWW
« Reply #3 on: February 04, 2013, 10:59:45 PM »

Wow, that's awesome! Any chance that you're going to share some of the technical details of how it was accomplished? Possibly with some code snippets to further exemplify it? =)
Logged

BleakProspects
Level 4
****



View Profile WWW
« Reply #4 on: February 04, 2013, 11:14:44 PM »

There are a few basic things:

1. A photon class (position, color, velocity, is dead)
2. A light class (photon buffer, emission rules)
3. A scene class (texture for occupany grid + texture map + light buffer)
4. Each light has N threads, each assigned to segment of the photon buffer. The threads just update each of their photons. A photon adds its own color to the light buffer. If they hit an obstacle in the occupany grid, they get a random velocity and their own color is multiplied by the texture of the obstacle. If they leave the screen, they are set to dead. If a thread sees that a photon is dead, it will re-create it with the light's emission rule.
5. The scene has a thread which is constantly doing post-processing on the light buffer. It determines the color of a pixel by multiplying the light buffer with the texture underneath. Then it does a global scaling operation and applies a gamma curve to keep everything in the range [0, 1]. This thread also "decays" the light buffer slightly (by multiplying it with a constant < 1. If the mouse is moving, the constant is lower, and if not moving the constant approaches 1).
6. The graphics card just dumbly reads the color buffer that's being modified by the post-processing thread.

Each of these steps is trivially made parallel, and I could imagine this would be 10,000x faster if I could figure out how to implement it on a GPU.
Logged

Christian Knudsen
Level 10
*****



View Profile WWW
« Reply #5 on: February 05, 2013, 12:31:26 AM »

Looks really cool! Though you can see some of the photon trails in shadow areas.
Logged

Laserbrain Studios
Currently working on Hidden Asset (TIGSource DevLog)
_Tommo_
Level 8
***


frn frn frn


View Profile WWW
« Reply #6 on: February 05, 2013, 02:30:23 AM »

This is completely awesome!
What about reflective surfaces which bounce the "photons" in a focused direction? Or refractive surfaces that bend the ray! Or things that have to be activated by creating a color with RGB lights!
I can see an awesome puzzler there Crazy

Needs to be realtime though!
Logged

Schrompf
Level 9
****

C++ professional, game dev sparetime


View Profile WWW
« Reply #7 on: February 05, 2013, 04:36:50 AM »

This looks astonishing! The single-colour lights break it, though... at that picture I wondered if you got gamma correction, but I assume you're already doing this when you're employing HDR and tone mapping.

I was pondering if that could be done on the GPU, for example by sampling mip map cascades. I haven't found a way to get this sharp silhouettes, though - mip mapping would spread the colours equally in all directions while you'd want to bounce them strictly along the wall normal. Interesting topic.
Logged

Snake World, multiplayer worm eats stuff and grows DevLog
JigxorAndy
Level 6
*


Working on Dungeon Dashers


View Profile WWW
« Reply #8 on: February 05, 2013, 04:48:47 AM »

That looks super pretty! Even if it runs really slow, you could use it on when a level loads up as some kind of baked lighting in a game.
Logged

Twitter / Dungeon Dashers: Website / Steam Store
_Tommo_
Level 8
***


frn frn frn


View Profile WWW
« Reply #9 on: February 05, 2013, 06:33:34 AM »

I had to experiment with this tech as it's too cool... and currently it runs realtime in FullHD Beer!

I set up 8 threads, each one double-buffered, that simulate particles one by one accumulating the results in their private backbuffer.
The main thread then round-robins at 60 fps between them, stopping a different one each frame, swapping their back/front buffers, and uploading the new front buffer to a texture.
Then, the texture of each thread is added and averaged using 8 alpha-blended quads layered one on top of the other.
So the result is that it runs at 60 fps with a change being fully "updated" after 8 frames, which looks quite acceptable because it adds some nice blur too Hand Any Key

I think I'll try to use (moving) rects for the collision next, to try and add some physic puzzles Smiley
Logged

BleakProspects
Level 4
****



View Profile WWW
« Reply #10 on: February 05, 2013, 09:25:55 AM »

I had to experiment with this tech as it's too cool... and currently it runs realtime in FullHD Beer!

I set up 8 threads, each one double-buffered, that simulate particles one by one accumulating the results in their private backbuffer.
The main thread then round-robins at 60 fps between them, stopping a different one each frame, swapping their back/front buffers, and uploading the new front buffer to a texture.
Then, the texture of each thread is added and averaged using 8 alpha-blended quads layered one on top of the other.
So the result is that it runs at 60 fps with a change being fully "updated" after 8 frames, which looks quite acceptable because it adds some nice blur too Hand Any Key

I think I'll try to use (moving) rects for the collision next, to try and add some physic puzzles Smiley

Dude. Screenshots.


This looks astonishing! The single-colour lights break it, though... at that picture I wondered if you got gamma correction, but I assume you're already doing this when you're employing HDR and tone mapping.

I was pondering if that could be done on the GPU, for example by sampling mip map cascades. I haven't found a way to get this sharp silhouettes, though - mip mapping would spread the colours equally in all directions while you'd want to bounce them strictly along the wall normal. Interesting topic.

I think the GPU version might be possible using just a custom vertex shader (where each vertex is a photon), they then write into a buffer somewhere on the GPU. The pixel shader then does the color correction + tone mapping on the buffer in a second draw call which just draws a fullscreen quad. Not sure how efficient writing texture memory would be though. Definitely would be efficient in a CUDA-like environment.
Logged

_Tommo_
Level 8
***


frn frn frn


View Profile WWW
« Reply #11 on: February 05, 2013, 10:42:34 AM »

Dude. Screenshots.

uhm, it's not like it's really working right now Durr...?
It's able to cast 1 million particles per second but it needs to be more correct...
lights have a strange "clover" shape, and if a thread is "early" on the swap it will write twice over part of its assigned rays because it starts again.
But screenshots will come, eventually.
Logged

Goran
Level 0
***



View Profile WWW
« Reply #12 on: February 05, 2013, 02:29:25 PM »

Looking forward to any screenshots you post!
Logged

Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #13 on: February 05, 2013, 04:07:48 PM »

Then it does a global scaling operation and applies a gamma curve to keep everything in the range [0, 1]. This thread also "decays" the light buffer slightly (by multiplying it with a constant < 1. If the mouse is moving, the constant is lower, and if not moving the constant approaches 1).

Sounds like this could be a major bottleneck.
Logged
powly
Level 4
****



View Profile WWW
« Reply #14 on: February 05, 2013, 04:09:41 PM »

_Madk, what would? If I correctly understand what he meant, applying the gamma curve is both pretty trivial and fast, also decaying is just a ping-pong render - nothing a semi-decent GPU can't handle.

I think the GPU version might be possible using just a custom vertex shader (where each vertex is a photon), they then write into a buffer somewhere on the GPU. The pixel shader then does the color correction + tone mapping on the buffer in a second draw call which just draws a fullscreen quad. Not sure how efficient writing texture memory would be though. Definitely would be efficient in a CUDA-like environment.

Oh, you don't need to bring CUDA into this to make it fun and fast! Rendering your particles as alpha blended lines into a texture with FBOs will be pretty much as fast as accumulating the light can get. The biggest problem would probably be finding intersections with the scene - you could go with distance fields, some octree solution or just move your light particles slowly enough so you don't run into problems with too much texture sampling. With a more elegant ray intersection system you could possibly remove the particle thing altogether and replace it with some form of path tracing, possibly with a system to detect and reduce noise and reuse old results when there are no big changes in lighting.

Just rendering with particles larger than pixel size would probably be sufficient, as you can render a great lot of them nowadays. I might give this a try later on, it seems a lot more solvable with current hardware than the 3D alternative I've been thinking about too much lately.
Logged
eigenbom
Level 10
*****


@eigenbom


View Profile WWW
« Reply #15 on: February 05, 2013, 04:23:39 PM »

bewidful!
Logged

Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #16 on: February 05, 2013, 05:09:01 PM »

_Madk, what would? If I correctly understand what he meant, applying the gamma curve is both pretty trivial and fast, also decaying is just a ping-pong render - nothing a semi-decent GPU can't handle.

I would imagine it involves looping through each pixel (to find the brightest one) and that sort of thing can be demanding. Also, isn't it being done on the CPU as it currently stands?
Logged
_Tommo_
Level 8
***


frn frn frn


View Profile WWW
« Reply #17 on: February 05, 2013, 05:52:27 PM »

Uhm, I don't think the GPU would be very good for the simulation part...
a particle involves colliding with a bitmap while rasterizing a line (eg. with bresenham) and that requires a lot of nonlocal read/writes which can really clog the GPU.
But probably being able to write directly to the VRAM texture offsets any advantage the CPU would have, so it would probably end up faster even on bad GPUs.

Another problem would be that with a really big number of threads, the chance that two or more threads are writing to the same pixel increases, so += operations (read, increment, write) are going to be wrong all over the place (say, read1, read2, increment1, increment2, write1, write2 or worse)... so you need to choose to accept the noise or use atomic_incr operations which are slow as hell and would probably make everything slower than the CPU.
On the CPU this isn't a problem because with a reasonable n of threads, you can just allocate a backbuffer for each thread.

But something that the CPU is really bad at instead is the tonemapping/gamma correction/buffer sum part, which should really be done with a shader imo...
I'm already accumulating the threads' backbuffers using alpha blended quads and it comes nearly for free Smiley

Still the more I hack at it the worse it becomes... I had that nice clover light made up of 160.000 particles, now I got trembling lines and less than 30k particles Facepalm
Logged

BleakProspects
Level 4
****



View Profile WWW
« Reply #18 on: February 05, 2013, 06:23:41 PM »

Screwed around with the GPU for a bit and indeed it seems like it will be a much bigger challenge than I anticipated. Forward simulating lights and then accumulating them into a buffer is trivial, but the collisions make everything difficult. I think the best I could do is perhaps texture sample the collision buffer and then every frame render out to a "hascollided" texture (that's N x 1 pixels, black and white). Then the CPU would every now and then check the buffer, and reset the photon whenever its collided to the appropriate color + position + velocity, and send it back to the GPU. Probably really inefficient both because of the random reads and the constant sending of new data to the GPU. So I think if I continue with this I will focus first on doing the post-processing (HDR) in the GPU, and then second on the light accumulation (as someone said earlier, I can just render line or large boxes). With those bits taken care of by the GPU the CPU will be freed up to do more physics calculations.

Btw, doing some profiling on my original code reveals 76% of the CPU time is spent in the light physics threads. In those threads I spend about 42% of my time in each light thread just accumulating the light buffer, and about 20% of my time total aside from the light physics threads is spent doing the HDR calculations.
« Last Edit: February 05, 2013, 06:34:37 PM by BleakProspects » Logged

_Tommo_
Level 8
***


frn frn frn


View Profile WWW
« Reply #19 on: February 05, 2013, 08:02:26 PM »

phew, turns out that Bresenham is terrible for this thing (caused the cloverleaf), so I had to switch back to the plain old float position being incremented (slower but actually spherical).
The backbuffers are unsigned short*s now; this removes the need for an overflow check (faster) but increases bandwidth use (slower)... overall it's worth it, with a net +10k particles per frame.
And it allows for a crude HDR too, it's quite needed.
Probably the best thing memory-wise would be a 10-10-10-2 RGBX value, but I wouldn't know how to sum two of those efficiently Epileptic

Right now it runs at ~50.000 particles per frame (1.5 millions/s, 30 fps!), and has a single light with no collisions & no tonemapping, here's a screenshot Smiley


running at 60 fps would be possible, but I would need to offload the blur/tonemapping on a shader... right now, the main thread does this stuff and it runs in 20 ms.
« Last Edit: February 05, 2013, 08:21:33 PM by _Tommo_ » Logged

Pages: [1] 2 3 4
Print
Jump to:  

Theme orange-lt created by panic