Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411283 Posts in 69325 Topics- by 58380 Members - Latest Member: bob1029

March 29, 2024, 01:03:16 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperArt (Moderator: JWK5)3D thread
Pages: 1 ... 148 149 [150] 151 152 ... 192
Print
Author Topic: 3D thread  (Read 931684 times)
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #2980 on: August 20, 2014, 04:31:50 AM »

And augmented with single point particle and simpler sprite scaling to help with the illusion.
Logged

Geti
Level 10
*****



View Profile WWW
« Reply #2981 on: August 20, 2014, 05:03:58 AM »

Remember they managed to get

with some corner cutting - the levels there aren't strictly polygon based so it's hard to make comparisons but still :^)
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #2982 on: August 20, 2014, 05:08:22 AM »

But how many polygon is a doom level not at a whole but visible on screen at any time (plus they don't get to be constrained by being only triangle)? also cylindrical coordinate.
Logged

Sik
Level 10
*****


View Profile WWW
« Reply #2983 on: August 20, 2014, 08:08:33 AM »

But I also remember mode 7 is rarely used with 3D graphics on snes!
instead they use a background (generally a gradient) for the ground with dot or object to suggest movement.

Render 3D shapes to sprites and use the background for the floor (mode 7 only allows one background plane, which is the one that can be rotated and scaled). So yeah, definitely doable. Not sure how suitable is the PLOT instruction for this, though (since pixels in sprites are arranged in a different way than in backgrounds).
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #2984 on: August 20, 2014, 12:35:52 PM »

Well rendering 3D shape into sprite is precisely what the super fx is designed to do, in fact it does not do 3D, it only do math more efficiently and draw shape to a "surface" which is then converted into sprite for the nes.

However some game like star wars 2 on snes have a "3D" mode7 that have "ripple" on it (heightmap-y) but still use sprite and low frame rates   Giggle
Logged

Sik
Level 10
*****


View Profile WWW
« Reply #2985 on: August 20, 2014, 04:31:06 PM »

Actually it's more common to render 3D graphics to one of the background planes, I think this is what Star Fox does (I know the background tilts, but you can do that with the per-tile scrolling modes, which is exactly what's going on there).
Logged
pixel-boy
Level 0
**



View Profile WWW
« Reply #2986 on: August 20, 2014, 07:10:20 PM »

Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #2987 on: August 20, 2014, 07:19:14 PM »

http://www.anthrofox.org/starfox/superfx.html
http://www.smwiki.net/wiki/SuperFX
http://en.wikibooks.org/wiki/Super_NES_Programming/Super_FX_tutorial

How many operation you need for a typical corridor based 3D engine per point in space?
Because it can do 10 millions op per secondes (+ plotting + halt during snes rendering)

But I worked with a guy who made a crash bandicoot demo on gba, he told me about some tricks (the same as with 2 image, aka swaping the sign to mirror / rotate in orthogonal) so I can imagine than there would be a hierarchy of object complexity, some are point based (position) other are just axis aligned (no rotation), some have axis aligned rotation and some are complex 6 dof rotations. In fact in the craziest sequences like in sector X,Y,Z that's where you have frame drop (and also they used line based block instead of filled up poly). And sure the game goes crazier with its visual as they make level and learn trick (see fortuna for god sake).
« Last Edit: August 20, 2014, 07:41:34 PM by Gimym JIMBERT » Logged

Sik
Level 10
*****


View Profile WWW
« Reply #2988 on: August 20, 2014, 09:44:08 PM »

Note that due to the instruction set most operations actually take up 2 or 3 cycles, not 1 (due to FROM and TO being separate opcodes instead of arguments). Constrast this to the SVP, which was clocked at about the same speed but every operation takes up 1 cycle. It's no wonder that Virtua Racing can push more polygons than Star Fox at a consistently higher framerate.
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #2989 on: August 20, 2014, 10:52:39 PM »

That makes me think we should not look at how many poly but how many vertex he can push raw with a full 6dof operations ... would give us a theoretical (impossible to reach) maximum, but yet a magnitude of order.

That's beyond my competance. I use to do simple 3D transform on casio using the plot( x/z , y/z ) but there is no rotation and not sure how to code the */z on snes assembly at all.

In fact I forgot everything about assembly since it's been 10+ years I haven't touch thus things. I'm no programmer and my brain plasticity is down since a while lol! I don't even understand note and code I use to do when younger, like this strange complex number inspired rotation without cos and sin ...
Logged

Sik
Level 10
*****


View Profile WWW
« Reply #2990 on: August 21, 2014, 12:59:35 AM »

You have to take both into account since the SFX has to plot every pixel. Large polygons will cause slow down (moreso than several small polygons), large amount of vertices will also cause slow down (regardless of how many unique polygons are in the mesh, so the more vertices are shared, the faster it is).

Most of the meshes in Star Fox don't need rotation at all (being a rail shooter helps!), so it's likely only a few meshes have 6 DOF. Even then, many of the meshes that rotate only do so in a single axis, so that also reduces the amount of calculations (assuming they aren't using a matrix).

Do you still have any of that old code you mention? o.o Maybe I can find some use for it if you do...
Logged
Geti
Level 10
*****



View Profile WWW
« Reply #2991 on: August 21, 2014, 02:41:08 AM »

this strange complex number inspired rotation without cos and sin ...
Sounds like quaternions?

Also re: shared vertices and better performance - surely that'd depend on how you're storing and rendering your mesh? if you just used a simple triangle list with no shared vertices you wouldn't have to do a table lookup for each index (of course there's memory cost for any duped vertices but there's no reason for it to be slower at raster time, you've got to load the vectors either way) - is the main cost in the transform? what about for things that don't need to be transformed each frame? Surely the cost of rasterising triangles is going to dominate transforming them?

Overall it seems like everything would be super duper implementation dependent. Could be fun, haha :^)

Maybe we should move discussion of this elsewhere though?
Logged

Sik
Level 10
*****


View Profile WWW
« Reply #2992 on: August 21, 2014, 03:52:44 AM »

Sounds like quaternions?

Doesn't that still require sin/cos at some point? (to calculate the values for making the quartenions)

Coming to think on it, if you use polar coordinates you don't need to use sin/cos for rotation, just additions. That can potentially save lots of time on a slow system (since rotation normally requires several multiplications, which on weak systems is pathetically slow). You'll still need sin/cos to convert them to cartesian coordinates in the end (needed to do the 3D projection), but you can offload that to a look-up table so the end result is a serious speed increase (at the expense of using a rather wacky format for storing the meshes).

Also re: shared vertices and better performance - surely that'd depend on how you're storing and rendering your mesh? if you just used a simple triangle list with no shared vertices you wouldn't have to do a table lookup for each index (of course there's memory cost for any duped vertices but there's no reason for it to be slower at raster time, you've got to load the vectors either way) - is the main cost in the transform? what about for things that don't need to be transformed each frame? Surely the cost of rasterising triangles is going to dominate transforming them?

Overall it seems like everything would be super duper implementation dependent. Could be fun, haha :^)

Well, the big advantage of shared vertices is that you only need to compute each vertex once, and no, transformation is not cheap (though the SFX had dedicated fast hardware multiply so that helped a lot). Practically every mesh that's closed will share vertices (because polygons are touching), so it's in your best interests to store them as a separate list.

Oh, and yeah, you need to transform pretty much everything every frame, if the camera moves all projected coordinates become invalidated.

Maybe we should move discussion of this elsewhere though?

Probably.
Logged
Geti
Level 10
*****



View Profile WWW
« Reply #2993 on: August 21, 2014, 05:48:08 AM »

I'll wait for someone with the powers to do so to split things off, no harm continuing here.

Re: projections are invalid given any camera/object movement - yeah, but how invalidated? :^) we're talking shitty little meshes rendered into low res textures for a blurry screen at already staggeringly low framerate, you can probably get away with a lot of shortcuts. I doubt most objects even got real perspective, meaning you could just pan and scale the output a lot of the time (perhaps reprojecting one background object per frame rather than all of them). There's a lot you can do to avoid trig in your low level code as well, though yes if you want to swap between an eulerangles rotation and a quarternion you're going to need to do some trig (could be precomputed pretty trivially though or use LUTs like you mentioned).

I understand that transformation is not a cheap operation, especially as the models get bigger but unless we're talking tiny triangles the fillrate has to be an issue as well. Would be interesting to play around with tradeoffs between the two on the actual hardware, or even an emulator; really not sure I have time to learn a new assembly language at the moment though.
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #2994 on: August 21, 2014, 11:16:54 AM »

Yeah I understand the impact of fillrates, I mention vertex operation only to get the raw 3D performances, then by adding the theoretical raw performance we can have a "rough" idea of performances in ideal situations ie without hacks, optimizations and "pipeline" issues.

Also camera is mostly always aligned with world so they share common reference (being a corridor help twice). It's only intro and outro who move the camera and then you have limited count of objects and lower framerate!

But isn't multiplication done through bit shifting anyway? Can't imagine the base snes not having them!

@Sik
It wasn't code, it was a demo he showed and explain on gba. He also have a rolling perspective to hide horizon and clipping, it was very smart. The project was canned because I think it put the psx game in bad light (it was really close visually if you don't realize the hack).

If you are talking about my casio 3D program there is nothing left, just vague memory. I don't know how grades level translate in other country but in france I was at 1ere/terminal where we learn complex number. I remember there was a lot of programmer but I made clear code with esoteric structure nobody else understood, not even me today lol.

I remember using a kind of "unit" rotation vector you multiplied to get the desired rotation with a kind of table with multiple "rotation unit vector component" to assemble and get the desired angle. And by table I mean the weird optimization I have no memory of because actual table was reallllllyyyy slow. Still slow because casio! using memory slowed it, using instruction slow it, There was a lot of implicit hack to gain shortness of loop so it run at decent speed.

I wish I could do thing like that again, my brain nowaday just go "nope" so much I can actually sleep peacefully at night (never happen before, never), you can tell, my post on tigs have been on the shorter decent size unlike when I joined.
Logged

Sik
Level 10
*****


View Profile WWW
« Reply #2995 on: August 21, 2014, 03:38:59 PM »

This wasn't split yet? o.O

There's a lot you can do to avoid trig in your low level code as well, though yes if you want to swap between an eulerangles rotation and a quarternion you're going to need to do some trig (could be precomputed pretty trivially though or use LUTs like you mentioned).

All that listed in that page is extremely overkill for what these old games were doing, there are just too many multiplications going on. Also remember that sin/cos is actually cheap, since you could replace them with look-up tables without any trouble. Multiplication is actually worse in this sense.

But isn't multiplication done through bit shifting anyway? Can't imagine the base snes not having them!

Doing multiplication that way takes up some really large circuit, for the record. Old CPUs lacked multiplication like that at all (the earliest CPUs implementing multiplication used loops in microcode, making it pretty slow), while DSPs dedicated most of the die space just to the multiplication hardware (at the expense of everything else).

The 65816 itself didn't have any multiplication instruction, period, it had to be done in software. The SNES hardware did provide multiplication hardware which took up 8 cycles to compute, but it only could be used with 8-bit integers, which limited its usefulness (if I recall correctly its main purpose was to aid with mode 7 transformations).

For the record, the SNES was going to have more advanced math hardware on it, but it was removed at last moment. This is why Pilotwings has a custom chip in it: there simply wasn't enough time to rewrite the code so instead they added the missing hardware in the cartridge.

I remember using a kind of "unit" rotation vector you multiplied to get the desired rotation with a kind of table with multiple "rotation unit vector component" to assemble and get the desired angle.

Yeah, that sounds like polar coordinates (they're angle + distance, basically).
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #2996 on: August 21, 2014, 04:02:39 PM »

It wasn't polar coordinate, the casio only supported carthesian coordinate, but if you add a perpendicular vector (tangent) to another so that it is the arc of a sphere of the same length as the vector, you are effectively rotating the original vector by the amount of that tangent vector.

Now there is a lot of things that is strange about that statement as it is where my memory and understanding break, this unit vector would need length and change (linearly) according to the length of the affected vector, this unit vector would need to be rotated to remain a tangent vector to continue the rotation. That's where I'm at loss, my brain don't process this anymore and going back to my school copy books of the time was jarring XD I didn't understand anything for some reason. That's my current problem, I understand things but I can no more manipulate them to recreate process or to form some new (damn spherical harmonics) Sad
Logged

Geti
Level 10
*****



View Profile WWW
« Reply #2997 on: August 21, 2014, 06:13:43 PM »

Re: too much multiplication - ignoring the matrix creation, I'm not seeing how you'd get around doing dot and cross products in the situation he describes though? His point is that the trig is superfluous in a lot of cases where you'd use it because you've got the values geometrically anyway, not that you should necessarily use matrices for transformation.

How would you go about any reorientation/projection while avoiding multiplies? How would you use the sin or cos resultants from a LUT without A) multiplying them into something later anyway and B) without incurring a lot of fetching overhead in the LUT anyway?

Setting up from/to (1-3 cycles each) + moving(2-6 cycles), so 4-12 per sin lookup, another for cos - A 16 bit signed multiply is 4-11 cycles for comparision (+from/to = 6-17). Looks like there's stupid overheads everywhere with these kinds of processors, but at least it's got some pipelining (if without hazard detection).

Good data on timing here (book 2, super fx).

Not being antagonistic here, just trying to see how you'd make it faster - the sin/cos lookup doesn't seem to net you anything because you need to multiply them in at some point anyway.
Logged

Sik
Level 10
*****


View Profile WWW
« Reply #2998 on: August 22, 2014, 08:29:42 PM »

Welp, redownloaded the docs (I had them in my other hard disk but not here).

Decided to just say "screw it" and see if somebody has details about what algorithms were actually used, I don't feel like spending time reverse engineering stuff right now, let's see where it goes (maybe byuu knows something, since he made an emulator?):
http://forums.nesdev.com/viewtopic.php?f=12&t=11567

Some things I noted from the documentation though:

  • Accessing the RAM is slow, like 7 cycles slow. It's pipelined though, so as long as you don't attempt to write to it twice in a row the SFX won't get halted. Accessing ROM seems to be quite faster, but only works on 8-bit values (but you can extend them as both unsigned and signed).
  • Signed multiplication is fast (effectively 1 cycle against a register or 2 cycles against an immediate value), but it's 8-bit × 8-bit only (just like on the SNES' own multiplier), so I guess its only real advantage is being much faster (like 30 times as fast, so I guess that's important =P). May cause issues if you need to multiply large numbers, so it's better if you can avoid them.
  • Unsigned multiplication, on the other hand, is slower, and doesn't support immediate values, so avoid it if you can.
  • The PLOT instruction can be used with sprites (there's a setting for it), so the idea of rendering to sprites and using a mode 7 background for a textured floor seems indeed possible without much effort. Wondering if it allows rendering to multiple sprites though. PLOT can also do dithering, so that comes for free. It can also be horribly slow, but I'm not sure if coding this by hand would be any faster really...
  • The SFX can only access memory on the cartridge, so the SPPU transfer rate is probably a bottleneck too (this happens with Virtua Racing on the Mega Drive too and practially all Mega CD games that do rendering on their own, that's why they're capped to 15FPS). On top of this, while the SFX is running the 65816 can't access the cartridge, so either it has to run from its own RAM (possibly just idling if there isn't enough room for processing code), or it prevents the SFX from accessing the ROM part of the cartridge (making it run slower).
  • There's an instruction to divide by 2. It works exactly the same as ASR (signed shift right), except for one special situation (-1 becomes 0). It seems... kind of useless (if it always rounded towards 0 it'd have made more sense). ASR takes up less space and executes faster.
  • Executing from the cache is much faster than from ROM or RAM, so make sure that any high speed code is in the cache. The problem is that you only get 512 bytes for this, so make your code as small as possible.

Anything else I may have missed? (I didn't take an in-depth look yet)

Also sooner or later you'll need sin/cos just to come up with a rotation value, either each axis individually or as a vector =P (although in those cases we're talking about unit-sized values)
Logged
Geti
Level 10
*****



View Profile WWW
« Reply #2999 on: August 23, 2014, 01:22:51 AM »

One thing about multiplication - you do have 16 bit signed versions, they just take a little more time (FMULT and LMULT) - got a feeling these would be used for transformations or you'd be stuck in a very low precision number space. Of course, it says that these are performed by repeating the multiplication actions but you avoid needing to swap things around in registers manually whilst doing so.

Might need to disassemble some roms to find out specific algorithms used tbh.

Re: needing sin/cos - the whole point of that article I linked is that they're encoded implicitly if you're doing rotations to/from one space to another and don't know the angles ahead of time (ie have them hardcoded or in a table of animation data).
Logged

Pages: 1 ... 148 149 [150] 151 152 ... 192
Print
Jump to:  

Theme orange-lt created by panic