Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411423 Posts in 69363 Topics- by 58416 Members - Latest Member: JamesAGreen

April 19, 2024, 02:45:01 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)One header C99 Linear Algebra Library
Pages: 1 [2]
Print
Author Topic: One header C99 Linear Algebra Library  (Read 2449 times)
JWki
Level 4
****


View Profile
« Reply #20 on: November 23, 2016, 05:54:14 AM »

I haven't gone through all the above word for word but I'll have to pick up on the GPU "expecting" a matrix to be in some order - to the GPU, isn't any matrix just a 16 * 32 byte block of memory and it doesn't well fucking care what the semantics of that block of memory are? Matrix-Vector operations are implemented on a language level in GLSL, HLSL or whatever shading language your rendering API uses, just like in a CPU math library, there's no dedicated hardware for that on a GPU chip (unless I'm really really mistaken right now). So the GPU should not care whether the bytes in the block of memory that you call a matrix represent four rows or four columns in order because it has no concept of matrices. The layout is only meaningful in context of an API and a shading language.
Logged
qMopey
Level 6
*


View Profile WWW
« Reply #21 on: November 23, 2016, 10:28:55 AM »

@namandixit Yep that all checks out. So now the question is when we pass memory to the GPU for two matrices A and B, and we want to perform A * B, the memory storage for both A and B can be in either row or column storage.

Code:
A = Translation * Rotation * Scaling

// And in block form:
[ R * S,   T ]
[ { 0 }^T, 1 ]

// Above is typical column major notation we are all
// familiar with. Commonly we *pre-transpose* this into
// the following memory storage, so in memory we have:
[ (R * S)^T, { 0 } ]
[   T^T    ,   1   ]

So the GPU should not care whether the bytes in the block of memory that you call a matrix represent four rows or four columns in order because it has no concept of matrices. The layout is only meaningful in context of an API and a shading language.

We can call it whatever we want, sure, but the storage order will definitely be expected in a certain format; i.e. we can transpose a matrix and get different behavior.

...

So I did a little more reading and it looks like I was totally wrong about DX/GL expecting the same storage! To quote from Fabian here:
Quote
So, while OpenGL defaults to column-major storage and D3D defaults to row-major storage, they don’t store the same thing – the matrices themselves are different (transposed in fact) because of the different types of vectors they use.

I would definitely trust what Fabian says. When I originally was doing tests I think I had forgotten at one point I was swapping the ordering to make sure the storage between GL and DX was consistent, and so I thought the storage had to be consistent all the time. Consequently I thought GL used row major storage. Fabian says:
Quote
HLSL supports both row-major and column-major storage. The default if you don’t do anything is column-major, but within a shader you can either set the global default for all matrices to row-major using a pragma, or specify row_major/col_major per matrix if you want to.

Either way, there’s no performance difference between v*M and M*v.

The interesting thing was that there didn't seem to be much of a performance difference either way, so what really matters is just sticking with something consistent. Personally I think sticking with OpenGL's storage (described in my above code example) all around would be a pretty good idea, which is what namandixit is doing in his source.

So turns out I was also wrong about what I was calling row major. GL uses column major storage, which is why we have to do the whole pre-transpose thing. Fabian points out that the FAQ I linked to is incorrect, and quotes directly from the OpenGL spec stating GL uses column major storage. So I had the storage terminology flipped due to the incorrect FAQ.

According to Fabian DX uses row major storage by default, but this can be changed with an HLSL #pragma.

tl;dr
GL uses column major storage. DX uses row major storage by default, but this can be changed by the user. Whatever is picked doesn't matter really at all in terms of performance, in terms of shaders.
Logged
Pages: 1 [2]
Print
Jump to:  

Theme orange-lt created by panic