Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411276 Posts in 69323 Topics- by 58380 Members - Latest Member: bob1029

March 28, 2024, 11:32:31 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)One header C99 Linear Algebra Library
Pages: [1] 2
Print
Author Topic: One header C99 Linear Algebra Library  (Read 2433 times)
whoamievenidontknow
Level 0
**



View Profile
« on: November 17, 2016, 10:13:15 PM »

Working on my own engine, I found a dearth of a simple to use and easily integrable 3d math library in C99. So I made my own:

https://gist.github.com/namandixit/f5818c5c5f4457b3ba3b6e055bff44c5

It is neither feature complete nor very fast yet since I am extending and optimizing it on an as-needed basis. However, for small prototyping or non-CPU bound applications, it should be adequate (most blatant missing feature is orthographic projection matrix AFAIK, it should be easy to add though).

It was tested with OpenGL, with GCC and Clang both on Linux and Windows.
Logged
jimon_c
Level 0
**


work work work


View Profile WWW
« Reply #1 on: November 21, 2016, 01:29:52 PM »

Hey, I would recommend you to get to A[ i ][ j ] (where i is row, and j is column) notation pretty early instead of going with A[ k ] (where k is from 0 to 15), because most of algorithms in books/wiki/blogs/etc are described in this notation and it would be easier for you to convert between column and row major matrices later on Smiley

PS. maybe some other libs can give you additional inspiration, like this one https://github.com/gingerBill/gb/blob/master/gb_math.h
Logged
whoamievenidontknow
Level 0
**



View Profile
« Reply #2 on: November 21, 2016, 08:34:14 PM »

Hey, I would recommend you to get to A[ i ][ j ] (where i is row, and j is column) notation pretty early instead of going with A[ k ] (where k is from 0 to 15), because most of algorithms in books/wiki/blogs/etc are described in this notation and it would be easier for you to convert between column and row major matrices later on Smiley

Actually, I am much more familiar with [row][column] notation from school, but the entire row-major/column-major thing broke my brain and since OpenGL shaders expect a matrix as a contiguous array of memory (with translation element at index 12,13 & 14), I just decided to roll with it. But you are right, I should change the notation back ASAP.

PS. maybe some other libs can give you additional inspiration, like this one https://github.com/gingerBill/gb/blob/master/gb_math.h

Thanks
Logged
qMopey
Level 6
*


View Profile WWW
« Reply #3 on: November 21, 2016, 10:53:32 PM »

Quick note: all GPUs expect row major format *in memory*. Column/row major notation is just notation. In memory you should always have row major (i.e. one row in memory followed by another, and another, creating a contiguous array of 3 contiguous arrays). This is true for OpenGL and DirectX alike.

Order of operation is a notation, so operator overloading or what have you can be reversed depending on the API. However, the memory is always the same despite the API (unless the API is dumb and transposes the memory, forcing the user to transpose matrices upon upload to GPU -- neither DX/GL are dumb enough to do this).

For these reasons we often see Matrix3 defined as:

Code:
struct mat3
{
    vec3 x;
    vec3 y;
    vec3 z;
};

struct mat4
{
    vec4 x;
    vec4 y;
    vec4 z;
    vec4 w;
};

If we peer into Bullet's math we see exactly this. It also plays nicely with SIMD instructions. Conveniently each matrix can overload the [] operator to select a vector, and each vector can overload the [] operator to select a component. This allows for a consistent index notation with normal double arrays. Also we setup mat4 to contain w component, which appropriately stands for witchcraft.

Cheers  Gentleman Toast Right
« Last Edit: November 21, 2016, 11:01:19 PM by qMopey » Logged
whoamievenidontknow
Level 0
**



View Profile
« Reply #4 on: November 22, 2016, 12:24:31 AM »

Quick note: all GPUs expect row major format *in memory*. Column/row major notation is just notation. In memory you should always have row major (i.e. one row in memory followed by another, and another, creating a contiguous array of 3 contiguous arrays). This is true for OpenGL and DirectX alike.

Is that correct? I thought that memory storage layout depends on whether matrices are pre-/post-multiplied regardless of API. So,

Mb*Ma*V = Column Major matrix layout
V*Ma*Mb = Row Major matrix layout

And since I am used to the first format, that's the one I went with.

Am I wrong?
Logged
qMopey
Level 6
*


View Profile WWW
« Reply #5 on: November 22, 2016, 09:30:34 AM »

Yep you are just straight up wrong. How else would a GPU function? The GPU cannot magically use matrices transposed in memory for one API, and not do so for the another. Order of operations doesn't say anything about memory layout.

https://www.opengl.org/archives/resources/faq/technical/transformations.htm
Quote
For programming purposes, OpenGL matrices are 16-value arrays with base vectors laid out contiguously in memory. The translation components occupy the 13th, 14th, and 15th elements of the 16-element matrix ... Column-major versus row-major is purely a notational convention ... Sadly, the use of column-major format in the spec and blue book has resulted in endless confusion in the OpenGL programming community. Column-major notation suggests that matrices are not laid out in memory as a programmer would expect.

AKA they are *row major in memory*. Always.

P.S. if you wanted some more inspiration (not in C though) this style from Box2D is great, as it's very clear and can be translated one to one for SIMD instructions by swapping out function definitions: https://github.com/erincatto/Box2D/blob/master/Box2D/Box2D/Common/b2Math.h

I've done this for my own personal math library, and something similar can be seen on Handmade Hero a la Muratori.
« Last Edit: November 22, 2016, 09:35:59 AM by qMopey » Logged
jimon_c
Level 0
**


work work work


View Profile WWW
« Reply #6 on: November 22, 2016, 11:15:32 AM »

Quick note: all GPUs expect row major format *in memory*. Column/row major notation is just notation. In memory you should always have row major (i.e. one row in memory followed by another, and another, creating a contiguous array of 3 contiguous arrays). This is true for OpenGL and DirectX alike.
Mb*Ma*V = Column Major matrix layout
V*Ma*Mb = Row Major matrix layout

And since I am used to the first format, that's the one I went with.

Am I wrong?

You're not wrong, you just need to know more details: order of matrix multiplication is irrelevant and doesn't matter from column-major or row-major layout only. You can use both orders regardless of memory representation.

What truly matters is how you (or in this case glsl/hlsl shader) implements matrix * vector multiplication, if it's matrix * column vector (so called Ax notation) and looks like this:



So what happens with ordering of matrices then, it looks like this :

position_on_screen = projection_matrix * view_matrix * translate_matrix * rotation_matrix * scale_matrix * position_of_the_vertex

The trick is to use "mind-trick" (tm), because what you usually see in the shader is:

position_on_screen = (projection_matrix * view_matrix * translate_matrix * rotation_matrix * scale_matrix) * position_of_the_vertex where staff in brackets is so called modelViewProjection (or MVP) matrix.

What really happens is this:

position_on_screen = (projection_matrix * (view_matrix * (translate_matrix * (rotation_matrix * (scale_matrix * position_of_the_vertex)))))

So using this trick you will understand why matrices are multiplied as such - because the vector stand from the right (Ax notation) and you need to apply transformations in specific order to get correct calculations.

While Ax notation is popular in glsl, our friends in d3d went with row vector * matrix (so called xA notation) in hlsl. It looks like this:



As you can see the difference is not huge, it's just if every element of a vector is a dot product with column or a row of the matrix Smiley But because the vector stands from the left we get a different formula now:

position_on_screen = position_of_the_vertex * scale_matrix * rotation_matrix * translate_matrix * view_matrix * projection_matrix

Which you usually see as this in the shaders:

position_on_screen = position_of_the_vertex * (scale_matrix * rotation_matrix * translate_matrix * view_matrix * projection_matrix)

But really this is just this:

position_on_screen = ((((position_of_the_vertex * scale_matrix) * rotation_matrix) * translate_matrix) * view_matrix) * projection_matrix

So this is why matrix ordering is different in OpenGL and DirectX, column/row-major layouts have nothing to do with it. You can still roll with column-major matrices in DirectX just fine, just remember to use matrix * column-vector multiplications in the shaders Smiley

Hope this helps!




Logged
qMopey
Level 6
*


View Profile WWW
« Reply #7 on: November 22, 2016, 12:50:37 PM »

To be absolutely clear, jimon's above post is talking about notation, not the in-memory layout. Notation != memory layout. Notation includes order of operations and operator overloading. In memory the GPU will expect all matrices to be laid out in row major format, despite order of operations or notation used.
Logged
jimon_c
Level 0
**


work work work


View Profile WWW
« Reply #8 on: November 22, 2016, 01:02:27 PM »

To be absolutely clear, jimon's above post is talking about notation, not the in-memory layout. Notation != memory layout. Notation includes order of operations and operator overloading. In memory the GPU will expect all matrices to be laid out in row major format, despite order of operations or notation used.

Hm, do you have a reference for this? (I mean that all GPU's have only row major layout)

GPU matrices are
Code:
float data[16];
with translation in 13,14,15th elements, everything else (like column-or-row-major) is just a notation details used in particular implementation.
Logged
whoamievenidontknow
Level 0
**



View Profile
« Reply #9 on: November 22, 2016, 01:04:04 PM »

Ok, so here's is what I have learned is last 3 months since I started working on my engine:

Column-major layout means that columns are laid out one after the other (C0-C1-C2-C3). Also, according to the way we define transformation matrices, the translational elements will be in C3, meaning that in flat memory layout, they will be on position 12, 13, 14 (starting from 0).

Row-major layout means that rows are laid out one after the other (R0-R1-R2-R3). Again, according to the way we define transformation matrices, the translational elements will be in R3, meaning that in flat memory layout, they will be on position 12, 13, 14 (starting from 0) in this case too.

The reason for this is: (AB)T=BTAT. And since the memory layout is exactly the same, all the hulla-bulalo around row-major vs column-major comes down to the exact algorithm used to multiply the matrix to vector (by algorithm, I mean what index's value is multiplied to what). This means that that if we write M*v, the GPU driver is going to assume that v is column vector and thus interpret the 16x4 bytes of memory as column major matrix and multiply using (M's row)*(v's column) method of matrix multiplication. In case of v*M, reverse will happen.


What really happens is this:

position_on_screen = (projection_matrix * (view_matrix * (translate_matrix * (rotation_matrix * (scale_matrix * position_of_the_vertex)))))


So using this trick you will understand why matrices are multiplied as such - because the vector stand from the right (Ax notation) and you need to apply transformations in specific order to get correct calculations.

But matrix multiplication is associative (http://mathworld.wolfram.com/MatrixMultiplication.html  , CTRL+F for "associative"), so your parenthesis are unnecessary. You can very easily multiply the matrices together first (in order) and then multiply the resultant transform to vector. Basically, f(g(x)) == (fog)(x). Of course, matrix multiplication is not commutative which means it has to be P*V*M instead of M*V*P for column-major layout.

While Ax notation is popular in glsl, our friends in d3d went with row vector * matrix (so called xA notation) in hlsl. It looks like this:

In the fixed function pipeline only, if what I have read in tutorials is correct. In shader pipeline, Matrices can be in either format (https://msdn.microsoft.com/en-us/library/windows/desktop/bb509628(v=vs.85).aspx  , notice parameter description of y; and if it is column vector, then matrix will have be interpreted as column matrix or all hell shall break loose)



Is all I have said correct? Because if it is, then it should mean that I can use the same column-major layout in both GLSL and HLSL as long as I use M*v and mul(M, v) respectively.

Though your point in first post is still correct, I should implement a subscript based notation ASAP just for a nice abstraction and ease of use.


Logged
whoamievenidontknow
Level 0
**



View Profile
« Reply #10 on: November 22, 2016, 01:09:16 PM »

To be absolutely clear, jimon's above post is talking about notation, not the in-memory layout. Notation != memory layout. Notation includes order of operations and operator overloading. In memory the GPU will expect all matrices to be laid out in row major format, despite order of operations or notation used.

I'll like to see some source of this information too, since you are insinuating that my renderer should not be working at all (since I am using what you so confidently term as column-major matrix). Beside, we are not talking about operator overloading (there is no such thing in C99), we are talking about how GPU driver interprets raw bytes.
Logged
jimon_c
Level 0
**


work work work


View Profile WWW
« Reply #11 on: November 22, 2016, 01:20:57 PM »

Is all I have said correct?

I think so, indeed you can use column-major for both ogl and d3d, this is what this folks are doing https://github.com/bkaradzic/bgfx Smiley

IMHO I would recommend you to try to write tests for your math lib and see how that's going, because many folks was and will be doing math libs, but I don't see any luck with a test suite to "test them all", and what happens - me finding and fixing bugs in them :D
Logged
qMopey
Level 6
*


View Profile WWW
« Reply #12 on: November 22, 2016, 02:27:31 PM »

To be absolutely clear, jimon's above post is talking about notation, not the in-memory layout. Notation != memory layout. Notation includes order of operations and operator overloading. In memory the GPU will expect all matrices to be laid out in row major format, despite order of operations or notation used.

Hm, do you have a reference for this? (I mean that all GPU's have only row major layout)

GPU matrices are
Code:
float data[16];
with translation in 13,14,15th elements, everything else (like column-or-row-major) is just a notation details used in particular implementation.

I already linked to the OpenGL faq. And yes you're right, but that memory layout is actually called row major layout (see wiki link below).

To be absolutely clear, jimon's above post is talking about notation, not the in-memory layout. Notation != memory layout. Notation includes order of operations and operator overloading. In memory the GPU will expect all matrices to be laid out in row major format, despite order of operations or notation used.

I'll like to see some source of this information too, since you are insinuating that my renderer should not be working at all (since I am using what you so confidently term as column-major matrix). Beside, we are not talking about operator overloading (there is no such thing in C99), we are talking about how GPU driver interprets raw bytes.

I linked an OpenGL faq from opengl.org. That's a pretty good source! Also the matrix you described (translation in last few elements) is *row major* in memory. I did not confidently term it as column major. I confidently termed it as row major in memory, and told you your observation of the translation being the last few elements is correct. This was also verified in my large quote from opengl.org. Here's a description of row vs column in terms of memory layout: https://en.wikipedia.org/wiki/Row-_and_column-major_order

And yes we most definitely are talking about notation, since that's precisely what you are confused about. The difference between order in memory is completely independent of order in notation. For example DirectX uses "left-handed" notation, and OpenGL uses "right-handed" notation, and jimon's post talks about the differences, and yet both notations work on all GPUs because the memory layout is the same regardless of "handedness".

Edit:
Is that correct? I thought that memory storage layout depends on whether matrices are pre-/post-multiplied regardless of API.

The reason this isn't correct is because pre/post multiplication does not dictate memory storage layout, and you thought it did. We can store in memory rows contiguously, or columns contiguously, and we can write multiplication functions that perform pre or perform post on either memory layout. The memory layout is completely independent of the order in which functions need to be called (multiplication functions).

For example, with row major in memory we can define column major multiplication (A * x) where x is a column:

A is:
Code:
mat3
{
    vec3 x;
    vec3 y;
    vec3 z;
};

Code:
x[ 0 ] = Dot( x, A.x )
x[ 1 ] = Dot( x, A.y )
x[ 2 ] = Dot( x, A.z )

Or we can define the opposite handedness, and go for row major style (x * A) where x is a row:

Code:
for i = 0; i < 3; ++i
    v = vec3( A.x[ i ], A.y[ i ], A.z[ i ] )
    x[ i ] = Dot( v, x )

It's easy to see that one is more efficient than the other (the first one) due to linear vs non-linear memory traversal. In this example the memory is laid out in row major, and we can write code to perform row OR column major style multiplication (either handedness). But the real point is we could write functions that look like x * A but perform a multiplication operation as if x was a column (A * x):

Code:
vec3 Mul( vec3 x, mat3 A )
{
    vec3 v
    v.x = Dot( A.x, x );
    v.y = Dot( A.y, x );
    v.z = Dot( A.z, x );
    return v;
}

And now suddenly we have a column-major style notation, and we must perform an order of operations in this handedness, even though the memory layout is transposed relative to the notation.

One more link describing the same thing I'm trying to say: https://fgiesen.wordpress.com/2012/02/12/row-major-vs-column-major-row-vectors-vs-column-vectors/

Anyway I hope I'm helping, let me know if I made an error or if there's confusion  Who, Me?
« Last Edit: November 22, 2016, 05:13:29 PM by qMopey » Logged
whoamievenidontknow
Level 0
**



View Profile
« Reply #13 on: November 22, 2016, 07:02:57 PM »

that memory layout is actually called row major layout

Please read my last post: https://forums.tigsource.com/index.php?topic=58743.msg1299095#msg1299095
Logged
qMopey
Level 6
*


View Profile WWW
« Reply #14 on: November 22, 2016, 09:56:38 PM »

I did. Did you want me to respond to something in particular?
Logged
whoamievenidontknow
Level 0
**



View Profile
« Reply #15 on: November 22, 2016, 10:11:38 PM »

I did. Did you want me to respond to something in particular?

Won't both row-major and column-major matrix will have same layout (linear array with translational elements at 12, 13, 14 position) as long as we follow the convention that row-matrix is post-multiplied to a row-vector and column-matrix will be pre-multiplied to column-vector? Because if yes, then your statement that "Matrices are row-major in CPU" is totally false.

PS: Wait, when you say that matrices are row-major, are you trying to say that 2d-arrays in C (and C-like languages) used row-major notation? Because that's arbitrary and not dependent on CPU in any way (Fortran used the other notation, if I remember correctly). That is the reason I used float[16] instead of float[4][4] in the library because working with column-major matrices with C notation was wonky (and yes, required cache misses).
Logged
qMopey
Level 6
*


View Profile WWW
« Reply #16 on: November 22, 2016, 10:33:37 PM »

Won't both row-major and column-major matrix will have same layout (linear array with translational elements at 12, 13, 14 position) as long as we follow the convention that row-matrix is post-multiplied to a row-vector and column-matrix will be pre-multiplied to column-vector? Because if yes, then your statement that "Matrices are row-major in CPU" is totally false.

Yeah, you can end up with the same resulting vector if you follow a convention. I believe I said GPU not CPU though. The GPU will expect row1, row2, row3... etc.

PS: Wait, when you say that matrices are row-major, are you trying to say that 2d-arrays in C (and C-like languages) used row-major notation? Because that's arbitrary and not dependent on CPU in any way (Fortran used the other notation, if I remember correctly). That is the reason I used float[16] instead of float[4][4] in the library because working with column-major matrices with C notation was wonky (and yes, required cache misses).

Yes, row major is like C arrays. Yes Fortran had it flipped and did column major (Fortran was even the example in the wiki link section I sent). Personally I didn't mind either way if you do float[4][4] or float[16]. Both are fine to me. I just wanted to clear up the confusion between in memory layout vs notational layout.

float[4][4] is perfectly fine if you have one row followed by another, since the first index will pick a row, and second index will pick an element (column), just like a C-style 2D array, and just like row-major format. float[16] is also fine.
Logged
whoamievenidontknow
Level 0
**



View Profile
« Reply #17 on: November 22, 2016, 10:56:57 PM »

I believe I said GPU not CPU though.

Right, sorry. Misread that.

The GPU will expect row1, row2, row3... etc.

No, GPU doesn't care (or more precisely, the APIs we use to interact with GPU don't care). That's why you can do mul(M, v) in HLSL and have GPU assume that you mean column-matrix and column-vector or do mul(v, M) and have GPU believe that you mean row-matrix and row-vector. All the GPU sees is 16*4 bytes. Of course, consistency is key to sanity, so you'd only use one convention in a codebase and fill your matrices according to that convention.

But yes, we both are basically stating the same fact in different ways.
Logged
qMopey
Level 6
*


View Profile WWW
« Reply #18 on: November 22, 2016, 11:38:16 PM »

No, GPU doesn't care (or more precisely, the APIs we use to interact with GPU don't care). That's why you can do mul(M, v) in HLSL and have GPU assume that you mean column-matrix and column-vector or do mul(v, M) and have GPU believe that you mean row-matrix and row-vector. All the GPU sees is 16*4 bytes. Of course, consistency is key to sanity, so you'd only use one convention in a codebase and fill your matrices according to that convention.

Well this isn't quite right, or maybe it is and I'm misunderstanding. Either way the GPU sees those 16*4 bytes and will expect row after row in memory. The shader language is practically irrelevant when we talk about memory storage. For example, if we take our 16*4 bytes and transpose the elements, the behavior on-screen will change. This shows the GPU absolutely cares about what exactly goes into the rows in memory.

I think what you meant is that GPUs don't care about notation, which is correct. Like you correctly pointed out, different shader languages can do Mul( A, x ) or Mul( x, A ) and get the same result on the same GPU. Both Mul operations are probably using the same GPU instruction at the end of the day when the shader is compiled, and both Mul operations are certainly using the same memory storage layout (row major in memory).

Edit:
When I said "The GPU will expect row1, row2, row3... etc." I'm just talking about memory here. Seeing code that says Mul( A, x ) or Mul( x, A ) unfortunately does not give us hints about how the matrix is stored in memory. It only shows which convention to use, and what order of operations we should perform (aka which notation).
« Last Edit: November 22, 2016, 11:47:33 PM by qMopey » Logged
whoamievenidontknow
Level 0
**



View Profile
« Reply #19 on: November 23, 2016, 12:44:51 AM »


So this is what I was able to understand yesterday with some back-of-the-envelope calculation:

Well this isn't quite right, or maybe it is and I'm misunderstanding. Either way the GPU sees those 16*4 bytes and will expect row after row in memory. The shader language is practically irrelevant when we talk about memory storage.

Think of it this way. Say you are a GPU (Hi!). You get 64 bytes and some type info tells you that these are 16 floats. Now, some chump tells you to treat this array as a abstract entity called a matrix and multiply it with a vector (another abstract entity). Since these are aggregate structures and not elementary, there is no single instruction to multiply them (let's keep it simple, maybe you are a 90's VooDoo GPU). What you have is an algorithm to multiply these two like so:

w[0] = M[0]*v[0] + M[4]*v[1] * M[8]*v[2] + M[12]*v[3]
w[1] = ...
w[2] = ...
w[3] = ...

Note that this work for both row-major (v*M) and column-major (M*v) (try it on paper and see for yourself).

But what happens if you try to multiply two matrices M and N? First try M*N with column major layout:

O[0] = M[0]*N[0] + M[4]*N[1] + M[8]*N[2] + M[12]*N[3] (and so on) ...(1)

Now, with row major layout:

O[0] = M[0]*N[0] + M[1]*N[4] + M[2]*N[8] + M[3]*N[12] (and so on) ...(2)

They are different since matrix multiplication is non-commutative. However, in case of N*M (column major), we have:

O[0] = N[0]*M[0] + N[4]*M[1] + N[8]*M[2] + N[12]*M[3] (and so on) ...(3)

which is exactly same as row-major M*N (since (MN)T = NTMT).

Now, imagine that GPU vendors can choose any one algorithm and they implement algorithm (3). What this entail is that if you wanted to apply transform M first and then transform N, you can multiply M*N or N*M. At this point, you have 4 options to multiply the vector in resultant transform: vMN, MNv, vNM, NMv (converting from row-vector to column vector appropriately). However, because of the way we have defined transforms (as a property of our coordinate systems, etc.), only NMv or vMN will give correct results. And as I demonstrated just now, N*M (column major) == M*N (row-major) (algo 2 == algo 3).


Now, you may ask, what about N*M (row-major) and M*N (column-major)? Won't they be equivalent too? And indeed, they would be; however, once again, it seems that because of the way we have defined this system of transforms, they do not give the right result.

I am guessing that it should be possible to create a mathematical system in which the correct transform is given by something weird like M-1*v*NT or something. However, it is a job for better minds that me.

In any case, this is what I gathered after half-an-hour of headbutting. And it seems to play with my observations and previous knowledge too, but I ain't a mathematician either.
« Last Edit: November 23, 2016, 01:13:18 AM by namandixit » Logged
Pages: [1] 2
Print
Jump to:  

Theme orange-lt created by panic