Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411432 Posts in 69363 Topics- by 58417 Members - Latest Member: gigig987

April 20, 2024, 05:30:15 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)Beautiful hacks
Pages: [1] 2 3 ... 7
Print
Author Topic: Beautiful hacks  (Read 17264 times)
Evan Balster
Level 10
*****


I live in this head.


View Profile WWW
« on: December 03, 2010, 07:46:56 AM »

So someone mentioned it was possible yesterday to swap two bytes without a third and without using the assembly command to do so.  I figured out

Quote
A ^= B;
B ^= A;
A ^= B;

Does the trick.  Now, unless we're programming microcontrollers or something (in which case we'll have a more efficient swap routine anyway) we don't have any practical need for this sort of hack.  Yet it's interesting.  It's fun to know about, as a programmer, even for those of us who are aghast at the practice of bit-twiddling.


Another fun one for those of you who don't know is Fast Inverse Square Root, which can be done like this on little-endian systems:

Quote
void fastSqrt(float arg)
{
  Uint32 val = * (Uint32*) arg;
  val = 0x5f375a86 - (val >> 1);
  return 1.0f / (* (float*) val);
}

It's an approximation but it's extremely close to the correct value and is used frequently in shaders and sometimes actual game logic.  It's faster by a huge factor than the 'proper' means of calculating a square root.  Since the division isn't necessary for operations like vector normalizing, it can be skipped for even better performance.

It works by exploiting the nature of IEEE 754 floating-point number representation on modern processors, which consists of an exponent and a 'mantissa'.  That's its own matter--all you need to know is that the magic number "0x5f375a86" manages to make this thing work like a charm.


So anyway.  For a bit of fun, let's share the silly but intriguing hacks we know, whatever they might be.
Logged

Creativity births expression.  Curiosity births exploration.
Our work is as soil to these seeds; our art is what grows from them...


Wreath, SoundSelf, Infinite Blank, Cave Story+, <plaid/audio>
JoeManaco
Level 2
**



View Profile WWW
« Reply #1 on: December 03, 2010, 08:00:02 AM »

Quote
void fastSqrt(float arg)
{
  Uint32 val = * (Uint32*) arg;
  val = 0x5f375a86 - (val >> 1);
  return 1.0f / (* (float*) val);
}

John Carmack used this in the Quake Engine, I think Smiley
Logged

bateleur
Level 10
*****



View Profile
« Reply #2 on: December 03, 2010, 08:48:38 AM »

This is my code, but not my algorithm:

Code:
 public static function isqrt(n:uint):uint {
  var rem:uint = 0;
  var root:uint = 0;
  var divisor:uint = 0;
  for (var i:uint=0; i<16; i++) {
   root = root << 1;
   rem = ((rem << 2) + (n >>> 30));
   n = n << 2;
   divisor = (root << 1) + 1;
   if (divisor <= rem) {
    rem -= divisor;
    root++;
   }
  }
  return(root);
 }

Possibly you're wondering why I needed an efficient square root algorithm using only integers. It has to do with interactions between physics simulation and networked play.

(It's AS3. Feel free to use it.)
Logged

st33d
Guest
« Reply #3 on: December 03, 2010, 08:56:17 AM »

How about an even faster InvSqrt?

http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf
Logged
raigan
Level 5
*****


View Profile
« Reply #4 on: December 04, 2010, 06:49:42 AM »

Here's a great invsqrt overview (with references to other resources): http://cbloomrants.blogspot.com/2010/11/11-20-10-function-approximation-by_20.html
Logged
slembcke
Level 3
***



View Profile WWW
« Reply #5 on: December 04, 2010, 12:01:17 PM »

People get really excited about that inverse square root function, but keep in mind that on a lot of recent processors that 1.0f/sqrtf() will probably be slightly faster and far more accurate. True at least on my 3 year old Core2 Duo. Even on my 6 year old G5, 2 year old Atom and on the iPhone the InvSqrt() is only about 10-15% faster. (at least using GCC for all the tests) Unless your program does nothing but renormalize vectors, you are probably only going to see low single digit overall performance gains.

I'd say that my favorite hack has to be checking for NaNs. Comparisons to NaN always fail, even when comparing NaN == NaN. So to check if some calculation exploded, you can do this:
Code:
if(x != x){
  error("oh noes!");
}

It's completely non-obvious what that actually does until somebody tells you. There is isnan() in the C standard math library, but I remember running into compilers that didn't define it before. MSVC or some versions of GCC maybe?

I also often feel like trigonometry identities are hacks... Maybe that's just me though.
« Last Edit: December 04, 2010, 12:24:16 PM by slembcke » Logged

Scott - Howling Moon Software Chipmunk Physics Library - A fast and lightweight 2D physics engine.
Glaiel-Gamer
Guest
« Reply #6 on: December 04, 2010, 12:11:58 PM »

Booleans implicitly converted to ints make awesome short ways to do things


Code:
float movex = (RIGHTKEYSTATE-LEFTKEYSTATE)*speed;
float movey = (DOWNKEYSTATE-UPKEYSTATE)*speed;
Logged
Glaiel-Gamer
Guest
« Reply #7 on: December 04, 2010, 12:21:54 PM »

Also, 2D array indexing if the width and height are powers of 2:

normally:
Code:
array[row*width+column]

but wait! width is a power of 2! Also column will always be less than the width, so the last N bytes in the index are all 0 before adding column.


Code:
//N = log2(width)
array[(row<<N)|column]

also just realized you can do bounds checking with bitwise operations too

Code:
//N = log2(width)
//M = log2(height)
//you want an int where all bits > N are 1, precompute it
//i.e. if N = 5, the bits should look like
//11111111111111111111111111100000
//which is N + N<<1 + N<<2 + N<<3 + N<<4 +...+ N<<32 //dont need to check, once you shifted it all the way over you're just adding 0 which is a nop. Can also use | instead of +
//unsigned K = N + N<<1 + N<<2 +...+ N<<32
//unsigned L = M + M<<1 + M<<2 +...+ M<<32

boundcheck = (unsigned(row)&L)|(unsigned(column)&K);
if(boundcheck){
  //out of bounds
}


i just realized this today after writing this... I think im gonna have to go pop that in to my code today since bound checking wastes time in a function thats called about 50000 times per frame (related to collision detection) in closure
« Last Edit: December 04, 2010, 12:59:47 PM by Glaiel-Gamer » Logged
Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #8 on: December 04, 2010, 12:25:43 PM »

Return ((arg>0)*2-1)*arg

It's slower than the typical function on the average processor, but I think it still qualifies as a hack, in case the compiler lacks the instruction for some reason.
Logged
slembcke
Level 3
***



View Profile WWW
« Reply #9 on: December 04, 2010, 12:36:25 PM »

A C neat hack that I don't actually use because I think it's mildly unreadable and probably tricks compilers into generating less efficient code:

Code:
!!value
// is the same as
value != 0

If value is 0 it evaluates to 0, if value is any other number it evaluates to 1.

Another neat hack that you'll see in compiled assembly code is setting a register to 0. Often times you'll write this code:
Code:
int number = 0;
Which is compiled to be something like this:
Code:
int number; number ^= number

The reason being that any value XORed with itself returns 0, and the XOR instruction on x86 processors is shorter than a load instruction with a 32 or 64 bit constant for 0. It makes for a smaller and slightly faster binary.
Logged

Scott - Howling Moon Software Chipmunk Physics Library - A fast and lightweight 2D physics engine.
Netsu
Level 10
*****


proficient at just chillin'


View Profile WWW
« Reply #10 on: December 04, 2010, 03:45:10 PM »

Booleans implicitly converted to ints make awesome short ways to do things


Code:
float movex = (RIGHTKEYSTATE-LEFTKEYSTATE)*speed;
float movey = (DOWNKEYSTATE-UPKEYSTATE)*speed;


:O <runs away to change half of his ifs for this>
Logged

ink.inc
Guest
« Reply #11 on: December 04, 2010, 06:51:13 PM »

Code:
float movex = (RIGHTKEYSTATE-LEFTKEYSTATE)*speed;
float movey = (DOWNKEYSTATE-UPKEYSTATE)*speed;

I came.
Logged
LemonScented
Level 7
**



View Profile
« Reply #12 on: December 04, 2010, 07:51:01 PM »

I was always rather taken with this technique for finding out if a number is a power of 2:

Code:
bool IsPowerOfTwo(const int num)
{
    return !(num & (num-1));
}

Not really a hack, just a nice succinct way of using the properties of binary numbers.
Logged

slembcke
Level 3
***



View Profile WWW
« Reply #13 on: December 04, 2010, 08:12:48 PM »

I've always wanted to know if there was some clever way to get the next largest power of two for a number without a for loop. Like 9 -> 16, 8 -> 8, 129 -> 256, etc. I've never really sat down to figure it out, but it seems like there should be a moderately simple way to do it.
Logged

Scott - Howling Moon Software Chipmunk Physics Library - A fast and lightweight 2D physics engine.
raigan
Level 5
*****


View Profile
« Reply #14 on: December 04, 2010, 08:19:06 PM »

People get really excited about that inverse square root function, but keep in mind that on a lot of recent processors that 1.0f/sqrtf() will probably be slightly faster and far more accurate.

Yeah, in the link I posted the author concludes that these days in 99% of cases it's faster to use the intrinsic command for whatever platform you're on.
Logged
Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #15 on: December 04, 2010, 08:28:43 PM »

I've always wanted to know if there was some clever way to get the next largest power of two for a number without a for loop. Like 9 -> 16, 8 -> 8, 129 -> 256, etc. I've never really sat down to figure it out, but it seems like there should be a moderately simple way to do it.

return 2^ceil(log(arg)/0.30103)
Logged
Zaphos
Guest
« Reply #16 on: December 04, 2010, 08:39:50 PM »

I've always wanted to know if there was some clever way to get the next largest power of two for a number without a for loop. Like 9 -> 16, 8 -> 8, 129 -> 256, etc. I've never really sat down to figure it out, but it seems like there should be a moderately simple way to do it.
There's some discussion of it here: http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2Float (and in the very next entry on that list + the entries on computing an integer log)
Logged
slembcke
Level 3
***



View Profile WWW
« Reply #17 on: December 04, 2010, 08:46:53 PM »

People get really excited about that inverse square root function, but keep in mind that on a lot of recent processors that 1.0f/sqrtf() will probably be slightly faster and far more accurate.

Yeah, in the link I posted the author concludes that these days in 99% of cases it's faster to use the intrinsic command for whatever platform you're on.

On the other hand, I don't mean to rob it of it's brilliant hack status. I honestly have no clue how somebody would have figured that out. I remember thinking the newton raphson method was nifty when I learned about it in calculus, but I never would have thought to seed the estimate using some random bitwise operation on the raw floating point number. O_o

return 2^ceil(log(arg)/0.30103)

But is there a way to do it only using simple integer and bitwise operations? Using floating point math kind of cheats it of it's "hack" status. Wink On a related note, that reminds me of some of the SQL Server code from where I used to work that was used for decoding bitmasks using the floating point pow() function. heh
Logged

Scott - Howling Moon Software Chipmunk Physics Library - A fast and lightweight 2D physics engine.
slembcke
Level 3
***



View Profile WWW
« Reply #18 on: December 04, 2010, 08:50:26 PM »

There's some discussion of it here: http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2Float (and in the very next entry on that list + the entries on computing an integer log)

Code:
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;
Shocked Not quite as straightforward as I expected.

It looks like it basically smears the bits to the right, doubling the size of the smear each time. The -- and ++ just handle the edge case where something is already a power of two. Nifty!
Logged

Scott - Howling Moon Software Chipmunk Physics Library - A fast and lightweight 2D physics engine.
Toeofdoom
Level 2
**



View Profile WWW
« Reply #19 on: December 05, 2010, 02:46:52 AM »

That one does look interesting, I suppose you could stick if statements in there too but it's unlikely they'd speed much up and they would make it even less neat.
Code:
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
if(v & 0xffffff00)
{
v |= v >> 8;
v |= v >> 16;
}
v++;
Could be faster if you don't expect it to go above 256 very often.
Also the shifts can be in any order... it looks hackier if they're randomly flung around right?
Logged

Pages: [1] 2 3 ... 7
Print
Jump to:  

Theme orange-lt created by panic