Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411582 Posts in 69386 Topics- by 58445 Members - Latest Member: Mansreign

May 05, 2024, 10:05:13 PM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)More hand-made compression fun
Pages: [1] 2
Print
Author Topic: More hand-made compression fun  (Read 3602 times)
raigan
Level 5
*****


View Profile
« on: June 03, 2010, 12:25:21 PM »

Before everyone starts shouting "stop reinventing the wheel; just zip it!", this is mostly for my own edification, because I know nothing about compression/encoding. Smiley

Anyway: imagine I want to record the state of the jump button in, say, Super Mario Bros. The "raw" data would be a list of bits, each bit representing the state of the button at a particular frame, so bits[0] tells me if the button is down on the 1st frame, bits[10] tells me if it's down on the 9th frame, etc.

This seems to be ripe for run-length encoding -- because the game is running at a high frequency (say, 60hz) the button state is likely to look something like 00000000011111111000000111111111111111110011000000, i.e lots of identical contiguous values.

So, let's say we encode this in the following way:
-record the starting value (0 or 1)
-record a list of N-bit numbers, where each number represents the number of frames that are identical to the current frame; 0 would be a special code which indicates that the count is greater than 2^N-1, it would essentially mean "the count is 2^N-1 plus the value of the next number" (and the next number could also be a 0, etc. for very long button presses)

My question is: how can I determine, for a given set of raw input, what the ideal value of N is?

Is there a direct way to calculate the optimal value of N, from e.g a histogram of the input (i.e the size of all contiguous regions) or something similar? It seems like there might be some sort of formula relating the size of the compressed output to N and the frequency at which the input changes.

Any keywords or links would be appreciated, I'm sure this is a well-studied problem but as I said, I know nothing.

Rest assured that I will be benchmarking this against "just run the raw data through zlib" to make sure it's worthwhile Smiley

thanks,
raigan
Logged
Glaiel-Gamer
Guest
« Reply #1 on: June 03, 2010, 12:34:54 PM »

Code:
outputfile = null
minsize = infinity
for(N=0 to 64){
    file = compress(N)
    if(size(file)<minsize){
        minsize = size(file)
        outputfile = file
    }
}

or you could do it the swf way, and have 5 bits be what N is (or 2 bits, and add 10 to it) for each block, perhaps keep the same N until you read a certain bit at the beginning of each block

i.e. each block:

1 bit: does N need to be updated
[if N needs to be updated, read 5 bits and add 1 to it, this is your new N]
[if N>1] N bits: the size of the block, add 1 to this (since 0 can't be a size) [alternatively, use 0 to signify the next block will be raw data, and the next 10 bits specify how long it is]
1 bit: the value of the block


hence, if your string is 010101001010101100000000000000111111111111 you could set N=1 and only really need to set N twice, use 1 for the first half and 4 for the second half

now determining when you should / shouldn't set N and the value of what N should be here would be a much more interesting problem...


010101001010101100000000000000111111111111 compresses to:

Code:
1 00000 0 0 //n = 0+1 = 1
0 0 1
0 0 0
0 0 1
0 0 0
0 0 0
0 0 0
0 0 1
0 0 0
0 0 1
0 0 0
0 0 1
0 0 0
0 0 1
0 0 1
1 00011 1100 0 //n=3+1 = 4
0 1010 1

1000000000100000100000000000100000100000100000100110001111000010101
« Last Edit: June 03, 2010, 12:52:39 PM by Glaiel-Gamer » Logged
raigan
Level 5
*****


View Profile
« Reply #2 on: June 03, 2010, 01:26:11 PM »

I don't think I want to get into the world of "variable N", too scary!

So is there really no solution better than just brute-force encoding using all possible values of N? Given a histogram that tells me how many contiguous blocks of M bits there are in the input, isn't there a direct way to find the optimal value of N? The cost of encoding for a particular value of N would be:

output_bit_count = Sum_M(num_M*N*(1 + floor(num_M/N)))

Where num_M = the number of contiguous regions of length M, and the sum runs over all the measured values of M.

You're right though, might as well just brute-force it. I just figured that there was maybe a simple solution to this problem..
Logged
Tycho Brahe
Level 10
*****

λx.x


View Profile
« Reply #3 on: June 03, 2010, 01:58:02 PM »

To be honest, this seems perfect for run length encoding.

However it might be an idea to simply record the keydown and keyup times and the keycode, that way for each keypress you only store two values per button press. This wouldn't work if there were lots of rapid presses, for something like Guitar hero, but would be perfect for a racing sim, where you hold one key for a long time, and use occasional presses of other buttons.

just my two cents/ideas
Logged
raigan
Level 5
*****


View Profile
« Reply #4 on: June 03, 2010, 04:04:31 PM »

@14113: I think that this is somehow the same as what Tyler suggested.. like, if you had an ordered list of event times (you don't need to store up/down with each event, you just need to know the starting state (up or down) -- each event is then the opposite of the previous type), you could then delta-encode the list so that instead of absolute time, each event defined its time relative to the previous event.
Logged
Tycho Brahe
Level 10
*****

λx.x


View Profile
« Reply #5 on: June 03, 2010, 04:27:03 PM »

Looking back at it they do seem very similar. Mainly though, I was suggesting changing how you record it, not how you compress it, as it seems counter intuitive to continually save data which changes fairly little over the course of a few frames.
However if you are set on recording/compressing why not take a look at some video compression algorithms, which operate (IMHO) in a similar set of circumstances, that is, data changing frame by frame, small changes normally, drastic changes in unpredictable time periods.
Logged
Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #6 on: June 04, 2010, 02:51:23 AM »

You'd probably be able to just store the RLE data as 7 bits length of the data and 1 bit what the data is. However, I do agree with 14113.
Logged
Ben Kuhn
Level 0
***


View Profile
« Reply #7 on: June 06, 2010, 11:21:29 AM »

Madk, that leads to inefficiency when you have lots of rapid state transitions. Specifically if the state changes more than once per 8 frames, it will be more efficient not to encode it at all.

I'm going to be radical and suggest that RLE isn't what you want at all. Rather, you should encode the frame number for each button press/release. If you have fewer than 16 buttons, you can spend 4 bits on which button, 16 bits on which frame (as long as you don't go over 1092 seconds at 60 fps -- perfectly reasonable for a Mario style game) and you don't have to spend anything on recording button state -- since it's binary you can just flip the state every time. That's 20 bits per button event or 5 bytes per button press - pretty reasonable in my book.

Actually, even better: spend 8 bits on the time, giving you a total of 12 bits per button press. Every 256 frames (4.2 seconds), start an event with button code 1111, to indicate that the clock has rolled over. Since that button is never used except to indicate a rollover, you don't have to put a time after it. In fact, if you have fewer than 12 buttons, you can just record 11. Hey presto, 3 bytes/button plus <1 bits/second. I think that's superior to RLE in most cases, although some constraints might make RLE or a limited form thereof better (for example, if whether the jump button is still pressed only matters for the first N frames, as is the case in Mario, then you might do better by just attaching the run-length to the button-press record rather than recording button-release separately).


edited because 16 = 2^4, not 2^5. doh.
edited again: apparently I decided not to read the part where other people suggested this idea. I guess if you're interested in the math this is still useful. Just ignore the "I'm going to be radical" part.
« Last Edit: June 06, 2010, 11:35:25 AM by Ben Kuhn » Logged
Tycho Brahe
Level 10
*****

λx.x


View Profile
« Reply #8 on: June 06, 2010, 11:27:19 AM »

I think thats basically whats been suggested already, in that you only store the button up/down times. However you've gone further and worked out the maths. Awesome! (I love me some maths)
Logged
bateleur
Level 10
*****



View Profile
« Reply #9 on: June 06, 2010, 11:37:37 PM »

Instead of rolling the clock over, another possibility would be to encode the length of time between state changes for the button in place of the frame number. To handle rollover there you just need to reserve a single value as a kind of escape character to denote exceptional cases generally.
Logged

raigan
Level 5
*****


View Profile
« Reply #10 on: June 07, 2010, 02:52:01 PM »

Instead of rolling the clock over, another possibility would be to encode the length of time between state changes for the button in place of the frame number. To handle rollover there you just need to reserve a single value as a kind of escape character to denote exceptional cases generally.

I think this is what I was trying to figure out actually, I shouldn't have said "RLE" since that was a bit of a misnomer Smiley

If you store the time of each event, you end up with something like: 29,34,190,204,etc. where the values are in order; you might as well encode the delta between values instead of the absolute time, since the deltas will be smaller and thus require fewer bits.

So we then return to the real problem: how many bits should we use to encode the deltas? If N = # of bits, each "spillover" costs a*N where a = floor(current_delta/2^N). If N is too small then you end up with lots of spillovers, but if it's too big then you end up wasting unused bits on each delta.

If the input is known, there should be a way to calculate the optimal value of N. I just figured there would be a formula or something, but for now brute-force as glaiel suggested is good enough Smiley

Of course, using a non-constant number of bits is probably best, but that makes the optimization problem a lot harder.
Logged
iopred
Level 0
**


Gutsch


View Profile
« Reply #11 on: June 07, 2010, 10:51:18 PM »

I use a half RLE system, where if the string is:

0,0,1,0,1,0,1,1

Then its on/off frame swapping, but if a number is negative, that is the RLE flag.

so:

0,0,1,0,-5,1,0,1,1

expands to

0,0,1,0,1,1,1,1,1,0,1,1

This works great because if its encoding to a string, you only use the RLE when there is a net saving (a run of 3 in my case).
Logged
shrimp
Level 5
*****


View Profile WWW
« Reply #12 on: June 07, 2010, 11:44:46 PM »

If you were storing those 1's and 0's as single bits you would get a file at worst 8 (or 16?) times smaller, without using RLE. Of course, the ease of writing to a text buffer might outweigh the saving, but if size is your main concern then it's something to consider.
Logged

muku
Level 10
*****


View Profile
« Reply #13 on: June 08, 2010, 01:02:51 AM »

Of course, using a non-constant number of bits is probably best, but that makes the optimization problem a lot harder.

You could do something like this: if your delta is in the range 0-127, just write it as a single byte. The highest order bit will be zero. Otherwise, determine how many additional bytes your integer needs and set as many high-order bits of the first byte to 1. You need an additional 0 to separate this width field from the actual number, but the rest of the bits is available to store your number in.

E.g.
Code:
00000011 = 3                         (no extra bytes)
10000000 10000000 = 128              (one extra byte)
10000001 00000001 = 257              (one extra byte)
11000001 00000000 11111111 = 65791   (two extra bytes)

This is under the assumption that short deltas would be most common, so those will be encoded as just a single byte.

Frankly I think in practice this would be a very premature optimization until you have determined that this data is actually taking up way too much space, but since you said it's a learning experience that's cool I guess Smiley
Logged
st33d
Guest
« Reply #14 on: June 08, 2010, 03:19:34 AM »

I wrote a method for this into a Genetic Algorithm in Java I was working on to calculate the best amount of bits to use for a given number of characteristics:

Code:
// utility for finding the number of digits a number has in base 2
 
  public static final int base2(int num){
    return (int)Math.round(Math.log(num)/Math.log(2)+0.5);
  }

So why not use that equation to answer your problem?
Logged
muku
Level 10
*****


View Profile
« Reply #15 on: June 08, 2010, 03:33:04 AM »

So why not use that equation to answer your problem?

The thing is, every delta will be a different size, so you either have to choose one fixed bitwidth and deal with wasted space and overflows, or encode the exact size in bits for every individual delta, which seems pretty wasteful. My approach above is a compromise between those two extremes in that you can specify the width for every delta in bytes, which doesn't use up too many bits. Also you always stay aligned to byte boundaries, which makes en-/decoding a bit easier.
Logged
muku
Level 10
*****


View Profile
« Reply #16 on: June 08, 2010, 05:51:34 AM »

Ok, so I decided to indulge in some math geekery. Warning, the following may have adverse effects on those who do not enjoy this kind of thing.

Let's say you use my encoding scheme generalized to a bitwidth of N bits. Then you can represent the numbers from

0 .. 2^(N-1)-1            in N bits,
2^(N-1) .. 2^(2(N-1))-1  in 2N bits,
2^(2(N-1)) .. 2^(3(N-1))-1  in 3N bits,

and so on. (For every additional byte, you use one bit which is used for the length marker, hence the N-1.)

(EDIT: Seems this is not really optimal, and perhaps also not quite correct when your length marker gets longer than N bits... Anyway, let's roll with this for now.)

The other thing we need is some stochastic representation of the button events as a random variable. Let's say the sequence of button down/up events is modeled by a Poisson process, that is, they occur independently of each other, much like ticks on a Geiger counter. (This assumption is maybe not ideal, but I think it can work as an approximation. It would be interesting to capture some real-life data and verify how it is actually distributed.)

The waiting times between two events then exhibit a exponential distribution. This distribution has the cumulative distribution function P(x) = 1-exp(-lambda*x). Here lambda is a parameter such that 1/lambda is the average waiting time between events in frames, i.e. 60*lambda is the average number of events per second for a game at 60fps.

If we put all this together, we can compute the expected length in bits for encoding one delta value. We have two parameters to play with here: the lambda and the N. Let's first say the game/player is rather chilled out and there are, on average, two events (i.e. one jump, button down and up again) per second, which gives us lambda = 2/60. If we plot N on the x axis and the expected number of bits on the y axis, we get this:



That is, using something around N = 4 bits would be optimal here. If, on the other hand, the game is very button-mashey and we have around 10 events/s, giving us lambda=10/60, we get this picture:



Here the minimum lies somewhere around 3 bits, which would be the optimal choice for N then.

That's all for now. I can give out the Mathematica notebook if anyone wants to have a look.
« Last Edit: June 08, 2010, 06:09:43 AM by muku » Logged
raigan
Level 5
*****


View Profile
« Reply #17 on: June 08, 2010, 08:33:39 AM »

Awesome! Instead of modeling/predicting the button input as you do, could you apply the same solution to e.g a histogram of time-between-events?

Also, it's really cool to see how non-linear the response is as you vary N.. totally informative! Smiley
Logged
muku
Level 10
*****


View Profile
« Reply #18 on: June 08, 2010, 02:01:36 PM »

Awesome! Instead of modeling/predicting the button input as you do, could you apply the same solution to e.g a histogram of time-between-events?

Definitely. Best thing would be to capture some real data first and get some idea how it is distributed. The exponential distribution I assumed may be way off base. Then, once that is confirmed, estimating lambda from an actual data set should be easy: just compute the average time between events and take the reciprocal of that.

Quote
Also, it's really cool to see how non-linear the response is as you vary N.. totally informative! Smiley

Yes, that surprised me quite a bit too. I thought there would be a convex function with a clear minimum, but I guess the discrete nature of the bitwidth function creates those hills and valleys. To be fair though, as I noted, I think my encoding widths aren't completely correct, which could shift the favor slightly towards more bits.
Logged
Zaphos
Guest
« Reply #19 on: June 08, 2010, 09:00:33 PM »

It seems like you could go directly from a histogram to the cost for each N, for a given data set, without going through the exponential distribution assumption?  For each bin in the histogram and a given N you know how much it costs to compress each element in that bin, so just multiply cost by quantity ...

Unless that's what you mean by the brute force approach?  It should be better than doing the full compression process for each N though.
Logged
Pages: [1] 2
Print
Jump to:  

Theme orange-lt created by panic