Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length

 
Advanced search

879459 Posts in 32981 Topics- by 24367 Members - Latest Member: bastion_music

May 24, 2013, 06:24:26 AM
TIGSource ForumsDeveloperTechnical (Moderators: Glaiel-Gamer, ThemsAllTook)Storing Game Data Efficeintly
Pages: 1 [2] 3 4
Print
Author Topic: Storing Game Data Efficeintly  (Read 7960 times)
handCraftedRadio
The Ultimate Samurai
Level 10
*


curt kling

handCraftedRadio
View Profile WWW Email
« Reply #15 on: July 15, 2007, 11:07:34 AM »

PoV: The method that you used increases the file size to 3.0MB, double what it was before. I think I'm going to have to take it to the next level and encode it as binary data like you said, but I have no idea how to do it.
Logged

DrDerekDoctors
THE ARSEHAMMER
Level 7
******


70463070 graham@duketastrophy.demon.co.uk drderekdoctors
View Profile WWW
« Reply #16 on: July 15, 2007, 11:57:42 AM »

If you can blow chunks out of it in the game (or deform it in some other way) you'd be crazy not to opt for run length encoding of some form given how chunky it is.

I think you meant "you'd be crazy to opt for...", because of the run adding/removal problem.   That's why I tried to clarify.

Nope, because he's talking about saving it to disc, not how it's stored in memory. Obviously you wouldn't store it in memory in RLE form because you've plenty of memory. But when you save it to disc, given that the data is comprised of large runs of values, RLE seems the natural choice.
Logged

Me, David Williamson and Mark Foster do an Indie Games podcast. Give it a listen. And then I'll send you an apology.
http://pigignorant.com/
Chris Whitman
Sepia Toned
Level 10
*****


A master of karate and friendship for everyone.


View Profile Email
« Reply #17 on: July 15, 2007, 12:50:50 PM »

Hell, if you've got two values, 'on' and 'off,' you could do an RLE scheme where the high bit determines which value you're writing and the rest of the entry encodes the length information. It's simple and homogeneous enough to be interpreted by an extremely reduced set of states.

Additionally, if you have only two values and you're worried about memory space, storing every 1 or 0 in a separate byte is massively wasteful. You could just pack it all in there, bit by bit, and you'd save memory by a factor of eight.

Furthermore, I would avoid using magic numbers in your reading or writing routines (by 'magic numbers,' I mean explicit numbers like 768 or 1024). You could substitute variables there, or predefined constants at the very least, so you're not restricting yourself to a particular size, in case you want to change it later on. As a general rule, don't let low level routines make important choices that should be made by their callers.

Even furthermore, I'd like to point out that the next step, as it is in all algorithm development, is the first step, which is designing the entire thing and doing it correctly the first time. If you'd like to do a run length encoding scheme, sit down and figure out how it will work on paper, then implement that.
Logged

Formerly "I Like Cake."
PoV
Level 5
*****


Click on the eye and something might happen..maybe


View Profile WWW
« Reply #18 on: July 15, 2007, 04:24:33 PM »

If you can blow chunks out of it in the game ...

Well then, don't hint at "blowing chunks" out of anything, whilst supporting the idea of RLE.  Tongue

PoV: The method that you used increases the file size to 3.0MB, double what it was before. I think I'm going to have to take it to the next level and encode it as binary data like you said, but I have no idea how to do it.

It might sound strange, but that's actually good.  Understanding why is important to the "next step", the idea of binary encoding. 

So, in your original example, you were writing text to the file.  An ASCII '0' or '1', followed by an ASCII newline character (0x13 and/or 0x10, for simplicity, lets assume it's only 1 byte for the newline).  That's two bytes per pixel.  To contrast, what we've written with the binary file example is the entire 4 bytes of of an integer, which is as you noted, twice the size of the text representation.

So, the solution to this problem is to somehow write less data per pixel to the file.  Given that you're only need 1 bit of information per pixel, that's our ideal size.  However, a first step could be to go from writing 2-4 bytes, and reduce that to writing 1 byte per pixel.

Bytes can hold values between 0-255 (or -128 and 127), and a byte is 8 bit's long.  Bytes are our basic measurement in computers, and are out minimum unit of space we can use.  What that means, is even though we only want 1 bit of information, we're required to use up at least 1 whole byte to do it.  Bytes represent basic text characters (A, B, C, 1, 2, 3, ...).  "C style strings" are arrays of bytes, followed by a terminator character ('H', 'e', 'l', 'l', 'o', 0).  Integers, pointers, and float's are built up of 4 bytes.

In C/C++, our unit of bytes is represented by the types 'char' and 'unsigned char'.  So the easy/lame solution is to move your data to a 'char', then write it.

Code:
ofstream MapFile("MyMap.map", ios::out | ios::binary);
int MyVariable = 25;
char MyByte = (char)MyVariable;
MapFile.write( &MyByte, sizeof(char) );

Code:
ifstream MapFile("MyMap.map", ios::in | ios::binary);
char MyByte;
int MyVariable;
MapFile.read( &MyByte, sizeof(char) );
MyVariable = (int)MyByte;

However, all this does for us is reduce the data to 1 byte in size, half of your smallest file so far.

Now, how to encode in to binary.

As it's been mentioned, binary is that whole 0 and 1 thing.  As far as whole numbers are concerned, binary is also known as the "base 2" number system.  Decimal is "base 10".  That means for every digit (i.e. character), there are that many options.  Every digit in a decimal number is between 0-9, and every digit in binary is between 0-1.

http://en.wikipedia.org/wiki/Binary_numeral_system

I'm going to assume you know what binary is for the rest of this, as well as it's basic operations (addition, subtraction, shifting, "OR", "AND", "XOR").  Review the above link to learn more/brush up on it.

Now, as it was said earlier, our smallest unit we can deal with on computers is the byte, not the bit.  To use bits, we need to cram/fit them in to a byte.  Since a byte is 8 bits in size, we can fit 8 of them in one.

So, what we want to do is change our reading and writing code to work with 8 pixels at one time.

Code:
  ofstream LFile("LevelFile.lvl", ios::out | ios::binary);
  for (int SCheckX = 0; SCheckX < 1024; SCheckX+=8)
  {
   for (int SCheckY = 0; SCheckY < 768; SCheckY++)
   {
     // ??
   }
  }

Inside that, we'll be doing something with 8 of our pixels.  Note that I changed the X to increment by 8, instead of 1.

To work with the idea of binary, we need to be sure our numbers are 0 and 1.  The number 1 in an integer or a byte is actually the equivalent of setting the first bit on.  2 in an integer or byte is the 2nd bit on.  4 in an integer or byte is the 3rd bit on, and so on.

We can move bit's around an integer or byte using shifts.  Down shift ">>" and up shift "<<".  An easy way to remember which is which is to think of the symbol as an arrow, and to look at a number.  "500".  If I wan to add another zero (5000), that'd be bigger.  If I was to remove a zero (50), that'd be smaller.  Up and down.  Up shifting makes that number bigger (doubles it to be exact).  So moving the number that way (<-), and it gets bigger (hence, "up").  "500 << 1" becomes "1000" after up shifting.  Down shifting half's the number, rounding down.  "500 >> 1" becomes "250".  Multiple shifts is like applying a shift multiple times.  "500 >> 2" would be 125, a quarter the size of the original, or 4x the size if shifting up.  Doing it again would be an 8th the size shifting down, or 8x the size if shifting up.

This is actually a binary operation behind the scenes.  Your number 50 in binary is "110010".  Shifting up, the binary number becomes "1100100", aka 100 in decimal.  Shifting down, the binary number becomes "11001", aka 25.  Shift that down again, and that right side 1 gets clipped off ("1100"), resulting in 12.  That's also an easy way to know if a number is odd or even (checking that first bit).

So, to encode our 8 pixels in to a byte, we need to fit each one in to a binary slot of a byte.  The first one is fine where it is (MyNumbers[X]).  The 2nd one needs to be shifted up by 1 (MyNumbers[X+1] << 1).  The 3rd one needs to be shifted up 2 slots (MyNumbers[X+2] << 2).  And so on, up to the last one (MyNumbers[X+7] << 7).  That'll give us individual binary bits.  Now to merge them in to one byte.

There's actually 3 ways to do this.  Adding the numbers together (+), XOR'ing the numbers together (^), or OR'ing the numbers together (|).  Technically in this case, they'll all do the same thing, but if there was bit overlap, the results of each operation would be different.  Our ideal operation for merging bit's is OR'ing.

Code:
int Bits = (Num[X+0] << 0);
Bits |= (Num[X+1] << 1);
Bits |= (Num[X+2] << 2);
Bits |= (Num[X+3] << 3);
Bits |= (Num[X+4] << 4);
Bits |= (Num[X+5] << 5);
Bits |= (Num[X+6] << 6);
Bits |= (Num[X+7] << 7);

I used the "OR Equals" operator because this wouldn't have fit cleanly on 1 line.

The above is what we'd do to combine all the bits for each pixel together in to a single number.  Convert that to a "char", and write it as binary to a file.

This post is long, so I'll do reading in a new one.
Logged

Mike Kasprzak | Sykhronics Entertainment - Smiles (HD), PuffBOMB, towlr, Ludum Dare - Blog
PoV
Level 5
*****


Click on the eye and something might happen..maybe


View Profile WWW
« Reply #19 on: July 15, 2007, 04:37:37 PM »

Reading.  Smiley

If we were to write that data all packed in to a single byte to the file, we'd now need to decode that to use it.

So like the modified writer, each single byte you read would contain information for every 8 pixels.

To extract bits from a bit-mask (another name for what we just created), we need to use shifts and AND's (&).

Code:
unsigned char Number = 50; // technically, binary number "110010"
int Bit[8];
Bit[0] = (Number >> 0) & 1;
Bit[1] = (Number >> 1) & 1;
Bit[2] = (Number >> 2) & 1;
Bit[3] = (Number >> 3) & 1;
Bit[4] = (Number >> 4) & 1;
Bit[5] = (Number >> 5) & 1;
Bit[6] = (Number >> 6) & 1;
Bit[7] = (Number >> 7) & 1;

Bit[1], Bit[4] and Bit[5] would be 1, and the rest would be 0.

What AND'ing does is only allows bits through that match the bit mask.  Our bit mask is "1", which is only the first bit.  And since we shifted up to place each bit, we now shift down to place them in the 1's spot.

A lot to take in, but that's binary encoding.  Writing a physical single byte, for 8 bits of information, instead of more.
Logged

Mike Kasprzak | Sykhronics Entertainment - Smiles (HD), PuffBOMB, towlr, Ludum Dare - Blog
handCraftedRadio
The Ultimate Samurai
Level 10
*


curt kling

handCraftedRadio
View Profile WWW Email
« Reply #20 on: July 15, 2007, 08:23:33 PM »

Wow! Thanks for all the help! I still have a problem though. Whenever I try to read it i'm getting all ones. Here is the code I have:

Writing:
Code:
int zBits;
  ofstream LFile("LevelFile.lvl", ios::out | ios::binary);
  for (int SCheckX = 0; SCheckX < 1024; SCheckX = SCheckX + 8)
  {
   for (int SCheckY = 0; SCheckY < 768; SCheckY++)
   {
    zBits |= (MapCo[SCheckX][SCheckY]<<0);
    zBits |= (MapCo[SCheckX+1][SCheckY]<<1);
    zBits |= (MapCo[SCheckX+2][SCheckY]<<2);
    zBits |= (MapCo[SCheckX+3][SCheckY]<<3);
    zBits |= (MapCo[SCheckX+4][SCheckY]<<4);
    zBits |= (MapCo[SCheckX+5][SCheckY]<<5);
    zBits |= (MapCo[SCheckX+6][SCheckY]<<6);
    zBits |= (MapCo[SCheckX+7][SCheckY]<<7);
   
    LFile.write((char *)(&zBits),sizeof(int));
 
   }
   
  }

Reading:
Code:
int zBit[8];
 unsigned char rNum;
   for (int SCheckX = 0; SCheckX < 1024; SCheckX = SCheckX + 8)
  {
   for (int SCheckY = 0; SCheckY < 768; SCheckY++)
   {
    LFile.read((char *) &rNum, sizeof(rNum));
   
    zBit[0] =  (rNum >> 0) & 1;
    zBit[1] =  (rNum >> 1) & 1;
    zBit[2] =  (rNum >> 2) & 1;
    zBit[3] =  (rNum >> 3) & 1;
    zBit[4] =  (rNum >> 4) & 1;
    zBit[5] =  (rNum >> 5) & 1;
    zBit[6] =  (rNum >> 6) & 1;
    zBit[7] =  (rNum >> 7) & 1;
   
    MapCo[SCheckX][SCheckY] = zBit[0];
    MapCo[SCheckX + 1][SCheckY] = zBit[1];
    MapCo[SCheckX + 2][SCheckY] = zBit[2];
    MapCo[SCheckX + 3][SCheckY] = zBit[3];
    MapCo[SCheckX + 4][SCheckY] = zBit[4];
    MapCo[SCheckX + 5][SCheckY] = zBit[5];
    MapCo[SCheckX + 6][SCheckY] = zBit[6];
    MapCo[SCheckX + 7][SCheckY] = zBit[7];
   
   }
   
  }

I must be doing something wrong but I don't know what it is.
Logged

PoV
Level 5
*****


Click on the eye and something might happen..maybe


View Profile WWW
« Reply #21 on: July 15, 2007, 09:28:54 PM »

You're not clearing your temporary variable each time.  OR'ing will eventually fill it all up with 1's.

Also, you might want to write a byte, and not an integer, else your file will be 4x larger than it needs to be.

Code:
  ofstream LFile("LevelFile.lvl", ios::out | ios::binary);
  for (int SCheckX = 0; SCheckX < 1024; SCheckX = SCheckX + 8)
  {
   for (int SCheckY = 0; SCheckY < 768; SCheckY++)
   {
    unsigned char zBits = 0;
    zBits |= (MapCo[SCheckX][SCheckY]<<0);
    zBits |= (MapCo[SCheckX+1][SCheckY]<<1);
    zBits |= (MapCo[SCheckX+2][SCheckY]<<2);
    zBits |= (MapCo[SCheckX+3][SCheckY]<<3);
    zBits |= (MapCo[SCheckX+4][SCheckY]<<4);
    zBits |= (MapCo[SCheckX+5][SCheckY]<<5);
    zBits |= (MapCo[SCheckX+6][SCheckY]<<6);
    zBits |= (MapCo[SCheckX+7][SCheckY]<<7);
   
    LFile.write((char *)(&zBits),sizeof(unsigned char));
 
   }
  }

And in your read code, you should be able to do away with the zBit array, and set MapCo directly to the values.
Logged

Mike Kasprzak | Sykhronics Entertainment - Smiles (HD), PuffBOMB, towlr, Ludum Dare - Blog
ravuya
Level 7
**


Yip yip yip yip yip

ravuya@gmail.com ravuya
View Profile WWW
« Reply #22 on: July 15, 2007, 09:40:42 PM »

Hey, don't forget about endianness, you jerks.
Logged

PoV
Level 5
*****


Click on the eye and something might happen..maybe


View Profile WWW
« Reply #23 on: July 15, 2007, 10:09:32 PM »

It should be fine.
Logged

Mike Kasprzak | Sykhronics Entertainment - Smiles (HD), PuffBOMB, towlr, Ludum Dare - Blog
inmatarian
Level 0
*


2993 forever


View Profile
« Reply #24 on: July 15, 2007, 10:25:02 PM »

If you want to link against zlib, you could also use their inflate and deflate functions. I have a generic inflate function here, which should be sort of the same with a deflate. (the error codes were removed to make it terse, but you could find them in the documentation)

Code:
int unzip( const char * from, int fromlen, const char * to, int maxlen )
{
  int ret;
  z_stream zstrm;

  /* allocate inflate state */
  zstrm.zalloc    = Z_NULL;
  zstrm.zfree     = Z_NULL;
  zstrm.opaque    = Z_NULL;
  zstrm.avail_in  = fromlen;
  zstrm.next_in   = (Bytef*) from;

  /* Start Stream (Allow GZIP and ZLIB Headers) */
  ret = inflateInit2(&zstrm, 15 + 32);

  /* Set Target */
  zstrm.avail_out = maxlen;
  zstrm.next_out  = (Bytef*) to;

  /* Decompress */
  ret = inflate(&zstrm, Z_FINISH);

  /* Shutdown */
  inflateEnd(&zstrm);
  return 1;
}
« Last Edit: July 15, 2007, 10:27:26 PM by inmatarian » Logged
DrDerekDoctors
THE ARSEHAMMER
Level 7
******


70463070 graham@duketastrophy.demon.co.uk drderekdoctors
View Profile WWW
« Reply #25 on: July 15, 2007, 10:42:30 PM »

If you can blow chunks out of it in the game ...

Well then, don't hint at "blowing chunks" out of anything, whilst supporting the idea of RLE.  Tongue

Wuh? Wuh? Wuh?

I don't see why we're arguing here. We're talking about saving to disc versus storage in real time. In real time the data sits in an array, you blow holes out of it or otherwise deform it. As you blow holes out of it, it gets more complex to a point, but probably not massively.

THEN you need to save it to disc. At that point you write it out in an RLE format. It takes up a few K at most and it's forgotten about.

THEN you need to load it, you read it in, in RLE format and blow it up to the usual full-sized array.

It's only ever in RLE format when it's on the disc, which is when it's not being updated. How is this a problem?
Logged

Me, David Williamson and Mark Foster do an Indie Games podcast. Give it a listen. And then I'll send you an apology.
http://pigignorant.com/
handCraftedRadio
The Ultimate Samurai
Level 10
*


curt kling

handCraftedRadio
View Profile WWW Email
« Reply #26 on: July 16, 2007, 05:28:27 AM »

Thanks a lot PoV! Grin Your method works perfectly. Down to 96.0kb. Problem solved. Thanks for helping me out, everyone.


Logged

PoV
Level 5
*****


Click on the eye and something might happen..maybe


View Profile WWW
« Reply #27 on: July 16, 2007, 11:02:37 AM »

I don't see why we're arguing here. We're talking about saving to disc versus storage in real time. In real time the data sits in an array, you blow holes out of it or otherwise deform it. As you blow holes out of it, it gets more complex to a point, but probably not massively.

Wow!  Oh noes!  I am wrong!   Embarrassed

For some reason I neglected the whole idea that this is static on disk, and not some memory efficient system too.  Sorry Graham.  I'm completely at fault for this stupid debate.

Can we still be friends??  :D :D
Logged

Mike Kasprzak | Sykhronics Entertainment - Smiles (HD), PuffBOMB, towlr, Ludum Dare - Blog
DrDerekDoctors
THE ARSEHAMMER
Level 7
******


70463070 graham@duketastrophy.demon.co.uk drderekdoctors
View Profile WWW
« Reply #28 on: July 16, 2007, 12:09:12 PM »

Aaah, I assumed you had to be reading it wrong but I was trying not to sound utterly patronising. But if you'd persisted in your vein the next step was writing some code with offensive comments in it. Wink

:D We're still bestest chums. Smiley
Logged

Me, David Williamson and Mark Foster do an Indie Games podcast. Give it a listen. And then I'll send you an apology.
http://pigignorant.com/
Chris Whitman
Sepia Toned
Level 10
*****


A master of karate and friendship for everyone.


View Profile Email
« Reply #29 on: July 16, 2007, 12:11:06 PM »

Not to flog a dead horse or nothing, but I would still really recommend using run length encoding. 96K per level is pretty wasteful for a two colour level, especially if people are going to be downloading it. I guarantee you could cut that down by a factor of ten, at least.
Logged

Formerly "I Like Cake."
Pages: 1 [2] 3 4
Print
Jump to:  

Theme orange-lt created by panic