Procedurally Generated Compression

Pineapple

Level 10

~♪

Re: Procedurally Generated Compression

« Reply #120 on: June 03, 2010, 02:35:58 PM »

Quote from: 14113 on June 03, 2010, 02:19:01 PM

Not to want to quote a batman film, but:

Quote from: Thomas wayne

Why do we fall [bruce]? So we can learn to pick ourselves up.

You just made my day - I love that film.

@Zarasutra
It's completely unnecessary to attribute that many bytes to the possible arrangements, as I said n a previous post, I'm just trying to figure out how many of the arrangements really do need to be available without sacrificing compression to a horrid extent. As for that ABABABAB you provided, the algorithm is designed to transform that into a sequence of A's and a sequence of B's, or, if that arrangement isn't available, it will be intelligent to know that one or two misplaced values is not the end of the world. Also considering I'm handling values as 4 bits each rather than 8 makes it much harder to stumble upon an uncompressable portion of data.

Also, when going over the same file many times, as Overkill pointed out could be counterproductive, chunks where no arrangement could be found that resulted in a reduction of total size, it will be saved differently formatted (most likely in the form of a bit shift on a per value basis - 0 would become 2, 1 -> 3, 2 -> 4, 13 -> 15, 15 -> 1, etc) so that in addition to being at a different offset and therefore spreading its contained data at the end of one chunk and the beginning of the next, the values will change, hopefully resulting in a greater chance of a significant amount of compression.

Also, I will be mindful to begin developing the decompressor alongside the compressor. I do thank you for the advice.


	Logged

BorisTheBrave

Level 10

Re: Procedurally Generated Compression

« Reply #121 on: June 03, 2010, 02:50:52 PM »

Quote from: Dacke on June 03, 2010, 02:13:54 PM

I see no need for ridicule.

No, I see plenty of cause for ridicule. Not for experimenting, that is fine. Nor ignorance of general compression knowledge, that's precisely the point of experimenting. Madk's sin is one of arrogance, that he didn't take the time to consider he might be wrong, shown in every action, but particularly in failing to grasp the impossibility proof repeated at every opportunity.

But he's owning up to his mistakes, somewhat, and he's never been insulting, so that's good. I'm sure this will eventually be a learning experiment for him, but not perhaps the one he wanted (or more likely, in addition to the one he wanted).


	Logged

Rob Lach

Level 10

Re: Procedurally Generated Compression

« Reply #122 on: June 03, 2010, 07:35:44 PM »

_Madk, I don't mean to be offensive but you seem to have had fallen into the same trap that has taken many inventors seeking to make perpetual motion machines; you come across a very ingenious idea but either look past or refuse to believe where that idea fails.


	Logged

POP: Methodology Experiment One | Homepage | Twitter

gimymblert

Level 10

The archivest master, leader of all documents

Re: Procedurally Generated Compression

« Reply #123 on: June 03, 2010, 08:47:51 PM »

But there is also the tales of Langley VS Wright brothers.
You never know who is the Langley.

The problem is also that "naysayers" assume ideals condition like in theory, but what is the real degree of entropy of most files? I may be wrong and prove wrong and yet find something that work in most practical ways. I have seen that happen a LOT.

EDIT: i don't believe the claim that all file can be reduce to 69 bytes.


« Last Edit: June 03, 2010, 09:00:24 PM by neoshaman »	Logged

♥♥♥ Send me a WiiU, hit me on PM ♥♥♥

https://forums.tigsource.com/index.php?topic=48415.new#new progen
https://forums.tigsource.com/index.php?topic=32227.new#new Game art trick
404
https://forums.tigsource.com/index.php?topic=49818.0
https://forums.tigsource.com/index.php?topic=68138.n

Alex May

...is probably drunk right now.
Level 10

hen hao wan

Re: Procedurally Generated Compression

« Reply #124 on: June 04, 2010, 01:31:42 AM »

The way I see it (in my mathematically uncultured mind), you can't do it because it's a look-up, either into an algorithm that describes the file completely or into a database. Either way you need to have all possible files described at the decompression location before you "decompress" the file. A database is out of the question because if you have all the files already then you don't need to decompress. Good luck finding an algorithm that can reproduce any given megabyte-long file, that is so trivially short that a) it can be redistributed and b) it can be computed in the first place.

edit: hadn't read some essential posts


« Last Edit: June 04, 2010, 01:42:41 AM by Alex May »	Logged

eufloria

Pineapple

Level 10

~♪

Re: Procedurally Generated Compression

« Reply #125 on: June 04, 2010, 02:43:26 AM »

Quote from: neoshaman on June 03, 2010, 08:47:51 PM

EDIT: i don't believe the claim that all file can be reduce to 69 bytes.

Neither do I - that was a delusional claim thanks to the mistake that 64 factorial was 64+63+62... and not 64*63*62...
However, I'm still confident the compression will perform better than other algorithms; we will see. I've had a lot of busy stuff lately, so I haven't been able to work at it much.


	Logged

Balrog

Level 2

Re: Procedurally Generated Compression

« Reply #126 on: June 04, 2010, 10:21:31 AM »

I've been kicking around an idea for compression. Basically you would scan a file for patterns of long strings of bits and also scan for small non-occurring patterns and replace the large patterns with the small and define the mapping in some kind of header part of the file. Something like:

00010101111111001111111 Original

Map 1111111(large occurring pattern) to 11 (smaller non-occurring pattern)

0001010110011 End result

I dunno, it seems like in any file of significant size you'd have some large pattern that you could map to a smaller pattern.


	Logged

http://store.steampowered.com/app/365120/

Chromanoid

Level 10

Re: Procedurally Generated Compression

« Reply #127 on: June 04, 2010, 10:51:53 AM »

the problem to this approaches is how to separate the "mapping keys" from the normal stuff.


	Logged

Tycho Brahe

Level 10

λx.x

Re: Procedurally Generated Compression

« Reply #128 on: June 04, 2010, 11:57:56 AM »

This seems a lot like rle, with the added benefit that you can apply that to patterns as well as blocks of bits


	Logged

shrimp

Level 5

Re: Procedurally Generated Compression

« Reply #129 on: June 04, 2010, 01:10:59 PM »

See LZW and Huffman coding


	Logged

tumblr | Exploration game "Proteus" dev thread | Unfinished Assemblee tactics game

st33d

Guest

Re: Procedurally Generated Compression

« Reply #130 on: June 04, 2010, 03:51:17 PM »

Quote from: 14113 on June 04, 2010, 11:57:56 AM

This seems a lot like rle, with the added benefit that you can apply that to patterns as well as blocks of bits

Oh, so like LZW?

I mean come on _Madk either you have something here or you don't. If you posted a snippet then people would jump on it in a good way.

We all like computer science on this part of the board. Even if you've got it wrong then we can be grown ups and say, well that's an interesting problem.

You don't cite references to established compression algorithms, that's bad. Always cite references, that's how you pass a degree.

Put some code on the table that will run. We'll solve the problem together. And either it will be wrong or we'll all go, "shit, that _Madk kid had a point." It doesn't matter that you solved it, it matters that WE solved it, after that WE knew you were right, and we'll support you.

No code. No dice.


	Logged

Pineapple

Level 10

~♪

Re: Procedurally Generated Compression

« Reply #131 on: June 04, 2010, 04:21:12 PM »

Alright.

Here's what I've got down in code at the moment, it's written in BlitzMax.

Code:

SuperStrict

Framework brl.StandardIO 'Standard console IO
Import brl.FileSystem
Import brl.Stream



Global chunksize%=32  '# of values to store in each chunk. Lower values are faster but higher values (until 256) result in more efficient compression. This is not considering availability of different arrangements.

Global persist:Byte=0 '0 is quick, 1 is regular, and 2 is dedicated for the persistence setting.
				
			    'PERSISTENCE:
			
				' QUICK:	would search only 1/32 of all available seeds, would use the first detected to result in a size any smaller, 
				'		and would leave the data uncompressed if none of the seeds tested resulted in a size reduction.
				
				' REGULAR:	would search exactly 1/4 of all available seeds and use the most efficient of those tested, or leave 
				'		uncompressed if no seeds succeeded.
				
				' DEDICATED:	would not stop searching until it found a suitable seed or exhausted all possibilites, in which case it 
				'		would leave the data uncompressed.



Global maxtries!
'If persist=
				

				

Global combos%=0 'number of possible ways to organize the number of values in a chunk. (factorial)
For Local calc_combos%=1 To chunksize 'calculate
	combos:*calc_combos
Next


Global fil:TStream
Function compress(in$,out$)
	Local a!=FileSize(in)*2			'size of arrays necessary to hold the file
	Local c!=Int(a/Float(chunksize))	'number of chunks held in the file
	Local extra_bytes%=a Mod chunksize	'extra bytes at the end that are too small to form a chunk
	
	Local mem@[a] 		'original data array
	Local sec@[c,chunksize] 	'shuffled & organized data array
	Local cmp@[c] 		'use this when seeking to rearrange each chunk.
	
	fil=ReadFile(in) 'open the input file
	If Not fil error "Could not open input file"
	
	'put file data into mem array
	Local mem_array_index!=0 
	Repeat
		Local b@=ReadByte(fil)			'read the byte from the file stream
		mem[mem_array_index+0]=b Mod 16		'first memory address is the right 4 bits. 	( [.][.][.][.][!][!][!][!] )
		mem[mem_array_index+1]=(b Shr 4) Mod 16	'second memory address is the left 4 bits. 	( [!][!][!][!][.][.][.][.] )
		mem_array_index:+2
	Until Eof(fil)
	CloseFile fil
	
	'organize mem array data into second (sec) array
	For Local into!=0 To a-1
		sec[Int(into/Float(chunksize)),(into Mod chunksize)]=mem[into]
	Next
	
	'find smallest compression possibility for each chunk
	Local newloc@@[chunksize] 'where each value would most desireably be moved to
	
	'	
	
	fil=WriteFile(out) 'write the output file
	If Not fil error "Could not create output file"
	
End Function



Function expand(in$,out$)

End Function


Function error(txt$)
	If fil Then CloseFile fil
	Print txt
	Print "Press enter to exit.."
	Input
	End
End Function


	Logged

Eraser

Guest

Re: Procedurally Generated Compression

« Reply #132 on: June 04, 2010, 09:47:46 PM »

Ha, I have a newfound respect for anyone who can create a working algorithm for file compression. I got really bored today and spent a few hours creating an LZW-like algorithm. It indexes 8 byte segments. Unfortunately, this doesn't have any real application. Aside from the compressor exponentially eating memory and being slower than a snail; because most files are not so perfect as to have a lot of repeating 8 byte patterns, this raised the file size 95% of the time (because it uses a 4 byte tag to decide how it should decompress 8 byte segments). Anyway, this was my first and probably last attempt at self-rolling compression, heh.

_madk, thanks for sharing your code; perhaps (if it's compressor code is complete) someone can write a decompressor to see if it works properly if you don't get around to it.


	Logged

bateleur

Level 10

Re: Procedurally Generated Compression

« Reply #133 on: June 05, 2010, 12:24:13 AM »

Quote from: _Madk on June 04, 2010, 04:21:12 PM

Here's what I've got down in code at the moment, it's written in BlitzMax.

First thing I notice is that you seem to be calculating the number of combos. No need for that. It's pretty obvious which ordering will compress most effectively: just placing all equal-valued digits adjacent to one another. So all you need to do is find a way to describe the resulting permutation.

(I say "all"... actually the most efficient encoding in the general case is going to be the uncompressed data! I'll leave you to find ways around that issue, if possible!)


	Logged

Blog / Twitter / FaceBook // Play The Golem! Beer!

Tycho Brahe

Level 10

λx.x

Re: Procedurally Generated Compression

« Reply #134 on: June 05, 2010, 01:15:16 AM »

Quote from: st33d on June 04, 2010, 03:51:17 PM

Oh, so like LZW?

I hadn't heard of that, thanks!


	Logged

drChengele

Level 2

if (status = UNDER_ATTACK) launch_nukes();

Re: Procedurally Generated Compression

« Reply #135 on: June 06, 2010, 05:26:13 AM »

What a coincidence, when I was 12 I wrote a kick-ass compression algorithm that was able to compress every file in the world into a 256-byte file.

Now if only there was a way to decompress them... Well, hello there!


	Logged

Praetor
Currently working on : tactical battles.

baconman

Level 10

Design Guru

Re: Procedurally Generated Compression

« Reply #136 on: June 06, 2010, 05:40:24 AM »

1. Understand multiple, efficient compression algorhythms that already exist.

2. Analyze file contents to determine optimal algorhythm for each particular file (images, sound, text, video, code, executables, and other file types would all have different consistency patterns about them, anyways), and group them accordingly.

3. Make compressed file groups, based on optimal choice, and index the files and formats accordingly.

4. Bundle these groups into a single file/format for transfer and storage convenience, indexing them based on their compression algorhythms; prepare file for optimal decompression (IE: add a quick switch command to change the algorhythm on the fly, as the main compressed file is decompressing); and finally, add a decompressing agent that responds to this.

5. Rock on.

I could be wrong, but isn't this exactly what "maximal" RAR compression does, btw? (And why it sometimes takes so long? "Quicker" variations just use less compression possibilities, resulting in bigger files, but quicker decompression, that way.)


	Logged

Pages: 1 ... 5 6 [7]

« previous next »