Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1412044 Posts in 69447 Topics- by 58482 Members - Latest Member: ZerusW

June 21, 2024, 12:19:06 PM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)Procedurally Generated Compression
Pages: 1 2 [3] 4 5 ... 7
Print
Author Topic: Procedurally Generated Compression  (Read 18125 times)
muku
Level 10
*****


View Profile
« Reply #40 on: June 01, 2010, 10:59:52 AM »

The algorithm works. I've been doing some mathematical analysis: it is mathematically impossible for the algorithm (using a different though still brute force method I've conceived) to be unable to compress a file over 128 bytes.

So what you're saying is that you can compress any file in the world into no more than 128 bytes, by applying the algorithm over and over. You just lost any shred of credibility you might still have had. The proof that this is impossible is very easy, it works via the pidgeonhole principle which even non-mathematicians usually don't have too much trouble grasping, and it was given at the link Overkill posted above. The logic is very simple: if you compress all, say, 129 bytes long files into at most 128 bytes long ones, then there simply aren't enough different 128 bytes long files to represent all the different 129 bytes long files. It's logically impossible, there is no high-brow mathematics at work, and there are no "loopholes" to be found or exploited. I'd love to hear your reasoning why this argument doesn't apply to your case though.
Logged
Eraser
Guest
« Reply #41 on: June 01, 2010, 11:06:39 AM »

Muku, I think he meant that he can compress any file over 128 bytes, at least by a portion -- he's not compressing all files to 128 bytes, that would be foolish to believe.
Logged
increpare
Guest
« Reply #42 on: June 01, 2010, 11:09:04 AM »

Muku, I think he meant that he can compress any file over 128 bytes, at least by a portion --
The point is that by his claim you can apply this compression repeatedly to any finite portion data until you have it at 128 bytes or under.

Actually, muku's point is that it's mathematically impossible to have an algorithm that compresses all files with 129 bytes to 128 bytes (or less).

_Madk, you understand that you're very far off the course of logic and rational discourse here, right?  (Incredible claims for an algorithm that you can't get to work).  Please be careful, and please reflect on your behaviour.  I've seen a lot of people fall prey to very strange (and incorrect) beliefs in the fields of science and mathematics - it's not fun to watch, and it can completely ruin people's lives : (
« Last Edit: June 01, 2010, 11:16:31 AM by increpare » Logged
Overkill
Level 3
***


Andrew G. Crowell


View Profile WWW
« Reply #43 on: June 01, 2010, 11:11:33 AM »

Muku, I think he meant that he can compress any file over 128 bytes, at least by a portion -- he's not compressing all files to 128 bytes, that would be foolish to believe.
But sadly those are the same thing. That would imply that ANY file over 128 bytes can be compressed. And then, if that compressed blob is still over 128 bytes, since we can still compress ANYTHING over 128 bytes, it is eligible for further compression. So by repeated application, we must eventually have a file of at most size 128. This is definitely not possible without massive data loss.

EDIT: oh man post collision.
« Last Edit: June 01, 2010, 11:15:11 AM by Overkill » Logged

Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #44 on: June 01, 2010, 11:14:59 AM »

@Pitt
 Corny Laugh Hand Money Right

@muku & MindEraser
Stop posting so much, every time I edit this post to include a reply to you guys' I find out you've posted again o.o;

@muku
MindEraser's pretty much on target. The thing is, the compressor's own output is included these 100% of files it is mathematically impossible to be unable to compress. Shannon's Law applies to random data - you can't reduce every file by even 1 bit. The trick is that I'm rearranging the order of bits in the file so that I can compress it; as long as more than 35 bytes (after some new refinements on the algorithm) exist in the file, they can be compressed. What this doesn't mean is that every file in the world will compress to 35 bytes, it just means that after enough passes, every file in the world should be able to be compressed to 35-69 bytes.

Don't ask. I honestly don't know why the algorithm can work so well, but I've tested it in concept and analysis and I'm rewriting the code to adhere to the adjustments I've made to the algorithm.
Logged
BorisTheBrave
Level 10
*****


View Profile WWW
« Reply #45 on: June 01, 2010, 11:19:02 AM »

Ok, this has descended to full on crank status now. These claims are truly ludicrous. I cannot decide whether my favourite part is where the OP refuses to discuss any technical aspect, except that we must take it on faith that he's right, or the part where he doesn't acknowledge his strong resemblance to classical cranks.

Hint, Madk: It is traditional at this point to cast aspersions on the founders and leading figure of the field in question for being so dense as to have not only missed this algorithm, but having writen a law which has misled thousands of amateurs into thinking breaking it is impossible. Or alternatively, you can start praising yourself as a paragon of genius for spotting something so cunning that we couldn't even comprehend its amazingosity.
Logged
muku
Level 10
*****


View Profile
« Reply #46 on: June 01, 2010, 11:25:54 AM »

The thing is, the compressor's own output is included these 100% of files it is mathematically impossible to be unable to compress.

Sorry, couldn't parse that sentence?

Quote
Shannon's Law applies to random data - you can't reduce every file by even 1 bit.

Please reread my post; I wasn't talking about Shannon's coding theorem (which BTW is different from Shannon's law). I was talking about basic logical implications, no coding theory required.

Quote
What this doesn't mean is that every file in the world will compress to 35 bytes, it just means that after enough passes, every file in the world should be able to be compressed to 35-69 bytes.

Please explain the difference between these two statements? You're still claiming that every file in the world can be compressed to at most 69 bytes. Again: not possible, and very obviously so.

But, sigh. It starts to feel like talking to a wall. You're either lying to us or to yourself, I can't tell at this point. How is your decompressor coming along?
Logged
Glaiel-Gamer
Guest
« Reply #47 on: June 01, 2010, 11:33:29 AM »

You also have to store the random seed for each chunk, that's part of the filesize too, you can't ignore that and say "well not counting the seed it can compress everything"

rearranging the data is still PART of the compression algorithm. Hence shannon's law still applies to the original file.

but whatever, go ahead and test it out, brute force compress, decompress, and diff every possible 256byte file, and I guarantee you'll find many which don't compress.
Logged
Overkill
Level 3
***


Andrew G. Crowell


View Profile WWW
« Reply #48 on: June 01, 2010, 12:07:50 PM »

MindEraser's pretty much on target. The thing is, the compressor's own output is included these 100% of files it is mathematically impossible to be unable to compress. Shannon's Law applies to random data - you can't reduce every file by even 1 bit. The trick is that I'm rearranging the order of bits in the file so that I can compress it; as long as more than 35 bytes (after some new refinements on the algorithm) exist in the file, they can be compressed. What this doesn't mean is that every file in the world will compress to 35 bytes, it just means that after enough passes, every file in the world should be able to be compressed to 35-69 bytes.

Don't ask. I honestly don't know why the algorithm can work so well, but I've tested it in concept and analysis and I'm rewriting the code to adhere to the adjustments I've made to the algorithm.

Entirely rubbish. You realize a lossless compression method is required to be invertible, yes? And a function is left-invertible (ie. a compression is lossless) if and only if it is an injection.

But for a function to be an injection, each input value must map to a unique output. Similarly, for any possible uncompressed input string, there must a unique output string, if a compression algorithm is lossless. (If a function maps two values to the same output, it fails to be an injection.)

If you claim that any input of length n can be output in at most m bits (or M = m/8 bytes), with n > m, then by pigeonhole principle, at least one file of length n will map to the same output string as another file (there are 2^n pigeons, and 2^m holes, and n > m, so 2^n > 2^m -- pigeon hole principle applies). Therefore, any compression algorithm that makes this claim is not an injection, and thus fails to be left-invertible, and as a result, cannot be decompressed in a lossless manner.

EDIT: minor correction.
EDIT 2: modified after increpare pointed a few things.
« Last Edit: June 01, 2010, 01:33:11 PM by Overkill » Logged

Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #49 on: June 01, 2010, 12:20:14 PM »

@muku
I'm rewriting the code after I've refined the algorithm. I never did solve the bugs; as a matter of fact, I ended up with more.  Lips Sealed

@Boris
I had the full details earlier, but after receiving nothing but criticsm and coming closer to producing the finished program, I removed it from my posts.

@all
Thank you all for the feedback! Maybe I'll fail, I've said several times I haven't ruled out the possibility. However, the algorithm is painfully simple and there is absolutely no reason, if you dare think outside the bounds of these compression laws, why it wouldn't work.

Here's a brief explanation of why.

RLE compression is very simple. Instead of representing each value normally, pieces of the file are represented by each value, and the quantaty of that same value following. For example, "HELLO" would translate into something like "1H 1E 2L 1O". In this case, the file size would actually increase; that's why the compressor only works with chunks of data where it is mathematically impossible to become larger than the original. Saywhat? Suppose we had 4 possible values - the compressor only works with 16 (4 bits at a time) - 0, 1, 2, and 3. If the size of the string of data is greater than 8 numbers long, it will be compressible, given the similar numbers are grouped together. The reason RLE compression is so inapplicable is that this almost never happens. Our data looks more like "1231012030213230321" than "000011111222223333". The first, when RLE'd, is much larger than the original. The second, however, is smaller. That's why the algorithm reorganizes the data. Before I was relying on a very brute force and chance method (I've come to a better method, though) to just randomly mix everything up. But basically, what happens, is each chunk, after being organized into that "0000111112222233333", is RLE compressed, along with the 3 bytes of data necessary to return the uncompressed "0000111112222233333" to the original "1231012030213230321". There's no mystery about it. What happens is the file is broken down into many chunks of a manageable size, then compressed. After that, you compress that output, and make continue to make many consecutive passes.


I'll post again when I've finished the program.



If anybody steals my process in the slightest degree, I will kill you all. Tongue


« Last Edit: June 01, 2010, 12:27:36 PM by _Madk » Logged
Christian Knudsen
Level 10
*****



View Profile WWW
« Reply #50 on: June 01, 2010, 12:28:38 PM »

Uhm, why are you now reposting the same explanations that you've just <snipped> from your previous posts? Are you no longer afraid that people will steal it? Lips Sealed

EDIT:

Maybe I'll fail, I've said several times I haven't ruled out the possibility.

Going over your posts in this thread, I can't find a single example of this...
Logged

Laserbrain Studios
Currently working on Hidden Asset (TIGSource DevLog)
Zaratustra
Level 7
**



View Profile WWW
« Reply #51 on: June 01, 2010, 12:32:24 PM »

What this doesn't mean is that every file in the world will compress to 35 bytes, it just means that after enough passes, every file in the world should be able to be compressed to 35-69 bytes.



Uh, information about each of these "passes" is stored where?
Logged

increpare
Guest
« Reply #52 on: June 01, 2010, 12:46:45 PM »

Entirely rubbish. You realize a lossless compression method is required to be invertible, yes? And a function is invertible (ie. a compression is lossless) if and only if it is a bijection.
No dude.  A function has an inverse if it's injective (it might have many). 
Logged
muku
Level 10
*****


View Profile
« Reply #53 on: June 01, 2010, 12:55:59 PM »

Uh, information about each of these "passes" is stored where?

I've just come up with an algorithm which compresses every file in the world to just a single byte: Let's treat every file as a single number; just read all of its bytes as a sequence of hex digits, and you get a potentially huge number. One pass of the compression now simply consists in subtracting 1 from this number. Now, repeating this algorithm a lot of times, you'll eventually reach 0, which you can store in a single byte. One pass of the decompresser consists in adding 1 to the file, and to decompress the original file, just run this decompresser as often as you ran the compresser before!


@increpare: I believe that's a matter of definition; depending on whether you want left- and/or right-inverses, I think?
Logged
Overkill
Level 3
***


Andrew G. Crowell


View Profile WWW
« Reply #54 on: June 01, 2010, 12:57:28 PM »

Entirely rubbish. You realize a lossless compression method is required to be invertible, yes? And a function is invertible (ie. a compression is lossless) if and only if it is a bijection.
No dude.  A function has an inverse if it's injective (it might have many).  

Argh you're right. What I meant to say was a function f: X -> Y has an inverse function f': Y -> X (not partial inverse) if and only if f is a bijection. IIRC, this means we can construct f'': X -> Y, which is the inverse of f', and is equal to f.
« Last Edit: June 01, 2010, 01:02:10 PM by Overkill » Logged

David Pittman
Level 2
**


MAEK GAEM


View Profile WWW
« Reply #55 on: June 01, 2010, 12:59:53 PM »

If anybody steals my process in the slightest degree, I will kill you all. Tongue

Counter-challenge! If anybody (including you) makes this work, I will buy him a beer. Beer!
Logged

increpare
Guest
« Reply #56 on: June 01, 2010, 01:02:09 PM »

Argh you're right. What I meant to say was a function f: X -> Y has an inverse function f': Y -> X (not partial inverse) if and only if f is a bijection.
Stick a 'unique' in there and you get a gold star.
Logged
muku
Level 10
*****


View Profile
« Reply #57 on: June 01, 2010, 01:06:16 PM »

Argh you're right. What I meant to say was a function f: X -> Y has an inverse function f': Y -> X (not partial inverse) if and only if f is a bijection.
Stick a 'unique' in there and you get a gold star.

Hum. MathWorld disagrees with that terminology, nor is it what I've learned. Probably just a case of different authors/different fields.
Logged
Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #58 on: June 01, 2010, 01:07:17 PM »

Sew the doubt, folks. It helps ward away the thieves. Ninja


@chrk
I intentionally left out some important details concerning the actual coding of the algorithm that were there earlier.

@Pitt
I will hold you to that! Tongue
Logged
BorisTheBrave
Level 10
*****


View Profile WWW
« Reply #59 on: June 01, 2010, 01:14:25 PM »

As it happens, I did not get a chance to see the snipped post. You took it down because of criticism? What you mean is, it didn't stand up to the critism, or else you could have rebutted it.

Anyway yes, everyone here understands how RLE works. The part I questioned before, and you have still glossed over, is how do you know that there is such a seed that will re-order the string into a more compressible one?

And you are oblidged not just to explain why you think your algorithm works, but why it circumvents the elementary counting proofs that it could not possibly work. Explain, not assert. Not sure you've quite grokked that part yet.
Just to re-iterate that proof, tailored just for you:

There are 2^70 70 byte files.
You claim all files can be compressed to 69 bytes or less.
There are 2^70-1 69-or-less byte files.
Thus, as the number of possible inputs outnumbers the
   number of possible outputs, at least two inputs compress
   to one output.
Thus, the decompressor cannot possibly reproduce both of those files.


No information theory in sight in that statement (just the pigeonhole principle for the penultimate step). Surely you can deign to answer that? (also, please try to ignore the repeat application part, it is not relevant. Either you can produce a pair of programs for compression/decompression or you cannot. If it involves repeat application of the base algorithm, fine, but in the end you still just have one application of the total program).

PS It is nice that you are responding to me, thanks for the courtesy at least
Logged
Pages: 1 2 [3] 4 5 ... 7
Print
Jump to:  

Theme orange-lt created by panic