Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411588 Posts in 69386 Topics- by 58443 Members - Latest Member: Mansreign

May 06, 2024, 12:13:20 PM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)Compression by means of a binary diff to a common file
Pages: [1]
Print
Author Topic: Compression by means of a binary diff to a common file  (Read 929 times)
Dataflashsabot
Level 2
**


View Profile
« on: June 04, 2010, 02:04:36 AM »

This idea popped into my head while reading one of the many threads about compression. How about distributing 4MB files as a diff to D3DX9_40.dll? Or think of a deterministic algorithm that will result in the same large file each time, distributing a program that will create that file, and have the installer be a diff of that file (applied by another small EXE)? I know almost nothing about compression so let me know if I'm being stupid...
Logged
Pineapple
Level 10
*****

~♪


View Profile WWW
« Reply #1 on: June 04, 2010, 02:46:55 AM »

Personally, I think it's possible if the creation program could be complex enough (it would probably be wise to incorporate a neural network in there somewhere just to be safe) but it would take a very long time brute force or otherwise.

If you want, I can lend you the code for my neural network I wrote a while back.
Logged
slembcke
Level 3
***



View Profile WWW
« Reply #2 on: June 04, 2010, 06:06:06 AM »

No, because the diff is almost certainly going to have the same amount of entropy (amount of randomness) as the original file. The only reason the diff would be more compressible is if the files are very similar to begin with. So if you had two similar files, diffing one against the other then compressing the diff would probably work much better than just compressing the file. Most video compression algorithms use some sort of diff to compress each frame. This works because each frame is probably very similar to the one before. If you diffed a frame against some random data, you would just get random noise and it wouldn't compress well at all.

It's also worth pointing out that the best diffing algorithm depends on the type of data. A text diff would work terrible for images, and an image diff wouldn't work well for text. A lot of data compression is realizing how to rearrange the data so that it can be better compressed using another general data compression algorithm. With lossy compression (video, sound, images, etc) it's also important to understand how to rearrange the data so that you can discard the least important parts of the data.
Logged

Scott - Howling Moon Software Chipmunk Physics Library - A fast and lightweight 2D physics engine.
Pages: [1]
Print
Jump to:  

Theme orange-lt created by panic