He mistook nothing. It's that the entire purpose of the algorithm is to eliminate or at least minimize entropy.
You
can't reduce the entropy of a file without removing information from it. What you are doing is shuffle all the file's entropic content into a different representation, which may be
larger or smaller than the original file.
Let me see if I understand: This algorithm takes blocks of 64 bytes, sorts the bytes into groups of the same value (possibly from smallest to largest), then stores the groups, RLE compressed (something between 2 and 80 bytes per block), along with the data used to shuffle them (296 bits, or 37 bytes).
This algorithm will obviously do no compression if there are no repeated bytes in the block. Furthermore, Since the position information takes 37 bytes, it needs to be that much more efficient than the regular RLE algorithm.
The only possible case I can envision this happening was a block that contained something like:
ABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABAB
yes, that stores horribly in RLE.
Which is why most compressors do not use RLE.