

This is not new knowledge and predates the current LLM fad.
See the Hutter prize which has had “machine learning” based compressors leading the ranking for some time: http://prize.hutter1.net/
It’s important to note when applied to compressors, the model does produce a code (aka encoding) that exactly reproduces the input. But on a different input the same model is unlikely to produce an impressive compression.
I could have said it better.
I mean compressor as half of a compression/decompression algorithm. The better way I should have worded it is: when you apply machine learning to a compression problem, you can do it lossless…your uncompressed output will be identical to the input, every time.
“NNCP” is a good search term to learn more, specifically about how this works.