2025-05-24

Thoughts on entropy, compression and predictors

I will prefix this post by saying that none of the ideas here are new and its all well covered ground by now. Like much of this blog, I am mostly writing to clarify ideas and to serve as a reference for the future-me.

Entropy and Prediction

Entropy as considered in information theory, can be thought of as the number of possible values the each symbol can take on. A stream of perfectly random bits has maximal entropy because each symbol is equally likely to be a 1 or 0. If however we impose rules on such a binary stream, e.g. after 3x1s there must be a 0, then after a run of 3x1s, the next symbol must be a 0, and so the entropy of such a stream is reduced. This change in entropy can be considered information.

Viewed through this lens, entropy and our ability to predict the next symbol, based on what has come before, are related concepts. If we can perfectly predict the next symbol every time, then the entropy of the a data stream is the entropy of the predictor, i.e. how many bits do we need to make the predictions.

Compression and Predictors

To compress a data stream is to reduce the number of bits required to represent towards the limit dictated by the entropy of the data stream. This can be done by looking for repeated symbols and using more efficient encoding for those symbols as is typically done with Huffman codes. A further refinement of this technique is to use a predictor to predict the next symbol and encode the difference between the actual symbol that occurs and the predicted symbol.

To see why this is an improvement, consider a perfect predictor - the only symbol we would need to encode is 0 which makes for very high compression ratios. In practice this rarely happens, but predictors do generally reduce the number of possible symbols. Consider the sequence [10, 9, 8, 5, 3, 1]. This has 6 unique symbols we need to compress. However if we use a simple predictor that says "the next symbol is the same as the current symbol", then the difference is [10, -1, -1, -3, -2, -2], where we assume the first prediction is 0. This sequence of difference from prediction only uses 4 symbols and is easier to compress.

The use of predictors allows us to exploit correlation between symbols to achieve better compressions than simply using more efficient coding. Predictors can be continuously applied with a different predictor each time, to potentially exploit high levels of correlation. The use of predictors can be said to "decorrelate" the data because the ability to predict the next symbol in difference-from-prediction data stream is less than before. e.g. if we were apply the same predictor as before to [10, -1, -1, -3, -2, -2] we get [10, -11, 0, -2, 1, 0], and now we have 5 symbols instead of the 4 we started with, i.e. use of the predictor was detrimental. Where as the original data had elements that correlated to each other by difference of 1 or 2, the difference-from-prediction does not have this property. This also illustrates that the choice of predictor matters as a bad predictor can make things worse by trying to force correlations where non-exists.

Predictors can be specified ahead of time, or "trained" on the data and transmitted along with the compressed data so the receiver can decompress it.

Adventures in Compressed DNG

I have been working with industrial cameras and frame-grabbers that simply gives me a dump of their sensor data. To make that data useful it needs to undergo demosaicing, ideally using one of the modern demosaic algorithms like RCD rather than simple interpolation or the rather outdated VNG. Unfortunately the latter are all that OpenCV supports.

To gain access to more advanced demosaic algorithms and sophisticated image processing functions like colour calibration, micro-contrast, dehaze, etc, I encoded the raw colour filter array (CFA) data into DNG using the PiDNG library. With compression disabled this library produces DNGs which is accepted by darktable. However if I enable compression darktable is unable to open it.

Running darktable from the command line, the following error is observed:

    71.5724 [rawspeed] (compression-test.dng) void rawspeed::AbstractDngDecompressor::decompress() const, \
    	line 250: Too many errors encountered. Giving up. First Error: virtual Buffer::size_type \
        rawspeed::LJpegDecoder::decodeScan(), line 108: Unsupported predictor mode: 6

This suggests that the predictor used is not supported. By editing the source code of PiDNG, predictor was set to 1 which results in yet another error:

    10.7300 [rawspeed] (compression-test.dng) void rawspeed::AbstractDngDecompressor::decompress() const, \
    	line 250: Too many errors encountered. Giving up. First Error: virtual Buffer::size_type \
        rawspeed::LJpegDecoder::decodeScan(), line 141: Maximal output tile size is not a multiple of LJpeg frame size

This error happens because PiDNG uses an optimisation trick to improve the compression ratio. Consider a typical CFA pattern:

RG
GB

The raw sensor data is essentially the above repeated horizontally and vertically:

RGRGRG
GBGBGB
RGRGRG
GBGBGB
RGRGRG
GBGBGB

Lossless JPEG compressed by using a predictor that uses previous pixel values, including those in the previous row. Such a predictor work better if correlated values are closer together. So instead of saving the image as-is, the DNG specification (as of 1.4.0) allows the image data to be stored as an image that is twice as wide but halve as high (preserving the total number of pixels):

RGRGRGGBGBGB 
RGRGRGGBGBGB
RGRGRGGBGBGB

In this new image, correlated pixels are closer together. e.g. the first red pixel had no immediate neighbours that was also red, but now it has an immediate neighbour in the next row. The rawspeed library however doesn't support this optimisation trick (or predictors other than 1), which to be fair is also useless with predictor 1, which only looks at the previous value on the same row. This trick is most useful with predictors that look at values from previous rows, e.g. mode 6 which is the default mode used by PiDNG.

With another modification to the PiDNG library that encodes the sensor data as a JPEG image of the original size, we were able to produce a losslessly compressed DNG file that darktable opens. The compression ratio isn't as good - 80% vs 75%. Hopefully darktable will gain support for predictors 2-7 and we can get better compression. In the meantime I cleaned up the changes to PiDNG and have submitted a PR which will hopefully be accepted.

Rant

DNG v1.4.0 was released in 2012, a time when DEFLATE compression in TIFF files was widely used yet for some reason the DNG specification restricts its use to floating point image data with estoic lossless JPEG, defined 20 years ago in 1992, being the only other lossless compression option. It is not until 2023, with DNG v1.7.0, that JPEG-XL is added for integer image data. Unfortunately JPEG-XL is exotic enough that darktable still doesn't support it in 2025. Why we couldn't have allowed Deflate for integer image data is beyond me. So much space could have been be saved.

2025-04-10

Python One-liner for Local QR Code Decoding

Because KeepassXC lacks the ability to decode QR code for TOTP setup and not all MFA implementations show the required secret in text format, I use the following snipet to read the QR code information locally:


  python -c 'from cv2 import QRCodeDetector; from PIL.ImageGrab import grabclipboard; import numpy as np; print(QRCodeDetector().detectAndDecode(np.array(grabclipboard()))[0])'

While there are many websites which can do the same, the TOTP secret is sensitive information that shouldn't be transmitted willy nilly to random websites.

You will need the following libraries installed: python-opencv>4, numpy, pillow. If you use uv then it is possible to have this as a nice script:

#!/usr/bin/env -S uv run --quiet --script
# /// script
# requires-python = ">=3.11"
# dependencies = [
#     "numpy",
#     "pillow",
#     "opencv-python>3",
# ]
# ///

from cv2 import QRCodeDetector
from PIL.ImageGrab import grabclipboard
import numpy as np

def main() -> None:
    im = grabclipboard()
    if im is None:
        print('Clipboard does not contain an image')
        return


    data, _, _ = QRCodeDetector().detectAndDecode(np.array(im))
    if data == '':
        # QR codes cannot encode the empty string (apparently)
        print('No QR code detected')
        return

    print(data)


if __name__ == "__main__":
    main()


Put this in your path somewhere and you can execute it like any other script and uv will handle installing depedencies etc.