May 24
2025
I will prefix this post by saying that none of the ideas here are new and its all well covered ground by now. Like much of this blog, I am mostly writing to clarify ideas and to serve as a reference for the future-me.
Entropy and Prediction
Entropy as considered in information theory, can be thought of as the number of possible values the each symbol can take on. A stream of perfectly random bits has maximal entropy because each symbol is equally likely to be a 1 or 0. If however we impose rules on such a binary stream, e.g. after 3x1s there must be a 0, then after a run of 3x1s, the next symbol must be a 0, and so the entropy of such a stream is reduced. This change in entropy can be considered information.
Viewed through this lens, entropy and our ability to predict the next symbol, based on what has come before, are related concepts. If we can perfectly predict the next symbol every time, then the entropy of the a data stream is the entropy of the predictor, i.e. how many bits do we need to make the predictions.
Compression and Predictors
To compress a data stream is to reduce the number of bits required to represent towards the limit dictated by the entropy of the data stream. This can be done by looking for repeated symbols and using more efficient encoding for those symbols as is typically done with Huffman codes. A further refinement of this technique is to use a predictor to predict the next symbol and encode the difference between the actual symbol that occurs and the predicted symbol.
To see why this is an improvement, consider a perfect predictor - the only symbol we would need to encode is 0 which makes for very high compression ratios. In practice this rarely happens, but predictors do generally reduce the number of possible symbols. Consider the sequence [10, 9, 8, 5, 3, 1]. This has 6 unique symbols we need to compress. However if we use a simple predictor that says "the next symbol is the same as the current symbol", then the difference is [10, -1, -1, -3, -2, -2], where we assume the first prediction is 0. This sequence of difference from prediction only uses 4 symbols and is easier to compress.
The use of predictors allows us to exploit correlation between symbols to achieve better compressions than simply using more efficient coding. Predictors can be continuously applied with a different predictor each time, to potentially exploit high levels of correlation. The use of predictors can be said to "decorrelate" the data because the ability to predict the next symbol in difference-from-prediction data stream is less than before. e.g. if we were apply the same predictor as before to [10, -1, -1, -3, -2, -2] we get [10, -11, 0, -2, 1, 0], and now we have 5 symbols instead of the 4 we started with, i.e. use of the predictor was detrimental. Where as the original data had elements that correlated to each other by difference of 1 or 2, the difference-from-prediction does not have this property. This also illustrates that the choice of predictor matters as a bad predictor can make things worse by trying to force correlations where non-exists.
Predictors can be specified ahead of time, or "trained" on the data and transmitted along with the compressed data so the receiver can decompress it.
Jul 21
2022
Assembly4 and SubShapeBinders make it possible to assemble Parts, then using their position in the assembly to generate new Parts. However because new Parts depend on Parts in the assembly, if you try to add them to the assembly FreeCAD 0.19 will complain about cyclic redundancy, which makes sense …
Jul 21
2022
Scenario
- There exists a policy that snapshots directory ABC
- Directory ABC now no longer exists
- You want to keep existing snapshots
- You want to stop kopia from attempting to take any more snapshots
In order to achieve (4):
- Set the policy to manual only
- Disable inheritance from parent/global policy …
Mar 05
2022
In astrophotography the combination of a goto mount, a camera and plate-solving is a powerful one. It allows you to do all kind of neat things, like polar-alignment without having a clear view of the south, extremely accurate goto functionality, and automated capture of multiple predefined targets.
There are roughly …
Jan 02
2022
I have been using SpiderOak One since Edward Snowden recommended it in 2013, almost a decade ago. I have over 1.5T of deduplicated data spread across 6 or so devices. This year, in 2022, I will not be renewing my subscriptions.
The main issues:
- Lack of updates: the last …
Nov 24
2021
On an ASUS E203M
Busenlabs Lithium
Installed fine but kernel (4.9?) would immediately shutdown after decrypting the root volume due to incorrect thermal readings. Was not able to disable this via thermal.nocrt=1 boot param
AntiX 21
Live system did not have working mouse. Did not proceed to …
Jun 24
2019
T102HAAS.303 Breaks Suspend
After upgrading my Asus Transformer Mini's BIOS to T102HAAS.303 suspend was completely broken. Closing the "lid" would cause some kind of suspend that cannot be disabled in software and from which Linux 4.15.0 cannot successfully resume from (actually it does resume but the …
Jun 24
2019
Installation
Create a USB drive with a single FAT32 partition then extract the installation ISO into it using 7z:
7z x /path/to/ISO -o/path/to/usb
Note that there is no space between -o and the following path.
Plug in the USB drive then turn on the …
Aug 05
2018
# GPS Timing
¶
Carrier-phase detection is suppose to yield better timing information than tracking the pseudorandom code stream. The reason for this is supposedly that the higher frequency carrier allows for more accurate measurements of the …
Mar 23
2016
Some issues I ran into when trying to move a Linux install from a 500 GB drive to a smaller 120 G SSD:
- When duplicating the filesystem, make sure that /proc exists, otherwise when you boot the new drive the kernel will complaint that /proc is missing and it can't …
Mar 23
2016
I recently acquired a used Lenovo X220 for use as a linux laptop, and needed the following in my openbox configuration XML to make the hardware speaker and mic mute buttons work:
<!-- Modified for X220 -->
<keybind key="XF86AudioMute">
<action name="Execute">
<command>pactl set-sink-mute 0 toggle</command>
</action>
</keybind>
<keybind …
Oct 04
2015
Introduction
I recently started playing around with software defined radio using a USB TV tuner dongle utilising the popular RTL2832U chipset. After playing around with software like
CubicSDR and
gqrx I was somewhat frustrated at the opaqueness of what is going on under the hood. As such I resolved to …
Nov 06
2013
Simply set the language for non-unicode programs to Chinese (PRC) and it will magically work.
N.B. You must have first installed files for East Asia Languages in Languages Tab for this to work.
Cheers,
Steve
Aug 19
2012
- Download SheepShaver
- Download "New World PPC ROM" from redundantrobot.com
- Extract the zip. This should produce newworld86.rom. Rename this file to ROM and put it in the same directory as SheepShaver.app
- At this point SheepShaver.app should run, showing you a folder with a blinking ? inside it
- Download …
Jul 21
2012
Here are some notes I made while trying to understand the opcodes in this fun article on generating small elf binaries. Hopefully they will be of use.
00000000 B801000000 mov eax, 1
00000005 BB2A000000 mov ebx, 42
0000000A CD80 int 0x80
In the above snippet:
- B8 is the move instruction …
Feb 12
2012
Replacing the SuperDrive with a hard disk seems to be a Bad Idea at least in my 13" mid-2010 MBP: my 320GB WD Scorpio Blue took over a minute to copy 60MB. That is so 80s.
Interestingly, when I put the hard drive back to where it belongs, and put …
Mar 26
2011
The installers Ralink provides for some of their chipsets, like the RT2770 does something retarded: they attempt to unload a kext in a pre-install script, and when it fails the script fails and the entire install fails.
This means on a new machine, or one that never had the kext …
Feb 25
2011
I took delivery of my Linksys WUSB600N V2 today and was very excited to get my G4 Mac Mini online. The many online sources suggests all I need to do is change a few values in an Info.plist. However as it turns out, that wasn't enough. In order for …
Jan 27
2011
I often mockup iPhone interfaces on the Gimp, then exporting each layer for use as backgrounds. For iOS 4 each image needs to be exported twice: one at full resolution with @2x in its filename, and once at half-resolution without the @2x. e.g. nav_bg@2x.png and nav_bg.png …
Jun 15
2010
Recently unboxed my Ben NanoNote, and I am impressed. The packaging was top-notch and classy as hell. I would not hesitate to put my name to it.
The NanoNote itself is tiny, and feels fairly solid despite the shiny-appearance which I have come to associate with cheapo devices. It's a …