From time to time, we hear that users want to compare files as binary using Kaleidoscope.
Why compare files as binary?
There are a number of valid reasons why you might want to compare files as binary data. The most obvious case is for file types that are binary by nature, such as executables. For an executable, there is no easy way to “show” the contents of the file to a human, as executable files are typically read by computers, not humans. That’s why Kaleidoscope doesn’t open executable files.
As a developer, you might want to check aspects of an executable down to the bit-level detail. So you need a way to look at those bits and even check two sets of bits for differences.
Then there’s another use case, more closely related to the core functionality of Kaleidoscope: files that look identical in Kaleidoscope might not be identical on disk. Consider two very simple text files Mac.txt and Win.txt. When comparing them in Kaleidoscope, their text content is the same, with no change to spot:
But when comparing the raw binary data of those files, something curious surfaces:
One byte is different. Why is that? Of course, this is an example that’s specifically crafted to make a point. But there’s a long and complex history behind that little difference.
I saved both files in a different encoding, Mac.txt in Western (Mac OS Roman), Win.txt in Western (Windows Latin 1). In the first case, the character
ß is located in position
DF of the code page. In the latter, it’s in
If that all sounds strange and unfamiliar to you, and you never heard the term code page, enjoy the fact that you are living in 2023! Typically, we no longer have to deal with gory details of data storage for text. Historically, when storage space was highly constrained, this used to be a major nightmare in computing. Getting a simple plain text file from a Windows computer to a Mac typically meant you needed to know such details and convert the format along the way, or you would just receive garbage on the other end. If you are interested in learning more about text and encodings, watch this terrific talk titled There’s No Such Thing As Plain Text.
Kaleidoscope focuses on content, so it doesn’t bother users with details like different text encoding. But if you need to investigate details, now you can. Read on…
What is “binary”?
Files on disk are always binary format. They are composed of bits that have two states:
1. In fact, all computers (most of them, anyway) store all data as a sequence of zeros and ones. Computers can’t do anything other than read, write, and compare sequences of zeros and ones, which are typically grouped into sets of 8, called a byte.
Your iPhone only deals with bits all day long, but an awful lot of bits in a very short amount of time. When your finger taps the shutter button in the camera app, not only does iOS deal with millions of bits to manage that finger tap on the display and execute an action in response, it also writes millions of zeros and ones to its Flash storage, in the format of a HEIC image file.
A full-size ProRAW photo taken on iPhone 14 Pro is approximately 75MB in size, that’s 629,145,600 bits, just to give you an idea.
For many file formats that Kaleidoscope deals with, such as text and image files, well-known methods exist that translate that binary data into text or images. Most of the time, you only care about the difference in content (text or images) when using Kaleidoscope. When you drop two files onto Kaleidoscope, the app uses operating system libraries to read the files and decode their content into text or image data and shows a comparison of that. The entire binary layer has been abstracted out of sight in modern computers.
Kaleidoscope and comparing binary data
Using Shortcuts (and the Finder)
As we have done before on several occasions, we’re going to employ macOS Shortcuts to create a solution for comparing binary data of files in Kaleidoscope, without the need to change Kaleidoscope itself.
Shortcuts is an automation technology on macOS (and other Apple platforms), to be used with an app by the same name. To clarify the naming mess, a thing you make in Shortcuts is called a shortcut (small s), which itself consists of actions (or shortcut actions). Kaleidoscope supplies several actions to Shortcuts. This means that data coming out of other actions can be sent into Kaleidoscope for doing comparisons.
The beauty is that these individual shortcuts can be used directly from the Finder, or the menu bar, to act on a selection of files.
In 2018, Till Toenshoff created a workflow based on Automator to compare binary data using Kaleidoscope, and posted it to GitHub. Our solution is inspired by his work.
The only real processing that our shortcut needs to do is run each file through a tool called
xxd, which transforms the binary data into readable hexadecimal characters.
Here’s a screenshot of the completed shortcut:
Installing the Shortcut
Before installing the shortcut, ensure you have a recent version of Kaleidoscope on your Mac.
Due to security and privacy concerns, getting a shortcut installed is a bit of work.
Get the Shortcut
First, you need to download the Shortcut into your Shortcuts app.
Click the Add Shortcut button.
Allow Running Scripts
Open the Settings, select Advanced in the toolbar and ensure that Allow Running Scripts is enabled.
Set as Quick Action
Find the Shortcut that you added above and double-click it. Click the ⓘ button in the toolbar, then ensure Use as Quick Action and Finder are enabled.
Click the ▶︎ button in the toolbar, select a file when prompted, choose Always Allow in any following permission prompt.
Follow the instructions in the sidebar to download and install the shortcut, in case you are not yet familiar with the process. Once set up, you can select a number of files of any kind in Finder, bring up the contextual menu, and select Quick Actions > Compare Binary Files.
Doing so will process those files using the shortcut and compare their binary representations in Kaleidoscope. This can be helpful in certain circumstances, see above. Note that you are not limited to two files. Use the File Shelf in Kaleidoscope to compare various combinations.
Using the command line
Some Kaleidoscope users prefer dealing with the command line. Almost the same functionality that took us several setup steps in Shortcuts can be achieved using an elegant single line command in Terminal:
ksdiff <(xxd A.txt) <(xxd B.txt)
Refer to the
xxd man page to fine-tune the layout to your needs. 🙂
Do you need to compare files that can’t be read by Kaleidoscope? Maybe there’s a way to automate the process in a tool like Shortcuts. Let us know. We are always interested to learn from our users…