Tech: High Efficiency Image Compression - Apple vs Open Source

Lars Poulsen - 2023-05-01

Apple and Linux collide to screw us …

This is a story about the joy and the anguish of using Linux.

The joy is that thanks to the generosity of the people in the Open Source development community, it is possible to have access to a very large amount of high quality software. Where Linux was once an operating system usable only for people with a high level interest in computer technology, it is by now no harder to install and use than Microsoft Windows.

But along with this joy come a few frustrations as well.

One of them is that in order to ensure that large corporations with many lawyers cannot come in and shut down open source projects for unlicensed use of their claimed intellectual property, the publicly available distribution packages of "free" software contain only program code that is vetted and cleared to not contain any patented pieces that have not been explicitly cleared by the rights holders to allow their use in this manner.

The History of Computer Picture Formats: TIFF, GIF and PNG

Computer picture formats have a long history of fights over patented image file formats. The first computer pictures were just long sequences of ones and zeroes: A 1 bit where you wanted a black point (or where you wanted the screen to light up), and a 0 bit where you wanted to see the white paper (or the dark screen background). This was also what you would do for FAX (Facsimile transmission - the original "electronic mail"). Someone wrote down how to package this in a file in a fairly obvious way. And someone else gradually expanded this to allow for grayscale pictures that might be black and white photographs, and then into color pictures, where each of 3 primary colors got an 8-bit graduated level in each picture cell (pixel). But the files were getting rather large, and took a long time to transmit over the dial;-up modems used to connect to the internet at the time. For example, a picture of 4 x 6 inches at 72 dpi (dots per inch) screen resolution would be 375 KB and would take 41 minutes to transfer across a dial-upline with a 1200 bps modem, which was the state of the art in the late 1980s. So when someone figured out a way to compress the picture files dramatically by identifying repeated bit sequences and saying "the next 30 pixels are light blue" instead of repeating "light blue, light blue, light blue" 30 times, it was a HUGE step forward, indeed worthy of a patent. In fact, the stepwise refinements involved, happened at several different companies, which each tried to assert patent rights over their specific contributions, trying to block the use of the resulting GIF file format unless you paid them royalties. Eventually, the Open Source community developed the PNG (Portable Network Graphic) file format to get around this, until all of those patents expired in 2004.

The History of Computer Picture Formats: JPEG, MPEG and HEIF/HEVC/HEIC

While GIF and PNG work very well for "cartoon-like" images, where solid blocks of color have the same shade in areas of some size, they work less well for photographs, where even blocks of seemingly the "same color" turn out to be graduated and textured when you examine them closely. Several standards organizations formed the "Joint Photographic Experts Group" to solve this problem, and in 1992, they presented their result, the JPEG (or JPG) file format. While this worked very well, and was adopted by almost all websites, there were several lawsuits by people that claimed to have invented and patented some of the techniques even before the JPEG committee started its work, and one of them managed to extract over $100,000,000 in license fees before the courts and the US patents office finally declared their patent invalid in 2006.

A similar evolution has been happening with video file formats, under the MPEG - Moving Pictures Expert Group, producing first MPEG2 and then MPEG4 video file encoding standards. Along the way, a sound compression scheme (MPEG-1 Audio Layer III - or MP3 for short) became an important file format in its own right, beginning around 1995.

When Apple was preparing iOS 11 in 2017, coinciding with the release of iPhone X, they incorporated "the next generation" of photo and video encoding in the form of HEIC photo formats, and HEVC video formats. These are more efficient (create smaller files for the same apparent quality) than their predecessors. While the major patent holder has agreed to allow the use of their patented compression technology in software distributed at no cost, there is some fear that one of the other patent holders might still be preparing to assert their right to extort fees once the formats are widely adopted, and the Open Source community is trying to keep its distance from these possible future problems. (There are dozens of individual patent holders registered in the licensing pool for this technology.)

How does this affect me?

I have a large repository of photos on my Linux home server … about 100,000 images.

Both my wife and I use iPhones for everyday photos. My wife likes to put hers on iCloud and make nice albums on her phone; she hardly ever moves them anywhere else (except by sending them in MMS text messages). I don’t trust that stuff, so I copy both hers and mine onto the Linux server for longterm safekeeping, and I have written CGI programs to browse them from anywhere.

Apple likes the HEIF family of image formats (High Efficiency Image Formats - with photos that have an HEIC suffix) while almost nothing else will decode those. But I recode the photos to .JPG format on the way in, with a script that invokes the heif-convert command, and I was very happy with this for a few years. But last week, this stopped working. After some googling, I found the problem.

The patent holders that control the patents for the HEIF formats appear to have started to litigate to stop the free distribution of the libraries that can decode these formats, and the people supporting the various Linux distributions are deleting the appropriate decoding routines from the library modules. The result was that my image processing suite stopped working after a routine dnf update command.

One you know what is happening, it is possible to undo the damage by issuing the command dnf downgrade libheif (which rolls this library module back to the previous version) - but it will break again the next time you do a dnf update. (Yes, somewhere there almost certainly is an option that can be applied on the dnf command line to update everything except libheif, but I have not found it yet.)

This sucks!

I hope that someone will figure out a workaround involving the RPM fusion non-free repository, so I can go back to applying weekly patches again!

Update

A couple of days have passed, and I have learned a few more things:

This is what the RPMfusion repositories are for, and indeed, the better solution involves RPMfusion. They have a pair of matching versions of competing plugins: libheif-freeworld, which does not contain the “offending” codecs - and libheif-hevc, which is a codec pack that does contain those codecs. You install libheif-hevc, and it comes in from RPMfusion, but after that, when you remove the exclude=libheif and do dnf update, both libheif-hevc and libheif-freeworld compete and in my case, libheif-freeworld won out, and I was back where I started, so I had to put back the exclude, but now it reads exclude=libheif-freeworld, and it will do the right thing even in a system-upgrade (i.e. Fedora version update, which happens twice a year).

Another update 2023-06-09
I tried to use it again, and again it failed. Apparently at this time, the heif-convert program is now in a new module libheif-tools. After I installed that, it started working again.
More pages

(End of page)