Jpegli for JPEG Encoding: Dark Magic in Software Encodings

I’ve had one version, or another, of my “Responsive Images to Hugo” program for the best part of a few years now. It’s a pretty simple program, takes some images, runs some processing, and then uploads them, so that I can easily reference them in this blog.

I’d just finished a new iteration of the program¹ (one that converts images to WebP files), when Google announced their new Jpegli library. And so, in a ritual I know far too well, I spent the best part of this weekend working on rewriting yet another version of this program - this time with Jpegli.

Why the fuss?

JPEG encoding isn’t exactly the most interesting thing in the world, but Jpegli more resembles art than code. It’s strictly better than previous JPEG libraries, in the kind of way Michael Phelps is better at swimming than other humans. The files it produces aren’t just smaller, they’re much smaller. The 35% increase in compression means that for reasonably sized files that modern encodings like AVIF and WebP become irrelevant. Smaller images on the web means faster pages.

Jpegli images also look better. In part due to better heuristics, but also because of the arcane software wizardry which enables Jpegli to encode 10-bits of information into an 8-bit JPEG standard so old it could theoretically have kids. It’s not a new JPEG encoding in the sense other JPEG encoders, the kinds made a decade ago, won’t be able to understand the output. Jpegli produces images entirely compatible with all other JPEG decoders, Jpegli just encodes the information more efficiently so that if you encode and then decode the image using Jpegli you’ll have access to richer information.

You might think the trade-off for having better and smaller images would be that it takes a long time to encode an image², and you’d be wrong. Jpegli not only produces smaller files, that look better, in an entirely backwards compatible fashion³. It also does it at comparable speeds - and with an entirely compatible API/ABI. You can drop a compiled Jpegli library into any place that libjpegturbo or mozjpeg are being used and theoretically produce better, smaller files without any downside.

Dark. Magic.

Jpegli is plain better, and theoretically easy to integrate. So, why wouldn’t I try and give it a go?

Down the Rabbit Hole

My program was written in Rust, so my starting point was the unofficial Jpegli crate for Rust which uses (mostly) the same API as the mozjpeg-rust crate. Given they use the same API on the Rust side as well, leveraging Jpegli should just be a matter of swapping out mozjpeg-rust for a basically identical Jpegli module.

However, there was one snag. My program didn’t actually use mozjpeg-rust. Instead, it uses rimage which in turn uses zune-jpeg for encoding and decoding jpegs by default. After digging a little, I discovered that there was a feature flag for encoding jpegs using mozjpeg! So, a short while later I had a PR for swapping encoding to use jpegli instead. Writing the encoding portion of the code was mostly find and replace, however I couldn’t get Jpegli decoding working. Testing my version, I kept running into out of bounds errors, and I was pretty sure my initial sketch of the code didn’t handle things like colour profiles, colour spaces, and EXIF data well.

Trying to find some reference code of mozjpeg-rust encoding to copy I found an example in the load_image crate, which is another crate written by the author of mozjpeg-rust. The relevant file has been reproduced in part below.

let thread_res = panic::catch_unwind(move || {
  let which_markers = if self.metadata {
    ALL_MARKERS
  } else {
    &[
        Marker::APP(1), /* Exif */
        Marker::APP(2), /* Profile */
    ]
  };
  let dinfo = Decompress::with_markers(which_markers).from_mem(data)?;
  let width = dinfo.width();
  let height = dinfo.height();

  if width*height > 10000*10000 {
    return Err(crate::Error::ImageTooLarge);
  }

  let (orientation, is_adobe_1998) = Self::get_exif_data(&dinfo);
  let profile = if self.profiles == Profiles::None {
    None
  } else if let Some(embedded) = self.get_jpeg_profile(&dinfo) {
    Some(embedded)
  } else if is_adobe_1998 {
    Profile::new_icc(profiles::ADOBE1998).ok()
  } else {
    None
  };
  let chunks = dinfo.markers().map(|m| {
    (ChunkType::JPEG(m.marker), m.data.to_vec())
  }).collect();
  let meta = ImageMeta::new(Format::Jpeg, chunks, fs_meta);

  let img = match dinfo.image()? {
    mozjpeg::Format::RGB(mut dinfo) => {
      let mut rgb: Vec<RGB8> = dinfo.read_scanlines()?;
      rgb.to_image(profile, width, height, true, meta)
    },
    mozjpeg::Format::CMYK(mut dinfo) => {
      let cmyk: Vec<CMYK> = dinfo.read_scanlines()?;
      cmyk.as_slice().to_image(profile, width, height, true, meta)
    },
    mozjpeg::Format::Gray(mut dinfo) => {
      let mut g: Vec<GRAY8> = dinfo.read_scanlines()?;
      g.to_image(profile, width, height, true, meta)
    },
  };
  Ok((img, orientation))
});

let (img, orientation) = thread_res.map_err(|e| {
  let string = e.downcast::<String>().map(|e| *e)
    .or_else(|e| e.downcast::<&'static str>().map(|s| String::from(*s)));
  if let Ok(e) = string {
    crate::Error::Jpeg(e)
  } else {
    crate::Error::UnsupportedJpeg
  }
})??;

Ok(img.rotated(Rotate::from_exif_orientation(orientation)))

Looking at the code, I realised that maybe this was a little more complicated than I’d imagined and decided to change tack to offload all the responsibility to load_image by changing load_image to use Jpegli. Simple! So now my program could depend on load_image to use Jpegli for decoding, and it could continue to use rimage for encoding, just now with added Jpegli.

The only difficulty was transforming the struct returned by load_image into the format of zune-image (the underlying image management crate for rimage).

load_image uses imgref under the hood, which provides an abstraction over using Vec<u8> for storing image buffers. Other image modules aren’t interoperable with imgref though, and to load in image data to manipulate they expect a Vec<u8>. imgref doesn’t give you a simple way to get a Vec<u8> back when you want one though.

The code ended up looking like this.

fn zune_image_from_loaded_image(image: load_image::Image) -> Image {
  match image.into_imgvec() {
    ImgVecKind::RGB8(pixels) => {
      let (pixels, width, height) = pixels.into_contiguous_buf();
      zune_image::image::Image::from_u8(
          &pixels
              .into_iter()
              .flat_map(|px| px.as_slice().to_owned())
              .collect::<Vec<u8>>(),
          width,
          height,
          ColorSpace::RGB,
      )
    }
    ImgVecKind::RGBA8(pixels) => {
      let (pixels, width, height) = pixels.into_contiguous_buf();
      zune_image::image::Image::from_u8(
          &pixels
              .into_iter()
              .flat_map(|px| px.as_slice().to_owned())
              .collect::<Vec<u8>>(),
          width,
          height,
          ColorSpace::RGBA,
      )
    }
    ImgVecKind::RGB16(pixels) => {
      let (pixels, width, height) = pixels.into_contiguous_buf();
      zune_image::image::Image::from_u16(
          &pixels
              .into_iter()
              .flat_map(|px| px.as_slice().to_owned())
              .collect::<Vec<u16>>(),
          width,
          height,
          ColorSpace::RGB,
      )
    }
    ImgVecKind::RGBA16(pixels) => {
      let (pixels, width, height) = pixels.into_contiguous_buf();
      zune_image::image::Image::from_u16(
          &pixels
              .into_iter()
              .flat_map(|px| px.as_slice().to_owned())
              .collect::<Vec<u16>>(),
          width,
          height,
          ColorSpace::RGBA,
      )
    }
    // Removed for brevity
  }
}

The key elements in the code above are:

Matching on the enum, so that you can handle different numbers of colour channels (RGB is 3 channels, RGBA is 4 channels, and greyscale is 2 channels) as well as different integer sizes (8bit vs 16bit)
Using into_contiguous_buf to produce a buffer where pixels are laid out row by row (i.e. the pixel for coordinate X, Y is found at index: y * height + x)
Once you have access to a pixel you can flat_map it into RGB(A) order

Nothing too hard, just a bunch of simple but arcane knowledge required!

Now I can transform images in peace, until the next light-year leap in JPEG encoding.

This one avoids the need to use responsivebreakpoints.com by ditching art direction (for now) and generating resizes locally. ↩︎
This is true for guetzli a jpeg encoder that produces fantastic (best in class) encodings but it takes an eternity and demands huge amounts of memory. ↩︎
Okay so, this is a bit of a lie. 10-bit encoding and decoding uses new API/ABIs, but even without 10-bit there is both a visual and compression improvement from jpegli. ↩︎