Some Notes on Compression and Image Editing
This page collects some random information about editing, sharing, and
printing images, along with some observations about the JPEG image scheme and
implications for how to edit JPEGs.
First off, "why JPEG?" JPEG is by far the most common image format
for the web, and since it offers great compression, many people use it for image
storage as well. In addition, most digital cameras store their images as JPEG
files. As a result, we need to discuss JPEG and how it relates to editing your
JPEG (Joint Photographic Experts Group) is a compression scheme formalized in
the early 1990s. It is designed to store 24-bit (i.e. 8-bits per channel) images,
compressing them with a variable bit-rate lossy compression method. The
important concepts in that last sentence are "variable bit-rate" and
"lossy." Let's tackle them in reverse order.
JPEG is Lossy
"Lossy" in this context means that image is thrown away
irrecoverably during the compression process. Consider everyone's favorite compression
format for storing files, namely ZIP archives. When you unZIP an archive, what
you get is exactly what you started with. In other words, ZIP is a
"lossless" compression scheme. The same is not true of JPEG. The JPEG
scheme actually consists of 5 different compression methods all rolled into one
process. Three of those steps involve throwing away information, and once
you've thrown away information, you can't recover it. That's the bad news. The
good news is that JPEG is designed to throw away information that your eye +
brain have a hard time detecting in the first place, so the effects of the loss
may not be obvious unless you're specifically looking for them.
JPEG Uses Variable Bit-rate Compression
What this means is that the JPEG scheme allows the user to specify how much
information to throw away. Strictly speaking, JPEG's compression is not
"variable bit-rate" since there is no provision for actually
specifying and/or controlling the size of the final JPEG file, but that's a
nit-pick at best. JPEG allows the user, through a control often called
"quality", to choose small files (and lower image quality) or large
files (and higher image quality). To understand exactly how JPEG does this, you
have to understand the inner workings of the JPEG scheme. Click
here if you want the down-n-dirty info about how JPEG works. Warning: it's
JPEG: It's an 8x8 World
The important thing to know about JPEG, assuming that you skipped the
down-n-dirty page, is this: everything in JPEG works on 8x8 blocks of pixels,
and the lossiest part of the compression affects entire 8x8 blocks at a time.
Implications For Editing
Have you ever seen a photocopy of a photocopy? Or a copy of a copy of a copy?
The image starts to look pretty bad, doesn't it? This is called "generational
loss," and is the bane of editors of any kind of material. You want the
best quality at any given stage, and you don't want future versions of an image
to be limited by edits done to previous versions. Unfortunately, JPEG works
against you on that front. When you open a JPEG, you're de-compressing
compressed data. When you save a new JPEG, you're re-compressing that data. If
you open a JPEG and then save it again immediately, without doing anything, you
have re-compressed it...and we know that JPEG is lossy, so we know that some
image data will be thrown away in the process. If you opened, re-saved, closed,
and re-opened the same image enough times, you'd be sure to see the effects.
So how do we avoid this problem?
The best way to avoid the problem is not to make your master images JPEGs. If
you're scanning images with a scanner, save them as TIFF files, Photoshop PSDs,
or whatever uncompressed (or lossless-compressed, as TIFF compression is) format
is available to you. GIF is terrible; don't even think about using it for photo
images. Almost every image editor will support TIFF files, so they're a good way
to go. TIFFs are large, but disk space is cheap.
If you're using a digital camera, which starts you off with a JPEG, re-save
it as a TIFF before doing anything else. Trust me on this one: the first thing
you should do is open the JPEG and "save as" the thing into TIFF
format. Then do your edits on the TIFF.
If TIFF is not an option and you have to work with JPEGs, there are a
couple of things you can do to help yourself. First of all, make all your early
saves at the highest image quality possible. This will reduce the amount of data
discarded with every save, which in turn will minimize generational losses. You
should also strive to minimize the number of generations in the editing process;
don't work with an image for 5 minutes, save it, come back the next day, re-open
it, work another 5 minutes, etc. Do as much editing as possible in one session,
and make only one "save" at the end. Also (this is a big one): if
you're going to crop your image, do the crop as the first operation. Remember
that JPEGs work on 8x8 pixel blocks; if you crop an image, you change the
boundaries on all the pixel blocks (unless you cropped off exactly 8 pixels!),
which means all the DCT blocks get recomputed (see the JPEG
details page if you don't understand this comment and really want to do so).
So the absolute worst thing you can do to a JPEG image is crop it. For best
results, do this first, and do it only once.
Finally, "resize" should be the very last thing you do to an image...and as you might expect from the previous paragraph, you should do it only once. If you need two versions of the image at different resolutions, you're best off starting from a TIFF, resizing to the first target res, re-opening the TIFF, and then resizing to the second target res. This is not true just for JPEGs, but resizing a JPEG is especially tough on the image quality.