An Overview of Image Metadata - How It Affects Web Performance and Security
If your website's images are taking forever to load, the data dwelling behind them could be partially to blame. Even worse, this data could contain sensitive information that you don't want visitors to know. This guide will explain how to make sure your image metadata isn't weighing down your applications and leaving them vulnerable to exploitation.
What is metadata?
Metadata is any auxiliary information stored within a file, which may include when the file was created and last edited. Web developers sometimes add short descriptions, or tags, to metadata so that search engines can identify images. According to research conducted by Dexecure, such metadata can account for more than 15 percent of a JPEG file's total size.
Image files actually contain several file types within them. Different types of metadata are stored in different types of files. For example, in addition to visual data, JPEG images might also contain:
- EXIF files, which hold information about the camera settings and manufacturer.
- IPTC files, which hold user-added metadata.
- 8BIM files, which are added by Photoshop.
- ICC files, which contain information about embedded color profiles.
EXIF files alone can include dozens of specific details such as when the picture was taken, the shutter speed used and whether the flash was enabled. Most of this information isn't necessary for a browser to render the image, so it can be disposed of without worries.
Examples of image metadata
Below are some specific things to look out for when trimming metadata from your images.
- Thumbnails: Image thumbnails are sometimes stored as EXIF data. Since browsers don't make use of thumbnails, this data should be removed.
- Copyright information: IPTC files sometimes contain information about the photographer or copyright holder. This information can sometimes come in handy but keep in mind that many social media websites automatically omit such data when images are shared.
- Orientation instructions: EXIF data may also include instructions telling browsers how to orient the image, but if the image is already embedded directly into the website, then these instructions only come into play if users directly visit the image's URL. Therefore, it's not really needed.
- Color profiles: Color profiles stored as ICC metadata are intended to ensure that an image looks consistent across different devices with various hardware capabilities. That said, not all browsers support ICC color profiles, so you should weigh the costs and benefits of including or excluding them.
The downsides of too much metadata
In addition to unnecessarily increasing the download size of your image files, storing too much metadata can cause other problems for your application. Browsers rely on height and width information from images to lay out web pages. Since EXIF data appears before everything else in a JPEG file, browsers can't start rendering the page until they wade through all that metadata. As you know, it takes just a few seconds for visitors to decide whether or not they want to wait for a page to finish loading, so an excess of metadata can lead to higher bounce rates.
Malefactors can use image metadata to carry out security attacks, which is why any data provided by your users should be sanitized before being used in security-sensitive situations. In some contexts, image metadata may tell viewers more than you want them to know. For example, police in Belize were able to track John McAfee (a person of interest) because he had geo-tagging enabled on his iPhone. Therefore, whenever he took a picture and uploaded it to the cloud, the image file included his exact location.
Likewise, Apple accidentally left metadata in one of its the image file included his exact location including editorial comments from Apple employees. The comments, in this case, were innocuous, but such information could have opened the door for liability issues if any sensitive company secrets were revealed.
Of course, image metadata isn't the only metadata developers need to worry about.
Documents and database files often retain timestamped modifications within their metadata, so content you thought you deleted may still be accessible. Of course, sometimes you want to keep information about a file's revision history, so you don't want to throw out the digital baby with the digital bathwater. Thus, all software developers should adopt some best practices for redacting metadata before releasing their applications to the world.
Removing image metadata
Some photo hosting services automatically scrub EXIF data. If yours doesn't, then you have plenty of methods to get rid of metadata on your own:
1. Removing metadata manually with Windows
If you have a lot of photos on your desktop that you want to upload online, you might as well go ahead and remove unwanted metadata in advance. In Windows, this is fairly simple. Just right click on an image, select "Properties" and click on the "Details" tab. Then, click on the "Remove Properties and Personal Information" link. You'll be given the option of creating a copy of the image with the metadata removed, or you can specify which properties you want to throw out.
It's possible to select multiple images in File Explorer and perform this process on them all at once. Some leftover EXIF data will still remain, but all sensitive information will be gone. That should be sufficient for security purposes.
2. Removing metadata on a Mac
Getting rid of metadata on a Mac requires an extra application. Fortunately, there is ImageOptim. Once it's set up, you can drag multiple images into ImageOptim at once. When the process is finished, just drag the images back to your desktop.
3. Removing metadata with GIMP
For a more thorough metadata cleansing, you can use GIMP. When you launch GIMP, open an image and look under the "File" tab and select "Export As." Give the image a name with a JPEG extension and click "Export." In the next window, open the "Advanced Options" panel, and then uncheck the box that says "Save EXIF data." Finally, hit "Export" again.
If you have Photoshop, you can follow these exact same steps for the same results.
4. Removing metadata with mobile apps
If your phone or tablet is your primary camera, you can download an app that removes EXIF data. Before you do that, however, look into your camera's settings for the option to disable EXIF data generation. Some cameras will only allow you to disable geotagging, so you may still need an app to finish the job.
More image compression tools that remove metadata
Aside from GIMP and ImageOptim, there are dozens of other compression tools to help reduce the size of your images by removing useless metadata. The Optimus plugin for WordPress, for example, automatically removes metadata and compresses the image without affecting their visual appearance. If you're using WordPress, ensure that you leave the following feature unchecked.
Furthermore, EXIF Purge is a lightweight application that lets you remove EXIF data from multiple images at one time.
Image metadata and SEO
As mentioned earlier, image metadata can play an important role in search engine optimization, or SEO. Of course, the most important piece of information is the alt text, or the alternate text that gets displayed when the browser cannot find the image.
Google looks at alt tags when ranking search results, so make sure your alt tags are descriptive without being repetitive. If you upload images to WordPress, the media uploader will prompt you to include a description and a caption, which both get stored as metadata. Check out our ultimate guide for image optimization to learn more about using image metadata to make WordPress websites more crawlable and SEO-friendly.
Summary
Eliminating metadata is an easy way to deliver images more promptly to your users. It will also make your applications more secure. Whichever removal method you choose, taking the few extra seconds to strip your photos of unnecessary data will pay off in the long run.