Technology comes and goes faster than we realize — there are products and services that were mainstays just a decade or two ago that are now extinct. The key to preserving digital media is through intelligent data archival, and that includes more than just the physical mediums used to store information. The file format used to archive information is equally important, and not every file type is great for longevity. There isn't a perfect method for predicting which formats will outlast the hardware storing them, but the most likely candidates provide unencrypted and uncompressed data, or are open-source rather than proprietary.

In fact, notable archival bodies like the U.S. Library of Congress have put together lists of file formats best suited for preservation. We can also look at the past to predict the file formats that have a strong probability of surviving in the future. This list of enduring formats includes file types that not only meet the characteristics outlined above, but also have already survived decades thus far. If you use these file formats for your archival needs, there's a good chance your digital data will be around longer than the mediums you're currently using to store them.

Windows Format USB Drive dialog box with the File system options visible.
This is what NTFS, FAT32, and exFAT actually mean — and when to use each

Not all drives speak the same language, and these file systems decide who understands what.

TXT

A simple, plain-text document that should last forever

DOCX and TXT logos on a blue gradient background.
Logos from Wikimedia Commons
Free to use:
https://commons.wikimedia.org/wiki/File:.docx_icon.svg
https://commons.wikimedia.org/wiki/File:Text-txt.svg

TXT files are some of the most basic computer document types available. They contain plain text data encoded as binary numbers, which can be read by virtually any computer, word processor, or document viewer. This plain text file type lacks the modern amenities we've come to expect from text documents, like formatting, hyperlinks, or embedded images.

However, this simplicity is also what makes TXT files excellent for preservation. TXT files are small and can be read without any specialized software, and their basic nature makes them readable across operating systems and technological eras.

XML

A markup language for structuring and tagging text information

Windows File Explorer showing the extracted sysmon-config-master folder, including the sysmonconfig-export XML file.

XML stands for "Extensible Markup Language," and it's a more versatile plain-text document type than TXT. While the latter is excellent for basic archival of unstructured data, the former is superior for organizing and structuring digital information. XML files contain tags that sort and describe the information contained within them. It's similar to a CSV file in that XML is perfect for structuring large data sets like databases or metadata.

The beauty of using XML files is that they make logical sense when read as plain text, since the tags reveal exactly what each grouping of data contains. In other words, an XML document can be opened with any basic text editor, and a human or software tool will have no problem parsing the data and understanding what it all means.

PDF

The do-it-all file format with support for fonts, formatting, and images

One important characteristic of file types best for long-term preservation we haven't covered yet is that they are ubiquitous. A popular format that is proprietary might have a better shot at surviving than a niche open-source file type that no one uses. This is one of the reasons that Adobe's Portable Document Format (PDF) is likely to last despite it being an imperfect file format for archival purposes. PDFs use font libraries stored on your device, which is problematic. If certain fonts become uncommon or extinct, older PDFs could become unreadable on newer devices.

Luckily, there's a solution, and it's called PDF/A. This is shorthand for "PDF for Archiving." PDF/A files bake all the necessary information to open and read the document into the file itself. This includes fonts, character sets, and color profiles, but PDF/A files by nature exclude things like links, audio recordings, or video recordings that are part of standard PDFs. Due to the prolific nature of PDFs and the future-proofed PDF/A, these files are well-positioned to last for a while. If given the choice, you should use PDF/A for situations where preservation is the goal.

CSV

A plain-text document with organized, tabular data

A preview of csv files to be imported into Power Query in Excel.
Screenshot by Ada

The best way to preserve information typically contained in a spreadsheet is a CSV file, and this stands for Comma-Separated Values. In simple terms, CSV files are plain-text documents that use commas to separate data points in a format that aligns with a provided schema. CSV files can be read in graphical form by spreadsheet applications like Microsoft Excel or Google Sheets, but they can also be opened with any text editor or document viewer. The commas are key to isolating each data entry and aligning them with their stated category. The plain-text nature of CSV files means they'll last as long as computers are capable of reading binary.

TIFF

An uncompressed and unencrypted image format

Choose TIFF when saving a file for print in Photoshop
Screenshot by Danny Maiorca from Photoshop for macOS
Credit: Danny Maiorca / MakeUseOf

Finding the right image format for preservation is tricky. JPEG, or a subformat like JPEG 2000, looks good on paper — but it involves compression and suffers from things like data loss or artifacting along the way. Instead, TIFF is the viable alternative for long-term image storage. The file type stands for Tagged Image File Format (TIFF), and it crucially contains uncompressed and unencrypted data. This makes it ideal for storing visual information in rich quality, up to 32-bit color.

Unfortunately, TIFF files are incredibly large, and are size-limited. If you care about data efficiency or need to store images larger than 4GB, you'll need to choose a competing format like JPEG. To have the best shot at your files lasting for the long haul, TIFF is the superior option in terms of longevity.

WAV

A digital file format containing raw and uncompressed audio

The Tenacity multi-track clip editor on Windows.
Brady Snyder / MakeUseOf

Need an audio file format built to last? AIFF (Audio Interchange File Format) or WAV (Waveform Audio File Format) is the way to go. AIFF was developed by Apple while WAV was made by Microsoft and IBM, but the latter is more common. It is a completely lossless file format that involves zero compression, making it ideal for preservation. WAV files can be played on virtually any operating system due to their versatility. Although, like TIFF, WAV files are larger than alternatives such as MP3.

AVI

A basic multimedia container storing both audio and video data

Fix AVI files drop-down menu in VLC Media Player

Popular video formats like MP4 are poor for preservation due to their reliance on compression, and the Audio Video Interleave (AVI) format's uncompressed version solves this problem. It's a container that stores video and audio information necessary for playback. It's particularly suited for archiving very old videos stored on physical media formats, such as movies on VHS tapes. File sizes are large, and features like metadata support are nowhere to be found. However, AVI is likely to be playable decades from now, and is recommended for video preservation by archival bodies like Northwestern University.

How important is having a future-proof file format?

The longevity of your digital file formats isn't as consequential today as it was in the past. If there are open-source file conversion tools capable of reading older formats and transforming the data they contain into new ones, preservation isn't much of a worry. Common file formats like JPEG or FLAC will probably last decades, even if they don't meet the stringent characteristics of an archival-grade file type. Still, going with a plain-text format without unnecessary encryption or compression will set you up for success. Avoiding proprietary file formats or data with digital rights management (DRM) software is recommended for archival purposes, too.

You'll notice that many of these file formats are older or niche compared to more popular file types in their category. That's because these may not be as efficient or feature-rich as their successors and alternatives. However, when it comes to digital data preservation, simplicity and accessibility always win.