Image of a an employee pointing at a computer screen with another employee typingAs the number of federal records continues to increase, and agencies are required to create and maintain electronic versions of a larger portion of these documents, it’s important to choose “sustainable formats.” In other words, what happens to a record slated for long-term storage if it’s kept in a format that becomes obsolete in coming years? The record may then itself become unusable, unintelligible, or be stripped of important content/information (like metadata).

Instead, sustainable formats increase the likelihood that electronic records will continue to be accessible and usable through the record’s lifecycle and throughout the organization’s lifespan.

What makes a format sustainable?

Sustainable formats are usually standardized, which means many developers recognize and use the formats. As a result, their use tends to be widespread enough that the market itself supports continued use of the formats. They are also “self-describing” in that the format itself stores relevant metadata needed to understand and interpret the information contained within the record.

There are also some less obvious considerations, too. For example, even though many records need to be protected, requiring retention of user IDs and/or passwords to access records will limit their sustainability. It’s better to find other ways to protect the contents of the record. On that note, be careful how you handle encryption. Encryption is an excellent way to secure records that contain sensitive, private, or confidential information. However, it should be handled by the Electronics Record Management (ERM) system, not at the file level, for fear that a given encryption technique or technology becomes outdated or inaccessible.

Another example of a non-obvious consideration is whether the format is “lossless.” Many digital formats “compress” the data they contain in order to shrink file size and save storage space. Some compression techniques, however, may remove some information from the file. A “lossless” file format preserves all information within the record, a key component of long-term viability.

What formats are not sustainable?

Before we dive into examples of sustainable formats, what are some formats to avoid? In general, any proprietary format that is owned and/or controlled by a single corporate or organization will not be suitable for long-term storage. What happens if that organization is dissolved at some point in the future? Similarly, any proprietary format that can only be used in a single or small number of software programs is less likely to be sustainable. What happens if the software becomes unavailable or cannot be run on the computers of the future?

Note that it’s fine to use proprietary file formats (like Microsoft Office documents) when the record is being created and used if it can be later converted or exported into a sustainable format for long-term retention.

What are examples of sustainable formats for text files?

HTML (Hypertext Markup Language) and XML (Extensible Markup Language): HTML is a language that is used for web content. It is open source and widely used, but it doesn’t include descriptive metadata. That’s where XML comes in: it’s another format for storing information on websites that also preserves metadata related to the record.

Plain Text: The most basic type of text file, plain text (usually denoted by the “.txt” at the end of the filename) is universal and easy for a huge variety of programs to read. It doesn’t capture formatting like bold or italics, nor special characters like emojis, but there’s little doubt plain text files will stand the test of time.

ODF (Open Document Format): An alternative to proprietary file formats (like those used with Microsoft Office), the ODF format is used to create word processing documents, presentations, and spreadsheets based on the XML markup language. It’s less widely used than other text formats, but it’s open source and supported by most software that can read those kinds of files. ODF files are denoted by .odt, .ods, and .odp in the filename.

PDF (Portable Document Format): PDF is one of the most common formats for records intended for widespread use. Unlike plain text, PDF can include formatting and media elements like graphics and sometimes even audio, video, and interactive 3D objects. Originally created by Adobe, the source code is freely available.

PDF/A (Portable Document Format / Archives): A specialized variant of the PDF format, PDF/A is better suited for genuinely long-term storage. Unlike the standard PDF format, it can’t embed multimedia like video and audio files, but it can incorporate fonts, so it is completely self-contained and adds descriptive metadata to the document itself.

What are examples of sustainable formats for spreadsheets?

CSV (Comma Separated Values): Despite the fact that spreadsheets can contain a lot of complex information, CSV files can be a simple way to capture and store that data. Think of it as the spreadsheet equivalent of plain text. CSV files can be opened by almost any text editor or spreadsheet software. However, the CSV format is simplistic and does not offer advanced features, so it may be best suited for relatively basic spreadsheets.

ODF: As discussed above, ODF (here, the .ods format specifically) is a good option for storing more advanced and complex spreadsheets. ODF is better than, say, .xls (Microsoft Excel format) for long-term, ongoing storage because it is open source (not proprietary) and therefore more accessible by a wider range of software options.

In our next post in this series, we’ll look at sustainable file formats for multimedia, including audio, video, and image records, as well as touch on long-term sustainability in storage media.

About PSL

PSL is a global outsource provider whose mission is to provide solutions that facilitate the movement of business-critical information between and among government agencies, business enterprises, and their partners. For more information, please visit or email info@penielsolutions.com.