Open file formats help ensure access to your data over the long term.
Guidelines for selecting file formats
Open file formats typically have the following characteristics:
- Non-proprietary
- Open, documented standard
- Common usage by the research community
- Standard representation (e.g. ASCII, Unicode)
- Unencrypted
- Uncompressed
It is also important to document the software to access and use the data in a README file.
Examples of preferred file formats
Note that this is not an exhaustive listing (UK Data Service):
- Text: PDF/A, RTF, TXT, XML
- Audio: FLAC
- Image: TIFF
- Spreadsheet: CSV, TAB
- Video: MP4, OGV, OGG, MJ2
- Geospatial: SHP, SHX, DBF, PRJ, TIF, TFW, DWG, GML
More information:
Some resources for identifying preferred long-term preservation file formats include:
- Library and Archives Canada Guidelines on File Formats
- US Library of Congress Recommended Formats Statement
- US Library of Congress Sustainability of Digital Formats