Organizing your Data

Organizing Files

Using version control, a consistent file system and file naming structure makes your data easier to understand, share and preserve. Some advice for naming files:

  • Avoid special characters
  • Capitals or underscores instead of periods or spaces; try CamelCase
  • Use 32 or fewer characters
  • Use date format ISO 8601:YYYYMMDD
  • Include version information (if applicable)
  • Use meaningful names

 

Documenting Data

Data documentation provides context on how your data was created, generated and processed. Documentation may be done in the form of a codebook, data dictionary, "readme" file, or formal metadata

 

Security and Storage

The security of the location you are storing your data is important, especially if you are dealing with any sort of sensitive data. In general, storage resources provided by your department are more secure than personal storage, and storage provided by campus services providers is better still. The Office of Research and Innovation provides online training around secure handling of data as it relates to human subjects. Much of what is discussed in this training is relevant to general data practices as well.

 

Confidential and Sensitive Data

Confidential data refers to data gathered about human subjects - any research conducted involving human subjects must go through the U of T Research Ethics Board

Sensitive data refers to any data that, if released to the public, would have an adverse effect (for example, an individual's racial or ethnic origin or political opinions, geospatial data related endangered species nesting sites, etc.).

If you have confidential or sensitive data, you will face unique data management and dissemination requirements. Confidential or sensitive data should be password protected and stored with encryption on a secure server with role-based rights.

For information on working with confidential or sensitive data, refer to the Office of Research and Innovation's Policies, Guidelines and Procedures, for related policies including:

 

Data Management Plans

A Data Management Plan (DMP) documents your data organization decisions. A DMP is a short document created at the start of your research which addresses how you will work with your data.

Common topics addressed in a DMP include:

  • Types of data that will be created
  • Policies (funding, legal and institution) that apply to the data
  • Who will own, have access to, and be responsible for managing the data
  • What equipment and methods will be used to capture, process and document the data
  • How the data will be organized and documented
  • Where the data will be stored during and after the research
  • How the data will be shared and under what provisions

There are free tools available to help draft a DMP:

These tools have templates that reflect criteria to meet funding requirements, and can help design a plan that is relevant to data in your research domain. You can also draft your own DMP rather than relying on a template.