print friendly version

Questions and answers

2564
How can I store and name files so that it is easy to find them again?


If dealing with a large amount of data it is important to be able to find the specific file you want quickly and easily. One of the best ways that you can do this is to devise a consistent file and folder naming system and stick to it which will help you avoid confusion and rapidly identify the data you are looking for.

It is perfectly acceptable to split data up between a group of related folders which are then stored within a parent folder (which could itself be stored within a grandparent folder).

For example, all of the data from a particular project could be stored in a folder called:

  • research-data

This folder could then contain a number of different folders to group the data by year:

  • research-data
    • 2007
    • 2008
    • 2009
    • 2010

Each year folder could then contain subfolders to further group the data by month:

  • research-data
    • 2007
    • 2008
    • 2009
    • 2010
      • 01
      • 02
      • 03
      • ...

Creating a data structure such as this from the start will ensures that you don't need to move files later on, which can be time consuming and/or risky if involving large complex files.

Sorting

The above example illustrates how useful it is to name files and/or folders in such a way that they are sorted in a meaningful manner.

If you wish to use date to sort them (as above) then it is best to include the date in the file or folder name as follows:

YYYY-MM-DD or YYYYMMDD

where YYYY is the full year, MM is the month (from 01 for January to 12 for December) and DD is the date (from 01 to 31). This will ensure they are always sorted correctly.

However, elements of the date can be omitted if folder structure indicates it - for example a file stored in a folder named for the year it refers to need only be named for month and day:

  • research-data/2008/0323.txt

This illustrates another important nomenclature consideration - avoid repeating information to simplify and/or shorten paths to data as much as possible. 

Version numbers can also be used. When starting a version schema make sure you give yourself enough room for future expansion, for example naming the first iteration v001 rather than v1 giving yourself 998 possible future versions rather than just 8.

What not to use

It is best to restrict filenames to alphanumeric characters only as other characters may not be supported on all platforms or may be reserved for use by the operating system.

When manually naming files try to avoid using spaces, underscores or periods (aside from the period used to separate the filename from the file extension). If you need to separate words, use a dash character -. This improves readability (and can have additional benefits for documents placed on the web as search engines will treat words separated by dashes as discrete keywords).

Case sensitivity

File names on some platforms are case sensitive, whereas on others they are not. In order to avoid any potential problems this may cause it is best to use lowercase only when naming files and folders.

Help us to improve this answer

Please suggest an improvement
(login needed, link opens in new window)

Your views are welcome and will help other readers of this page.

Categories

This is question number 2564, which appears in the following categories:

Created by Chris Limb on 25 April 2013 and last updated by Chris Limb on 17 June 2014