Boost Filesystem Library

This Document
    Introduction
    Two-minute tutorial
    Examples
    Definitions
    Requirements
    Race-condition danger
    Acknowledgements
Other Documents
    Library Design
    FAQ
    Portability Guide
    path.hpp documentation
    operations.hpp documentation
    fstream.hpp documentation
    exception.hpp documentation
    Do-list

Introduction

The Boost Filesystem Library provides portable facilities to query and manipulate paths, files, and directories.

The motivation for the library is the need to be able to perform portable script-like operations from within C++ programs. The intent is not to compete with Python, Perl, or shell languages, but rather to provide portable filesystem operations when C++ is already the language of choice. The design encourages, but does not require, safe and portable filesystem usage.

Filesystem Library components are supplied by several  headers, all in directory boost/filesystem:

Two-minute tutorial

First some preliminaries:

#include "boost/filesystem/operations.hpp" // includes boost/filesystem/path.hpp
#include "boost/filesystem/fstream.hpp"    // ditto
#include <iostream>                        // for cout
namespace fs = boost::filesystem;

A class path object can be created:

fs::path my_path( "some_dir/file.txt" );

The string passed to the path constructor is in a portable generic path format. Access functions make my_path contents available in an operating system dependent format, such as "some_dir:file.txt", "[some_dir]file.txt", "some_dir/file.txt", or whatever is appropriate for the operating system.

Class path has conversion constructors from const char* and const std:: string&, so that even though the Filesystem Library functions in the following code snippet take const path& arguments, the user can just code C-style strings:

fs::remove_all( "foobar" );
fs::create_directory( "foobar" );
fs::ofstream file( "foobar/cheeze" );
file << "tastes good!\n";
file.close();
if ( !fs::exists( "foobar/cheeze" ) )
  std::cout << "Something is rotten in foobar\n";

Additional class path constructors provide for an operating system dependent format, useful for with user provided input:

int main( int argc, char * argv[] ) {
fs::path arg_path( argv[1], fs::system_specific );

To make class path objects easy to use in expressions, operator<< appends paths:

fs::ifstream file1( arg_path << "foo/bar" );
fs::ifstream file2( arg_path << "foo" << "bar" );

Note that expressions arg_path << "foo/bar" and arg_path << "foo" << "bar" yield equivalent results.

Class directory_iterator is an important component of the library. It provides input iterators over the contents of a directory, with the value type being class path.

The following function, given a directory path and a file name, recursively searches the directory and its sub-directories for the file name, returning a bool, and if successful, the path to the file that was found.  The code below is extracted from a real program, slightly modified for clarity:

bool find_file( const fs::path & dir_path,     // in this directory,
                const std::string & file_name, // search for this name,
                fs::path & path_found )        // placing path here if found
{
  if ( !fs::exists( dir_path ) ) return false;
  fs::directory_iterator end_itr; // default construction yields past-the-end
  for ( fs::directory_iterator itr( dir_path );
        itr != end_itr;
        ++itr )
  {
    if ( fs::is_directory( *itr ) )
    {
      if ( find_file( *itr, file_name, path_found ) ) return true;
    }
    else if ( itr->leaf() == file_name ) // see below
    {
      path_found = *itr;
      return true;
    }
  }
  return false;
}

The expression itr->leaf() == file_name, in the line commented // see below, calls the leaf() function on the path object returned by the iterator. leaf() returns a string which is a copy of the last (closest to the leaf, farthest from the root) file or directory name in the path object.

Notice that find_file() does not do explicit error checking, such as verifying that the dir_path argument really represents a directory. All Filesystem Library functions throw filesystem_error exceptions if they do not complete successfully, so there is enough implicit error checking that this application doesn't need to include additional error checking code.

The tutorial is now over; hopefully you now are ready to write simple, script-like, programs using the Filesystem Library!

Examples

Until a custom-made example is available, see compiler_status.cpp, an actual program which uses the library.

Test programs are also sometimes useful in understanding a library, as they illustrate what the developer expected to work and not work. See:

Definitions

directory - A container provided by the operating system, containing the names of files, other directories, or both. Directories are identified by directory path.

directory tree - A directory and file hierarchy viewed as an acyclic graph.

path - A possibly empty sequence of names. Each element in the sequence, except the last, names a directory which contains the next element. The last element may name either a directory or file. The first element is closest to the root of the directory tree, the last element is farthest from the root.

It is traditional to represent a path as a string, where each element in the path is represented by a name, and some operating system defined syntax distinguishes between the name elements. Other representations of a path are possible, such as each name being an element in a std::vector<std::string>.

file path - A path whose last element is a file.

directory path - A path whose last element is a directory.

name - A file or directory name, without any directory path information to indicate the file or directory's actual location within a directory tree. For some operating systems, files and directories may have more than one valid name, such as a short-form name and a long-form name.

Requirements

Unless otherwise specified, all Filesystem Library functions are required to throw a filesystem_error exception if the implementation cannot successfully complete operations required to meet the function's specifications. Such exceptions are in addition to any exceptions specified in the function's "Throws" paragraph.

Filesystem Library functions are permitted to call C++ Standard Library functions, so std::bad_alloc exceptions may also be thrown, unless otherwise specified.

There is no rollback guarantee; a Filesystem Library function which throws an exception may leave the external file system in an altered state.

Race-condition danger

The state of files and directories is often globally shared, and thus may be changed unexpectedly by other threads, processes, or even other computers which have access to the filesystem. As an example of the difficulties this can cause, consider that the following asserts may fail:

assert( exists( "foo" ) == exists( "foo" ) );  // (1)

remove_all( "foo" );
assert( !exists( "foo" ) );  // (2)

assert( is_directory( "foo" ) == is_directory( "foo" ) ); // (3)

(1) will fail if a non-existent "foo" comes into existence, or an existent "foo" is removed, between the first and second call to exists(). This could happen if, during the execution of the example code, another thread, process, or computer is also performing operations in the same directory.

(2) will fail if between the call to remove_all() and the call to exists() a new file or directory named "foo" is created by another thread, process, or computer.

(3) will fail if another thread, process, or computer removes an existing file "foo" and then creates a directory named "foo", between the example code's two calls to is_directory().

A program which needs to be robust when operating on potentially-shared file or directory resources should be prepared for filesystem_error exceptions to be thrown from any filesystem function except those explicitly specified as not throwing exceptions.

Implementation

The current implementation (September, 2002) supports operating systems that have either the POSIX or Windows API's available.

The following tests are provided:

As of September, 2002, these tests succeed for the following compilers on Windows:

As of September, 2002, some limited use has been successful on Linux using GCC and IBM/AIX using Visual Age C++.

Acknowledgements

The Filesystem Library was designed and implemented by Beman Dawes, except for the directory_iterator and filesystem_error classes which were based on prior work from Dietmar Kühl, as modified by Jan Langer.

Key design requirements and design realities were developed during extensive discussions on the Boost mailing list, followed by comments on the actual implementation.  Participants included (in more-or-less chronological order) Beman Dawes, Jan Langer, Darin Adler, Michiel Salters, Jani Kajala, Jason Stewart, Carl Daniel, David Abrahams, Bill Kempf, Jonathan Caves, George Heintzelman, Ken Hagen, Eric Jensen, Joel de Guzman, Jim Hyslop, John Maddock, Matt Austern, Peter Dimov, Davlet Panech, Dylan Nicholson, Tom Harris, Giovanni Bajo, Baptiste Lepilleur, Thomas Witt, Keith Burton, Mattias Flodin, Daniel Frey, Vladimir Prus, Toon Knapen.

Specific improvements for a preliminary design document came from Dan Nuffer and Jeff Garland.

A lengthy discussion on the C++ committee's library reflector illuminated the "illusion of portability" problem, particularly in postings by JP Plauger and Pete Becker.


© Copyright Beman Dawes, 2002

Revised 13 September, 2002