The Hackerlab at regexps.com

A Virtual Unix File-System Interface

up: libhackerlab
next: VU Native File System Access
prev: Tuning Posix Regexp and XML Regular Expression Performance

VU provides a virtual file-system interface -- a way for programs to redefine the meaning of the basic file-system functions for a particular file descriptor or for a particular part of the file-system namespace. When a process defines a virtual file-system, its definitions apply to that process only.


The VU File-System Functions

up: A Virtual Unix File-System Interface
next: Virtual File-System Functions

For the purposes of this section, the unix file system interface is defined as these functions:

     access chdir chmod chown chroot close closedir fchdir fchmod
     fchown fstat fsync ftruncate link lseek lstat mkdir open
     opendir read readdir readlink rename rmdir stat
     symlink truncate unlink utime write fcntl
     dup dup2

For each of those functions, there is a corresponding vu_ function:

     vu_access vu_chdir vu_chmod ...

The function prototypes of the vu_ functions are slightly different from the traditional unix functions. For one thing, all vu_ functions accept an output paramter, errn , as the first argument:

     int vu_open (int * errn, char * path, int flags, int mode)

That parameter takes the place of the global variable errno . When an error occurs in a vu_ function, the error number is returned in *errn .

The other interface difference is the functions for reading directories. The vu_ functions return an integer to indicate success (0) or failure (-1) and return other information through output parameters:

     int vu_opendir (int * errn, DIR ** retv,  char * path)
     int vu_readdir (int * errn,
                     struct alloc_limits * limits,
                     char ** file_ret,
                     DIR * dir)

Aside from those difference, vu_ functions can be substituted freely for their unix counterparts. If no other changes are made, the modified program will run just the same as the original.

There are two additional vu_ functions:

     vu_read_retry vu_write_retry

which are similar to read and write but which automatically restart after a EINTR or EAGAIN .


Virtual File-System Functions

up: A Virtual Unix File-System Interface
next: VU Closures
prev: The VU File-System Functions

Programs can provide their own definitions of the vu_ file-system functions. These definitions can be made to apply to specific file descriptors or to a subset of the names in the file-system namespace of the process.

To define a set of virtual functions, a program defines functions having the same return types and almost the same parameters as the vu_ virtual functions. The virtual functions take one extra (final) argument, void * closure . For example:

     int my_virtual_open (int * errn, 
                          char * path, 
                          int flags,
                          int mode,
                          void * closure)

All of these functions must be named consistently by adding a prefix to the root name of the file system functions:

     my_virtual_access my_virtual_chdir my_virtual_chmod ...

Two additional functions must be provided:

     void * my_virtual_make_closure (void * closure)
     void my_virtual_free_closure (void * closure)

These are explained below.

Having defined a set of virtual file system functions, a program must declare a vtable for the functions:

     #include "hackerlab/vu/vu.h"
     ...

     struct vu_fs_discipline my_fs_vtable
      = { VU_FS_DISCIPLINE_INITIALIZERS (my_virtual_) };

Note that the prefix used to name virtual functions is used by the macro VU_FS_DISCIPLINE_INITIALIZERS to initialize the vtable.

Finally, the functions may be applied to either a portion of the file-system namespace or to a particular descriptor:

     // Applying `my_virtual_' functions to a particular 
     // descriptor:
     //
     vu_set_fd_handler (fd, &my_fs_vtable, closure);

     // Applying `my_virtual_' functions to a part
     // of the file-system namespace.  Note that
     // preg is a `regex_t' -- a regular expression
     // compiled by `regcomp'.  `eflags' is like the
     // parameter of the same name to `regexec':
     //
     vu_push_name_handler (name, doc, preg, eflags, 
                           &my_fs_vtable, closure, 0);

After those calls, a call to a vu_ file-system using descriptor fd will do its work by calling a my_virtual_ function (e.g. vu_read will call my_virtual_read to read from fd ).

A call to a vu_ function with a path that matches the regexp preg will also do its work by calling a my_virtual_ function (e.g. vu_open will call my_virtual_open to open a matching file-name). If a new file is successfully opened this way, VU automatically calls vu_set_fd_handler to establish the same vtable as the handler for the new descriptor.

Name-space handlers are kept on a stack and searched from the top of stack down. Each handler has a name and an array of strings which are its documentation.


VU Closures

up: A Virtual Unix File-System Interface
next: Pseudo-Descriptors
prev: Virtual File-System Functions

When a set of virtual file system functions is installed by vu_set_fd_handler or vu_push_name_handler the caller provides a closure. This closure is an opaque value to vu itself. A copy of the closure is passed as the final parameter to the caller's virtual functions. For example, to open a file that has matched a regular expression passed to vu_push_name_handler , vu_open uses the sequence:

    fd = handler->vtable->open (errn, path, flags, mode, 
                                handler->closure);

VU doesn't save a copy of the closure directly. Instead, it calls the make_closure function from the vtable to create the value to save and the free_closure function when that copy is being discarded. make_closure is called once when vu_push_name_handler is called, and once each time vu_set_fd_handler is called. free_closure is called each time vu_close or vu_closedir is called.

Type vu_handler

struct vu_handler;

For each VU namespace handler and file-descriptor handler, there is a struct vu_handler that indicates the vtable and closure to use for that portion of the file-system.

   struct vu_handler
   {
     struct vu_fs_discipline * vtable;
     void * closure;
   };




Pseudo-Descriptors

up: A Virtual Unix File-System Interface
next: Establishing VU Handlers
prev: VU Closures

The vu_ functions can operate on file descriptors which are created by the program itself without the knowledge of the operating system kernel. The advantage of such descriptors is that they take up no kernel resources so it is practical to create a large number of them.

Pseudo-descriptors which are guaranteed to be distinct from all kernel descriptors can be created using the function reserv_pseudo and destroyed using the function unreserv_pseudo . For example:

     fd = reserv_pseudo ();
     vu_set_fd_handler (fd, &my_fs_vtable, 0);

and:

     int
     my_virtual_close (int * errn, int fd, void * closure)
     {
       unreserv_pseudo (fd);
       return 0;
     }

One way to use a pseudo-descriptor is to combine it with buffered-I/O to create a file that corresponds to a string in the program's address space. (See Buffered File Descriptors.)


Establishing VU Handlers

up: A Virtual Unix File-System Interface
next: Looking Up VU Handlers
prev: Pseudo-Descriptors

Function vu_push_name_handler

void vu_push_name_handler (t_uchar * name,
                           t_uchar ** doc,
                           regex_t * preg,
                           int eflags,
                           struct vu_fs_discipline * vtable,
                           void * closure,
                           int is_optional);

vu_push_name_handler establishs a vtable of virtual file system functions for a portion of the file-system namespace.

name is the name for the handler recognized by vu_enable_optional_name_handler . Conventionally, this name may be used as an option argument to the command line options -N or --namespace . For optimal help message formatting, name should be no longer than 30 characters. A pointer to name is kept by this function.

doc is a documentation string for the handler, printed in the outpupt of vu_help_for_optional_handlers . For optimal help message formatting, each line of doc should be no longer than 40 characters. The last element of the array doc must be 0 . A pointer to doc is kept by this function.

File-names matching preg are handled by the functions in vtable . Matching is performed by regexec :

     regexec (preg, path, 0, 0, eflags)

If a matching file is successfully opened, vu_open and vu_opendir call:

     vu_set_fd_handler (new_fd, vtable, closure)

If is_optional is not 0 , the file-name handler is recorded but not enabled. It can be made active by a call to vu_enable_optional_name_handler .



Function vu_enable_optional_name_handler

int vu_enable_optional_name_handler (t_uchar * name);

Push the named namespace handler on the VU namespace stack.

The named handler must have previously been established by a call to vu_push_optional_name_handler .

Return 0 on success, -1 if the named handler was not found.



Function vu_set_fd_handler

void vu_set_fd_handler (int fd,
                        struct vu_fs_discipline * vtable,
                        void * closure);

Establish a vtable of virtual file system functions for a particular descriptor or pseudo-descriptor.

The handler is automatically removed by vu_close .



Function vu_move_state

int vu_move_state (int * errn, int fd, int newfd);

Move the VU handler for fd to newfd .




Looking Up VU Handlers

up: A Virtual Unix File-System Interface
next: Stacking Descriptor Handlers
prev: Establishing VU Handlers

Function vu_path_dispatch

struct vu_handler * vu_path_dispatch (char * path);

Return the vtable and closure that handle the file-name path . (See vu_handler.)



Function vu_fd_dispatch

struct vu_handler * vu_fd_dispatch (int fd);

Return the vtable and closure that handle the descriptor fd . (See vu_handler.)



Function vu_dir_dispatch

struct vu_handler * vu_dir_dispatch (DIR * dir);

Return the vtable and closure that handle the directory dir . (See vu_handler.)




Stacking Descriptor Handlers

up: A Virtual Unix File-System Interface
next: The VU File-system Interface
prev: Looking Up VU Handlers

The function vu_fd_dispatch returns a pointer to a struct vu_handler which in turn holds a pointer to the vtable and closure used to handle vu_ functions for a particular descriptor (see vu_handler).

Using vu_fd_dispatch , descriptor handlers can be stacked. For example, the buffered-I/O functions (vfdbuf_ ) work by imposing a file-system vtable that maintains a buffer but that performs actual I/O by calling functions from an underlying vtable. To establish the buffering vtable, the function vfdbuf_buffer_fd uses a sequence of operations like:

     int
     vfdbuf_buffer_fd (int * errn, int fd,
                       long bufsize, int flags, int zero_buffer)
     {
       struct vu_handler * sub_handler;

       ...
       sub_handler = vu_fd_dispatch (fd);

       ... remember that sub_handler does I/O for fd:
       bufs[fd].sub_handler = sub_handler;
       ...

       ... Establish the buffering functions as the new vtable for
           `fd'.

       vu_set_fd_handler (fd, &vfdbuf_vtable, 0);
     }

The vfdbuf_ file-system functions follow this example:

     int
     vfdbuf_fsync (int * errn, int fd, void * closure)
     {
       ... Empty the buffer.

       if (vfdbuf_flush (errn, fd) < 0)
         return -1;

       ... Perform the actual `fsync' using the vtable set aside
           in vfdbuf_buffer_fd:

       return bufs[fd].sub_handler.vtable->fsync 
                     (errn, fd, bufs[fd].sub_handler.closure);
     }

Note that when closing a file, it is the responsibility of the vfdbuf_ functions to free the closure for the underlying vtable:

     int
     vfdbuf_close (int * errn, int fd, void * closure)
     {
       int errn
       int got;
       int ign;

       ...
       got = bufs[fd].sub_handler.vtable->close 
                     (errn, fd, bufs[fd].sub_handler.closure);

       bufs[fd].sub_handler.vtable->free_closure
             (bufs[fd].sub_handler.closure);
       ...
     }


The VU File-system Interface

up: A Virtual Unix File-System Interface
prev: Stacking Descriptor Handlers

These functions approximately mirror the traditional unix system call interface, but have these improvments:

     1. Error numbers are not stored in a global, but
        in a return value.

     2. Functions dispatch on file names and descriptors,
        permitting these functions to work on objects other than
        ordinary files and sockets and to work on ordinary files
        and sockets in unusual ways.

Function vu_access

int vu_access (int * errn, char * path, int mode);

See the manual page for access .



Function vu_chdir

int vu_chdir (int * errn, char * path);

See the manual page for chdir .



Function vu_chmod

int vu_chmod (int * errn, char * path, int mode);

See the manual page for chmod .



Function vu_chown

int vu_chown (int * errn, char * path, int owner, int group);

See the manual page for chown .



Function vu_chroot

int vu_chroot (int * errn, char * path);

See the manual page for chroot .



Function vu_close

int vu_close (int * errn, int fd);

See the manual page for close .



Function vu_closedir

int vu_closedir (int * errn, DIR * dir);

See the manual page for closedir .



Function vu_fchdir

int vu_fchdir (int * errn, int fd);

See the manual page for fchdir .



Function vu_fchmod

int vu_fchmod (int * errn, int fd, int mode);

See the manual page for fchmod .



Function vu_fchown

int vu_fchown (int * errn, int fd, int owner, int group);

See the manual page for fchown .



Function vu_fstat

int vu_fstat (int * errn, int fd, struct stat * buf);

See the manual page for fstat .



Function vu_fsync

int vu_fsync (int * errn, int fd);

See the manual page for fsync .



Function vu_ftruncate

int vu_ftruncate (int * errn, int fd, off_t where);

See the manual page for ftruncate .



Function vu_link

int vu_link (int * errn, char * from, char * to);

See the manual page for link .



Function vu_lseek

off_t vu_lseek (int * errn, int fd, off_t offset, int whence);

See the manual page for lseek .



Function vu_lstat

int vu_lstat (int * errn, char * path, struct stat * buf);

See the manual page for lstat .



Function vu_mkdir

int vu_mkdir (int * errn, char * path, int mode);

See the manual page for mkdir .



Function vu_open

int vu_open (int * errn, char * path, int flags, int mode);

See the manual page for open .



Function vu_dir_fd

int vu_dir_fd (DIR * dir);

Return the pseudo descriptor associated with DIR.



Function vu_opendir

int vu_opendir (int * errn, DIR ** retv,  char * path);

See the manual page for opendir .



Function vu_read

ssize_t vu_read (int * errn, int fd, char * buf, size_t count);

See the manual page for read .



Function vu_read_retry

ssize_t vu_read_retry (int * errn, int fd, char * buf, size_t count);

Use vu_read to read from fd . Read repeatedly (even if a read returns early from EINTR or EAGAIN) until count characters are read, or the end-of-file is reached.

Return the number of characters read or -1 on error.



Function vu_readdir

int vu_readdir (int * errn,
                struct alloc_limits * limits,
                char ** file_ret,
                DIR * dir);

See the manual page for readdir .

Note that in vu , the file name is returned in *file_ret and is dynamically allocated using limits . It is up to the caller to free the file name.



Function vu_readlink

int vu_readlink (int * errn, char * path, char * buf, int bufsize);

See the manual page for readlink .



Function vu_rename

int vu_rename (int * errn, char * from, char * to);

See the manual page for rename .



Function vu_rmdir

int vu_rmdir (int * errn, char * path);

See the manual page for rmdir .



Function vu_stat

int vu_stat (int * errn, char * path, struct stat * buf);

See the manual page for stat .



Function vu_symlink

int vu_symlink (int * errn, char * from, char * to);

See the manual page for symlink .



Function vu_truncate

int vu_truncate (int * errn, char * path, off_t where);

See the manual page for truncate .



Function vu_unlink

int vu_unlink (int * errn, char * path);

See the manual page for unlink .



Function vu_utime

int vu_utime (int * errn, char * path, struct utimbuf *times);

See the manual page for utime .



Function vu_write

ssize_t vu_write (int * errn, int fd, char * buf, size_t count);

See the manual page for write .



Function vu_write_retry

ssize_t vu_write_retry (int * errn, int fd, char * buf, size_t count);

Use vu_write to write to fd . Write repeatedly (even if a write returns early from EINTR or EAGAIN) until count characters are written.

Return the number of characters written or -1 on error.



Function vu_fcntl

int vu_fcntl (int * errn, int fd, int cmd, long arg);

See the manual page for fcntl .



Function vu_dup

int vu_dup (int * errn, int fd);

See the manual page for dup .



Function vu_dup2

int vu_dup2 (int * errn, int fd, int newfd);

See the manual page for dup2 .



libhackerlab: The Hackerlab C Library
The Hackerlab at regexps.com