Object Based Storage HOWTO

Andreas Dilger & Peter J. Braam `{adilger,braam}@stelias.com`

v1.2, Dec 23, 1999

Object based storage centers around the idea of storage devices that manage file objects (read inodes) instead of blocks. This document explains the configuration and operation of a suite of object based storage software we are experimenting with. The software contains drivers for "simulated" object based disks, logical drivers (for snapshots) and an object based file system on Linux. We expect this to be part of an increasing suite of software to manage storage through object interfaces.

1. Disclaimer and License

This software forms an experimental file system. It contains kernel code and daemons running with root permissions and is known to have bugs. Please back up all data when using or experimenting with Object Based Storage software.

This software may be redistributed it and/or modified under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. The file COPYING contains version 2 of the GPL.

Copyright on the Object Based Storage software is held by a large number of developers because the code derives from other parts of the Linux kernel. Specific copyright holders are listed in the source files.

2. Introduction

2.1 Project WWW site

This project is further documented at http://www.lustre.org.

2.2 What is an Object Based Device?

An Object Based Disk (OBD) or Object Based Storage Device (OBSD) is one that works at the level of files ("storage objects"), rather than at the level of individual blocks as conventional storage devices do. The OBD keeps track of allocated objects, which blocks belong to each object, free space, etc. internally, rather than exposing these details to the operating system.

An OBD could be a real OBD if disk vendors should decide that it is a worthwhile idea to produce such devices. We have written a simulated object based device, based on the "lower half" of the Ext2 file system. (Other file systems could easily be used as well.).

A utility called obdcontrol allows for direct manipulation of objects. More interesting is the object based file system (OBDFS) which uses the an object based device as its storage device.

OBDFS communicates to the underlying ext2obd device by means of object id's and logical blocks inside objects - NOT physical blocks on the storage device. It lets the ext2obd device handle block and inode allocation. Roughly speaking the combination of OBDFS and ext2obd equals Ext2, and indeed OBDFS is another file system that can access Ext2 formatted drives.

However, instead of gluing them straight on top of each other, one can insert logical object drivers in the middle. These receive object commands from "above", e.g. from OBDFS and speak to other object driver(s). Examples of such configurations are RAID and snapshots.

Because of the object abstraction used in OBDs, it is possible to layer OBD drivers on top of each other. The logical object driver is a client of a lower level driver (or the direct device driver), but is itself a target of a higher layer driver, or application such as OBDFS issues object methods to the driver it utilizes.

This allows OBD the ability to stack OBD drivers to implement different functionalities in each OBD layer. The snapshot and network layers are simply OBD drivers stacked on top of a base OBD driver. Also under discussion and/or development for OBDFS drivers are RAID0, RAID1, Object Volume Management, and others.

As an example we have implemented a snapshot driver that can be used in conjucntion with ext2obd and OBDFS. The current OBDFS implementation has the ability to create multiple timed snapshots of a filesystem, allowing historical views of a filesystem, or consistent filesystem backups for a mounted filesystem. The way that snapshots are currently implemented, however, means the underlying ext2 filesystem is not a valid filesystem for the normal ext2 driver when any snapshots exist (it is a valid ext2 filesystem when all of the snapshots have been removed, however).

It will also be possible to use OBDFS in a network mode, like NFS, to access files on a remote system; code for a SUN RPC driver for the storage object protocol is forthcoming. (Of course, faster interconnects such as FC or InfiniBand are attractive too.) This will form the basis for the Lustre file system - a Linux Cluster file system based on object based storage.

In addition to the highly modular architecture for storage management, another possible benefit of an OBD over a conventional block-based storage device can be likened to using an accelerated graphics adapter to handle drawing circles, filled rectangles, etc., instead of having the CPU draw each pixel individually. OBDFS can pass a few high-level commands to an OBD when creating, copying, or deleting a file, instead of being concerned with keeping track of hundreds or thousands of individual blocks for each file on a device. In clusters such devices avoid sharing the allocation metadata mong all cluster nodes which is a cause of complexity. Precisely how beneficial all this is, remains to be evaluated.

2.3 Update History

1999/12/02 Initial Draft
1999/12/20 First release

2.4 Credits

This project is led by Peter J. Braam <braam@stelias.com> at Stelias Computing (most of the code and bugs to date are his responsibility.)
Phil Schwan (no longer with Stelias) did initial work on the ext2obd and Andreas Dilger <adilger@stelias.com> wrote a substantial part of the snapshot driver, added iterators, fixed numerous bugs, and wrote most of this document.

3. What this document all about?

In this HOWTO, I will go over how to use the Object Based Device filesystem under Linux. Since OBDFS is still under development, it should not be used on production systems, or to store important data. It is expected that anyone using OBDFS knows how to patch and compile a kernel.

We will cover the basic installation and configuration of OBD devices and filesystems under Linux, as well as go through some examples of how OBDFS might be used. Since OBDFS is still under development, there are likely bugs to be found if you venture off the beaten path. Please report these bugs to the obd-devel mailing list (see Contacting the Authors for more information).

4. How to use OBD software

4.1 Configuring OBD under Linux

In order to use OBDFS under Linux you need kernel 2.3.31, in addition to compiling several modules. You will also need to have loop devices enabled in the kernel in order to do safe filesystem testing. The user-space tool (obdcontrol) is written in Perl, which is normally installed, but requires the Term-Readline-GNU-1.04.tar.gz Perl module (see below).

This code only works with linux-2.3.31, in particular it won't work with 2.2 versions of Linux; it compiles as a module and you should NOT need to patch your kernel. The modules install a character device on major 186 (allocated to us for this purpose and a file system named OBDFS. So create some character devices:


# mknod /dev/obd0 c 186 0
# mknod /dev/obd1 c 186 1
# mknod /dev/obd2 c 186 2
# mknod /dev/obd3 c 186 3
# mknod /dev/obd4 c 186 4

In order to use OBDFS, you also need to compile the kernel modules obdext2, obdclass, obdfs, and obdsnap (if you are using the snapshot facility). The OBD drivers are closely tied to the kernel version, as there was a major change to the VFS layer around kernel version 2.3.25.

For an initial configuration and compile, run:


# cd /path/of/obdcode
# make config all

The configuration will ask some basic questions about your system configuration. For symbols I suggest you tell the script to find it in your Linux kernel tree. It should then proceed to compile the various OBD modules.

One other piece of software is needed: a Perl readline package that makes the commandline tool obdcontrol ever so much nicer to use. Get the package from:


 
ftp://ftp.lustre.org/pub/lustre/Term-ReadLine-Gnu-1.04.tar.gz

Installation is easy:


# untar
# cd into it
# perl Makefile.PL
# make install

Ready to play!

4.2 Using the OBDFS file system

The quick way to get OBDFS mounted is:


# mkdir /mnt/obd
# cd top-of-the-source/demos
# ./obdfssetup.sh

If you type mount you will see your new file system. Copy a couple of files in there to test it out. To clean up again, use:


# ./obdfsclean.sh

Interestingly the underlying file system is still a good old Ext2 file system. Let's run e2fsck on it, to see that we didn't corrupt it:


# losetup /dev/loop0 /tmp/obdfs.tmpfile
# e2fsck /dev/loop0

You could go on and mount /dev/loop0 as an ext2 file system to verify. Instead, let's go on and play with snapshots.

4.3 Using snapshots

Again, we have provided a quick way to get you off the ground:


# cd top-of-the-source/demos
# sh obdfsclean.sh
# rm /tmp/obdfs.tmpfile
# ./snapsetup.sh

When you type mount you will see two file systems. One uses /dev/obd1 and the other /dev/obd2. Both of these OBD devices are talking to an obdext2 on /dev/obd0, which is configured, as before, to talk to /dev/loop0 on /tmp/obdfs.tmpfile.

It is instructive to investigate the inodes of the file system with debugfs:


# debugfs /dev/loop0
debugfs: stat <2>    # look at the contents of the root inode
debugfs: ls <2>      # see the hello file we created?  it's inode 12
debugfs: stat <12>   # let's look at the block assigned to hello
debugfs: q

This shows that objects (inodes) 2 and 12 have a block attached to them, holding the directory and file data, respectively.

Now we can make a few changes to the /mnt/obd filesystem and see what effect this has on the two filesystems (which both share one device).


# echo "today" >> /mnt/obd/hello

Now run debugfs again.


# debugfs /dev/loop0
debugfs: stat <12>   # file 12 (hello) looks like it has 3 blocks
debugfs: stat <18>   # file 18 (old hello) has old data block
debugfs: stat <19>   # file 19 (new hello) has a new data block
debugfs: q

For the /mnt/obd/hello file, the first "block" listed is actually a magic number which indicates to the snapshot driver that this inode has multiple versions. The second "block" is (in this case) the object id of the current snapshot of this inode. (That snapshot is mounted on /mnt/obd.) The last "block" is the object id for this file corresponding to a snapshot that was timed to preserve state just after we created the file /mnt/obd/hello in the file system (look in snapsetup.sh for details).

What has happened is that the inode was made into an indirect object that refers the caller to either the old data (in object 18) or the new data in object 19. Of course, the numbers 18 and 19 can change if you do extra file system operations.

In the file system we can see this too:


 
# cat /mnt/obd/hello 
# cat /mnt/snap/hello

The final test is of course:


# rm /mnt/obd/hello
# ls /mnt/obd
# cat /mnt/snap/hello

Finally we will restore the old world with our snaprest.sh shell script:


# ./snaprest.sh
# ls /mnt/obd
# cat /mnt/obd/hello

To clean up from this call snaprestclean.sh. More fun with snapshots can be found below, along with more explanation how it operates. Ok, let's explain in some more detail how this works.

4.4 Explanations of the magic

The first steps in starting to use OBD software is to load the modules into the kernel. For basic usage, you need to install the obdclass, obdext2, and obdfs modules. The following examples assume you are in the main OBD directory.


# insmod class/obdclass.o
# insmod ext2obd/obdext2.o
# insmod obdfs/obdfs.o

The obdclass module provides a dispatching service for object type dependent methods used by various of the OBD drivers. The obdext2 module is the low-level block device driver which simulates an object based device using an ext2 filesystem on disk. It only works with blocks, inodes, and bitmaps, but has no understanding of directories or filenames. The obdfs module is the filesystem which manipulates files and directories, and presents a view of the underlying device to the user.

In order to test OBD stuff, you need to create a small test filesystem for the obdext2 driver to work with.

This can be done using the normal ext2 tools found in the e2fsutils package (this is installed as part of the base install of every Linux system). The easiest way to do this is with a loopback device:


# dd if=/dev/zero of=/tmp/obdfs.tmpfile bs=1k count=10k
# insmod loop
# losetup /dev/loop0 /tmp/obdfs.tmpfile
# mke2fs -b 4096 /dev/loop0

Note: that the ext2 filesystem currently needs to be created with a 4k block size because the obdext2 driver assumes the block size matches the page size. This needs to be fixed in a later release of the obdext2 driver to allow ext2 filesystems with 1k and 2k block sizes.

The majority of configuration of OBDFS is through the control program obdcontrol. This is a relatively complete command-line interface, with basic help, command completion, and command history. It allows you to (un)configure basic OBD and snapshot devices, as well as do debugging and testing of OBD devices and objects.

The most common commands in obdcontrol are (in matching pairs) attach and detach, setup and cleanup, connect and disconnect, help, and quit. To get a complete listing of available commands, type help at the obdcontrol prompt. To get basic help on the meaning and syntax of a command, type help command. Command completion is activated with the TAB key, and command history is available via the up- and down-arrow keys.

Attach will attach the specified OBD driver (ext2_obd or snap_obd) to the current OBD device (by default /dev/obd0. (You can change device with the device command.) This serves two purposes. First the device /dev/obdX now has methods given in through the type in the attach command. In some cases we also pass some data in to the system, for example to indicate what snapshot view /dev/obdX should give.

We need the ext2_obd driver so we can attach to the test filesystem we created.


# class/obdcontrol
Device now /dev/obd0
obdcontrol > attach ext2_obd /dev/loop0

Setup will complete the configuration of the current OBD device. For ext2_obd a setup command initializes an inode and buffer cache which the obd driver exploits.


obdcontrol > setup
obdcontrol > quit

At this point, you should be able to mount the OBDFS filesystem:


# mkdir /mnt/obd
# mount -t obdfs -o device=/dev/obd0 none /mnt/obd
# df -k /mnt/obd
Filesystem           1k-blocks      Used Available Use% Mounted on
none                 9668        20      9148   0% /mnt/obd

These steps are included in the script demos/obdfssetup.sh. Light usage of the file system (such as rebuilding the obd code) is usually possible.

NOTE: Reams of debugging output are produced by the various OBD components. This can be quelled by

echo 0 > /proc/sys/obd/debug
echo 0 > /proc/sys/obd/trace

For example create a few files there for testing:


# echo "yesterday" > /mnt/obd/hello
# echo "test" > /mnt/obd/bye
# touch /mnt/obd/a /mnt/obd/b
# ln -s hello /mnt/obd/link
# cat /mnt/obd/link
yesterday
# ls -li /mnt/obd
total 23
     15 -rw-r--r--    1 root     root            0 Dec 16 16:43 a
     16 -rw-r--r--    1 root     root            0 Dec 16 16:43 b
     13 -rw-r--r--    1 root     root            5 Dec 16 16:43 bye
     12 -rw-r--r--    1 root     root           10 Dec 16 16:43 hello
     14 lrwxrwxrwx    1 root     root            5 Dec 16 16:43 link -> hello
     11 drwxr-xr-x    1 root     root        16384 Dec 16 16:43 lost+found

Connect will establish a unique connection to the OBD device. This allows the device to keep track of parameters and resources on a per-client basis. Most operations require such a connection to have been made. For example, to get the attributes of inode 12 (the file hello in the previous listing) we need to first connect:


obdcontrol > connect
Client ID     : 2
Finished (success)
obdcontrol > getattr 12
Inode: 12  Mode:  100644
User:      0   Group:      0   Size: 10
ctime: 3859792b -- Thu Dec 16 23:43:39 1999
atime: 00000000 -- Thu Jan  1 00:00:00 1970
mtime: 3859792b -- Thu Dec 16 23:43:39 1999
flags: 3859792b
Finished (success)
obdcontrol > disconnect
Finished (success)

The OBDFS file system makes a connection when it is mounted. An important purpose of connections is to release pre-allocation data from obdext2 when the connection is closed.

4.5 Making snapshots with OBDFS

The power of the object storage paradigm can be seen by storage management modules which reside between our file system (OBDFS) and storage driver (obdext2). Snapshots are read-only clones of file systems, which are present in addition to the current copy. Snapshots are assocaited with a certain point in time, enabling consistent views of older versions of the file system as well as un-assisted retrieval of old files after accidental deletion.

An uninteresting way to produce a type of snapshot is to simply copy the entire filesystem to a certain location. With the OBDFS snapshots we maintain the read-only clones through a "copy on write" mechanism. With this mechanimsm snapshots require much less space.

The stack of drivers is now different. The file system uses the snapshot OBD driver as its device and NOT ext2_obd. The snapshot driver type is associated with several devices. There is a current snapshot which is read/write, and then one snapshot device can be instanatiated for each timed snapshot.

Leave some files around in the file system you mounted above. These will be the "old" copies, preserved in the snapshots, while the new ones are maintained in the current snapshot. First unmount the file system, then install the snapshot OBD driver and finally configure (attach and set-up) the snapshot drivers:


# cd top-of-the-source
# insmod snap/obdsnap.o
# umount /mnt/obd 
# class/obdcontrol
obdcontrol > snaptable
enter file name: /tmp/obdfs.snaptable
Add, Delete or Quit [adq]: a
enter index where you want this snapshot: 1
enter time or 'now' or 'current': current
Time: current -- Index 1
Add, Delete or Quit [adq]: a
enter index where you want this snapshot: 2
enter time or 'now' or 'current': now
Time: current -- Index 1
Time: Thu Dec 16 16:32:37 1999 -- Index 2
Add, Delete or Quit [adq]: q
OK with new table? [Yn]: y

All that we've done so far is create a table, which in real use would be a configuration file somewhere in /etc, but for now was placed in /tmp/obdfs.snaptable. This has information about snapshot times and which OBD slots are associated with each snapshot. Every inode can remember up to 12 snapshots and we allocate each snapshot to a slot. We always need to have a "current" snapshot, which we placed at index 1 in this case, which is where updates to the filesystem go (read-write snapshot).

We also created a "historical" snapshot (at index 2), which means that the state of all files stored in the OBD filesystem before 16:32:37 (the time when I created the "now" snapshot) will be preserved in that snapshot. Deletions will leave those old files around, and writing to a file created before that timestamp and not modified after will cause a copy (the COW - Copy on Write) to be left behind in the historical snapshot. Updates to the atime are so frequent that we have eliminated them from the causes of COW.

Now we load the newly created snapshot table into the snapshot driver. We will load this into snapshot table 0, with the snapset command. We also want to attach the snapshot OBD driver to OBD devices, one device for each snapshot. We will attach /dev/obd1 to snapshot index 1 (current), and /dev/obd2 to snapshot index 2 (historical). In both cases we use /dev/obd0 as the underlying data storage area.


obdcontrol > snapset 0 /tmp/obdfs.snaptable
Time: current -- Index 1
Time: Thu Dec 16 16:47:16 1999 -- Index 2
Snapcount 2
type snap_obd (len 8), datalen 24 (24)
Finished (success)
obdcontrol > device /dev/obd1
Device now /dev/obd1
obdcontrol > attach snap_obd 0 1 0
type snap_obd (len 8), datalen 12 (12)
Finished (success)
obdcontrol > setup snap_obd
Finished (success)
obdcontrol > device /dev/obd2
Device now /dev/obd2
obdcontrol > attach snap_obd 0 2 0
type snap_obd (len 8), datalen 12 (12)
Finished (success)
obdcontrol > setup snap_obd
Finished (success)

For the first attach command, we attach the current OBD device (/dev/obd1) to type snap_obd and give attachment data in the form of 3 parameters. The first lists the underlying object device to use, /dev/obd0 (first parameter), the second the snap index to use (snap index 1 in this case) and the third lists the table giving the times (table 0, the third parameter). The second attach command is similar. Now we are ready to try out use our snapshots as devices. The need to mount /dev/obd2 read-only is a deficiency in our software and will be enforced automatically in a future release.


# mount -t obdfs -o device=/dev/obd1 none /mnt/obd
# mkdir /mnt/snap
# mount -t obdfs -o ro,device=/dev/obd2 none /mnt/snap

The previous steps for configuring the snapshot device are included in the demos/snapsetup.sh script. Finally we will see the snapshot in operation. First we take a look at the files in the two directories, and note that they have the same inode numbers for both the read-write and read-only devices:


# ls -li /mnt/snap /mnt/obd
/mnt/snap:
total 19
     15 -rw-r--r--    1 root     root            0 Dec 16 16:43 a
     16 -rw-r--r--    1 root     root            0 Dec 16 16:43 b
     13 -rw-r--r--    1 root     root            5 Dec 16 16:43 bye
     12 -rw-r--r--    1 root     root           10 Dec 16 16:43 hello
     14 lrwxrwxrwx    1 root     root            5 Dec 16 16:43 link -> hello
     11 drwxr-xr-x    1 root     root        16384 Dec 16 16:43 lost+found

/mnt/obd:
total 20
     15 -rw-r--r--    1 root     root            0 Dec 16 16:43 a
     16 -rw-r--r--    1 root     root            0 Dec 16 16:43 b
     13 -rw-r--r--    1 root     root            5 Dec 16 16:43 bye
     12 -rw-r--r--    1 root     root           10 Dec 16 16:43 hello
     14 lrwxrwxrwx    1 root     root            5 Dec 16 16:43 link -> hello
     11 drwxr-xr-x    1 root     root        16384 Dec 16 16:43 lost+found

It is instructive to investigate the inodes of the file system with debugfs:

$num; debugfs /dev/loop0
debugfs: stat <2>     # look at the blocks assigned to the root inode
debugfs: ls <2>       # list the root directory
debugfs: stat <12>    # stat the file "hello", inode 12 above

This shows that objects (inodes) 2 and 12 have a block attached to them, holding the directory or file data.

Now we can make a few changes to the /mnt/obd filesystem and see what effect this has on the two filesystems (which both share one device):


# chmod 777 /mnt/obd
# echo "today" >> /mnt/obd/hello
# cp /etc/hosts /mnt/obd
# rm /mnt/obd/a
# chmod 777 /mnt/obd/b
# ls -li /mnt/snap /mnt/obd
/mnt/snap:
total 19
     15 -rw-r--r--    1 root     root            0 Dec 16 16:43 a
     16 -rw-r--r--    1 root     root            0 Dec 16 16:43 b
     13 -rw-r--r--    1 root     root            5 Dec 16 16:43 bye
     12 -rw-r--r--    1 root     root           10 Dec 16 16:43 hello
     14 lrwxrwxrwx    1 root     root            5 Dec 16 16:43 link -> hello
     11 drwxr-xr-x    1 root     root        16384 Dec 16 16:43 lost+found

/mnt/obd:
total 24
     16 -rwxrwxrwx    1 root     root            0 Dec 16 16:43 b
     13 -rw-r--r--    1 root     root            5 Dec 16 16:43 bye
     12 -rw-r--r--    1 root     root           16 Dec 16 18:28 hello
     19 -rw-r--r--    1 root     root          394 Dec 16 19:32 hosts
     24 lrwxrwxrwx    1 root     root            7 Dec 16 19:34 link -> bye
     11 drwxr-xr-x    1 root     root        16384 Dec 16 16:43 lost+found
# cat /mnt/snap/hello
yesterday
# cat /mnt/obd/hello
yesterday
today
# cat /mnt/obd/link
test

We can see that /mnt/snap has stayed constant (inode numbers, file size, mtime), while /mnt/obd shows the changes we have made to the various files, yet they also have the same inode numbers (very important for directory lookups, NFS, etc). This is all handled in the snap OBD driver, where it does copy-on-write for modified objects, and handles redirection to the proper underlying object, depending on the context of the object request.

Debugfs shows the details:


 
# debugfs /dev/loop0
debugfs: stat <2> 
debugfs: ls <17>
debugfs: ls <18>

What is seen here is how the inode is no longer pointing to blocks: it has a magic constant (stating "I'm a snapshot inode) and it contains referrals to two other inodes, number 17 and 18 in my case (if you do more things with the file system, the allocated inode numbers can be different of course). Doing the two ls's reveals the two copies of the directory introduced by the snapshot driver.

Because of the redirection in the snapshot layer, the underlying ext2 filesystem is not in a valid ext2 state (this may be fixed in a later release of OBDFS). However, we can delete a read-only (old) snapshot and leave the "current" state as a clean ext2 filesystem. We can also restore the filesystem to its former state.

Note: In the current release of OBDFS, it is possible to add a snapshot while a filesystem is mounted, but it is not possible to remove the snapshot while the filesystem is mounted. While it may appear to work in many cases, it will likely corrupt the filesystem.


obdcontrol > device /dev/obd2
Device now /dev/obd2
obdcontrol > connect
Client ID     : 2
Finished (success)
obdcontrol > snapdelete
type snap_obd (len 8), datalen 4 (4)
Finished (success)
obdcontrol > cleanup
Finished (success)
obdcontrol > detach
Finished (success)
obdcontrol > device /dev/obd1
Disconnecting active session (2)...Finished (success)
Device now /dev/obd1
obdcontrol > cleanup
Finished (success)
obdcontrol > detach
Finished (success)
obdcontrol > quit
# rmmod obdsnap
# rmmod obdfs
# rmmod obdext2
# rmmod obdclass
# mount /dev/loop0 /mnt/obd
# ls -li /mnt/obd
total 32
     16 -rwxrwxrwx    1 root     root            0 Dec 16 16:43 b
     13 -rw-r--r--    1 root     root            5 Dec 16 16:43 bye
     12 -rw-r--r--    1 root     root           16 Dec 16 18:28 hello
     19 -rw-r--r--    1 root     root          394 Dec 16 19:32 hosts
     24 lrwxrwxrwx    1 root     root            7 Dec 16 19:34 link -> bye
     11 drwxr-xr-x    2 root     root        16384 Dec 16 16:43 lost+found

After removing the snapshot, we did a bit of cleanup on the devices we had previously configured, so that we can safely remove the loaded modules. Finally, we remounted the underlying filesystem as ext2 again without any problems, and can see that it is in the state of the current snapshot.

Finally we mention the snap restore operation (see the shell scipt demos/snaprest.sh for how it is used. This allows you to revert a file system to the state in a previous snapshot.

5. Our next few steps

Over the next months we will be working on other aspects of these systems. We hope to release the following:

A page flush daemon for OBDFS. This should make OBDFS comparable in performance to Ext2.
A reorganization of the snapshot data layout, following suggestions from Ted Ts'o and others.
Remote access to OBD devices using SUN RPC and possibly other interfaces.
A cluster file system based on a locking API and object drivers.
Recovery, hopefully exploiting journal based techniques as used in ext3 and the ReiserFS.

If you want to help, let us know!

6. Known issues and bugs

While every effort is made to have a functioning system, because the software is under development, there may be dated releases which do not work. In general, the versioned releases should be working releases.

Some known issues/bugs with obdfs at the time this document was created:

The file system writes synchronously to the disk. This is done so that we can develop and debug the early versions of OBDFS with some kind of sanity. For this reason, you should not make judgements regarding OBD filesystem performance. In a future release, we will be using a page cache and flush daemon to do asynchronous disk I/O, which should improve performance.
The ext2 filesystem currently needs to be created with a 4k block size because the OBDFS assumes the block size matches the page size. This will be fixed in a later release of OBDFS.
In the current release of OBDFS, it is possible to add a snapshot while a filesystem is mounted, but it is not safe to remove a snapshot while any snapshot of that filesystem is mounted. While it may appear to work in many cases, it will likely corrupt the filesystem.

7. Contacting the Authors

There are two mailing lists for OBD, one for questions and development of the OBDFS software, and a second low-volume list for the announcement of new OBDFS releases. We read email sent to both of these lists regularly.

In order to send email to these lists, you must be subscribed. To subscribe to the obd-devel list, send email to obd-devel-request@lustre.org with the body:


subscribe your@email.addr obd-devel

The process for subscribing to obd-announce is the same.

To contact the authors directly, send email to braam@stelias.com or adilger@stelias.com