NFSP

WIP Wed Sep 10 14:58:23 CEST 2003

The most up-to-date information should be available in the packages on the NFSP web page here.

Content

What is NFSP ?

NFSP aims at building a distributed NFS server. There are currently two versions (a user mode and a kernel mode version). A few sub-flavors may be enabled at compilation time by means of Makefile options.
The two main versions are: Currently this package only ships UNFSP. Older packages shipped KNFSP but it has to be adapted to function with the latest kernels (2.4.22) and the latest nfs-utils (1.0.5). NFSP may be viewed as a mix of Parl's PVFS (Parallel Virtual Filesystem) and a NFS server.

Our objectives are simple:

Current requirements

There are some limitations of the code that will probably cause problems with running or compiling the the code on different hardware or OS'es.
To summarize, NFSP have been tested on/with: It may work with lower versions (but was not tested), so feel free to let me know should this work on your system.

Compilation

A quick build method is available through the use of a script. Simply launch the script script/batchbuild You may also prefer to build one component at a time (a la mano mode ;): Yet, you may copy only needed files according to the type of the node:

Installation: server and I/O daemons side

Three entities have to be distinguished: Among this set of nodes, there must be exactly one dedicated node which is chosen to act as the server: this will be the only host your clients will have to know (same thing as for a plain NFS server).
The other nodes may be: iod(s) and/or client(s), provided there is at least 1 iod (the upper limit is fixed in default.h by MAX_IODS - edit it to suit your needs).


On the server:

On the I/O servers (iod's):

There are two ways to start the I/O storage servers (iods) and they depend on the NFS clients' implementation.
If the client is loosely checking the IP packets (as at least in Linux 2.4.x) then the root privileges are not required and the following steps have to be done (this is the default mode):
Now, if your clients enforce some kind of IP checking, then root privileges are needed in order to enable UDP spoofing techniques. Yet, the steps are not much altered.
Instead of starting iodng as the user, start it as root and use the -user ioduser flags. Once the application has its low-level sockets open, it will change its (e)uid/(e)gid to the ioduser user's.

Back on the server:

As there are several NFSP flavors there are several ways to launch them...

User-mode without VIOD support
User-mode with VIOD support
The VIOD let users add some kind of data redundancy for the iods and are groups of iods replicating the information they receive. XXX If every step was successful then you have now a working NFS share available for your clients (see the part below to know how use it) !
You can verify you can see it from your clients by using the utility showmount which is supposed to give you a list of the available NFS shares of a server as in the example below: $ /sbin/showmount -e metasrv Export list for metasrv: /META 192.168.0.0/255.255.255.0 $ If not, then go to the troubleshooting section.

Installation on the clients

Well... It's quite simple: the only thing the clients require is the kernel support of NFS protocol.
As root on all of your clients, issue the following command (metasrv being the host holding the directory /META):
root@clientbox# mount -t nfs metasrv:/META /mnt/mymountpoint/ -o rsize=8192,wsize=8192,hard,intr
Other options for your NFS client are available on Linux systems in the man page nfs(5) but YMMV. You may also edit the /etc/fstab if you happen to use often this NFS share to automagically mount it.
And voilą it's done.

Troubleshooting

Component Trouble/Fix
hint all it's still a work in progress so do expect crashes, tears and desolation for your valuable data :)
hint all most executables will display help messages if you use the -h/--help flags (recommended)
hint all for performance purposes, you may increase the number of processes since blocking I/O are used (check with -h/--help)
hint all you may tweak applications by editing the hardcoded values in default.h file
hint all nfsp specific #define's are at the beginning of iodng.h so if some limitations annoy you, feel free to tweak them in this file
hint all location does not matter (or else this is a bug ;) provided it is specified it on the command line
hint all most applications have a man page (*.man) though these may be incomplete/inaccurate/not up to date
ts iod with some paranoid setups may not allow IP spoofing so it may not work (check /proc/sys/net/ipv4/conf/*/rp_filter and /usr/src/linux/Documentation/filesystems/proc.txt)
ts iod the iodng_ping utility may be used to "ping" an iod and to test if this host can do IP spoofing (check the iodng's -s option)
ts iod the directory in which lives the iod must be at least 0700 and must belong to the uid/gid specified with the -u option (default to user nobody)
ts all most commands support a -F option (as in "Foreground") to tell the applications not to daemonize and thus help debugging
ts mount If you cannot mount your NFS share: check the permissions of the files /etc/hosts.{allow,deny} for portmap and mountd, if your metafile directory exists (/META above), test the iods with iodng_ping, run foreground, enable debug mode SHOWINFO to 1 in dbg.h), check /var/log/syslog
BUG iodng_ping spoof option (-s) may not work if you use "special" addresses to be spoofed (for instance 127.0.0.0/8 and 224.0.0.0/3) since it will interact strangely with routing tables. (You don't want to do that anyway, do you ? :)
BUG iod only 2 network interfaces are being currently supported for the iods (lo and eth0). They are probed by means of ioctl calls to get their MTU's and correctly fragment the IP packets.
BUG all this has only been tested with Linux x86 with Linux x86 (32bits) clients: there are probably issues with endianness and/or 64bits architectures...
BUG nfsd you may create a special block device - well if you are root - (mknod foo b 12 23) but you won't be able to remove it on a mounted partition
BUG nfsd if you truncate() a file then make it grow, by writing something beyond its new limit then read between the old end offset and the new end offset, there will most likely be stale data (understand: "undefined behavior") and not "0" as expected.
BUG all if an iod breaks, the system will break and clients will hang. You may restart the iod and operations should work again (not much tested) but you should have mounted the NFS share with intr (or soft even though it's not really advised) option as it was stated above

Todo

Contact

Please, add the word '[nfsp]' somewhere in the subject of your mail and do not forget to set a real subject if you want a quicker answer.
The NFSP team within the ID-IMAG laboratory gathers Yves Denneulin (general coordination), Adrien Lebre (tools and performance evaluations), Pierre Lombard (prototype) and Olivier Valentin (kernel port). To contact a specific member, feel free to use firstname.lastname@imag.fr or just mail me pierre.lombard@imag.fr.


Valid XHTML 1.0!