Transparent Access Method for Distributed Harddisks

Christian Widmer

Diploma Thesis Winter 2001/2002
Supervisors: F. Rauch, Prof. T. Stricker
Institute for Computer Systems, ETH Zürich


Objectives

What is the topic?

The idea is to have transparent access to all disks in a cluster of PCs using Gigabit Ethernet. This should be done by writing some kernel modules and/or user applications for a Linux (version 2.4.x) based operating system.

What is the motivation?

Since PC hardware is becoming more and more powerful, clusters of PCs gain more popularity. Developing applications for clusters can become very complex. That's because one needs to take into account different aspects of parallel and distributed programming. Applications like distributed and parallel databases are difficult to develop. The idea is to distribute the data to see if that makes the development of distributed and parallel databases easier.

What are the goals?

A system should be designed and implemented that allows transparent access to SCSI disks that are integrated in the client and all servers. Applications should have access to these disks through the file system without the need of any modifications. The system should use gigabit Ethernet for communication and should not exclude other services on the wire. It has to work with the Linux kernel version 2.4.x.

Which problems have to be solved?

One question to answer is, on which level in the kernel to insert the system. There are two possible levels. One is to implement a special SCSI driver. This driver would forward all SCSI commands to the server using some network protocol. An alternative method is to implement a block device driver and forward the block device read and write requests. Another question is what kind of network protocol should be used. Since the system runs in a cluster with a switched Gigabit Ethernet it is not necessary to use some IP based protocols. In addition, TCP/IP uses a slow start mechanism that may slow down the system. One should also think about how the server should be implemented. There's the possibility to implement a user level daemon. This would make development of the server easier than a kernel based daemon but would be much more influenced by the scheduler.

Results

What was accomplished?

The system developed consists of four kernel modules. Due to the time that was needed to identify some serious hardware bugs in the gigabit network adapter the servers access is limited to an internal ram disk. All the sources seem to be in a quite clean, stable and robust state.

What are the solutions to the posed problems?

A block device driver was developed. It forwards all block requests over the network using the newly developed Burst Transfer Protocol. This protocol implements virtual channels. To transport data between machines, data buffers get preallocated at the data receiver. An attempt was made to implement True Zero Copy. A special gigabit network interface card was chosen that has several RX/TX hardware descriptor rings. Depending on the user priority in the VLAN tag the hardware decides on which hardware descriptor ring to store arrived data. The kernel based block devices server uses several kernel threads to execute the I/O requests.

What are the remaining problems?

The block device server should be extended so that it can access hard disks. Some parts of the Burst Transfer Protocol especially its interface to the network interface card, are too simple. They need some optimizations in order to minimize the latency. The interface to the network adapter should therefore be extended. The Burst Transfer Protocol already has some error handling included but the block device driver and the block device server don't use the error messages yet. Also the network interface needs some extensions to fully support error handling.

[ CS-Department | Up ]


ETH Zürich: Department of Computer Science

Comments to Felix Rauch <rauch@inf.ethz.ch>
March 2002