Distributed Filesystem for a High Speed Network
Andreas Fleuti
Semester Thesis Summer 1999
Supervisors: Assistant Felix Rauch, Prof. T. Stricker
Institute for Computer Systems, ETH Zürich
Objectives
In the diploma work of Michael Psarros, he wrote the Myrinet Distributed
File System, in short MDFS. The basic idea of MDFS is to use the advantages
of a fast network together with several harddisks, accessed in parallel.
The goal is to get a high throughput by using several distributed servers,
each one sending data. The implementation of MDFS was done for Linux. However,
the perfor-mance of MDFS was not optimal. The system was not scaleable
and therefore the throughput was constant, independently of the number
of servers. The measured throughput showed us that there is big potential
compared to the ability of the network. My task was to find out why the
system was not scaleable and in the second step, to implement the needed
changes.
Results
At the beginning, I focused on Linux and the way Linux was treating device
requests. I was able to handle the requests in a parallel way. To access
each single server, the Linux RAID (Redundant Arrays of Inexpensive Disks)
was first used. This system showed a significant lower through-put, which
was about half the throughput measured without the Linux RAID. This happened
also with only one server. So I implemented my own simple RAID system in
the MDFS driver. This way, I could check scalability and exclude influence
the Linux RAID system could have. The lower per-formance was gone, but
scalability was still not reached. The parallel handling of device requests
resulted in a throughput of 17 MB/s. I was able to higher the
throughput by adapting the part of the program, dealing with sending
and receiving data packets. The new measurement showed 20 MB/s, which is
about three times the original MDFS. However, the system was still not
scaleable. There is no easy answer for the question why the system is not
scaleable. Exclusion of components, which are scaleable, shows were the
unresolved issues must be. It's likely that the program running on the
Myrinet processor LANai is such a candidate, but also how well Liniux creates
and handles requests is an open issue.
[ CS-Department | Up
]
ETH Zürich: Department of Computer Science
Comments to Felix Rauch <rauch@inf.ethz.ch>
Date: June, 28. 1999