Cluster 2005 START ConferenceManager    

TCP Adaptation for MPI on Long-and-Fat Networks

Motohiko Matsuda (AIST), Tomohiro Kudoh (AIST), Yuetsu Kodama (AIST), Ryousei Takano (AIST), Yutaka Ishikawa (University of Tokyo)

IEEE International Conference on Cluster Computing (Cluster 2005)
Boston, Massachusetts, USA, September 27 - 30, 2005


Abstract

Typical MPI applications work in phases of computation and communication, and messages are exchanged in relatively small chunks. This behavior is not optimal for TCP because TCP is designed only to handle a contiguous flow of messages efficiently. This behavior is well-known, but fixes are not integrated into today's TCP implementations, even though performance is seriously degraded, especially for MPI applications. This paper proposes three improvements in the Linux TCP stack: i.e., pacing at start-up, reducing Retransmission-Timeout time, and TCP parameter switching at the transition of computation phases in an MPI application. Evaluation using the NAS Parallel Benchmarks shows that the BT, CG, IS, and SP benchmarks achieved 10 to 30 percent improvements. On the other hand, the FT and MG benchmarks achieved no improvement because they have steady communication as TCP assumes, and the LU benchmark becomes slightly worse because it has very little communication.


  
START Conference Manager (V2.49.7)
Maintainer: rrgerber@softconf.com