Slide 18 of 24
For the prototype of our CoPs interconnected with Dolphin’s Scalable Coherent Interface and Myrinet peak bandwidths of 70 and 128 MB/s are achieved for a contiguous transfers.
But contiguous load and remote strided stores suffer from too many PCI bus arbitrations and the DMA initializations.
The sloped curve from stride 2 to stride 8 with SCI can be explained by the mechanics of stream buffers.
With Myrinet we have about the same picture as with SCI. But here we have no business with the stream buffers.
As a reference the local copy performance is given with the black bullets.
The results show that the remote performance clearly suffers from the I/O system.