Result: Shared-Memory Multiprocessor Trends and the Implications for Parallel Program Performance

Title:
Shared-Memory Multiprocessor Trends and the Implications for Parallel Program Performance
Publisher Information:
University of Rochester. Computer Science Department.
Publication Year:
2004
Collection:
University of Rochester, New York: UR Research
Document Type:
Report report
Language:
English
Rights:
This item is protected by copyright, with all rights reserved.
Accession Number:
edsbas.5229EF14
Database:
BASE

Further information

The last decade has produced enormous improvements in processor speeds without a corresponding improvement in bus or interconnection network speeds. As a result, the relative costs of communication and computation in shared-memory multiprocessors have changed dramatically. An important consequence of this trend is that many parallel applications, which depend on a delicate balance between the cost of communication and computation, do not execute efficiently on today's shared-memory multiprocessors. In this paper we quantify the effect of this trend in multiprocessor architecture on parallel program performance. Our experiments on bus-based, cache-coherent machines like the Sequent Symmetry, and large-scale distributed-memory machines like the BBN Butterfly, demonstrate that applications scale much better on previous-generation machines than on current machines. In addition, we show that some scalable machines support fine-grain, shared-memory programs better than some bus-based, cache-coherent machines, without significantly greater programming effort. From our experiments we conclude that communication has become a dominant source of inefficiency in shared-memory multiprocessors, with serious consequences for system software involved in scheduling and decomposition decisions. In particular, we argue that shared-memory programming models that could be implemented efficiently on the machines of yesterday do not readily port to state-of-the-art machines, and that current software trends in support of fine-grain parallel programming are at odds with hardware trends.