Operating System Support for High-Performance Shared-Memory Multiprocessing.

A primary motivation behind building multiprocessors is to

cost-effectively improve system performance, but in practice

it has proven difficult to obtain good performance from

parallel applications. This project has been to devise and

prototype solutions to several problems that have limited

multiprocessor system performance. A common theme in

each of these solutions is that a multiprocessor operating

system should be different (although it need not be more

complicated) than a uniprocessor operating system that

happens to run on a multiprocessor. Specifically, the research

has concerned:

 
  • Efficient threads.
Threads are the vehicle for concurrency in many

approaches to parallel programming. I have shown

that thread management overhead can be

reduced to within an order of magnitude of the

cost of a procedure call. The goal is to make the

expression and control of parallelism sufficiently

cheap that the ``natural parallel decomposition'' of

an application can be exploited by the

programmer or compiler with acceptable

overhead, even in the case where this

decomposition is relatively fine-grained. With high

performance threads, however, small differences

in thread management can have a significant

performance impact for applications with

fine-grained parallelism, often posing a tradeoff

between throughput and latency. Per-processor

data structures can be used to improve

throughput, and in some circumstances to avoid

locking, impving latency as well.

  Efficient thread management is impossible without efficient

low-level synchronization primitives. In the work on thread

management we discovered that, contrary to intuition, the

obvious methods of spin-waiting for busy critical sections

have poor performance. Instead, slightly more sophisticated

methods, such as Ethernet-style backoff or explicit software

queueing, are needed to avoid having busy processors slowed

because of the activity of spin-waiting processors.

 
  • Multiprocessor operating system structure.
Threads can be supported either by the operating

system kernel or by user-level library code in each

application's address space, but neither approach

has been fully satisfactory. The performance of

kernel-level threads, although at least an order of

magnitude better than that of traditional UNIX-like

processes, has been at least an order of magnitude

worse than that of user-level threads. User-level

threads, on the other hand, have suffered from a

lack of system integration, leading to poor

performance or even incorrect behavior in the

presence of ``real world'' factors such as

multiprogramming, I/O, and page faults. Thus the

parallel programmer has been faced with a

difficult dilemma: employ kernel-level threads,

which ``work right'' but perform poorly, or employ

user-level threads, which typically perform well but

sometimes are seriously deficient. We have

designed and prototyped a new kernel interface,

called Scheduler Activations, and user-level thread

package that together provide the same

functionality as kernel-level threads without

compromising the performance and flexibility

advantages of managing parallelism at the user

level within each application address space.


 
 
 
 

[ back ] | [ next ]






copyright © 1999 bySMELLY CAT®. All rights reserved.