In computers, parallel processing is the processing of
program instructions by dividing them among multiple
processors with the objective of running a program in
less time. In the earliest computers, only one program ran
at a time. A computation-intensive program that took
one hour to run and a tape copying program that took
one hour to run would take a total of two hours to run.
An early form of parallel processing allowed the
interleaved execution of both programs together. The
computer would start an I/O operation, and while it was
waiting for the operation to complete, it would execute
the processor-intensive program. The total execution
time for the two jobs would be a little over one hour.
The next improvement was multiprogramming. In a
multiprogramming system, multiple programs submitted
by users were each allowed to use the processor for a
short time. Then the operating system would allow the
next program to use the processor for a short time, and
so on. To users it appeared that all of the programs were
executing at the same time. Problems of resource
contention first arose in these systems. Explicit requests
for resources led to the problem of deadlocks.
Competition for resources on machines with no
tie-breaking instructions lead to critical section routines.
Vector processing was another attempt to increase
performance by doing more than one thing at a time. In
this case, capabilities were added to machines to allow a
single instruction to add (or subtract, or multiply, or ...)
two arrays of numbers. This was valuable in certain
engineering applications where data naturally occurred in
the form of vectors or matrices. In applications with less
well-formed data, vector processing was not so valuable.
The next step in parallel processing was the introduction
of multiprocessing. In these systems, two or more
processors shared the work to be done. The earliest
versions had a master/slave configuration. One processor
(the master) was programmed to be responsible for all of
the work in the system; the other (the slave) performed
only those tasks it was assigned by the master. This
arrangement was necessary because it was not then
understood how to program the machines so they could
cooperate in managing the resources of the system.
Solving these problems led to symmetric multiprocessing
systems (SMP). In an SMP system, each processor is
equally capable and responsible for managing the flow of
work through the system. Initially, the goal was to make
SMP systems appear to programmers to be exactly the
same as single processor, multiprogramming systems.
(This standard of behavior is known as sequential
consistency). However, engineers found that system
performance could be increased by someplace in the
range of 10-20% by executing some instructions out of
order and requiring programmers to deal with the
increased complexity. (The problem can become visible
only when two or more programs simultaneously read
and write the same operands; thus the burden of dealing
with the increased complexity falls on only a very few
programmers and then only in very specialized
circumstances.) The question of how SMP machines
should behave on shared data is not yet resolved.
As the number of processors in SMP systems increases,
the time it takes for data to propagate from one part of
the system to all other parts grows also. When the
number of processors is somewhere in the range of
several dozen, the performance benefit of adding more
processors to the system is too small to justify the
additional expense. To get around the problem of long
propagation times, message passing systems were
created. In these systems, programs that share data send
messages to each other to announce that particular
operands have been assigned a new value. Instead of a
broadcast of an operand's new value to all parts of a
system, the new value is communicated only to those
programs which need to know the new value. Instead of a
shared memory, there is a network to support the
transfer of messages between programs. This
simplification allows hundreds, even thousands, of
processors to work together efficiently in one system. (In
the vernacular of systems architecture, these systems
"scale well.") Hence such systems have been given the
name of massively parallel processing (MPP) systems.
The most successful MPP applications have been for
problems that can be broken down into many separate,
independent operations on vast quantities of data. In data
mining, there is a need to perform multiple searches of a
static database. In artificial intelligence, there is the need
to analyze multiple alternatives, as in a chess game. Often
MPP systems are structured as clusters of processors.
Within each cluster the processors interact as in a SMP
system. It is only between the clusters that messages are
passed. Because operands may be addressed either via
messages or via memory addresses, some MPP systems
are called NUMA machines, for Non-Uniform Memory
Addressing.
SMP machines are relatively simple to program; MPP
machines are not. SMP machines do well on all types of
problems, providing the amount of data involved is not
too large. For certain problems, such as data mining of
vast data bases, only MPP systems will serve.
copyright © 1999 bySMELLY CAT®. All rights reserved.