Next: 9.1 Future Work Up: Using Parallel Program Characteristics Previous: 8 Combining Work and

9 Conclusions and Future Research

In this paper we examine the problem of processor allocation in multiprogrammed multiprocessors. We intentionally use relatively simple models of a parallel program's execution, the workload, and the system, in order to gain a first-order understanding of how job characteristics might be used to make better processor allocation decisions. The experiments include a wide range of workload characteristics, including large variations in the amount of work jobs execute as well as jobs that execute work at different rates.

Using our job, system, and workload models we make the following observations:

Policies that use only information about the amount of work each job executes do not reduce mean response time when compared with an equipartition policy unless the average efficiency of the jobs is relatively high. The size of the improvements observed by making partitioning decisions based on the amount of work each job executes increases with the average efficiency of the jobs, the coefficient of variation of the work, and the system load. Improvements are maximized and are substantial when all jobs are perfectly efficient.
Policies that use only information about the efficiency with which a job executes can provide small but statistically significant reductions in mean response time when compared with equipartition, if extreme partitioning policies are avoided. With the particular workload used in these experiments we found that using a value of yielded reductions of up to 14% in some instances. However, these policies are unable to produce further improvements because they fail to consider the amount of remaining work each job has to execute. At first this conclusion might appear to conflict with results from recent work by Nguyen, Vaswani, and Zahorjan [25]. However, their significant performance gains are obtained under workloads for which equipartition can naively allocate too many processors (thus actually increasing the execution time of applications). If similar workload models were used here we would see similarly large improvements.
Under the execution rate function used in this paper, policies that make processor partitioning decisions using only information about the knee of the execution time - efficiency profile of each job provide the same advantages and disadvantages of policies that consider only the efficiency of a job.
Policies that consider both a job's remaining work and a job's efficiency when making processor allocation decisions can yield substantial improvements when compared with the equipartition policy. However, the improvements are limited by the coefficient of variation of the work jobs execute, the average efficiency with which jobs execute their work, and the system load. Our new algorithm is based on using characteristics about the amount of work a job executes and the efficiency with which it can be executed. Although such information may not be readily available we believe that it is important to understand the properties of effective processor allocation policies in order to develop more effective, practical scheduling algorithms.
We demonstrate that it is not desirable to blindly equipartition processors, and that considerable improvement in mean response time can be obtained by using information about a job's work and efficiency. However, the performance benefits obtained by using partitioning techniques based on job characteristics is likely bounded and equipartitioning processors is a compromise that is safe in the absence of such characteristics. Equipartition is also likely to remain the practical algorithm of choice until more studies of multiprocessor workloads are performed and better techniques for obtaining job characteristics are developed.

While true evidence of the applicability of our results requires an experimental evaluation, we believe that the insights from our work can be used to guide the implementation of scheduling algorithms in production multiprogrammed multiprocessors. Nguyen, Vaswani and Zahorjan [25, 24] have shown experimentally that estimates of a job's efficiency can be obtained and used effectively at run-time, and that the cost of reallocating processor in response to changes in job and workload characteristics is not prohibitive. Leland and Ott [17] point out, and Harchol-Balter and Downey confirm their study using modern UNIX workloads [14], that under sequential UNIX workloads there is a strong correlation between the length of time a job has been executing and its expected remaining execution time. This observation is part of the inspiration behind the multilevel feedback queue schedulers currently used on many UNIX systems [2] and we expect that similar observations will hold for multiprocessor workloads.

These previous studies demonstrate that it is possible to obtain estimates of a job's efficiency and of its remaining work. They also provide evidence that these estimates are reasonably accurate and that the estimates can be applied efficiently. We believe that such estimates can be combined with the scheduling techniques outlined in our algorithms and that this combination can be used to reduce mean response time when compared with existing multiprogrammed multiprocessor scheduling algorithms.

9.1 Future Work

Next: 9.1 Future Work Up: Using Parallel Program Characteristics Previous: 8 Combining Work and

Tim Brecht
Thu Jul 24 14:20:29 EDT 1997