next up previous
Next: 6 Considering Efficiency Up: Using Parallel Program Characteristics Previous: The Generalized Algorithm:

5 Considering Work

 

We begin by considering jobs that execute with perfect efficiency. This is an admittedly unrealistic assumption, but one that is useful in understanding and quantifying the importance of the service demand characteristic of a job (i.e., its work). This is done by comparing a spectrum of allocation algorithms that adaptively reallocate processing power to jobs according to their remaining work, tex2html_wrap_inline825 . The basis for comparison is equipartition (i.e., tex2html_wrap_inline785 ). Since jobs execute with perfect efficiency we use tex2html_wrap_inline933 in order to determine processor allocations and examine a range of tex2html_wrap_inline779 values.

The results shown in Figure 2 have been obtained by simulating a system with P = 100 processors, using a mean work requirement of tex2html_wrap_inline893 , and the coefficient of variation of work being tex2html_wrap_inline979 . The graph plots the mean response time as a function of the scheduling policy ( tex2html_wrap_inline779 value) and shows results for loads (mean processor utilizations) of 30%, 50%, 70%, and 90%.

   figure182
Figure 2: Mean response time vs. scheduling algorithm ( tex2html_wrap_inline779 ); tex2html_wrap_inline985 , perfectly efficient jobs

We first note, as expected, that the mean response times, tex2html_wrap_inline987 , obtained with positive values of tex2html_wrap_inline779 are very large when compared with values of tex2html_wrap_inline779 less than or equal to zero. This is because positive values of tex2html_wrap_inline779 will assign a larger portion of processors to larger jobs (i.e., jobs with larger tex2html_wrap_inline825 ). The larger the tex2html_wrap_inline779 value, the larger the portion of processors assigned to the largest job. With large enough tex2html_wrap_inline779 , jobs are essentially executed in a Most Remaining Work First (MRWF) fashion. We include positive tex2html_wrap_inline779 values in this experiment mainly to point out that MRWF is not pessimal. We see that in this experiment, with a load of 90%, tex2html_wrap_inline987 is maximized (there is a peak in the curve) when tex2html_wrap_inline1005 . This is because the mean response time of a MRWF first policy can be increased by introducing processor sharing among the largest jobs, thus increasing the mean response time.

We also note that the difference in mean response time across different scheduling policies ( tex2html_wrap_inline779 values) decreases as the load decreases. This is because as the load gets lighter the degree of multiprogramming decreases. Under extremely low loads there will only be one job in the system at any point in time, in which case all of the allocation policies behave in the same fashion (i.e., the job is allocated all P processors).

In Figure 3 we more closely examine the range tex2html_wrap_inline1011 . This graph has been produced by running experiments using a load of 90% and increasing the coefficient of variation of work. We consider tex2html_wrap_inline1013 . In each case, as tex2html_wrap_inline779 increases in the negative direction the mean response time decreases. This is not surprising since (as mentioned previously) all jobs execute with perfect efficiency and a Least Remaining Work First (LRWF) policy is optimal under these conditions [32, 3]. As well, as the coefficient of variation increases so does the difference between tex2html_wrap_inline785 and tex2html_wrap_inline787 , demonstrating the increased benefits of knowing and appropriately using the job characteristic tex2html_wrap_inline825 . We also point out that the shape of the curves becomes sharper with increases in tex2html_wrap_inline789 . This is because with a small variation in tex2html_wrap_inline825 , higher values of tex2html_wrap_inline779 are required in order to approximate a LRWF policy, while with larger variations in tex2html_wrap_inline825 smaller values of tex2html_wrap_inline779 can be used to approximate LRWF.

   figure194
Figure 3: Mean response time vs. scheduling algorithm ( tex2html_wrap_inline779 ); load=90%, perfectly efficient jobs

In Table 1 we provide a rough quantification of the performance gains that can be achieved by knowing and utilizing tex2html_wrap_inline825 when jobs are perfectly efficient and the load is 90%. This table contains one column for the mean response times and 90% confidence intervals for the two policies tex2html_wrap_inline785 and tex2html_wrap_inline787 , as well as one column showing the percentage improvement that is obtained by using tex2html_wrap_inline787 instead of tex2html_wrap_inline785 . It shows that for tex2html_wrap_inline979 an improvement of roughly 64% can be expected and that the improvements likely increase with tex2html_wrap_inline789 (note the larger confidence intervals for larger tex2html_wrap_inline789 ).

This rough quantification is of interest because it provides an approximate bound on the reduction in mean response time that can be obtained by using tex2html_wrap_inline825 (or tex2html_wrap_inline777 ). This is because workloads with less efficient jobs will receive less benefit from allocating all processors to one job at a time (assuming a non-decreasing execution rate function), thus reducing the difference between an equipartition and an optimal policy. As well, previous analytic comparisons between LRWF and equipartition policies [3] show that the reduction in mean response time obtained when all jobs arrive simultaneously can be as large as 50% and that the difference can grow with the number of jobs (i.e., it is not competitive) when new arrivals are considered. Since this previous result assumes that arrival times and job sizes are designed to demonstrate maximum differences between the two policies, our comparison is important because it provides more reasonable bounds on the performance improvements that might be expected.

   table203
Table 1: Comparing mean response times for tex2html_wrap_inline785 and tex2html_wrap_inline787 for different tex2html_wrap_inline789 ; load = 90%

These same experiments have also been conducted under the assumption that the work, tex2html_wrap_inline825 , is not known a priori (see [12] for details). By keeping track of each job's accumulated execution time and using that as an estimate of a job's remaining work, influenced by studies performed by Leland and Ott on uniprocessor systems [17] (a portion of this study was recently confirmed for modern workloads by Harchol-Balter and Downey [14]), the response time is significantly reduced when compared with equipartition, although the reductions are not as large as when tex2html_wrap_inline825 is known.

We now explore the consequences of using only information about a job's remaining work, tex2html_wrap_inline825 , if jobs do not execute with perfect efficiency. In this experiment we again choose tex2html_wrap_inline825 with a mean of tex2html_wrap_inline893 and tex2html_wrap_inline1013 . Now instead of having all jobs execute with perfect efficiency, each job executes work at the rate defined by the execution rate function and the parameter tex2html_wrap_inline777 ( tex2html_wrap_inline849 ). In the following experiments we generate tex2html_wrap_inline849 values uniformly between tex2html_wrap_inline901 and tex2html_wrap_inline903 and consider different workloads with different tex2html_wrap_inline901 and tex2html_wrap_inline903 values.

The graph on the left side of Figure 4 shows that for workloads with relatively low average efficiency ( tex2html_wrap_inline1097 ), using only tex2html_wrap_inline825 to make scheduling decisions leads to increased mean response times (versus equipartitioning processors, tex2html_wrap_inline785 ). This is because these policies ( tex2html_wrap_inline1103 ) are allocating larger portions of the processors to small jobs even though they may not be capable of utilizing them effectively, and are therefore essentially wasting processing power. However, the graph on the right side of Figure 4 shows that for workloads with high average efficiency ( tex2html_wrap_inline1105 ), using only tex2html_wrap_inline825 to make scheduling decisions can decrease mean response times (versus using no job characteristics and equipartitioning processors, tex2html_wrap_inline785 ). This is because, under this workload, the average efficiency of jobs is high enough that the benefits of allocating large portions of the processors to these jobs (reduced response time) outweigh the costs (under-utilized processors).

   figure225
Figure 4: Mean response time vs. scheduling algorithm ( tex2html_wrap_inline779 ); load=90%, jobs are not perfectly efficient


next up previous
Next: 6 Considering Efficiency Up: Using Parallel Program Characteristics Previous: The Generalized Algorithm:

Tim Brecht
Thu Jul 24 14:20:29 EDT 1997