Abstract:
|
One important research area in high-performance computing (HPC) is the management of performance variability, which is affected by complicated interactions of numerous factors, such as CPU frequency, the number of I/O threads, file size and record size. In this paper, we are interested in the I/O variability, which is measured by the I/O throughputs that varies from run to run. Given a system configuration, one way to describe the variability is to use the cumulative distribution function (cdf) for the throughputs. Prediction the cdf of throughputs for a new system configuration is often of interest, which, however, is a challenging task. To overcome this challenge, computer scientists conducted large-scale experiments and collected a mass amount of data for the distribution of variability under various system configurations. We develop a Gaussian process model to predict the cdf under a new configuration using the experimental data. We also impose a monotone constraint so that the predicted function is monotonically increasing, which is one desired property of the cdf. We evaluate the performance of the proposed method by using the experimental data. We also discuss the methodolog
|