Abstract:
|
Computation-intensive methods are conventionally handled by parallel computing on computer clusters or supercomputers. Two new trends emerged in the 12 years. The first trend was to make parallel computing more down-to-earth. Thanks to the ubiquity of multicore CPUs, statistical languages, e.g. R, started to have packages that enable parallel computing on a laptop. The second trend was to leverage the Graphics Processing Units (GPU) in computers. A GPU can have thousands of computing units, and are efficient for vector computation. We use 2 examples to illustrate these 2 trends. The first one is in the area of late-stage clinical data analysis. It is computation intensive because bootstrap has to be used to estimate the confidence interval of a key parameter. An existing R package allows us to bootstrap in parallel on my laptop, and reduce the computing time from 3 hours to 1. The second example is in the area of drug discovery. A deep neural network was used to predict compounds' efficacy/safety using their chemical structures. The model has millions of parameters to estimate. By using a GPU, the model training time was reduced from 4 days on a powerful desktop to 2 hours.
|