Abstract:
|
Data analysis algorithms increasingly must run at the same scale that data are being produced, and moreover at the same time and place. This is because the cost of moving and storing data, in terms of time, power, and capacity, overshadow other costs at the largest data sizes. For scientific simulations and experiments, this translates to computing statistical analyses in situ on high-performance supercomputing platforms. I will outline our infrastructure for writing highly parallel distributed-memory algorithms, and then I will provide a few examples of algorithms built this way and science applications using these algorithms.
|