Online Program

Return to main conference page

All Times EDT

Thursday, June 4
Data Visualization
Divide and Recombine for Big Data Analysis and Visualization
Thu, Jun 4, 1:20 PM - 2:55 PM
TBD
 

Rethinking Climate Data Analysis and Visualization in the Era of Big Data (308169)

*Wen-wen Tung, Purdue University 

Keywords: big data, data analysis methods, visualization, distributed parallel computing, environmental data

Climate change, growth in the world population, and extreme weather are increasingly taxing Earth's limited natural resources and exacerbating weather and climate risks, especially to vulnerable populations. Solutions to these complex global issues demand an interdisciplinary approach. Scientists need to create a new paradigm for understanding and predicting these issues that harnesses the power of data science.

In this talk, we demonstrate how data science has transformed climate data analysis through big data analysis of surface precipitation associated with Atmospheric Rivers (ARs) over the North American West Coast and the Midwest. Atmospheric rivers, the long narrow filaments of enhanced water vapor transport in the lower troposphere, are known to accompany extreme rain and winds. They are important weather systems for US water resources on the West Coast and in the Midwest.

Currently, there are many AR "detection"algorithms for creating AR indices. It has been presumed that after obtaining an AR index, one can then study all aspects of the downstream impacts of the atmospheric rivers. We take a solution-driven approach, with which we first ask which impacts, in which region, and in what time scale and period are of concern. Then, we use an algorithm combining climatological significant- or extreme-event criteria, image processing, and statistical methods to create an ensemble of O(100) AR indices for answering the questions with detailed visualization.

This approach is made possible by distributed parallel computing with data and, specifically, the divide and recombine approach using the R-based DeltaRho (http://deltarho.org) back-ended by a Hadoop system. Procedures and outcomes of the creation and selection of AR indices for studying the hydro-meteorological impacts of ARs making landfall to the North American West Coast and in the central US using ten years of NASA MERRA2 data from 2006 to 2015 will be discussions.