Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 587 - Ocean Statistical Methodology and Application
Type: Topic Contributed
Date/Time: Thursday, August 6, 2020 : 3:00 PM to 4:50 PM
Sponsor: Section on Statistics and the Environment
Abstract #313657
Title: Marine Data Mining with CMAP Using R
Author(s): Aditya Mishra* and Christian L. Müller
Companies: Flatiron Institute and Flatiron Institute, Simons Foundation
Keywords: marine data; robust regression; regularization; multivariate analysis
Abstract:

Recent advances in experimental techniques and scientific instruments have enabled the collection of biological, biogeochemical, and imaging data of the ocean on a global scale. The Simons CMAP, a currently developed large-scale open-access marine database, hosts a multitude of such marine datasets, including remote-sensing satellite observations, large-scale integrated in-situ biogeochemical cruise measurements, amplicon sequencing data, and complex synthetic ocean simulation data. To facilitate easy access to these rich data sets for statisticians and data scientists, we have developed cmap4r, an R package that enables downloading, analyzing, and visualizing datasets from the Simons CMAP in a fast and structured manner. Integrated analysis of marine data is challenging due to several factors, including the presence of outliers, missing entries, different spatial and temporal resolutions, spatiotemporal dependencies, high dimensionality, and amplicon sequencing data, the absence of absolute species abundance measurements due to experimental limitations. This presents a unique opportunity for both the development and the application of novel statistical methods for marine data.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program