Online Program Home
My Program

Abstract Details

Activity Number: 644 - Statistical Computing on Parallel Architectures
Type: Invited
Date/Time: Thursday, August 2, 2018 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Computing
Abstract #326500
Title: Success with OpenMP in R Package Data.table
Author(s): Matt Dowle*

Matt will share his positive experience of parallelizing C code using OpenMP in the R package data.table. He will cover several tasks that are complete and released to CRAN: fwrite, fread, sort and shuffle. The focus will be on general techniques used (e.g. OpenMP's ordered clause) that may be applied by a wider audience to their fields. The examples will be from R but the same principles apply in Python, Julia and any environment where OpenMP can be used at C level. Problems overcome will include: how to halt with error (not thread safe) from a thread, the ability to reorder a character column in parallel even though the R API function SET_STRING_ELT() is not thread safe, how to reason with and tackle the fact that even on a server with 32 CPUs we still typically only have 32K of L1D, a mere 16 cache lines per thread. The talk will contain OpenMP example code and one or two references to Ulrich Drepper's 2007 paper: "what every programmer should know about memory".

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program