Abstract:
|
Matt will share his positive experience of parallelizing C code using OpenMP in the R package data.table. He will cover several tasks that are complete and released to CRAN: fwrite, fread, sort and shuffle. The focus will be on general techniques used (e.g. OpenMP's ordered clause) that may be applied by a wider audience to their fields. The examples will be from R but the same principles apply in Python, Julia and any environment where OpenMP can be used at C level. Problems overcome will include: how to halt with error (not thread safe) from a thread, the ability to reorder a character column in parallel even though the R API function SET_STRING_ELT() is not thread safe, how to reason with and tackle the fact that even on a server with 32 CPUs we still typically only have 32K of L1D, a mere 16 cache lines per thread. The talk will contain OpenMP example code and one or two references to Ulrich Drepper's 2007 paper: "what every programmer should know about memory".
|