Abstract:
|
Open-source software (OSS) has a long and rich history in academia and research. Over the last twenty years it has moved beyond internet-based startup companies to running the New York Stock Exchange. Statistical production environments have also seen the adoption of open-source software languages, such as R and python, and open-source big data processing engines, such as Hadoop and Apache Spark. Adoption of this software provides access to a large number of state-of-the art statistical techniques and data processing methods, not necessarily available in commercial software. However, adoption of OSS comes with its own unique risks. In this roundtable discussion, the benefits and challenges of adopting, integrating, and maintaining OSS in statistical production environments is discussed. Use cases of R in production will be presented, and some potential solutions will be examined.
|