Keywords: Data Verbs; Grammar of Data; Curriculum
Wickham (2014) introduced fundamental data verbs for the manipulation of data. These verbs evolved into the grammar of data manipulation and the development of the widely used dplyr package in R. The focus of this work is to share our experiences in teaching these verbs throughout an entire undergraduate data science curriculum. The data verbs are 1) initially introduced in a non-coding environment 2) are repeatedly emphasized in various coding environments, e.g. dplyr, pandas, SQL, and 3) essential to the teaching of distributed computing via pyspark. The deliberate use of a common set of data verbs in a variety of coding environments has given our students more confidence and more ability to manage data.