Statistical analysis of spatio-temporal data has evolved over time to handle increasingly large data sets. E.g., the North American CORDEX program is producing daily values of climate-related variables on spatial grids with approximately 100,000 locations over 150 years. In order to perform a functional data analysis of this type of data, one must represent the discretely observed process as a function. Traditional tensor-based methods for representing functional data can quickly break down under the size of such massive data. We propose a penalized spline method for representing continuous, smoothly varying spatio-temporal functional data using a generalization of the sandwich smoother proposed Xiao et al. (2013). Unlike the original method, the generalization treats the spatial and temporal dimensions distinctly and can be applied to non-gridded data while retaining computational efficiency. A simulation study is used to demonstrate the accuracy of the methodology. The methodology is then applied to the North American CORDEX data, in which nearly 5.5 billion data values are smoothed in a timely fashion on a basic personal computer.