Urban mass transit systems generate large volumes of data via automated systems for smart card transactions, signaling, and operations. There is considerable motivation within the transit industry to harness these data for performance analytics and real-time prediction. In this paper we compare machine learning (ML) and semiparametric (SP) regression techniques for analyses of travel times and flows across different lines, times of day, and operating conditions. The ML algorithms considered include Regression Trees, Kalman filters, Random Vector Functional Link Networks, Fuzzy Descriptive Logic, and Random Forests. Aspects of system performance to be modeled include travel times, reliability, crowding, capacity utilization and passenger flows. Computational expense for the ML and SP methods are assessed to facilitate the choice of appropriate methods for real-time prediction. The results can help improve the planning and implementation of transit services and reduce the potential for mis-match between passengers flows and capacity supplied under varying travel conditions.