Design and analysis of cluster randomized trials must take into account correlation among outcomes from the same clusters. When applying generalized estimating equations (GEE), the first-order (e.g. treatment) effects can be estimated consistently even with a misspecified covariance structure. In settings for which the correlation is of interest, one could estimate this quantity via second-order generalized estimating equations (GEE2). However, this procedure is (1) biased under informative missing data and (2) computationally infeasible as cluster sizes grow. We first introduce a stochastic variant to fitting GEE2 which alleviates reliance on parameter starting values and provides substantially faster speeds and higher convergence rates than the standard Newton-Raphson method. Under certain conditions, the proposed method can reduce time complexity from O(n^2) to O(n), where n is the maximum cluster size. Secondly, we extend the missing data framework to GEE2, for which we incorporate a "second-order" inverse-probability weighting (IPW) and "second-order" doubly robust (DR) schemes. We highlight the need to model correlation among missingness indicators in such settings.