Community detection in networks has been extensively studied in the form of finding a single partition into a “correct” number of communities. In large networks, however, a multi-scale hierarchy of communities is much more realistic. We show that a hierarchical tree of communities, obviously more interpretable, is also potentially more accurate and more computationally efficient. We construct this tree with a simple top-down recursive algorithm, at each step splitting the nodes into two communities with a non-iterative spectral algorithm, until a stopping rule suggests there are no more communities. The algorithm is model-free, extremely fast, and requires no tuning other than selecting a stopping rule. We propose a natural model for this setting, a binary tree stochastic block model, and prove that the algorithm correctly recovers the entire community tree under relatively mild assumptions. As a by-product, we obtain explicit and intuitive results for fitting the stochastic block model under model misspecification. We illustrate the algorithm on a statistics papers dataset constructing a highly interpretable tree of statistics research communities.