Aim: Citizen science data are increasingly used for modelling species distributions because they offer broad spatiotemporal coverage of local observations. However, such data are often collected without experimental design or set survey methods, raising the risk that bias and noise will compromise modelled predictions. We tested the ability of species distribution models (SDMs) built from these low-structure citizen science data to match the quality of SDMs from systematically collected data and tested whether stringent data filtering improved predictions. Location: Northeastern USA. Methods: We evaluated models built from a rapidly growing dataset of avian occurrences reported by birders—eBird—against models built from four independent, systematically collected datasets. We developed SDMs for 96 species using both data sources and compared their predictive abilities. We also tested whether culling eBird data by applying stringent data filters on survey effort or observer expertise improved predictions. Results: We found that SDMs built from low-structure citizen science data matched or exceeded performance of SDMs from systematically collected datasets for 12%–31% of species ((Formula presented.) = 22%), depending on the dataset. At least one culling option produced equivalent or better performance for 40%–70% of species ((Formula presented.) = 49%). Data culling by restricting survey effort improved predictions more than restricting by observer expertise. The optimal effort restriction differed by dataset, and for three of the datasets was further informed by species traits. Main conclusions: Species distribution models developed using low-structure citizen science data sometimes performed as well as those from systematic data. Culling generally improved models, but results were heterogeneous, prohibiting clear recommendations for how to cull. Our results indicate that the growing availability of citizen science data holds potential for creating high-quality spatial predictions, but that time should be invested in determining how best to cull datasets and that one-size-fits-all solutions beyond basic outlier filtering may be hard to find.