Real-time reactor power monitoring is critical for a variety of nuclear applications, spanning safety, security, operations, and maintenance. While machine learning methods have shown promise in monitoring reactor power levels, there is limited research on their efficacy in label-starved environments. The goal of this work is to assess the feasibility of classifying nuclear reactor power level using multisource data in scenarios with limited labels. Data were collected using low-resolution multisensors at four nuclear reactor facilities: two large research reactors and two TRIGA reactors. Within each pair, one reactor dataset served as the source and the other as the target in a transfer learning paradigm. Twenty-three supervised models were trained on labeled sequences of magnetic field and acceleration data from each of the target sites. Self-learning and transfer learning methods were applied to the top performing models to assess their classification performance with increasing amounts of labeled data. While reactor power level classification was achieved with a Matthews Correlation Coefficient of up to 0.739 ± 0.003 and 0.622 ± 0.009 with only 400 sequences per power state for the large research reactor and TRIGA target sites, respectively, self-learning and transfer learning leveraging source site data did not improve target classification performance. These findings suggest that alternative methods, such as higher sensitivity sensors, digital twins, or the use of physics-informed models, are required to enable high-performance classification in machine learning approaches to reactor monitoring with a dearth of target ground truth.