The Listening Sentence Span Task is a widely used measure of working memory capacity for children. However, this measure has not been analyzed from an IRT framework nor has it been adapted to non-English languages. Study 1 of this paper examined the Classical Test Theory summed-score statistics, construct equivalence via a structural equation modeling framework, item parameter estimation utilizing Item Response Theory, and concurrent validity of a newly adapted Spanish-version of the Listening Sentence Span Task (LSST-S) for 491 English language learners (ELLs) in grades 1-3. Results of the analysis demonstrated that the majority of items on the measure displayed low item-total correlations and low internal consistency reliability. In addition, a very low coefficient á was obtained for the overall measure. A confirmatory item factor analysis demonstrated that the LSST-S measured a distinct latent construct when compared to its English predecessor, implying that construct non-equivalence was present between the two measures. Lastly, the LSST-S exhibited poor concurrent validity with measures of reading comprehension, fluid intelligence, and arithmetic computation. Study 2 examined differential item functioning of the Listening Sentence Span Task English-version in a mixed language-status sample, which was comprised of ELL (n=491) and non-ELL (n=315) children. This analysis demonstrated that uniform and non-uniform DIF was present. Recommendations for improving the LSST-S and LSST-E for use with ELLs are provided.