Gender status, gender roles, and gender values vary widelyacross cultures. Anthropology has provided qualitative ac-counts of economic, cultural, and biological factors that im-pact social groups, and international organizations have gath-ered indices and surveys to help quantify gender inequalitiesin states. Concurrently, machine learning research has recentlycharacterized pervasive gender biases in AI language models,rooting from biases in their textual training data. While thesemachine biases produce sub-optimal inferences, they may helpus characterize and predict statistical gender gaps and gendervalues in the culture(s) that produced the training text, therebyhelping us understand cultural context through big data. Thispaper presents an approach to (1) construct word embeddings(i.e., vector-based lexical semantics) from a region’s social me-dia, (2) quantify gender bias in word embeddings, and (3)correlate biases with survey responses and statistical gendergaps in education, politics, economics, and health. We validatethis approach using 2018 Twitter data spanning 143 countriesand 51 U.S. territories, 23 international and 7 U.S. gender gapstatistics, and seven international survey results from the WorldValue Survey. Integrating these heterogeneous data across cul-tures is an important step toward understanding (1) how biasesin culture might manifest in machine learning models and (2)how to estimate gender inequality from big data.