Decades of research have informed about ways in which infants and young children learn through action in connection with their sensory system. However, this research has not strongly addressed the issues of cultural diversity or taken into account everyday cultural experiences of young learners across different communities. Diversifying the scholarship of early learning calls for paradigm shifts, extending beyond the analysis at the individual level to make close connections with real-world experience while placing culture front and center. On the other hand, cultural research that specifies diversity in caregiver guidance and scaffolding, while providing insights into young learners’ cultural experiences, has been conducted separately from the research of action-based cross-modal learning. Taking everyday activities as contexts for learning, in this chapter, we summarize seminal work on cross-modal learning by infants and young children that connects action and perception, review empirical evidence of cultural variations in caregiver guidance for early action-based learning, and make recommendations of research approaches for advancing the scientific understanding about cultural ways of learning across diverse communities.