A new AI model, based on the PV-RNN framework, learns to generalize language and actions in a manner similar to toddlers by integrating vision, proprioception, and language instructions.