On the effect of curriculum learning with developmental data for grammar acquisition
Published in CoNLL Shared Task: BabyLM Challenge, 2023
We look at whether Elman’s idea of curriculum learning can be revitalised using developmentally plausible data. While we found no benefit over random sampling using the BabyLM pre-training corpora, we were able to attribute this to the majority of the data being high utility (i.e., simple transcribed speech). When we switched to a training setting where a lot of the data was more complicated and lowered the proportion of transcribed speech, we found some preliminary evidence that it might be worth starting small after all.