- Talk Bank: The goal of TalkBank is to foster fundamental research in the study of human communication. It contains a number of diverse speech and text corpora. Some are public and some require contacting TalkBank for permission.
- BYU corpora: Collection of free and commercial corpora by Mark Davies in English, Spanish and Portuguese
- Wordbank: Open database of children’s vocabulary development.
- Sketch Engine: Sketch engine lists free and commercial corpora in many different languages.
- DWDS: the Digitale Wörterbuch der deutschen Sprache - the digital dictionary of the German Language - offers a number of corpora and other resources in German.
- Taaluniversum corpora: A variety of Dutch text corpora (e.g. SoNaR) provided by the Dutch Taaluniversum.
- Children’s Printed Word Database : Printed word frequencies as read by children aged between 5 and 9
- DeveL: Developmental lexicon project.