Linguistic Society of AmericaThe LSA works to advance the scientific study of language. The organization supports research on the documentation and preservation of languages.
Linguistics Research CenterThe LRC provides linguistic resources for study on Indo-European languages and cultures
Modern Language AssociationProvides opportunities for the members to share their scholarly findings and teaching experiences with colleagues and to discuss trends in the academy. Works to strengthen the study and teaching of language and literature.
SIL InternationalSIL International is a faith-based nonprofit organization that works with communities worldwide on sustainable language development
LINGUIST ListOnline forum for academic linguists to discuss linguistic issues
LInguistic Data ConsortiumThe Linguistic Data Consortium (LDC), hosted by the University of Pennsylvania, is an open consortium of universities, libraries, corporations and government research laboratories. The organization creates and distributes a variety of linguistic resources, including data, tools and standards.
BYU CorporaOpen Access corpora created by Mark Davies, Professor of Linguistics at Brigham Young University. Most corpora for American or British English, but does include a Spanish and Portuguese corpus. Wide range of chronological periods.
Helsinki Corpus of English TextsThe Helsinki Corpus of English Texts is a structured multi-genre diachronic corpus, which includes periodically organized text samples from Old, Middle and Early Modern English. Each sample is preceded by a list of parameter codes giving information on the text and its author. The Corpus is useful particularly in the study of the change of linguistic features in long diachrony. It can be used as a diagnostic corpus giving general information of the occurrence of forms, structures and lexemes in different periods of English.
Lancaster Corpus of Mandarin ChineseThe Lancaster Corpus of Mandarin Chinese (LCMC) is designed as a Chinese match for the FLOB and FROWN corpora for modern British and American English. The corpus is suitable for use in both monolingual research into modern Mandarin Chinese and cross-linguistic contrast of Chinese and British/American English. The corpus sampled 15 written text categories including news, literary texts, academic prose and official documents etc published in P.R.China in the early 1990s.
The EMILLE Lancaster CorpusThe EMILLE Corpus has been constructed as part of a collaborative venture between the EMILLE project (Enabling Minority Language Engineering), Lancaster University, UK, and the Central Institute of Indian Languages (CIIL), Mysore, India. EMILLE is distributed by the European Language Resources Association.
The corpus consists of three components: monolingual, parallel and annotated corpora. There are fourteen monolingual corpora, including both written and (for some languages) spoken data for fourteen South Asian languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Oriya, Punjabi, Sinhala, Tamil, Telegu and Urdu. The EMILLE monolingual corpora contain approximately 92,799,000 words (including 2,627,000 words of transcribed spoken data for Bengali, Gujarati, Hindi, Punjabi and Urdu). The parallel corpus consists of 200,000 words of text in English and its accompanying translations in Hindi, Bengali, Punjabi, Gujarati and Urdu.
Using the top search box will search text, audio and film. Many older books and dictionaries can be found digitized in this collection.
The Oxford Text ArchiveThe OTA collects, catalogues, preserves and distributes high-quality digital resources for research and teaching.
The Online Books PageIndex to more than one million works in English. Mostly older materials no longer under copyright although more current items included when permission for free access has been granted.