Corpora Collection
Search in more than 30 million sentences of German newspaper material:
Go back to main download site
Download Corpora Karachay-Balkar
To download a corpus select a corpus size - given in number of sentences - and download the corresponding data file.
×
Close
Wikipedia
Used text material was taken from Wikipedia dumps.
Wikipedia
Year
Downloads
2014
10K
2016
10K
×
Close
Language
A
Achinese
Acoli
Afar
Afrikaans
Akan
Albanian
Amharic
Anaang
Arabic
Aragonese
Armenian
Assamese
Asturian
Aymara
Azerbaijani
B
Balinese
Bambara
Banjar
Bashkir
Basque
Bavarian
Belarusian
Bemba (Zambia)
Bengali
Bihari
Bikol
Bishnupriya
Bosnian
Breton
Buginese
Bulgarian
Buriat
Burmese
C
Catalan
Cebuano
Central Bikol
Central Kurdish
Chechen
Chinese
Chuvash
Corsican
Croatian
Czech
D
Danish
Dari
Dhivehi
Dimli
Dutch
Dyula
E
Eastern Maninkakan
Eastern Mari
Eastern Yiddish
Egyptian Arabic
Emiliano-Romagnolo
English
Erzya
Esperanto
Estonian
Ewe
Extremaduran
F
Faroese
Fiji Hindi
Finnish
Fon
French
Fulah
G
Galician
Gan Chinese
Ganda
Georgian
German
Gilaki
Goan Konkani
Guarani
Gujarati
H
Haitian
Halh Mongolian
Hausa
Hebrew
Hiligaynon
Hindi
Hungarian
I
Ibibio
Icelandic
Ido
Igbo
Iloko
Indonesian
Interlingua
Interlingue
Iranian Persian
Irish
Italian
J
Japanese
Javanese
K
Kabardian
Kabiyé
Kabuverdianu
Kabyle
Kalaallisut
Kannada
Karachay-Balkar
Kashmiri
Kashubian
Kazakh
Khmer
Kikuyu
Kinyarwanda
Kirghiz
Kituba (Congo)
Komi
Komi-Permyak
Kongo
Konkani
Koongo
Korean
Kurdish
Kölsch
L
Ladino
Lao
Latin
Latvian
Ligurian
Limburgan
Lingala
Lithuanian
Lombard
Lomwe
Low German
Lower Sorbian
Lugbara
Lumbu
Lushai
Luxembourgish
M
Macedonian
Madurese
Maithili
Makonde
Malagasy
Malay
Malayalam
Maltese
Mandarin Chinese
Manx
Maori
Marathi
Mazanderani
Min Dong Chinese
Min Nan Chinese
Minangkabau
Mingrelian
Mirandese
Modern Greek
Mongolian
Mossi
N
Navajo
Ndonga
Neapolitan
Nepali
Newari
Nigerian Pidgin
North Azerbaijani
Northern Frisian
Northern Sami
Northern Uzbek
Norwegian
Norwegian Bokmål
Norwegian Nynorsk
Nyanja
Nyankole
O
Occitan (post 1500)
Oriya
Oromo
Ossetian
P
Pampanga
Pangasinan
Panjabi
Papiamento
Pedi
Persian
Pfaelzisch
Piemontese
Plateau Malagasy
Polish
Pontic
Portuguese
Pulaar
Pushto
Q
Quechua
R
Romanian
Romansh
Romany
Rundi
Russian
Rusyn
S
S'gaw Karen
Sami
Samogitian
Sanskrit
Saraiki
Sardinian
Scots
Sena
Serbian
Serbo-Croatian
Shona
Sicilian
Silesian
Sindhi
Sinhala
Slovak
Slovenian
Somali
Soninke
South Ndebele
Southern Sotho
Spanish
Standard Estonian
Standard Latvian
Standard Malay
Sukuma
Sundanese
Susu
Swahili
Swahili (macrolanguage)
Swati
Swedish
Swiss German
T
Tagalog
Tajik
Tamil
Tatar
Telugu
Thai
Tibetan
Tigrinya
Timne
Tiv
Tosk Albanian
Tsonga
Tswana
Tulu
Tumbuka
Turkish
Turkmen
Tuvinian
U
Udmurt
Uighur
Ukrainian
Upper Sorbian
Urdu
Uzbek
V
Venda
Venetian
Vietnamese
Vlaams
Volapük
Võro
W
Walloon
Waray (Philippines)
Welsh
Western Frisian
Western Mari
Western Panjabi
Wolof
Wu Chinese
X
Xhosa
Y
Yakut
Yiddish
Yoruba
Z
Zeeuws
Zhuang
Zulu
Go back to main download site