CLARIN-LV

CLARIN (Common Language Resources and Technology Infrastructure) is a research infrastructure that was initiated from the vision that all digital language resources and tools from all over Europe and beyond are accessible through a single sign-on online environment for the support of researchers in the humanities and social sciences.

Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

11 to 20 of 122 Results

Corpus of Contemporary Latgalian Speech (MuLaR) Mar 3, 2026 Martena, Sanita; Nau, Nicole; Kļavinska, Antra; Juško-Štekele, Angelika; Kociņš-Kūceņš, Armands; Sprukte, Ausma; Briška, Anna; Gusāns, Ingars; Mazure, Laura, 2025, "Corpus of Contemporary Latgalian Speech (MuLaR)", https://hdl.handle.net/20.500.12574/118, Rēzekne Academy of Technologies The corpus consists of audio recordings and their transcripts. It documents natural, spontaneous speech, including field research recordings, interviews, TV and radio broadcasts. This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Corpus of Contemporary Latgalian Speech (MuLaR) (2026-03-02) Mar 3, 2026 Martena, Sanita; Nau, Nicole; Kļavinska, Antra; Juško-Štekele, Angelika; Kociņš-Kūceņš, Armands; Sprukte, Ausma; Briška, Anna; Gusāns, Ingars; Mazure, Laura, 2026, "Corpus of Contemporary Latgalian Speech (MuLaR) (2026-03-02)", https://hdl.handle.net/20.500.12574/153, Rēzekne Academy of Technologies The corpus consists of audio recordings and their transcripts. It documents natural, spontaneous speech, including field research recordings, interviews, TV and radio broadcasts. This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Historical Dictionary of Latvian Given Names Feb 18, 2026 Siliņa-Piņķe, Renāte; Rapa, Sanda; Jansone, Ilga; Kazakevičs, Ņikita, 2026, "Historical Dictionary of Latvian Given Names", https://hdl.handle.net/20.500.12574/152, Latvijas Universitātes Humanitāro zinātņu fakultātes Latviešu valoda institūts "Historical Dictionary of Latvian Given Names" (LPVV) is an online scientific dictionary that collects and describes Latvian given names documented in written sources spanning more than eight centuries. This dictionary focuses on names that entered the Latvian given name system before the end of the 19th century. This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
LATE Conversational Speech Corpus V1 (LATE-sarunas) Jan 23, 2026 Auziņa, Ilze; Darģis, Roberts; Rābante-Buša, Guna; Ļaksa-Timinska, Ilze; Gailīte, Elīna; Auziņa, Arta, 2024, "LATE Conversational Speech Corpus V1 (LATE-sarunas)", https://hdl.handle.net/20.500.12574/113, AiLab IMCS UL Corpus contains recordings of informal conversations, interviews and public speeches and their transcripts in orthographic transcription. Metadata has been added to each audio recording: gender and age group of the speaker, information about the form of speech – dialogue, monologue, spontaneous or prepared speech, etc. This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Tēzaurs.lv 2025 (Autumn Edition) Dec 22, 2025 Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis, 2025, "Tēzaurs.lv 2025 (Autumn Edition)", https://hdl.handle.net/20.500.12574/137, AiLab IMCS UL Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 410,000 entries based on 350 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and integrated with the Latvian WordNet data. This dataset is availab... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Dictionary of Contemporary Latvian Language (MLVV) (2025-09-22) Dec 22, 2025 Zuicena, Ieva; Auziņa, Ieva; Briede, Santa; Jansone, Irēna Ilga; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Rapa, Sanda; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Timuška, Agris; Grasmanis, Mikus; Pretkalniņa, Lauma; Znotiņš, Artūrs, 2025, "Dictionary of Contemporary Latvian Language (MLVV) (2025-09-22)", https://hdl.handle.net/20.500.12574/138, Latvian Language Institute of the University of Latvia “Contemporary dictionary of Latvian language” (MLVV), developed by the Latvian Language institute of University of Latvia, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Dictionary of Latvian Literary Language (LLVV) (2025-12-21) Dec 22, 2025 Ceplītis, Laimdots; Spektors, Andrejs, 2025, "Dictionary of Latvian Literary Language (LLVV) (2025-12-21)", https://hdl.handle.net/20.500.12574/149, AiLab IMCS UL In the 20th century, the Latvian Language Institute of the University of Latvia (UL LLI, former Language and literature institute of the Academy of Sciences) has produced the largest lexicographic source of Latvian language, which has been digitalized (2001–2022) by the Institute of Mathematics and Computer Sciences, UL. The dictionary contains wor... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Dictionary of Latvian Literary Language (LLVV) (2025-03-05) Dec 22, 2025 Ceplītis, Laimdots; Spektors, Andrejs, 2025, "Dictionary of Latvian Literary Language (LLVV) (2025-03-05)", https://hdl.handle.net/20.500.12574/126, AiLab IMCS UL In the 20th century, UL Latvian language institute (former Language and literature institute of the Academy of Sciences) has produced the largest lexicographic source of Latvian language, which has been digitalized (2001–2022) by UL Institute of Mathematics and Computer Sciences. The dictionary contains words of standard Latvian used since 19th cen... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Latvian word frequency dataset Dec 19, 2025 Grasmanis, Mikus; Valkovska, Baiba; Levāne-Petrova, Kristīne, 2025, "Latvian word frequency dataset", https://hdl.handle.net/20.500.12574/148, AiLab IMCS UL This frequency list contains the 25,000 most frequent Latvian lemmas, obtained from 18 morphologically annotated corpora totalling 1.5 billion tokens from the Latvian National Corpora Collection (Korpuss.lv) and Tēzaurs.lv. Supporting academic and practical applications, including language teaching, machine translation, and speech technologies, the... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Latvian Folk Legend Corpus of LPT (in Latvian) Dec 19, 2025 Reinsone, Sanita; Kaščejeva, Simona; Spektors, Andrejs; Pakalns, Guntis, 2025, "Latvian Folk Legend Corpus of LPT (in Latvian)", https://hdl.handle.net/20.500.12574/147, Digital Humanities Center of the University of Latvia The corpus includes Latvian legends published in volumes 13, 14, and 15 of "Latvian Folk Tales and Legends" (1925–1937), compiled by Pēteris Šmits. The volumes were digitised in the late 1990s; a revised version and the preparation of the German-language texts were carried out in 2012. Metadata refinement and the development of a new corpus version... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.

Corpus of Contemporary Latgalian Speech (MuLaR)

Mar 3, 2026

Martena, Sanita; Nau, Nicole; Kļavinska, Antra; Juško-Štekele, Angelika; Kociņš-Kūceņš, Armands; Sprukte, Ausma; Briška, Anna; Gusāns, Ingars; Mazure, Laura, 2025, "Corpus of Contemporary Latgalian Speech (MuLaR)", https://hdl.handle.net/20.500.12574/118, Rēzekne Academy of Technologies

The corpus consists of audio recordings and their transcripts. It documents natural, spontaneous speech, including field research recordings, interviews, TV and radio broadcasts.