CLARIN-LV

CLARIN (Common Language Resources and Technology Infrastructure) is a research infrastructure that was initiated from the vision that all digital language resources and tools from all over Europe and beyond are accessible through a single sign-on online environment for the support of researchers in the humanities and social sciences.

Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

81 to 90 of 124 Results

SELMA Open Source Platform (UC0) Feb 9, 2024 Goško, Didzis; Bārzdiņš, Guntis, 2024, "SELMA Open Source Platform (UC0)", https://hdl.handle.net/20.500.12574/97, AiLab IMCS UL The SELMA Open-Source Software (OSS) offers effective means to test and compare the performance of various language models used in multilingual media monitoring and content production. The SELMA OSS Platform (also referred to as Use Case 0, UC0, or The Basic Testing and Configuration Interface) provides: * automatic speech recognition (ASR) from au... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Dictionary of Latvian Literary Language (LLVV) (2024-01) Feb 2, 2024 Ceplītis, Laimdots; Spektors, Andrejs, 2024, "Dictionary of Latvian Literary Language (LLVV) (2024-01)", https://hdl.handle.net/20.500.12574/96, AiLab IMCS UL In the 20th century, UL Latvian language institute (former Language and literature institute of the Academy of Sciences) has produced the largest lexicographic source of Latvian language, which has been digitalized (2001–2022) by UL Institute of Mathematics and Computer Sciences. The dictionary contains words of standard Latvian used since 19th cen... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Dictionary of Latvian Literary Language (LLVV) Feb 2, 2024 Spektors, Andrejs, 2010, "Dictionary of Latvian Literary Language (LLVV)", https://hdl.handle.net/20.500.12574/53, AiLab IMCS UL In the 20th century, UL Latvian language institute (priory, Language and literature institute of the Academy of Sciences) has produced the largest lexicographic source of Latvian language, which has been digitalized by UL Institute of Mathematics and Computer Sciences. The dictionary includes words of standard Latvian used since 19th century’s 70’s... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Dictionary of Contemporary Latvian Language (MLVV) (2023-09-21) Feb 2, 2024 Jērāne, Santa; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Zuicena, Ieva; Pretkalniņa, Lauma; Auziņa, Ieva; Briede, Santa; Šmidebergs, Imants; Timuška, Agris, 2023, "Dictionary of Contemporary Latvian Language (MLVV) (2023-09-21)", https://hdl.handle.net/20.500.12574/94, Latvian Language Institute of the University of Latvia “Contemporary dictionary of Latvian language” (MLVV), which is developed by the UL Latvian Language institute, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and dictionaries... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Dictionary of Contemporary Latvian Language (MLVV) (2023-12-22) Feb 2, 2024 Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Zuicena, Ieva; Pretkalniņa, Lauma; Auziņa, Ieva; Briede, Santa; Timuška, Agris; Jansone, Irēna Ilga; Rapa, Sanda, 2024, "Dictionary of Contemporary Latvian Language (MLVV) (2023-12-22)", https://hdl.handle.net/20.500.12574/95, Latvian Language Institute of the University of Latvia “Contemporary dictionary of Latvian language” (MLVV), which is developed by the UL Latvian Language institute, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and dictionaries... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Corpus of Students' Essays Dec 29, 2023 Levāne-Petrova, Kristīne; Pokratniece, Kristīne; Darģis, Roberts, 2021, "Corpus of Students' Essays", https://hdl.handle.net/20.500.12574/51, AiLab IMCS UL A specialized corpus containing 468 students' essays for the 12th grade Latvian language exam. This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Corpus of Latvian Pandemic Diaries 2020–2021 Dec 27, 2023 Reinsone, Sanita; Ļaksa-Timinska, Ilze; Jaudzema, Justīne, 2021, "Corpus of Latvian Pandemic Diaries 2020–2021", https://hdl.handle.net/20.500.12574/48, Institute of Literature, Folklore and Art of the University of Latvia The Archives of Latvian Folklore invited anyone document their life during pandemic and contribute to the collection "Diaries in the Time of Pandemic 2020-2021". The corpora consists of diary entries. Each file = 1 author. Dates of entries are marked in a format dd/mm/yyyy. This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Corpus of Latvian Women Writers’ Short Fiction Dec 27, 2023 Kārkla, Zita; Matulis, Haralds, 2022, "Corpus of Latvian Women Writers’ Short Fiction", https://hdl.handle.net/20.500.12574/69, Institute of Literature, Folklore and Art of the University of Latvia The corpus consists of short fiction by Latvian women writers published from 1893 to 2002. This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Latgalian Corpus (MuLa) Dec 27, 2023 Sperga, Ilze; Pokratniece, Kristīne; Briška, Anna, 2013, "Latgalian Corpus (MuLa)", https://hdl.handle.net/20.500.12574/8, AiLab IMCS UL The Special Latgalian Corpus (MuLa) is formed from the special written texts types from the time of national awakening (1987-1989) until 2013. The corpus includes three types of texts : literary texts, technical texts, and information texts. The textual sources selected in defined proportions, based on the chronological principle and text genres th... This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
LVMED: Latvian Speech Transcripts of the Medical Domain Dec 27, 2023 Auziņa, Ilze; Saulīte, Baiba; Akmane, Agate; Millere, Elīna; Naļivaiko, Inga; Stepanovs, Kaspars; Darģis, Roberts; Grūzītis, Normunds, 2021, "LVMED: Latvian Speech Transcripts of the Medical Domain", https://hdl.handle.net/20.500.12574/67, AiLab IMCS UL A text corpus of orthographic transcription of a Latvian medical speech corpus. It consists of 900 transcripts (documents) of a ~35 hour radiology speech corpus. Modalities covered: CT, MR, MG, CR, US. This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.

SELMA Open Source Platform (UC0)

Feb 9, 2024

Goško, Didzis; Bārzdiņš, Guntis, 2024, "SELMA Open Source Platform (UC0)", https://hdl.handle.net/20.500.12574/97, AiLab IMCS UL

The SELMA Open-Source Software (OSS) offers effective means to test and compare the performance of various language models used in multilingual media monitoring and content production. The SELMA OSS Platform (also referred to as Use Case 0, UC0, or The Basic Testing and Configuration Interface) provides: * automatic speech recognition (ASR) from au...