1 to 10 of 121 Results
May 12, 2026
Štekeļs, Jorens, 2026, "ConLoan-LV: A Contrastive Dataset for Latvian Language Loanwords, Code-switching, and Named Entities", https://hdl.handle.net/20.500.12574/158, University of Latvia
ConLoan-LV is a multi-purpose contrastive dataset designed for the classification and analysis of Latvian language loanwords, code-switching, and named entities. Replicating and extending the ConLoan methodology, the dataset contains 353 manually validated sentences in the baseline version and 676 in the extended version, with all sentences sourced...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Apr 20, 2026
Zuicena, Ieva; Auziņa, Ieva; Briede, Santa; Jansone, Irēna Ilga; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Rapa, Sanda; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Timuška, Agris; Grasmanis, Mikus; Pretkalniņa, Lauma; Znotiņš, Artūrs, 2026, "Dictionary of Contemporary Latvian Language (MLVV) (2026-04-08)", https://hdl.handle.net/20.500.12574/157, Latvian Language Institute, Faculty of Humanities, University of Latvia
“Contemporary dictionary of Latvian language” (MLVV), developed by the Latvian Language Institute of the Faculty of Humanities at the University of Latvia, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Apr 20, 2026
Zuicena, Ieva; Auziņa, Ieva; Briede, Santa; Jansone, Irēna Ilga; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Rapa, Sanda; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Timuška, Agris; Grasmanis, Mikus; Pretkalniņa, Lauma; Znotiņš, Artūrs, 2025, "Dictionary of Contemporary Latvian Language (MLVV) (2025-12-21)", https://hdl.handle.net/20.500.12574/150, Latvian Language Institute, Faculty of Humanities, University of Latvia
“Contemporary dictionary of Latvian language” (MLVV), developed by the Latvian Language Institute of the Faculty of Humanities at the University of Latvia, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Apr 20, 2026
Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis, 2025, "Tēzaurs.lv 2026 (Winter Edition)", https://hdl.handle.net/20.500.12574/151, AiLab IMCS UL
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 410,000 entries based on 350 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and integrated with the Latvian WordNet data. This dataset is availab...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Apr 20, 2026
Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis, 2026, "Tēzaurs.lv 2026 (Spring Edition)", https://hdl.handle.net/20.500.12574/156, AiLab IMCS UL
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 410,000 entries based on 350 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and integrated with the Latvian WordNet data. This dataset is availab...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Apr 8, 2026
Darģis, Roberts, 2022, "The Index of Aggressive Communication in Internet Portal Comments", https://hdl.handle.net/20.500.12574/45, AiLab IMCS UL
A corpus containing comments from Internet news sites tvnet.lv, delfi.lv, apollo.lv. The specialized corpus platform and its tools are designed to study aggression in the comments of news portals. The toolkit allows to identify news items from Internet portals that are most aggressively commented, as well as to study aggressive communication trends...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Apr 7, 2026
Kalnača, Andra; Pakalne, Tatjana; Auziņa, Ieva; Balmane, Vanesa; Butāne, Anita; Hoplíček, Milan; Horiguchi, Daiki; Jansone, Laura Paula; Levāne‑Petrova, Kristīne; Lokmane, Ilze; Miķelsone, Paula; Otomers, Oskars; Ozola, Paula; Urbanoviča, Inta, 2026, "Database of Latvian Morphemes and Derivational Models (DLMDM)", https://hdl.handle.net/20.500.12574/155, University of Latvia
"The Database of Latvian Morphemes and Derivational Models (DLMDM)" is a corpus-based derivational morphology resource developed at the Department of Latvian and Baltic Studies, Faculty of Humanities, University of Latvia. The core of the database consists of lemmas imported from the Balanced Corpus of Modern Latvian (LVK2018), with additional lemm...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Apr 7, 2026
Babaņins, Vladislavs, 2026, "Latvian Communist Leaflet Corpus (1934–1940)", https://hdl.handle.net/20.500.12574/154, University of Latvia
The Latvian Communist Leaflet Corpus (1934–1940) is a structured digital corpus of underground political leaflets produced by illegal communist organizations in Latvia between January 1934 and July 1940, covering the final months of the parliamentary period and the authoritarian regime of Kārlis Ulmanis. The corpus contains 251 unique leaflet texts...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Mar 3, 2026
Martena, Sanita; Nau, Nicole; Kļavinska, Antra; Juško-Štekele, Angelika; Kociņš-Kūceņš, Armands; Sprukte, Ausma; Briška, Anna; Gusāns, Ingars; Mazure, Laura, 2025, "Corpus of Contemporary Latgalian Speech (MuLaR)", https://hdl.handle.net/20.500.12574/118, Rēzekne Academy of Technologies
The corpus consists of audio recordings and their transcripts. It documents natural, spontaneous speech, including field research recordings, interviews, TV and radio broadcasts.This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Mar 3, 2026
Martena, Sanita; Nau, Nicole; Kļavinska, Antra; Juško-Štekele, Angelika; Kociņš-Kūceņš, Armands; Sprukte, Ausma; Briška, Anna; Gusāns, Ingars; Mazure, Laura, 2026, "Corpus of Contemporary Latgalian Speech (MuLaR) (2026-03-02)", https://hdl.handle.net/20.500.12574/153, Rēzekne Academy of Technologies
The corpus consists of audio recordings and their transcripts. It documents natural, spontaneous speech, including field research recordings, interviews, TV and radio broadcasts.This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
