41 to 50 of 114 Results
Mar 26, 2025
Rābante-Buša, Guna; Grūzītis, Normunds; Bārzdiņš, Guntis; Mendes, Afonso, 2022, "SELMA Latvian NER Dataset", https://hdl.handle.net/20.500.12574/98, AiLab IMCS UL
A dataset of hierarchically annotated named entities in Latvian news articles (provided by the Latvian Information Agency LETA) for the development and evaluation of transition-based parsers for named entity recognition (NER).This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Mar 26, 2025
Darģis, Roberts; Znotiņš, Artūrs; Auziņa, Ilze; Rābante-Buša, Guna, 2024, "LATE Dev&Test Set V1 for Latvian ASR", https://hdl.handle.net/20.500.12574/99, AiLab IMCS UL
A Latvian speech corpus for the development (validation), testing and comparison of ASR models. The audio data is segmented and aligned with the corresponding orthographic transcriptions which are human verified. The LATE-media subset contains both verbatim (raw) and formatted transcriptions (with punctuation, capitalisation, numbers, abbreviations...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Mar 26, 2025
Nešpore, Gunta; Rituma, Laura, 2023, "Word sense annotated "The Little Prince" fragments in Latvian 1.0", https://hdl.handle.net/20.500.12574/80, AiLab IMCS UL
Annotation of word senses for a running text corpus of 1200 tokens (beginning of The Little Prince by Antoine de Saint-Exupéry) as an evaluation corpus for Latvian WSD systems. Data is provided in a tab-separated format similar to CoNLL, indexing senses to the Tēzaurs.lv word sense IDs as of Tēzaurs.lv 2022 (http://hdl.handle.net/20.500.12574/66) d...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Mar 26, 2025
Laizāns, Mārtiņš; Pretkalniņa, Lauma, 2015, "Latvian Blog Corpus 2015", https://hdl.handle.net/20.500.12574/79, AiLab IMCS UL
Authomaticaly harvested Latvian blog corpus.This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Jan 30, 2025
Trumpa, Edmunds; Ozola, Anete; Jansone, Laura Paula, 2024, "Dataset for Latvian Phonetic Analysis", https://hdl.handle.net/20.500.12574/122, Latvian Language Institute of the University of Latvia
The dataset is intended for the characterization, classification and visualization of the phonetic features of syllable intonation characteristic of the modern Latvian language. The dataset contains the following folders: (1) Questionnaires (4 questionnaires with 171 sentences); (2) Recordings (855 utterances spoken by five speakers); (3) Graphs of...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Jan 16, 2025
Zuicena, Ieva; Auziņa, Ieva; Briede, Santa; Jansone, Irēna Ilga; Kuplā, Ieva; Lejniece, Gunta; Migla, Ilga; Oldere, Laimdota; Ozola, Ārija; Požarnova, Vija; Rapa, Sanda; Roze, Anitra; Šmidebergs, Imants; Šnē, Dorisa; Šnē, Māra; Timuška, Agris; Grasmanis, Mikus; Pretkalniņa, Lauma; Znotiņš, Artūrs, 2024, "Dictionary of Contemporary Latvian Language (MLVV) (2024-09-22)", https://hdl.handle.net/20.500.12574/109, Latvian Language Institute of the University of Latvia
“Contemporary dictionary of Latvian language” (MLVV), developed by the Latvian Language institute of University of Latvia, is a new explanatory dictionary based on Latvian language materials obtained during the last decade. The analysis of the word stock is based on MLVV card files, internet sources, as well as, on last decade’s encyclopaedias and...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Jan 16, 2025
Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Lokmane, Ilze; Klints, Agute; Stāde, Madara; Grasmanis, Mikus; Auziņa, Ilze; Znotiņš, Artūrs; Darģis, Roberts; Bārzdiņš, Guntis, 2024, "Tēzaurs.lv 2024 (Autumn Edition)", https://hdl.handle.net/20.500.12574/110, AiLab IMCS UL
Tezaurs.lv is the largest open machine-readable dictionary for Latvian. This version contains more than 405,000 entries based on 345 sources. The dictionary is enriched with phonetic, morphological, derivational, semantic and other annotations, inflection tables, corpus examples, and it is integrated with the Latvian WordNet data. This dataset is a...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Jan 7, 2025
Martena, Sanita; Nau, Nicole; Kļavinska, Antra; Juško-Štekele, Angelika; Kociņš-Kūceņš, Armands; Sprukte, Ausma; Briška, Anna; Gusāns, Ingars; Mazure, Laura, 2024, "Corpus of Contemporary Latgalian Speech", https://hdl.handle.net/20.500.12574/105, Rēzekne Academy of Technologies
The corpus consists of audio recordings and their transcripts. It documents natural, spontaneous speech, including field research recordings, interviews, TV and radio broadcasts.This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Dec 14, 2024
Baklāne, Anda; Saulespurēns, Valdis; Ozols, Artis, 2022, ""Karogs" corpus", https://hdl.handle.net/20.500.12574/83, National Library of Latvia
Corpus contains texts of the magazine "Karogs" from 1940 to 1994.This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Dec 14, 2024
Darģis, Roberts, 2022, "Corpus of Latvian PhD Theses (Disertācijas)", https://hdl.handle.net/20.500.12574/93, AiLab IMCS UL
The corpus consists of PhD theses and summaries published in the University of Latvia, Riga Technical University, Riga Stradins University and Liepaja University until 2020.This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
