1 to 10 of 109 Results
Dec 11, 2025
Auziņa, Ilze; Darģis, Roberts; Rābante-Buša, Guna; Levāne-Petrova, Kristīne; Saulīte, Baiba, 2017, "Annotated longitudinal corpus of Latvian children's language", https://hdl.handle.net/20.500.12574/7, AiLab IMCS UL
The collection contains three longitudinal corpora of monolingual Latvian speaking children, and one longitudinal corpus of simultaneous Latvian-Russian bilingual child. Participants were recorded for 30 minutes each week for 16 months, resulting in 134 hours of speech. 34 hours of obtained speech samples are orthographically transcribed.This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Dec 3, 2025
Reinsone, Sanita; Matulis, Haralds; Ļaksa-Timinska, Ilze; Žvarte, Elvīra, 2025, "Corpus of Latvian Autobiographies", https://hdl.handle.net/20.500.12574/145, Institute of Literature, Folklore and Art of the University of Latvia
The corpus consists of 74 unpublished autobiographies, life stories, and memoirs in Latvian, written between 1900 and 2024. All materials have been collected, digitised, and are preserved in the Autobiography Collection of the Archives of Latvian Folklore, Institute of Literature, Folklore and Art, University of Latvia. The corpus has been created...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Nov 27, 2025
Kļavinska, Antra; Martena, Sanita; Nau, Nicole; Šuplinska, Ilga; Anna, Briška, 2025, "Latgalian Tezaurs 2026 (Winter Edition)", https://hdl.handle.net/20.500.12574/144, AiLab IMCS UL
Latgalian Tezaurs (LTG T) is a lexical database and online dictionary of Latgalian (ISO 639-3 ltg). This version contains more than 750 entries, including many idioms and other multi-word units. Entries include spelling variants and dialect forms and name the sources where the lexical unit has been documented. Audio recordings illustrate pronunciat...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Nov 27, 2025
Kļavinska, Antra; Martena, Sanita; Nau, Nicole; Šuplinska, Ilga; Anna, Briška, 2024, "Latgalian Tezaurs 2025 (Winter Edition)", https://hdl.handle.net/20.500.12574/116, Rēzekne Academy of Technologies
Latgalian Tezaurs (LTG T) is a lexical database and online dictionary of Latgalian (ISO 639-3 ltg). The pilot version of December 2024 contains more than 450 entries, including many idioms and other multi-word units. Entries include spelling variants and dialect forms and name the sources where the lexical unit has been documented. Audio recordings...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Nov 26, 2025
Pretkalniņa, Lauma; Nešpore-Bērzkalne, Gunta; Pokratniece, Kristīne; Rituma, Laura, 2025, "Latvian and Latgalian Parallel Sample Treebank (Cairo)", https://hdl.handle.net/20.500.12574/143, AiLab IMCS UL
This corpus contains 20 Latvian and Latgalian sample sentences annotated in the same hybrid annotation model used in Latvian Treebank. Sentences used in this corpora are the same sentences that are used in "Cairo" sample corpora that showcase anntoation choices for Universal Dependency treebanks, and this corpus serves as a basis for both UD-Latvia...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Nov 25, 2025
Rituma, Laura; Pretkalniņa, Lauma; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Grūzītis, Normunds; Znotiņš, Artūrs, 2025, "LVTB - Latvian Treebank v2.17", https://hdl.handle.net/20.500.12574/142, AiLab IMCS UL
Latvian Treebank (LVTB) is being developed since 2010. It is manually annotated according to a hybrid dependency-constituency grammar model. This version of LVTB contains data used for deriving the corresponding version of Latvian UD Treebank (UDLV-LVTB).This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Nov 25, 2025
Rituma, Laura; Pretkalniņa, Lauma; Saulīte, Baiba; Nešpore-Bērzkalne, Gunta; Grūzītis, Normunds; Znotiņš, Artūrs, 2025, "LVTB - Latvian Treebank v2.16 (2025-05-15)", https://hdl.handle.net/20.500.12574/129, AiLab IMCS UL
Latvian Treebank (LVTB) is being developed since 2010. It is manually annotated according to a hybrid dependency-constituency grammar model. This version of LVTB contains data used for deriving the corresponding version of Latvian UD Treebank (UDLV-LVTB).This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Nov 20, 2025
Andronova, Everita; Spektors, Andrejs; Vanags, Pēteris; Baltiņa, Maija; Trumpa, Anta; Trumpa, Edmunds; Grūzītis, Normunds; Siliņa-Piņķe, Renāte; Frīdenberga, Anna; Skrūzmane, Elga; Ķauķīte, Sintija; Pretkalniņa, Lauma, 2022, "The Corpus of Early Written Latvian (2022)", https://hdl.handle.net/20.500.12574/90, AiLab IMCS UL
The Corpus of early written Latvian ‘SENIE’ provides access to the texts of written Latvian of the 16th–18th century, and its aim is to facilitate studies of early Latvian in general (e.g. the lexis, morphology and syntax of the texts) and to serve as the basis for "The Historical dictionary of Latvian (16th–17th cc.)". The Corpus was first launche...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Nov 20, 2025
Andronova, Everita; Baltiņa, Maija; Frīdenberga, Anna; Grūzītis, Normunds; Ķauķīte, Sintija; Pokratniece, Kristīne; Pretkalniņa, Lauma; Siliņa-Piņķe, Renāte; Skrūzmane, Elga; Spektors, Andrejs; Spektors, Mārtiņš; Štrausa, Ilze; Trumpa, Anta; Trumpa, Edmunds; Vanags, Pēteris, 2025, "The Corpus of Early Written Latvian (2025)", https://hdl.handle.net/20.500.12574/141, AiLab IMCS UL
The Corpus of early written Latvian 'SENIE' provides access to the texts and facsimiles of written Latvian of the 16th–18th century. Its aim is to facilitate studies of early Latvian in general and to serve as the basis for 'The Historical dictionary of Latvian (16th–17th cc.)'. Corpus serves as a unique digital repository of early Latvian texts, w...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Nov 6, 2025
Pretkalniņa, Lauma; Andronova, Everita; Frīdenberga, Anna; Skrūzmane, Elga; Siliņa-Piņķe, Renāte; Trumpa, Anta; Vanags, Pēteris, 2025, "Spelling normalization tool for Latvian 18th century texts", https://hdl.handle.net/20.500.12574/140, AiLab IMCS UL
The spelling normalization tool (pilot converter) is meant for converting any 18th century Latvian Unicode-encoded text into a more modern spelling. This version of the tool takes care of normalizing the roots of the words, thus, it is meant for for facillitating user-friendly corpora search in tools like Sketch Engine. The tool consists of 134 uni...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
