Metrics
268 Downloads
Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

191 to 200 of 202 Results
Nov 23, 2022 - CLARIN-LV
Grūzītis, Normunds; Pretkalniņa, Lauma; Saulīte, Baiba; Rituma, Laura; Nešpore-Bērzkalne, Gunta; Paikens, Pēteris; Auziņa, Ilze; Znotiņš, Artūrs; Levāne-Petrova, Kristīne; Darģis, Roberts, 2019, "Full Stack of Latvian Language Resources for NLU", https://hdl.handle.net/20.500.12574/5, AiLab IMCS UL
This repository contains a multilayer text corpus of Latvian. The multilayer corpus is anchored in cross-lingual state-of-the-art representations: Universal Dependencies (UD), FrameNet, PropBank and Abstract Meaning Representation (AMR).
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Nov 23, 2022 - CLARIN-LV
Leikuma, Lidija; Bernāne, Līga; Cibuļs, Juris; Butkus, Alvydas; Butkienė, Violeta; Vaisvalavičienė, Kristina; Sperga, Ilze, 2013, "The Lithuanian-Latvian-Latgalian Dictionary", https://hdl.handle.net/20.500.12574/52, Rēzekne Academy of Technologies
"The Lithuanian-Latvian-Latgalian Dictionary" (hereinafter — "the LLL dictionary") has been compiled on the basis of "Lthe Lithuanian Language Written Sources Frequency Dictionary" ("Dažninis rašytinės lietuvių kalbų žodynas" , hereinafter — "the Frequency dictionary;" comp. by Utka A., Kaunas, VDU, 2009;)). It consists of 42,061 headwords based on...
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Nov 23, 2022 - CLARIN-LV
Auziņa, Ilze; Darģis, Roberts; Rābante-Buša, Guna; Levāne-Petrova, Kristīne; Saulīte, Baiba, 2017, "Annotated longitudinal corpus of Latvian children's language", https://hdl.handle.net/20.500.12574/7, AiLab IMCS UL
The collection contains three longitudinal corpora of monolingual Latvian speaking children, and one longitudinal corpus of simultaneous Latvian-Russian bilingual child. Participants were recorded for 30 minutes each week for 16 months, resulting in 134 hours of speech. 34 hours of obtained speech samples are orthographically transcribed.
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Oct 26, 2022 - CLARIN-LV
Darģis, Roberts; Akmane, Agate; Naļivaiko, Inga; Grūzītis, Normunds; Auziņa, Ilze; Saulīte, Baiba; Stepanovs, Kaspars, 2021, "LVMED: Latvian Pronunciation Dictionary of the Medical Domain", https://hdl.handle.net/20.500.12574/68, AiLab IMCS UL
A machine-readable pronunciation dictionary of the medical domain derived from a large text corpus of historical medical records. Consists of 109k entries in the CSV format: first column - a wordform; second column - its pronunciation in the IPA encoding. The dictionary contains Latvian words and terms used in the medical domain, as well as abbrevi...
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Oct 5, 2022 - CLARIN-LV
Spektors, Andrejs; Pretkalniņa, Lauma; Grūzītis, Normunds; Paikens, Pēteris; Rituma, Laura; Saulīte, Baiba, 2019, "Tēzaurs.lv 2020", https://hdl.handle.net/20.500.12574/9, AiLab IMCS UL
Tezaurs is a machine-readable lexicon and an online dictionary for Latvian. The initial human-oriented version of this resource was made publicly in 2009, comprising more than 125,000 entries. Since then, Tezaurs has been updated once every three months and so far it has grown to more than 300,000 entries referring to more than 280 sources. The dic...
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Apr 19, 2022 - CLARIN-LV
Znotiņš, Artūrs, 2020, "LVBERT - Latvian BERT", https://hdl.handle.net/20.500.12574/43, AiLab IMCS UL
LVBERT is the first publicly available monolingual BERT language model pre-trained for Latvian. For training we used the original implementation of BERT on TensorFlow with the whole-word masking and the next sentence prediction objectives. We used BERT-BASE configuration with 12 layers, 768 hidden units, 12 heads, 128 sequence length, 128 mini-batc...
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Apr 15, 2022 - CLARIN-LV
Znotiņš, Artūrs; Paikens, Pēteris; Grūzītis, Normunds, 2020, "Latvian AMR Sembank", https://hdl.handle.net/20.500.12574/40, AiLab IMCS UL
An automatically derived AMR annotation layer of the FullStack multi-layer text corpus of Latvian. First, Latvian UD Treebank (v2.5) sentences were translated to English using a state-of-the-art Latvian-English neural MT system (Hugo.lv). Second, a state-of-the-art AMR parser for English (AMREager) was applied to the MT-translated sentences. Additi...
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Apr 12, 2022 - CLARIN-LV
---, ---, 2017, "Rendering of personal names in Latvian: database", https://hdl.handle.net/20.500.12574/61, Latvian Language Agency
The application „Rendering of personal names in Latvian” is electronic multilingual dictionary of names. Currently information about rendering of personal names and versions of rendering, rules of rendering and further reading about 28 languages can be found on this web-site. The dictionary is based on the principles of rendering of proper names pu...
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Apr 6, 2022 - CLARIN-LV
Gunta, Nešpore-Bērzkalne; Skadiņa, Inguna; Grūzītis, Normunds; Znotiņš, Artūrs; Goško, Didzis, 2021, "LUIS: data collection for task oriented dialogue system creation", https://hdl.handle.net/20.500.12574/47, AiLab IMCS UL
This multi-targeted dataset contains several datasets that allow to train goal-oriented dialogue systems for student service domain in Latvian. The dataset contains a manually annotated dataset of domain-specific dialog intents, a manually created and annotated dataset of generalised and formalised dialog scenarios based on corpus evidence, dataset...
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Jan 18, 2022 - CLARIN-LV
Vanags, Pēteris; Trumpa, Edmunds; Laumane, Benita; Markus, Dace; Šuplinska, Ilga; Ernštreits, Valts; Rapa, Sanda; Pūtele, Iveta; Frīdenberga, Anna; Kazakeviča, Agita; Markus-Narvila, Liene; Leikuma, Lidija, 2016, "The Linguistic Map", https://hdl.handle.net/20.500.12574/60, Latvian Language Agency
„The Linguistic Map” has been designed as an electronic informative learning aid, providing an overview of the history of Latvian linguistics and delving into its chronology and themes as well as its branches, sub-branches, and the individuals involved in this work. „The Linguistic Map” currently contains entries about individuals, events, places,...
This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data.
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.