datasets

Many different domains lack multilingual terminological resources. Making data and services accessible and usable in SSH is very much a matter of providing terminology across languages and multilingual vocabularies. Shortage of multilingual terminologies and vocabularies represents an obstacle to the access and reuse of information. Using the appropriate vocabularies can greatly improve both discovery and classification. Consequently, for SSHOC, it is important to address this issue with respect to the SSH domain. For the development of the European Open Science Cloud, terminologies pertaining to data management are recognized as particularly important, as they can be used to enrich datasets descriptions but also other types of documentation. The topic of Data Curation and Stewardship in particular, is of the utmost importance to all research infrastructures operating within the framework of the EOSC. 

The challenge was: to investigate, first, how and to what extent language technologies (in this specific case, tools for automatic term extraction and machine translation) can assist in the creation of domain-specific terminologies, specifically a multilingual terminology in the domain of Data Stewardship; secondly, to integrate the extracted, validated and translated terms in existing lexical-semantic/terminological/ontological resources, with the goal of providing background resources to be used for the basic access functionalities. 

Property:

Type:

SSHOC Events: