Date: 
02 June 2021 - 10:00 to 11:30
 

Dataverse is repository software developed by the Institute for Quantitative Social Science of Harvard University to enable researchers to archive and publish research data. It's supported by an active Dataverse Development Community with contributors around the world. From fixing bugs to writing documentation as well as creating integrations and client libraries, the community is a major part of what makes the Dataverse software successful. Fifty-nine institutes worldwide are using the software to establish a research data management solution for their own community.

Currently, the User Interface (UI) is in English. Task 5.2 of the SSHOC project integrated a tool (Weblate) into the installation pipeline of Dataverse to translate the UI more easily. This procedure can be used by organisations that are interested to have a UI in their national language(s). On June 2, we will organise a workshop to explain the ins and outs of the tool, discuss procedures and plan collaboration among translators.

 

Program

The content of the workshop will be as follows:

  • Short introduction about SSHOC task 5.2 by Marion Wittenberg (DANS)
  • Introduction of the tool by Laura Huis in ‘t Veld (DANS)
  • Experiences from translating to German by Veronika Heider (AUSSDA)
  • Q and A 
  • Procedures of translation 
  • Planning and follow up

 

Event Report

 

Questions asked at registration

How do you assure a sustainable long-term model for coordinated maintenance of the
translation work?

Final translations files should be saved in the Dataverse Community Github. So everyone can build on top of it. However, if you are the only one interested in a specific language, you will probably have to maintain it all by yourself. CESSDA is investigating how to support the sustainability within CESSDA.

It would be good to make a joint 'Flemish' and Dutch version. In Belgium we are obliged to
Translate to Dutch and French.

At DANS, we have not made a final decision about the need of a Dutch interface. If we decide we need it, of course it would be a good idea to cooperate. It is good to look for opportunities to work together on translations.

How can we start the translation process? Will it be automatically included in the official
Dataverse version?

The translation process can be started independently from IQSS/Harvard. How to start a translation in Weblate is described in the Guide created for this workshop. If there is no project yet in Weblate for your Dataverse version, please let us know by sending an e-mail to training@cessda.eu. We can create projects for previous versions, but also for the most recent release (currently version 5.5). Once the translation is final, you can save it in the Dataverse Global Community Github. This Github exists next to the official Dataverse Github and is managed by the Global Dataverse Community Consortium (GDCC), not just Harvard. It is up to each Dataverse installer to include a different language or not.

How much resources (in terms of persons/hours) do you need to allocate for the translation
process?

From past experiences, we can say that a professional translator with experience in technical translation needed around 15 work days to translate Dataverse version 4.8. (including metadata blocks and SOLR search fields). A data curator very familiar with the front end of Dataverse doing the translations only every now and then needed several weeks.

How can the Weblate tool be integrated with other systems/databases?

You can connect Weblate with Git (and Github). From the official Weblate documentation: “Weblate currently supports Git (with extended support for GitHub, Gerrit and Subversion) and Mercurial as version control back-ends.” Weblate also has an API, so it is also possible to develop a connection yourself.
As an alternative, you can download the final translation file from Weblate, and upload it manually to your own system.

How does Dataverse operate in multiple languages?

If you want to offer the Dataverse UI in another language than the default English, you can configure this in the dataverse settings. This is described in the dataverse Guide.
The user will be presented with a dropdown menu in the top navigation bar to choose the desired language. Here is one example of an English and French User Interface.
The metadata itself will stay in the original language. If the metadata was filled in in English, the switch to for example a French User Interface does NOT result in seeing the metadata in French. Only the metadata field label is translated.
For interested organisations who provide metadata in both their national language and in English, there is no solution yet. Dataverse does not have a language attribute for the metadata fields. It would be good if this can be changed. The language issue for the content of metadata, e.g. in keywords, will also be addressed in the ongoing work to implement controlled vocabulary support in Dataverse.

Questions asked during the workshop

How can we translate 'our' language (Lithuanian)? Our language is not yet on the list and I do
not have permission to add a language to Weblate.

Apparently, something went wrong with your user settings. How to add a language is described in the Guide written for this workshop. See paragraph 2.2.

Could you share your screen and show how to start translating into a new language?

Go to Languages - if your language is not there, click ‘Start new translation’ and wait for the application to create a new language (and fetch the source language from GitHub). This will take some time. Another option is to go to “FIles” and upload a file with translated strings. This is useful when you already have a partly translated Bundle.properties file. You can choose how you would like to upload the already translated strings in Weblate. For example, you can decide to mark these strings with a ‘Needs Editing’ status automatically.
The translation made in Weblate can be downloaded from the application any time. (Please see the Guide for screenshots.)

Is the list of translated languages available?

Translations that are available can be found in the Github repository of the Global Dataverse Community Consortium. Please note that some translations are only available for older versions of dataverse. Sometimes there might be multiple files available for the same language, you can then check who did the translation and choose the most trustworthy one.

Will the translations made with Weblate be shared with the Global Dataverse Community?

Yes, this is a plan. This will enhance the collaboration. For this workshop we have used a French translation in Weblate that we have downloaded from this community Github. Currently there is no automated upload from Weblate to Github, but we may resolve this in the future. For now we can do the upload manually. It is recommended to add the available languages to the GitHub community as soon as they are created, playing by community rules. This also helps in getting informed about work in
progress on translations. We also need to check the licence for translations - perhaps the license states that you will need to contribute back, for example when a GPO licence is used. The Dataverse software is licensed under the Apache License, Version 2.0.

Can you reuse translations of past (other) versions for new versions (example use of Swedish 4.9 to Norwegian 5.X)?

Yes, this is possible. There are two options. You could import the Swedish 4.9 language file as if it was a Norwegian translation. You can import it in Weblate with the ‘Needs Editing’ status, for example. In Weblate, you can select these strings and make alterations to it to create the Norwegian translation.
The second option is to upload the Swedish translation file not under the Norwegian language, but as Swedish. If you indicate in your Weblate profile settings that you are proficient in Swedish, Weblate will then show you the Swedish translation while you are working on your Norwegian translation. If from version 4.9 to version 5.x, new lines were added to the source language file, you would need to translate these specific lines without any previous Swedish input.

What about cyrilic letters and translation to languages that use these letters?

Cyrillic options are available, but are difficult for us to test, due to different keyboard(settings). Please contact the team if you would like to test this together or if you encounter any problems with it. It is also possible to add frequently used symbols to the menu in the Weblate interface, so you can easily select them while you are translating.

What decisions does an organisation have to make before translating?

Some important questions you should consider are:
● Which version of Dataverse do we need to translate?
● Who will do the translation? Do we have resources to hire a professional translator with a background in technical translations?
● Does our language have special forms that we need to discuss early on (for example female/male gender in German; cyrillic alphabet)
The user guide should be extended with a section with experiences from translators. This could be a community effort to gather this information.

How do we proceed with communication about this topic?

The user guide created for this workshop will be updated with the input from this Q&A session. We will use the CESSDA Dataverse basecamp for communication. If you are not part of CESSDA, please contact the Task 5.2 team, via an email to training@cessda.eu if you would like to join. We will send you a follow-up email that will also contain information about the communication channels.

 

Speakers

Veronika Heider is senior data curator at AUSSDA, the Austrian Social Science Data Archive. She contributes her expertise in making metadata, data and documentation in the AUSSDA Dataverse findable and usable to SSHOC task 5.2.

 

Laura Huis in ‘t Veld is Information Systems Officer at DANS. She is responsible for user support, configuration and testing of the DataverseNL platform. She is involved in task 5.2 of the SSHOC project and was previously involved in European projects, such as the CESSDA DataverseEU project. 

 

Marion Wittenberg is a service manager at DANS for DataverseNL, a repository service for Dutch universities and research organisations. She is also task leader of task 5.2 of the SSHOC project which adjusts the Dataverse software to the needs of the European SSH community.

 


 

More SSHOC resources on Dataverse

See SSHOC Service Catalogue for more information on the SSHOC Dataverse Service.

Previous events:

 

 

Video Embed: