2nd Translation Inference Across Dictionaries (TIAD) shared task

at LDK 2018 in Leipzig, Germany, on May 20, 2019


About

The second shared task for Translation Inference Across Dictionaries (TIAD 2019) is aimed at exploring methods and techniques for automatically generating new bilingual (and multilingual) dictionaries from existing ones in the context of a coherent experiment framework that enables reliable validation of results and solid comparison of the processes used. This initiative also aims to enhance further research on the topic of inferring translations across languages.

A workshop presenting the results of the shared task will be held in conjunction with the second conference on Language, Data and Knowledge (LDK 2019) in Leipzig, Germany, on May 20, 2019. (Register here).

dictionaries horizontal

Task definition

The objective of TIAD shared task is to explore and compare methods and techniques that infer translations indirectly between language pairs, based on other bilingual resources. Such techniques would help in auto-generating new bilingual and multilingual dictionaries based on existing ones.

In particular, the participating systems will be asked to generate new translations automatically among three languages, English, French, Portuguese, based on known translations contained in the Apertium RDF graph. As these languages (EN, FR, PT) are not directly connected in this graph, no translations can be obtained directly among them there. Based on the available RDF data, the participants will apply their methodologies to derive translations, mediated by any other language in the graph, between the pairs EN/FR, FR/PT and PT/EN.

Participants may also make use of other freely available sources of background knowledge (e.g. lexical linked open data and parallel corpora) to improve performance, as long as no direct translation among the target language pairs is applied.

Evaluation of the results will be carried out by the organisers against manually compiled pairs of K Dictionaries and other resources.

Publication of results

Participants will submit a system description paper including a description of their system, the way data have been processed, the applied algorithms, the obtained results, as well as the conclusions and ideas for future improvements. The papers will be peer reviewed prior to publication to confirm all aspects are well covered.

In addition, the workshop welcomes regular papers from those not participating in the shared task but having worked on the topic of translation inference and wanting to publish novel results or ideas, using different datasets and experimental basis from the ones proposed in TIAD shared task. Such papers will be peer reviewed on the basis of their scientific quality.

Both types of papers should have 6-8 pages and be formatted according to LNCS guidelines and will be submitted through EasyChair. All the accepted papers will be presented at the workshop and published on CEUR-WS.

How to participate in the shared task

1. Contact us so we can be aware of your participation and inform you about any possible change, issue, etc. (see contact details at the bottom of this page)
2. Read the task and data description
3. Get the input data (initial translations) 
4. Run your system on the input data
5. Get the output results (inferred translations) and format it according to the guidelines (see the task and data description section)
6. Send the output data to the organisers and wait for the evaluation results
7. Write and submit a system description paper
8. Present your paper at the workshop (register here!)

Important dates

13/11/2018 – First call for participation
06/12/2018 – Technical description of the evaluation process and data provided by organisers
01/03/2019  08/03/2019 – Submission of results by participants / submission of regular papers
01/04/2019 – Notification of regular papers
01/04/2019 08/04/2019 – Evaluation results communicated by organisers
15/04/2019 22/04/2019 – Submission of system description papers
20/05/2019 – Workshop day

Organisers

  • Jorge Gracia, University of Zaragoza, Spain
  • Besim Kabashi. Friedrich-Alexander University of Erlangen-Nuremberg and Ludwig-Maximilian University of Munich, Germany
  • Ilan Kernerman, K Dictionaries, Tel Aviv, Israel
University of Zaragoza Friedrich-Alexander University of Erlangen-Nuremberg K Dictionaries

Reviewing committee

  • Julia Bosque-Gil, Universidad Politécnica de Madrid, Spain
  • Thierry Fontenelle, Translation Centre for the Bodies of the EU, Luxembourg
  • Jorge Gracia, University of Zaragoza, Spain
  • Besim Kabashi. Friedrich-Alexander University of Erlangen-Nuremberg & Ludwig-Maximilian University of Munich, Germany
  • Ilan Kernerman, K Dictionaries, Israel
  • Nikola Ljubešić, University of Zagreb, Croatia
  • Shervin Malmasi, Harvard University, USA
  • Elena Montiel-Ponsoda, Universidad Politécnica de Madrid, Spain
  • John McCrae, National University of Ireland, Galway
  • Georg Rehm, German Research Center for Artificial Intelligence (DFKI), Berlin
  • Arvi Tavast, Institute of the Estonian Language, Tallinn
  • Liling Tan, Saarland University, Germany & Nanyang Technological University, Singapore
  • Marcos Zampieri, University of Köln, Germany

References

Some papers describing previous work on translation inference across dictionaries:
  • McCrae J. P., Bond, F., Buitelaar, P., Cimiano, Ph., Declerck, Th., Gracia, J., Kernerman, I., Montiel Ponsoda, E., Ordan, N. and Piasecki, M. (Eds.): Proceedings of the Workshop “Shared Task on Translation Inference Across Dictionaries”, co-located with the 1st Conference on Language, Data and Knowledge (LDK 2017). Galway, Ireland 2017. See http://ceur-ws.org/Vol-1899/.
  • Villegas, M., Melero, M., Gracia, J., and Bel, N. 2016. Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries. In LREC 2016 Proceedings: 613–622. http://repository.dlsi.ua.es/242/1/pdf/175_paper.pdf
  • Saralegi, X., Manterola, I. and San Vicente, I. 2011. Analyzing Methods for Improving Precision of Pivot Based Bilingual Dictionaries. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 846–856. ACL. http://dl.acm.org/citation.cfm?id=2145526.
  • Shezaf, D. and Rappoport, A. 2010. Bilingual Lexicon Generation Using Non-Aligned Signatures. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 98–107. ACL. http://dl.acm.org/citation.cfm?id=1858692
  • Kaji, H., Tamamura, S. and Erdenebat, D. 2008. Automatic Construction of a Japanese-Chinese Dictionary via English. In LREC 2008 Proceedings: 699–706.
  • Mausam, Soderland, S., Etzioni, O,, Weld, D, Skinner, M. and Bilmes, J. 2008. Compiling a Massive, Multilingual Dictionary via Probabilistic Inference. In Annual Meeting of the Association of Computational Linguistics. ACL. https://www.cs.washington.edu/sites/default/files/ai/papers/tmpiVvJEg.pdf
  • Tanaka, K. and Umemura, K. 1994. Construction of a Bilingual Dictionary Intermediated by a Third Language. In Proceedings of the 15th Conference on Computational Linguistics, Volume 1, 297–303. ACL. http://dl.acm.org/citation.cfm?id=991937

Supported by

Lynx project Prêt-à-LLOD H2020 project
Lynx and Prêt-à-LLOD have received funding from the Horizon 2020 European Union (EU) Research and Innovation programme. EC

Contact

To inquire about any aspect of this shared task please send an email to Jorge Gracia