ODIN

Overview

ODIN stands for the Online Database of Interlinear Text. It is a collection of interlinear glossed text (IGT) instances extracted from linguistic documents on the Web.

As of version 2.0, ODIN is distributed in the Xigt format (as well as text) and is licensed under the Creative Commons CC-BY 4.0 license.

Version 2.1 includes enriched data from the INTENT project, as well as numerous improvements to the cleaning and normalization of the original data.
A flowchart describing how INTENT enriches IGTs is available here.

Download

Version Date Description
v2.1 2016-03-14 IGT instances in the plain text format and in the Xigt format, as well as Xigt data enriched by INTENT.
Contains 158,007 IGT instances from 2,027 documents covering 1,496 languages.
Download Changelog
Readme
v2.0 2014-07-05 IGT instances in the plain text format and the Xigt format.
Contains 158,007 IGT instances from 2,027 documents covering 1,496 languages.
Download Changelog
v1.0 First release. A GUI search interface is hosted by The LINGUIST List website View

Citation

If you make use of ODIN in your research, please cite the following papers:

  • William D. Lewis and Fei Xia, 2010.
    Developing ODIN: A Multilingual Repository of Annotated Language Data for Hundreds of the World’s Languages,
    Journal of Literary and Linguistic Computing (LLC), 25(3):303-319. [pdf]
  • Fei Xia, William D. Lewis, Michael W. Goodman, Joshua Crowgey, and
    Emily M. Bender, 2014. Enriching ODIN, in Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), Reykjavik, Iceland. [pdf]

Related Publications

  • Michael Wayne Goodman, Joshua Crowgey, Fei Xia, and Emily M. Bender, 2015. Xigt: Extensible Interlinear Glossed Text for Natural Language Processing, in Language Resources and Evaluation, 49(2):455-485.
    [pdf] [bib]
  • Fei Xia, Michael Wayne Goodman, Ryan Georgi, Glenn Slayden, and William D. Lewis, 2015. Enriching, Editing, and Representing Interlinear Glossed Text, in Computational Linguistics and Intelligent Text Processing, 9041:32-46.
    [pdf] [bib]
  • Fei Xia, William Lewis, Michael Wayne Goodman, Joshua Crowgey, and Emily M. Bender, 2014. Enriching ODIN, in the Proceedings of LREC 2014, p3151-3157.
    [pdf] [bib]

#Example

The following Icelandic [isl] example is from:

Sigurðsson, Halldór Ármann.
“The Icelandic Noun Phrase: Central Traits.”
Arkiv för nordisk filologi 121 (2006): 193-236.
[pdf]

The example has been converted into the Xigt format and enriched by INTENT.
Not all annotations are shown; the original XML file is
here.
The example is visualized with the
XigtViz IGT renderer.
Interlinear annotations are shown in columns, and all annotations can be
seen by hovering your mouse cursor over an item. The immediate target of
annotation has a blue border, while ancestors are lightly shaded.