The goal of this project is to explore the possibility of extending OpenOffice.org (an open source suite of office productivity software) by adding interlinear glossing capabilities to its basic functionality. This extension will allow linguists to annotate texts for certain kinds of grammatical information and to link the words and morphemes in those texts to a lexical database, thus permitting them to build a lexicon and a collection of annotated texts in a basic working environment that is already likely to be quite familiar—namely, that of a modern word processor. Furthermore, since OpenOffice.org’s native document format, OpenDocument Format, is a variety of XML, resources produced using this system will already be in a form which facilitates archiving and resource interoperation (in keeping with the recommendations of E-MELD’s School of Best Practices).
The outputs of this project can be summarized as follows:
No project we are aware of has attempted to build a linguistics-specific toolkit within an existing office suite, even though this would clearly be desirable from the standpoint of the ordinary working linguist who will typically already make extensive use of such a suite in their workflow. At the conclusion of the project, we will be in a position to determine the general feasibility of using OpenOffice.org as a general platform for linguistic data manipulation tools and, if the results of our preliminary research are promising, to expand the project’s scope.
Completion of primary research and programming by June, 2008.
The Rosetta Disk