THDL Tools Status
Developing a broad spectrum of tools ranging from work flow management and input tools to data repositories to final delivery systems has been one of the major foci of the THDL from the beginning. Rather than taking the easy path of commercial systems and short cuts with quick results but long term chaos, we have from the start taking the difficult and slow tact of building integrated systems adherent to international standards. While this has greatly delayed the posting of content, it means that the infrastructure will endure for years to come and that data can be easily migrated and maintained in flexible ways.
Fedora: Digital Library Infrastructure
The entirety of THDL is ultimately destined to be stored and delivered using the FEDORA digital library system. FEDORA was first released in a functional form in mid-2003 by the University of Virginia Library, and 2004 will be the first year that THDL begins to directly implement it for its own collections. Much of THDL has been built with an eye towards the protocols and standards of FEDORA, but until now none of THDL has actually been implemented within FEDORA. This is one of the main technical challenges for 2004.
Systems
While Fedora will provide the infrastructure for the integrated digital library, the actual specific systems offering particular functions still have to be built by THDL. Most of these are either MYSQL or XML databases which we have custom-designed for use within Tibetan and Himalayan Studies.
Audio-Video database: over 2000-2003 we built a sophisticated system for cataloging audio-video materials, managing related work flow, building collections of related titles, and delivering views of titles combining media and transcripts. This system is now working smoothly and in the fall of 2003 will be migrated from Cold Fusion to PHP-MYSQL. Once the migration has been completed, we will focus on integrating it with Savant so that the transcripts are dynamically linked line by line to the corresponding audio-video titles.
">Audio-Video database: over 2000-2003 we built a sophisticated system for cataloging audio-video materials, managing related work flow, building collections of related titles, and delivering views of titles combining media and transcripts. This system is now working smoothly and in the fall of 2003 will be migrated from Cold Fusion to PHP-MYSQL. Once the migration has been completed, we will focus on integrating it with Savant so that the transcripts are dynamically linked line by line to the corresponding audio-video titles.Image database: in September 2003 we finally integrated our present collection of 25,000+ images into revamped online Filemaker Pro database. For the next year we are focusing on migrating collections of images one by one from that database into sophisticated XML collections which use a University of Virginia Library scheme known as GDMS ("general descriptive modeling system").
Gazetteer: the gazetteer in a very simple form was released in 2002 as an XML repository of place names. In Sept 2003, this has been released in version 2.0 with dramatic updates. The gazetteer now allows searching and browsing, and more importantly allows for a place name to be dynamically plotted on a map of the area. During 2004 our focus will be on connecting the gazetteer to the audio-video and image databases so that users can automatically request all media for a given place located in the gazetteer.
Flash Maps: the system for connecting interactive Flash maps with databases of information on features such as buildings has been implemented for Sera Monastery. There is still further work to do on Sera, and then we plan to extend the work to redo the Flash maps of Tibet and other areas in a true data-driven fashion.
Literary databases: this is THDL's oldest tools initiative, and was designed originally for Tibetan Buddhist literature. It is currently in good shape for cataloging Tibetan Buddhist literature.
Dictionary: this custom-designed MYSQL database has been built and deployed, but still lacks a decent user interface. We are currently working on interface issues with the goal of a December 2003 release of a more friendly interface.
Bibliographical database: we have currently implemented the use of the external Scout Portal project for the Web-based management and display of bibliographies of Web sites. We are currently working on adapting the system so that it can be used for bibliographies of text resources as well, and hope that by the end of 2004 that system will be on-line.
Roster: this custom-designed MYSQL database has been built and deployed, but will likely need revisions as it is used more intensively.
Discussion forums: this is a commercial system adapted for THDL's use by the University of Virginia Library. It is currently functional, though intensive use of the forums by users has not yet begun.
E-folio: this is a Cold Fusion-based system built by Yitna Firdyiwek at the University of Virginia for the collaborative management of a classroom space with posting of assignments, commenting on each other's work, on-line creation of HTML web pages and so forth. However at present it is only available for internal use at the University of Virginia, and it is unclear if we will secure funding to generalize it and help offer it more widely.
Fonts
Our special focus to date has been on Tibetan fonts. In partnership with Tony Duff of Tibetan Computer Company and the Trace Foundation, we have helped make available two high quality Tibetan fonts (Tibetan Machine/Web) designed by Mr. Duff, as well as a variety of powerful input and presentation tools using those fonts (see below).
However the future of Tibetan computing lies in Unicode, which is the international standard for encoding fonts. Through using Tibetan Unicode fonts, users will be able to use Tibetan fonts in a wide variety of software. Nathaniel Garson is currently working intensively on creating a Unicode Open Type font from Tibet Machine under the guidance of Chris Fynn. This is due for completion by the spring of 2004, and will be made available for free to the public.
We have also separately created tools for using Unicode diacritic fonts to render special diacritic marks necessary for rendering many Asian languages in roman script. This includes a systems keyboard for Windows.
Software
This includes the ability to use Tibetan within Word, type it over the Web, and create transcripts of video/audio.
Converters: David Chandler and Tashi Tsering are both working on robust converters in Java and C respectively which will work to and fro from Unicode. These converters will allow users with materials input in non-Unicode fonts to convert their materials into Unicode Tibetan; conversely, they will be able to take Unicode Tibetan materials and convert back into older "legacy" fonts if they have some reason to do so. Finally, they will allow for conversion back and forth between Tibetan and THDL extended Wylie transliteration.
Chandler's conversion routines can be seen as embedded within Jskad, and are currently available in beta version. They continue to be updated monthly. Tashi's converters (supported by the Trace Foundation) are currently still in an alpha form, and we do not anticipate a public testing release until early 2004.
Wylie Word: this software tool was originally formulated by Nathaniel Garson to enable users to type in Wylie within Microsoft Word using Tibet Machine Web. In September 2003, version 2.0 will be released, which includes the ability to switch back and forth between Tibetan and THDL Extended Wylie, as well as convert Tibetan into phonetic renderings. It also enables searching of the Tibetan. This new release was done by David Chapman on the basis of previous work by Garson. After the early fall 2003 release, a relaxed pace will be set for further revisions, including most prominently a Unicode-version.
QuillDriver-Savant: this Java-XML tool is designed to enable the transcription, translation, annotating and timecoding of audio-media using Tibetan script or other languages. In its "Savant" mode, it then allows users to play back the audio-video with the transcript dynamically connected. Currently only Tibetan and roman script is supported, but it is actively being used within THDL. A new grant for its development from 2003-6 has been awarded to Roger Andersen at UCLA, and thus there will be updates every few months as new functionality is added.
Jskad: this Java software tool is designed to facilitate the input of Tibetan Machine font on the Web or off-line using a variety of keyboards. In addition, it now has David Chandler�fs converters embedded within it (see above), thereby enabling users to switch between Tibetan Machine, Unicode, Wylie and other formats. Jskad continues to be updated and revised each month.
TiblEdit: this is a tool used by participants to give a user-friendly view of XML cataloging files of Tibetan literature for easy modification. It is not currently available to the public, but seems to work stably. Interested THDL participants should contact thdl@virginia.edu. We hope by the 2004 to put out a public release.
THDL Toolbox
This section of tools is building comprehensive documentation for use by participants ranging from manuals on photography in the field, to inputting e-texts, to marking up essays for the Web. When finished, it will be a huge repository of manuals and other documentation that will be of benefit to scholars and technologists alike. However, we have been slow to edit and organize the large amounts of documentation we have been accumulating. Getting an initial robust version up will be a major goal in 2004 with the hope that by the Spring this will be more than just a few isolated documents. As items are finished, they will be posted as discrete links in this section’s table of contents.
Creating Text (XML) for THDL: We have begun marking up large amounts of essays and reprints in XML using a TEI-derived system and adaptation of the free Morphon XML editor. The documentation, most of which is done in XML, for how this is done in the THDL is at Web Development section of the THDL Toolbox. This page contains links to other usefule pages such as:
- Introduction to XML
- Creating an XML Document
- Creating an XML Essay for THDL
- XML Markup Manual for THDL
- THDL Essay Term Glossary Table Creation Manual.doc
- XML Editors
- XML Resources
Collaboration
This section of tools is intended for the administration of THDL and is restricted to collaborators. Ultimately some sections of it as now listed will be migrated to "From Field to Web" when finished. At present this section is useful, but not well organized for non-insiders. From now and then throughout 2004 this will be organized into a more visually clear format, and new materials posted to facilitate the specific administration of THDL.