The Transliteration of Tibetan
Quick link to documentation on THDL's Extended Wylie transliteration system.
Background on Transliteration
All human languages have complex histories with individual words often stemming from a bewildering but discernible array of borrowings from other languages in a ongoing process. In addition to such complex genealogies of individual words – genealogies quite unknown to most speakers of the language – speakers of any given language will often have occasion to self-consciously utilize another language's terms within their speech. Obvious examples are personal names, toponyms, products or items unique to the culture in question, and so forth.
In determining the original source "language" of a given term, i.e. whether it is an original "Tibetan" term or a rendering of a foreign word, it must be realized that the line between "etymology", "loan words" and simple representations of a foreign word are quite arbitrary. Many words of a given language can be shown historically to stem from other languages originally, but by now are completely integrated into the language and the typical language user has no sense at all of the word being "foreign". In contrast, words typically thought of as "loan" words tend to be more recent borrowings, and/or their use is more selective and limited within the language so that language speakers retain a clear sense of the word's "foreignness". In contrast to both of these situations, in many contexts we quite deliberately refer to foreign terms in our own texts without any sense that they are part of our language – examples are language instructional text books, essays (journalistic or academic) on a certain culture, and so forth. While in many cases the boundaries between these three phenomena is very clear, in other cases it is not – the important thing in a process of scholarly documentation is that one is consistent and explicit in application. The issue then of transliteration – rendering a term from another language in one's own script – is most pertinent to the third such case, namely self-conscious representations of another language's terms as deriving from that language.
In fact, the incorporation and treatment of foreign words into a language also has significant differences in terms of the different types of words, and different languages in different socio-political circumstances. For example, words for concepts tend to be translated rather than borrowed, perhaps, while words for new products – such as tofu, for example – tend to be more directly incorporated wholesale in a rapid fashion. In a situation where a culture is dominated politically by another culture with a separate language, you also find a more rapid explosion of new loan words and a tendency to borrow rather than translate for obvious reasons.
One of the more interesting types of words is toponyms, i.e. names for specific places. For obvious reasons – and especially for travelers – we tend to try to learn the toponyms employed by the people who live in the place in question, albeit rendered in our own script in a way that we can pronounce easily (popular uses) and/or reproduce the native orthography (scholarly uses). In addition, because the place in question continues to be anchored in a particular geographical part of the world inhabited by specific groups of people speaking a specific language(s), references to a toponym from outside of a given culture tend to be very conservative in retaining their distinct status as a foreign language term even after extensive and widespread use. Thus whereas "tofu" might have quickly become in essence an English word, "Lhasa" or "Beijing" remains clearly a foreign language toponym even after years of use.
When self-conscious references to foreign language terms in a given language happens in written documents, one is faced with a challenge as to how to best represent those terms in one's own script. This process of representation is termed transliteration, defined as "to represent or spell in the characters of another alphabet", "to represent (speech sounds) by means of phonetic symbols", "to make a copy of (dictated or recorded matter) in longhand or on a machine (as a typewriter)", or "to make a written copy". There are two distinct methodologies one can pursue: one can attempt to represent the native spelling of the term in question, or one can attempt to represent the sound of the term in question, utilizing one's own script. We refer to these two different methodologies as orthographic transliteration and phonetic transliteration, respectively. The former aims to represent terms' standard spelling, or orthography, while the latter aims to represent terms' standard pronunciation, or phonology. For some languages, a single transliteration system can simultaneously serve both orthographic and phonetic needs because orthography reasonably approximates the pronunciation. An example is Sanskrit, which can be easily represented in orthographic transliteration in Roman script that in turn can be easily pronounced and remembered by a foreign reader without special training or knowledge. In contrast, Tibetan requires two distinct transliteration schemes – orthographic and phonetic – due to the considerable divergence between spelling and pronunciation.
Orthographic transliteration in general is relatively straightforward. The major issues are threefold:
- Disambiguation
- Comprehensiveness
- Graphical particularities
The first is to ensure that the system is unambiguous, so that a given transliterated letter and/or word cannot be contextually reconstructed back in the original language in two alternative ways. The second is to ensure that the system is comprehensive, and deals with all possible characters and modifications that appear in the original language. This is the more difficult issue, especially when the transliteration system is meant to account for the full history of the language, as well as special uses. Often transliteration systems only account for "standard" uses of the language in question, and offer no guidelines for more uncommon uses of the language – such as how the source language and its script might be used itself to represent other languages, special diacritic marks inherent to the script, and archaic forms. The third issue pertains to the need not only to indicate the spelling proper, but also how the details of the original script. For example, a given word with the same spelling might in different centuries represent a given conjunction of two letters graphically different, or it might utilize abbreviations. The orthographic transliteration system then must be able in addition to indicating the spelling, also indicate these graphical particularities. It should be noted that many many languages don't have traditions of using their own script to represent foreign language words in a way that documents the original orthography of the term, but rather only represent these words phonetically.
Phonetic transliteration is somewhat more complex. The main issue is that unlike the black and white issue of representing spelling, pronunciation is not an exact science. To begin with, authors have different degrees of accuracy at which they are aiming at to begin with. Thus a popular publication may desire simply a rough approximation of pronunciation using simple characters that anyone could pronounce, leading to schemes often referred to as "simplified phonetic schemes", or "simplified transliteration". In contrast, dictionaries and in general linguistic research aims at more precise modes of representing the sound, which often use special diacritic marks in conjunction with the relevant alphabet in order to allow for more precise indication of sounds. Despite the increased accuracy, such schemes have several problems: they require some degree of study that can mean they are simply ignored by many readers/users, their degree of divergence from ordinary practices in the target language entail many if not most readers have difficulty in remembering and thereby utilizing them, and finally the use of diacritics can hinder the use of these schemes in various digital contexts. An international standard has evolved for precise phonetic transcription that goes by the abbreviation IPA, or International Phonetic Alphabet. However, while used widely by linguists, few others are able to utilize this technical scheme and hence a wide variety of other phonetic schemes have continued to be used in various contexts. Finally, representing sound with scripts leaves room for disagreement as to the best representation in any given context. In addition, it should be noted that each language is characterized by diverse spoken practices based upon regional location and social class. Thus any phonetic scheme must be capable of representing the full spectrum of sounds possible in all dialectical variations. Just as importantly, it means that any use of phonetic schemes cannot possibly represent the language in general, but rather can only represent a specific dialect of the language at a specific time period.
Practically speaking, written sources are overall extremely inconsistent in the degree to which they employ systematic transliteration schemes. There are three broad tendencies:
- Inconsistency: any given term can be rendered in multiple and inconsistent ways within the same source without rhyme or reason.
- Term-based consistency: the spelling for any given term – a place name, for example – is consistently the same in a given source, but there is no system in place that is using consistent guidelines for determining the spellings of each term/toponym overall in a regular process; for example some might be phonetic transliterations, while others might be orthographic transliterations, while the practice for expressing phonetic renderings might be non-systematic and haphazard, or even multiple systems applied for different terms
- Systematic consistency: a scientific system is being used consistently for the representation of each term from the source language in the target language, so that not only are consistent spellings be used for each term, but those spellings are commonly determined by a single unambiguous system with explicit principles.
Tibetan Transliteration
Tibetans have used a variety of their own scripts to record and transmit their own language since at least the eighth century. However, obviously few people outside of their own culture will every learn those scripts, and thus it is necessary when dealing with Tibetan language terms outside of Tibetan culture to represent Tibetan with other scripts. This is true whenever a non-Tibetan language publication needs to refer to the name of a Tibetan person, a Tibetan region is mapped with Tibetan place names, any publication refers to a Tibetan passage or technical term, and so forth. Unfortunately, Tibetan requires two distinct transliteration schemes – orthographic and phonetic – due to the considerable divergence between spelling and pronunciation. Thus we must deal with two distinct challenges:
- How does one write a Tibetan term in another script in such a way that its original Tibetan spelling can be easily and unambiguously discerned?
- How does one write a Tibetan in another script in such a way that the original sound of Tibetan speech, whether originally oral speech, or the pronunciation of written Tibetan passages, can be easily approximated?
The present document focuses on both types of transliteration as it applies to the use of Roman script to transcribe Tibetan, and further on pronunciation of Roman script by English-language speakers. Our goal is to survey present systems in use, and propose drafts of standards for both types of systems as currently used within THDL. We hope to ultimately expand the document to include other scripts and language speakers as well as they pertain to representing Tibetan. However, we believe the initial focus on Roman script/English is justified, since they function as an international medium of communication more than any other script or language in the world. A general standard has emerged for orthographic transliteration in Roman script which is termed "Wylie," and we have adopted this with refinements to make it comprehensive. Unfortunately, no such standard has emerged for phonetic transliteration, and a bewildering array of imprecise and precise systems have produced a chaos of alternative renderings. We have formulated an easy to use system based that is concordant with Wylie and generally consonant with the most widespread practices.