| |
utf-skt
Processing Sanskrit texts in utf-8 notation with Omega TeX
Introduction
Strictly speaking (La)TeX does not require a Sanskrit package, because all
diacritics can be produced by TeX commands: a-macron (ā) can be
produced through \=a, d‐underdot (ḍ) as \d{d} and so forth. Although
there are individuals who, presumably as a sort of
cyber-prāyaścita, still use this cumbersome system, there are
now various ways to ensure that the input text remains
readable. Unfortunately many Sanskritists who tried TeX a long time
ago, when none of these tools were available, discarded it for reasons
that no more apply.
It is well known that the sudden recognition that computers would be
used in languages other than English led to the construction of many
"code pages" and few Indologists interested in computers could resist
constructing a more ingenious font layout than their predecessors for
use of Sanskrit and other Indian languages. As long as text files were
kept on a private harddisk this had no detrimental effects, but for
sharing documents, sending electronic texts to a publisher, creating
databases of texts, and for internet presentation, the existence of
more than a dozen ways to encode texts in one language was a
problem.
One attempt to resolve this was through the production of fonts in a
unified layout for different programms and platforms, as for instance
in the so-called CSXp (Classical Sanskrit Extended plus) convention,
which was also widely used with TeX. For typesetting texts in
Devanāgarī there is the devnag package, which consists of a
high quality font and workes with a preprocessor. This preprocessor
expects text in a specific coding: "aa" for a-macron, ".d" for
d-underdot etc. The disadvantage of this was that different input
files were required, one encoding for transcription and another for
Nāgarī, which had a negative effect on searching and conversion. If we
take into account that Sanskrit is only one in the canon of languages
used in Indology, the growing set of converters and preprocessors was
soon to become difficult to handle.
Understandably it was expected that Unicode would be the solution to
some of these problems in the near future. The creation of a Unicode
enabled TeX, Omega, was a step in this direction, but the unpredictable
cycles for development and support led to various successors, as aleph
or mem. Another attempt to harmonize Unicode and LaTeX without depending
on Omega is the package ucs, which works with normal TeX and
translates a large set of utf-8 input into the correct TeX-codes - a good
alternative to utf-skt for those who wish to stick with normal TeX, but only
for transliterated Sanskrit.
The present package is the result of occasional contacts between some, mostly
Indological, TeX users, which led to an informal discussion group,
which eventually included (in alphabetical order)
Stefan Baums, Jürgen Hanneder, Richard Mahoney, Norbert Preining, John
D. Smith, and Toru Tomabechi. In the course of these discussions it
became clear that most of the components for a Sanskrit environment
for OmegaTeX had already been written by Stefan Baums and Toru
Tomabechi and "only" had to be harmonized, with the exception of
an external translation program for Transliterated Sanskrit Unicode ->
Devanagari Unicode (ur2ud), which was written by J.D.
Smith.
The package described here is an attempt to put all these elements
into a simple LaTeX interface and add a few extensions, for instance to cover
Tamil transliteration.
Installation and Download
Usage
Components
Versions
J Hanneder (hanneder@indologie.uni-halle.de )
| |