The paradox of encoding new scripts in Unicode

Posted on Wed 09 October 2013 in unicode

Over the past several years I have developed Unicode standards for numerous historical and minor scripts of South Asia. Some of my formally proposed encodings have been approved and are making their way through various stages of balloting as part of the process established by the International Organization for Standardization (ISO) —- Unicode is also known as ISO/IEC 10646 —- while others have been formally published in the standard.

The inclusion of a script in Unicode is a major milestone. Native user and scholarly communities await the arrival of these new encodings from the day that I contact them —- if not for years in the making —- and discuss my plans for developing an encoding for a particular script. It takes nearly two years for an encoding to be published in Unicode after being approved by both the Unicode Technical Committee and ISO’s WG2 group. But, once a script is in Unicode the general perception of what “inclusion in the standard” means is far removed from the reality that is suggested by such an “inclusion”.

The disjunction between expectation and reality is truly a paradox: The encoding of a script in Unicode does not automatically imply that it will be supported out of the box. As is the case with other historical and minor scripts encoded in Unicode, it is unlikely that major software houses will rush to provide support for these scripts right away, or if they do so at all!

If these companies do not implement support for these scripts, users will be unable to produce documents in their writing systems using common applications such as Microsoft Word and Adobe InDesign. Users will be unable to view websites containing text in these scripts using Internet Explorer, Firefox, and other browsers. Moreover, typographers wishing to design new, modern typefaces for these scripts will be hampered because their creations will be unusable in ubiquitous platforms such as Windows. The reason for this is that OpenType, the common font standard, is dependent upon Microsoft’s Uniscribe rendering engine and a font will not work in Windows until Uniscribe can handle the underlying script. Certainly, those who use Linux systems can rely on HarfBuzz, an open source OpenType rendering engine; but, ultimately, from a practical perspective, Uniscribe support is important because Windows is the most-commonly used operating system worldwide.

What good is a Unicode standard for a script if there is no way for the general public to actually make use of it? What good is a Unicode standard for a script if implementation of it is dependent upon the economic cost-benefit decisions of major software companies? Although historical and minority scripts and related linguistic ecologies may not be profit generators, they are certainly valuable from humanist perspectives.

So, what is the reality if Microsoft, Apple, Google, and others don’t support historical and minor scripts in their operating systems when a new version of the Unicode standard is published… or ever? For now, thankfully, there we can provide font solutions using the Graphite rendering engine developed by SIL. However, Graphite fonts are not supported in Windows and, therefore, will not work in Microsoft Office, Internet Explorer, and other commonly-used applications. Applications like LibreOffice and XeTeX support Graphite, but the user base for these is limited. In the end, at least Graphite exists and those of us who work with lesser-used writing systems owe gratitude to SIL for having the foresight and imagination to provide a means for rendering writing systems independent of the whims of software house.

I maintain the hope that Microsoft will continue to expand Uniscribe in order to support all scripts in the Unicode standard. But until the time comes when there is mandatory synchronization between releases of the standard and implementation, we must turn to other, albeit limited, solutions. As part of this solution, I will build Graphite fonts for all the scripts I have brought into Unicode in order to provide a basic level of support for these scripts in the Windows environment, even if the range of usage is limited. I am not a professional typopgrapher nor do I pretend to be, but such efforts are just an extension of the work I’ve started.

But, I am an individual with limited resources and numerous constraints. I encourage Microsoft, Apple, Google and others to meet me half way. After all, what good is a standard if no one cares to support it?