What is Unicode?

Over the course of the past year I have found myself explaining Unicode many times. I have written various introductions for different groups. While I was just in Liberia I put together a introduction for LIBTRALO and decided to post it on my blog for all who may be interested. You can download also download a better designed version of this article from What is Unicode.

To understand what Unicode is, we need to understand how computers think. They only think in numbers. Letters are just a representation for humans of what the numbers are. These numbers are the codes for the corresponding letter.

Since the inception of computers there have been many standards for how letters should be coded. The problem is that each was designed for a specific major language. Each of these legacy standards could only handle a maximum of 256 characters.

People who worked in minority languages, as LIBTRALO does, had to develop their own standards and in so doing created uniquely modified fonts. This worked fine within a language group, but data was not able to be shared outside of that group without also passing along the necessary fonts and keyboards. Sometimes even two organizations that were working in the same language were not able to share their files.

To combat these problems a new system has been developed to create a distinct code for every character in every language in the world. Unicode has a place for more than 16 million different characters.

All of this is to say that, with Unicode, there is now a level playing field. Unicode is an international standard. It allows data to be shared among various organizations in Liberia and throughout the world. Once the linguistic analysis is done in a given language the information can then be shared with missionaries and other organizations working in similar languages.

Benefits of Unicode:

  • As a well developed international standard it works across almost all computer programs and operating systems throughout the world. Something that is typed in Microsoft Word on a Windows computer in Liberia can be read in OpenOffice on a Macintosh in China. Information can be easily shared. It even works on the internet so web pages can be published in any language.
  • Because it is an international standard, there are more choices available to us when we develop materials using Unicode. We are not limited to one font, but now have many choices. The best ones are from SIL, but even companies like Microsoft and Adobe are now developing very nice fonts that display Liberian characters.
  • Unicode improves our ability to archive. It used to be that when you archived a document, you had to make sure that you saved the specific font with that document so that it could be read. Now, if the file is saved using Unicode you can be sure that when it is opened again by somebody else it will be readable.
  • Unicode is not simply about how a character looks. It also knows what a character is. Take for example the “ɔ”. In Unicode it is stored as character number U+0254 which is also known as a “LATIN_SMALL_LETTER_OPEN_O”. Linguistic analysis tools know that this is a vowel and what sound it makes.

More Information:
The Unicode Consortium (http://www.unicode.org/standard/principles.html)
Non-Roman Script Initiative (http://scripts.sil.org/unicode)
Keyboard Manager (http://www.tavultesoft.com)

Advertisements

One thought on “What is Unicode?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s