|
What is Unicode, why was it is developed, and what problems are there?? Both ASCII and EBCDIC have a severe problem: They didn't represent enough characters. How so?
But there are only 8-bits available! Exactly. That is the problem. Even if we use all 8-bits we still have only 256 different combinations (28 = 256).
We need to add more bits. But how many? Well, since we know that the basic addressable unit in RAM is a byte, why not add another 8-bits? If we had 16 bits we would have 216 = 65,536 different combinations. A fair number. Is someone doing that?
For some time, people have been developing schemes to expand the set of symbols which can be represented on computers. It wasn't until 1988, however, that the Unicode Project was begun and the Unicode Consortium was incorporated in 1991. Unicode is the international standard whose goal is to provide the means to encode the text of every document people want to store in computers. This includes all scripts still in active use today, many scripts known only by scholars, and symbols which do not strictly represent scripts, like mathematics, linguistics and APL. Despite technical problems and limitations and criticism on process, today Unicode is considered the most complete character set and one of the largest, and has become the dominant encoding scheme in internationalization of software and multilingual environments.(Wikipedia) What characters are included in Unicode? Unicode is still an on-going project, and probably will be for a long time. Some (there are many more) of the items being considered for inclusion are: It's a very long list Who is involved in deciding what gets included? The Unicode consortium consists of governments, Corporations (mostly from the Information and technology sectors), research and educational institutions, industry groups and associations, and individuals (if you wish, YOU could become a member). As you can image, there are a lot of problems. What Problems? Aside from the technical problems (and there are many of those), there are political problems (there are national and corporate interests involved), disagreement about what should be included, font problems (No fonts - No Characters), and storage and processing problems (by doubling the number of bytes used to represent a character, we are doubling the storage and processing requirements). Its going to take some time. Some good references include: At this point in time, you should be able to Answer the following questions:
|