| Chris Kern ( @ 2006-10-19 20:43:00 |
I'm going to go ahead writing a paper about the 幽霊文字 (ghost characters) that I talked about in an older journal entry. However, since then I've read more about them and I have a much clearer idea of what they are. Hopefully this explanation makes sense.
The JIS set is a set of characters defined for use on computers (and similar devices). The standard exists so that software developers and the like can know what characters will be guaranteed to be displayable on a certain operating system or computer. There are 4 "levels" of JIS (I believe), with level 1 containing the most common characters. Most computer systems only do levels 1 and 2, which consist of about 6800 kanji.
At some point after the propogation of the JIS level 2 standard, people started to notice that there were characters on the list that could not be found in any dictionary. A committee was formed (I think a government committee) to investigate these characters and determine what their readings and meanings should be, and where the JIS set compilers found them. Many of the obscure ones were traced to a reference work called 国土行政区画総覧, which is supposed to contain the names of all administrative districts in Japan (i.e. all cities, towns, and districts within towns). The compilers apparently consulted this book because one of the goals of level 2 JIS was to contain all kanji that were used in Japanese place names, even small districts of towns (aza).
妛 is probably the most famous one; there is a place name called "akenbara" that is written with two characters, the first one which is a combination of 山 and 女 on top of each other. But when the compilers of the reference work mentioned above were typesetting the book, they did not have that character in their set, so they had to make it by cutting out parts from other characters (i.e. perhaps the top part of 催 and the bottom of 安). However, when the book was actually printed this cutting-and-pasting created a black line on the page which was interpreted by the JIS makers as a stroke, and the new character 妛 was created. Also, 彁 and 椦 appear to be mistakes by the committee but I'm not completely sure about that.
Aside from those three, there are 16 remaining characters that are called "ghost characters": 穃粫挧橸膤袮閠暃軅鵈恷碵駲墸壥蟐. All sixteen of them appear in the place-name book mentioned above, appearing only in one name each. However, the existence of the place names containing those kanji cannot be independently verified. This does not mean for certain that they don't exist, it just means that if they do or did exist at one time, whatever records the compilers of the place-name book were relying on cannot be found or are no longer available. It could also be that further mistakes in the compliation of the work were made (for instance, 閠 could very easily be a mistake for 閏), but this has not been confirmed.
However, the information on this topic is a little hard to find and some of it is contradictory. I'm not completely sure how I'm going to represent all that in the paper; I guess we'll see.
The JIS set is a set of characters defined for use on computers (and similar devices). The standard exists so that software developers and the like can know what characters will be guaranteed to be displayable on a certain operating system or computer. There are 4 "levels" of JIS (I believe), with level 1 containing the most common characters. Most computer systems only do levels 1 and 2, which consist of about 6800 kanji.
At some point after the propogation of the JIS level 2 standard, people started to notice that there were characters on the list that could not be found in any dictionary. A committee was formed (I think a government committee) to investigate these characters and determine what their readings and meanings should be, and where the JIS set compilers found them. Many of the obscure ones were traced to a reference work called 国土行政区画総覧, which is supposed to contain the names of all administrative districts in Japan (i.e. all cities, towns, and districts within towns). The compilers apparently consulted this book because one of the goals of level 2 JIS was to contain all kanji that were used in Japanese place names, even small districts of towns (aza).
妛 is probably the most famous one; there is a place name called "akenbara" that is written with two characters, the first one which is a combination of 山 and 女 on top of each other. But when the compilers of the reference work mentioned above were typesetting the book, they did not have that character in their set, so they had to make it by cutting out parts from other characters (i.e. perhaps the top part of 催 and the bottom of 安). However, when the book was actually printed this cutting-and-pasting created a black line on the page which was interpreted by the JIS makers as a stroke, and the new character 妛 was created. Also, 彁 and 椦 appear to be mistakes by the committee but I'm not completely sure about that.
Aside from those three, there are 16 remaining characters that are called "ghost characters": 穃粫挧橸膤袮閠暃軅鵈恷碵駲墸壥蟐. All sixteen of them appear in the place-name book mentioned above, appearing only in one name each. However, the existence of the place names containing those kanji cannot be independently verified. This does not mean for certain that they don't exist, it just means that if they do or did exist at one time, whatever records the compilers of the place-name book were relying on cannot be found or are no longer available. It could also be that further mistakes in the compliation of the work were made (for instance, 閠 could very easily be a mistake for 閏), but this has not been confirmed.
However, the information on this topic is a little hard to find and some of it is contradictory. I'm not completely sure how I'm going to represent all that in the paper; I guess we'll see.