Why the Chinese still use characters

The history of writing began in the late fourth millennium BC in Egypt and Iraq. In Egypt, the number of hieroglyphs was never never more than two thousand (1,071 hieroglyphs are in Unicode). In Iraq, the number of signs was on a similar order of magnitude; Unicode contains 922 cuneiform signs and that number fell over the course of the third millennium BC. Cuneiform was used as the primary international script of the Middle East during the mid-to-late second millennium BC, though this state disappeared by the early first millennium BC. The alphabet was invented in Egypt on the basis of Egyptian hieratic around one and a half millennia after the invention of writing, and, though not widely adopted in Egypt until the Persian conquest, spread to the Syro-Lebanese coast by the late second millennium BC and into South Arabia by the ninth century BC, and from there into Ethiopia. Due to the Syro-Lebanese coast’s strategic location, the Phoenician alphabet quickly spread across Syria and Palestine during the tenth century BC, and, by the eighth century BC, into Greece. The Latin script emerged from the Greek in the sixth century BC, and the runic script in northern Europe emerged from Latin around the time of Augustus. After the Persian conquest of Iraq, the migration of eastern Syrians into southern Iraq resulted in the disappearance of the Akkadian language and its replacement with Aramaic, written in a spinoff of the Phoenician alphabet. The Aramaic alphabet and language were adopted by the Achaemenids as the primary administrative language of their empire, resulting in the spread of the alphabet into India and Egypt. However, the conquest of Egypt by Alexander the Great resulted in a switch to the Greek language for administrative purposes, which became even more sustained with the Roman conquest. Experiments at writing Egyptian using the Greek alphabet began in the first century AD and Coptic proper, a language using large amounts of Greek vocabulary, became increasingly widely used in the third. Thus, by the time of the fall of the Western Roman Empire, hieroglyphic, hieratic, demotic, and cuneiform had all vanished from use, Egypt was under the rule of the Romans, and Iraq was under the rule of the Persians. The alphabet continued to spread Eastward into Cambodia, Thailand, Indonesia, and Tibet during the seventh century. By the ninth century, it had spread to the Philippines. The Arab conquest resulted in the Arabic script replicating much of the earlier movements of the alphabet across Eurasia.

The situation in East Asia was very different. The Chinese script started in the late second millennium BC already with more than 4000 characters, with each character standing for the meaning of one syllable (at this time, each Chinese word had only one syllable, polysyllabic words would be developed several hundred years later) and the number simply grew over time at a rate of about one character every two weeks (not a joke). Despite modern standard Mandarin containing less than 1200 syllables (far less than Middle Chinese), the government of China promulgates a list of 8105 “general standard” characters (and this was a substantial decline from the 80,000+ characters of the eighteenth century, 47035 in the 康熙字典) -some 6.8 characters per spoken syllable. Much like the Chinese spoken language contains many homonyms, so did China develop a writing system with numerous characters, many commonly used, that look almost identical.

Unlike the case with Iraq, the Chinese language was not replaced by that of any immigrant barbarians -rather, it expanded South into southern China and North into Manchuria. And unlike the case with Egypt, attempts at alphabetization by China’s barbarian conquerors were short-lived and half-hearted. The first attempt at a comprehensive alphabetization of Chinese (and all the other languages of the Empire) was under Kublai Khan, who promoted new 国字 in 1269 designed by the Tibetan monk Phagpa. The new script was used on currency and some monuments, but never percolated to any great degree to the general Chinese population, even though it was primarily used to write Chinese, rather than Mongolian. The attempt died with the fall of the Yuan dynasty in 1368. After the Manchu conquest of China, the Qing court sponsored transliterations from Chinese into Manchu, but never attempted to promote its script and language for the use of private Chinese individuals (with the exception of some usage on currency) or add knowledge of either to the standard imperial examinations and, by the end of the nineteenth century, had largely transitioned to speaking purely in Chinese. The Qing imperial examinations were on the classics of literary Chinese, a language as foreign to any Sinitic language existing today as classical Latin is to French, and which naturally could not be detached from characters due to the loss of ancient pronunciations. Since neither the Manchu script nor the Manchu language spread widely outside the imperial court, the constituency for changing the Chinese writing system never became large. A dramatically simplified Chinese script was used by an unknown number of women in Hunan Province, but it never spread far due to lack of state support.

The Communist conquest of China did not result in any great reforms such as greatly reducing the number of Chinese characters to more closely match the number of Chinese syllables or promoting the widespread use of romanization in addition to or as a replacement for characters to domestic audiences. Rather, the government merely simplified 6247 characters into 6235 simplified characters in order to make them easier to write. This was surely an idea that would have never been thought of in the computer age; had the Cangjie keyboard been invented just thirty years earlier, it is very unlikely we would have seen mainland character simplification. The invention of the computer led to near-universal use of the alphabet to type Chinese characters, with shape-based methods such as Wubi (a keyboard with 298 components, 23 of them exclusively traditional) and Cangjie (a keyboard with 121 components) generally declining over time due to their insufficiently straightforward rules. Likewise, the invention of the computer has led to an increasing decline in Chinese people’s ability to write the characters, with Chinese ability to write the characters peaking sometime in the late 1990s. Today, though Chinese ability to recognize the characters has never been higher, so has use of the alphabet never been more prevalent. Undoubtedly the trends of the past two decades will continue into the next few.

Author: pithom

An atheist with an interest in the history of the ancient Near East. Author of the Against Jebel al-Lawz Wordpress blog.

3 thoughts on “Why the Chinese still use characters”

  1. I don’t think I would ever have studied Chinese if it didn’t have the characters. It was the script that sparked my interest, and if it was like Vietnamese and written in the Latin letter based Pinyin, I don’t think I ever would have studied it, even though it’s becoming a more prominent and important language in general.

    Having said that, the best way to learn Chinese is not to waste any time memorizing the characters and writing them, but rather to exclusively stick to listening to audio and studying the Pinyin transcriptions of audio. Once you’ve done a lot of listening and know a good deal of the sentence patterns and basic vocab by ear and in Pinyin, you can easily pick up reading the characters by studying the Chinese character versions of those Pinyin transcriptions. Remember, Chinese characters generally are encountered in the context of multi-character words, phrases, sentences, not in isolation, so learning basic Chinese vocab and sentence patterns by ear and Pinyin first makes learning characters later much easier.

    I recommend using Glossika’s mass sentences in Chinese. This can easily be found and downloaded online on torrent and pirate sites. It’s 3,000 random sentences that build upon each other in vocab and sentence patterns. They’re divided into 50 sentence blocks. It’s audio, with the sentences read out in fluent Chinese followed by the English translation, and it comes with Pinyin, English, and Chinese character transcriptions of the sentences. Listen to the sentences over and over, you can do it while doing other things like working out, running errands, driving, etc., follow along and study the Pinyin & English transcriptions as needed (ignore the Chinese characters for much later), and you will get quite good fast. The most effective language learning method is exposure to audio input, and these Glossika mass sentences do that.

    Another source for Chinese mass sentences that can be used in addition to or instead of Glossika is the “Spoonfed” Anki deck of Chinese sentences. There are over 8,000 sentences in the deck. You can download free here: https://ankiweb.net/shared/info/53920083 or buy here for $3: https://gumroad.com/l/IEmpwF

    Regarding the massive number of Chinese characters, note that the vast majority of them in those old dictionaries listing tens of thousands of characters never appeared anywhere else except in those dictionaries. And many of them were for obscure individual species of plants, animals, etc. College level literacy in Chinese characters is around 5,000 to 8,000 characters, depending on one’s academic field or occupation or interests. But the chance of encountering characters beyond that is virtually nil, and in the event you did, you’d just look them up in the dictionary.

    Most of the characters are radical-phonetic compounds. The radicals become easy to recognize since there are “only” 214 of them and they’re derived from pictographs that aid memory and recognition. Each character is actually a combination of 224 basic graphical elements that are ultimately pictographs although many of them have been altered and no longer function as pictographs, but knowing that and what pictograph they were can help recognize the character overall.

    1. “but rather to exclusively stick to listening to audio and studying the Pinyin transcriptions of audio.”

      Learning characters is, in my humble opinion, a necessary evil, since you have to learn them to understand the subtitles to most Chinese videos (this is an example):

      as well as signboards and image-based memes.

      Yes; agree that learning the components is important.

      I certainly didn’t start learning Chinese for the characters, but because China, Taiwan, and Singapore are important.

      I like your suggestion for the mass sentences.

      1. Agree that the characters are necessary for literacy and full fluency. I do think it’s easier tackling after some audio and Pinyin though.

        There are some systems for remembering the characters. James Heisig is the author of good books on remembering their meaning:


        This blogger devised a memory system combining Heisig’s system for meaning with a pronunciation system:


