A (very) brief introduction to Chinese


by Eric Leung

Illustration provided by the editor, Mary Tzoannou.

Illustration provided by the editor, Mary Tzoannou.

I am currently conducting an ethnographic project to better understand the language situation faced by ethnic minorities in Hong Kong, and this term mainly refers to peoples of ‘South Asian’ origin, a term used locally to describe people coming from an area spanning from Pakistan to Indonesia, of non-Chinese (Han) ethnicity. Part of the problem they face is inadequate proficiency in Chinese, one of the territory’s official languages. While many of them are fluent in English, the other official language of the territory, and speak Cantonese, the main vernacular quite well too, their grades in ‘Chinese’ are often quite low, and Chinese characters are unintelligible to older residents too. Nonetheless, having a clearer idea of what Chinese is can be essential to having an accurate view of the entire situation.

Nowadays, it is increasingly common for a person to declare proficiency in Chinese, by both native and non-native speakers. Nonetheless, as with Arabic, another macro-language with many mutually unintelligible varieties when spoken (Kamusella 2017), it is quite impossible to know it in its entirety[1]. There are often variations according to locale, whether Hokkien (Bân-lâm-ōe) in Taiwan and Fujian, Mandarin (Putonghua) or Cantonese (Jyut jyu). For the international stage nowadays, this mostly means Mandarin with the People’s Republic of China’s (PRC) standardised version being one of the United Nation's official languages. However, this was not always the case, with earlier records focusing on Hokkien (Douglas & Barclay 1873) or Cantonese (Bolton 2002) given their importance as trade languages at the time. Even when we examine the word ‘Chinese’ alone, it is still complicated . In English, it could stand for an ethnicity, a nationality, a language, a geographical descriptor, among many other meanings. To clarify any of the possible definitions here is a potential minefield, mainly due to the current cross-strait relations and the associated disputes. In Hong Kong the term zungman/zhongwen (中文) is used, an emic macro term describing the entirety of the Sinitic language branch, as well as including multiple scripts, from traditional to multiple simplified systems.

Illustration provided by the editor, Mary Tzoannou. 

Illustration provided by the editor, Mary Tzoannou. 

The Chinese writing system, due to its pictographic origins is markedly different from other common writing systems, such as the Latin, Cyrillic, and Arabic alphabets. To be literate requires memorisation of thousands of characters. While fluency among ethnic minorities of Hong Kong is common, literacy is another hurdle. From the traditional script, the PRC and Japan have developed separate simplified scripts called jiantizi and shinjitai respectively. Singapore once used its own version but switched back to the PRC's version in 1976, and the PRC once attempted a second round of simplification which was abandoned (West 2009). Therefore, we currently have two major sets in use: the traditional set in Taiwan, Hong Kong and Macau and the simplified set in the PRC, Malaysia and Singapore. Then, there are multiple writing registers, the most common ones being classical(文言) and vernacular (白話), the latter developed after 1911 based on Mandarin varieties. In addition there are multiple works written in regional varieties such as The Sing-song Girls of Shanghai written in Soochow/Suzhou [2], regional operas and a tradition of incorporating Cantonese in novels which were written in a mixed form of classical and vernacular (Wong 2014). Moreover, a character has many valid ways of being pronounced due to the historical spread of the script across many nations in East Asia. An example is the character 京, meaning capital. There are at least 15 ways to pronounce it across Sinitic varieties, Japanese, Korean and Vietnamese. Even with a singular character, restricted to a single variety, there are still different contextual variations. For example, in Cantonese 婦, meaning woman, is normally pronounced fu5. Yet in the phrase 新婦, meaning bride, it is pronounced as pou5 instead (Chan 1998: 100–101). In a pictographic script, there is no link between pronunciation and meaning, with many situational exceptions such as the aforementioned one, providing additional difficulties for people learning the language. The character 包, Romanised in Wade-Giles as Pao, could stand for a range of meanings from abalone to artillery after the addition of parts for fish and rock respectively (Lin 1989: 225). This form of creating characters, aggregating a part for the sound of the right or bottom, with a part signifying the meaning on the left or top is called jingsing, literally shape-sound. Still, this means the bar for literacy is higher than a script which employs phonetic principles. Therefore, we encounter a complicated situation where different varieties are intelligible on paper since the written standard is identical, but when spoken the exact same phrases become as distinct as Romanian and Portuguese [3].

To conclude, the Chinese language encompasses many different dialects unintelligible between major branches, as well as two scripts which have an obvious chronological order, and are not fully intelligible either if one only knows one of them. It is a system which has enabled an extreme separation between speech and writing, resulting in the fact that all varieties are simultaneously Chinese and not, since the local varieties, when written may not be intelligible to speakers of other varieties. The label ‘Chinese speaker' is thus somewhat moot, and ‘Chinese literate’ is perhaps closer to the truth regarding Sinitic languages[4]. That however opens a new can of worms regarding other languages previously written in classical Chinese, in particular Korean and Vietnamese, not to mention Japanese which still utilises many Kanji today. Perhaps the determining factor is whether a dialect has an army and navy, a quip attributed to Weinreich. 


[1] A possible exception is Yuen Ren Chao, who is fluent in many varieties including Changchow, Fukien, Northern Mandarin and Cantonese as well as a number of other languages such as French and German (Levenson 1977). In addition he wrote Lion-Eating Poet in the Stone Den, famous for only using the sound shi, in Mandarin throughout the poem.

[2]  Original could be read at https://zh.wikisource.org/zh-hant/海上花列傳

[3] For the case between Taiwanese and Mandarin (Guoyu) in Taiwan where the former is increasingly displaced by the latter, see Mair (2003). Intelligibility is mainly found within main branches, such as Beijing and Nanjing Mandarin. Teochew in the Min family and Cantonese, however are not intelligible, whereas Minnan/Hokkien is somewhat legible to speakers of the former type.

[4] For a detailed account of the debate between the terms referring to Chinese varieties such as topolect, dialect and language see Mair (1991). Hence I have mostly used ‘varieties’ throughout the piece as a relatively controversy-free term.



Ansheles, A. 2017. Ansheles treats mom to Herbal tea (available on-line: https://www.youtube.com/watch?v=giSC7vFZzQo, accessed 25 June 2018).

Bolton, K. 2002. Chinese Englishes: from Canton jargon to global English. World Englishes 21, 181–199.

Chan, P. F. 1998. Lun Jyut fong jin ci bun zi hau sik [An investigation into the original characters present in Cantonese]. Hong Kong: Zhonghua book company.

Douglas, C. & T. Barclay 1873. Chinese-English Dictionary of the Vernacular Or Spoken Language of Amoy, with the Principal Variations of the Chang-Chew and Chin-Chew Dialects. Trübner [Suppl.:] Commercial Press in Shanghai.

Kamusella, T. 2017. The Arabic Language: A Latin of Modernity? Journal of Nationalism, Memory & Language Politics 11, 117–145.

Levenson, R. 1977. Chinese linguist, phonologist, composer and author, Yuen Ren Chao (available on-line: http://content.cdlib.org/view?docId=hb8779p27v&brand=calisphere&doc.view=entire_text, accessed 27 June 2018).

Lin, Y. 1989. Wuguo yu wumin [My country and my people]. Taipei: Fu Hsin.

Ma, B., D. Zhu & R. Tong 2006. Chinese dialect identification using tone features based on pitch flux, vol. 1, I 1029-1032. IEEE.

Mair, V. H. 1991. What is a Chinese “dialect/topolect”?: Reflections on Some Key Sino-English Linguistic Terms. Sino-Platonic papers.

Mair, V. H. 2003. How to Forget Your Mother Tongue and Remember Your National Language (available on-line: http://www.pinyin.info/readings/mair/taiwanese.html, accessed 27 June 2018).

West, A. 2009. Proposal to Encode Obsolete Simplified Chinese Characters (available on-line: http://www.babelstone.co.uk/CJK/N3695.html, accessed 27 June 2018).

Wong, C. M. P. 2014. Kong zin hau Hoeng Gong dik Jyut jyu siu syut [Cantonese novels after World War II]. In Jyut jyu dik zing zi: Hoeng Gong jyu jin man faa dik ji zat jyu do jyun [The politics of Cantonese: divergence and multiplicity of language culture in Hong Kong] (ed) K.-W. Man, 133–152. The Chinese University Press.