The Chinese script is a logographic script structured so that each character represents a single concept; characters are then combined to form compound words. Although there are several distinct languages (or "dialects") spoken in China including Mandarin and Cantonese (Hong Kong), they can all read the same "written words" because it is based on meaning, not on sound.
See the links below for more information
There are several variants of the the Chinese script used in different locations.
See the Other Language/Dialects section for information on forms like Cantonese and Wu.
In order to integrate foreign scripts into your computer, you must set up "keyboard" or input utilities in your operating system. These utilities will allow you to switch between typing English and other languages in word processors and Web tools. This process will also make sure the correct fonts are installed and available on your operating system.
See instructions for Setting up Keyboards for details.
Microsoft provides a variety of free keyboard utlities, but they must be installed from the disk, then activated from the Regional Control Panel.
Student Computing Labs - The utilities are installed in the University Park Student Computing Labs, but students must install the utlities by going to the Start menu then International Language Support » Microsoft » Office Microsoft Office Asian Character Input Support.
Home Computers - Several Asian and Middle Eastern keyboards are available in Windows, but you may have to install it from the Windows System disk because it is a complex script. After that you can activate the keyboards from the Regional Control Panel.
See Windows East Asian Keyboards for detailed instructions with screen captures.
Once the keyboards have been installed, they must be activated in the Regional Control Panel. Read the summary instructions below or go to Windows East Asian Keyboards for detailed instructions with screen captures.
See Detailed Windows Instructions for complete instructions with screen capture images.
Read Pinyin Joe Com for more information
Student Computing Labs - Many language keyboards have been activated in the labs and are available through the flag menu on the upper right. Skip to Step #4 in the instructions below.
Home Computers - A variety of keyboards are available from Apple, but you may have to install it from the Macintosh System disk then they can be activated through the International System Preferences. See details below.
See the Macintosh Keyboard Activation for complete instructions with screen captures.
For Unicode Compliant Applications, you can activate the U.S. Extended keyboard (10.3/10.4) or the Extended Roman keyboard (10.2) to type the long marks, but only some applications such as Microsoft Office 2004, Text Edit (free with OS X ), Dreamweaver, or Netscape 7 Composer /Mozilla Composer support it.
Once you switch to the Extended keyboards, you can use the following codes
See the Extended Keyboard Accent Codes for more information.
If you have your browser configured correctly, the Web sites above should display the correct characters. If you have difficulties, see list below for font and browser configuration instructions.
Please note which fonts are needed for each platform before viewing instructions to configure your browsers in the Preferences or Tools menu. Most browsers are recommended, but older browsers like Netscape 4.7 may need more adjustments.
All modern browsers support this script. Click link in list to view configuration instructions. In some cases, you will be asked to match a script with a font.
If you see Roman character gibberish instead of Chinese (such as at www.csssm.org) you will need to manually switch from Western encoding view to the Chinese Simplified (or Chinese Traditional) encoding under the View menu of your browser.
See Using Encoding and Language Codes for more information on the meaning and implementation of these codes.
These are the codes which allow browsers and screen readers to process data as the appropriate language. All letters in codes are lower case. Note also that a Chinese encoding system also includes the Latin alphabet, the Cyrillic alphabet, Greek alphabet, and other scripts. These are included to enhance comparability with non-Chinese files.
Note on GB18030 Encoding: The People's Republic of China has mandated that any newly developed 18030 encoding.GB 18030 compliant fonts are available from both Microsoft and Apple.
One option is to use Dreamweaver, Microsoft Expression or other Web editor and change the keyboard to the correct script. This will allow you to type content in directly with the appropriate script. However, it is important to verify that the correct encoding is specified in the Web page header.
Another option is to compose the basic text in an international or foreign language text editor or word processor and export the content as an HTML or text file with the appropriate encoding. This file could be opened in another HTML editor such as Dreamweaver or Microsoft Expression, and edited for formatting.
For Web tools such as Blogs at Penn State, Facebook, Twitter, del.icio.us, Flicker, and others, users can typically change the keyboard and input text. In most cases, this content will be encoded as Unicode.
For best cross-browser support, horizontal text is recommended. There is a way to specify vertical text in CSS, but it's only supported in Internet Explorer for Windows.
Computers process text by assuming a certain encoding or a system of matching electronic data with visual text characters. Whenever you develop a Web site you need to make sure the proper encoding is specified in the header tags; otherwise the browser may default to U.S. settings and not display the text properly.
To declare an encoding, insert or inspect the following meta-tag at the top of your HTML file, then replace "???" with one of the encoding codes listed above. If you are not sure, use utf-8 as the encoding.
Generic Encoding Template
<meta http-equiv="Content-Type" content="text/html; charset=??? ">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8 ">
The final close slash must be included after the final quote mark in the encoding header tag if you are using XHTML
Declare Unicode in XHTML
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
If no encoding is declared, then the browser uses the default setting, which in the U.S. is typically Latin-1. Some display errors may occur.
Language tags are also suggested so that search engines and screen readers parse the language of a page. These are metadata tags which indicate the language of a page, not devices to trigger translation. Visit the Language Tag page to view information on where to insert it.
Spoken Chinese comes in a variety of "dialects" which are so distinct, linguists classify them as a series of closely related languages. "Standard" Chinese corresponds to Mandarin, but each city in China has its own language. Examples include Cantonese of Hong Kong and Wu of Shanghai, but there are about seven to ten language groups in total. Chinese communities in the United States, Britain, Australia, Singapore and elsewhere often speak one of these forms at home.
Because Chinese is not phonetically based (particularly Traditional Chinese), these speakers can read and write Chinese, but not necessarily speak with each other. The spoken form is fairly standardized, but there are occasional regional differences.
Since all the language forms use the same script, development of the pages is much the same. However, you can can add a language code for pages for different dialects, especially when words are spelled out phonetically in a Roman script.
The script tags are
Note: A -- indicates no IANA or ISO-639-3 code registered.
* Min includes Fuzhou, Hokkein, Amoy, Taiwanese
©Penn State University, 2000-2013.
This Web page maintained by Teaching and Learning with Technology, a unit of Information Technology Services. For questions or comments on this Web page, please contact Elizabeth J. Pyatt (firstname.lastname@example.org).
This site uses Unicode to display non-English characters. This site is best viewed in the most recent versions of your browser.
Unicode character names and hexadecimal entity codes are taken from the public Unicode Character Charts.
This publication is available in alternate media upon request.
Last Modified: Tuesday, 04-Jun-2013 12:39:39 EDT