Teaching and Learning with Technology

Computing With Accents and Foreign Scripts

Skip Menu

Devanāgarī for Hindi, Sanskrit, Marathi, Nepali, etc.

This Page

  1. About the Devanagari Script
  2. Activate Keyboards for Typing
  3. Browser and Font Recommendations
  4. Web Development and Language Codes
  5. Devanagari Unicode Chart (New Page)
  6. Links

About the Script

Devanagari is a syllabic alphabet in that it consists of consonants with vowel signs. For more information on the script, see the following pages. It's used for the several major Indian languages including Hindi, Sanskrit, Konkani, Marathi, Nepali, Sindhi, Sherpa, and others, but it is only one of many scripts used in India.

In the past, proper encoding was not always considered a major issue and many larger news or information sites would offer their own fonts to download, but they would only work with those sites. Now there is more emphasis on developing Unicode fonts and Web pages, but implementation may still be patchy on some systems.

Top of Page

Activating Keyboards for Fonts

Basic Setup

In order to integrate foreign scripts into your computer, you must set up "keyboard" or input utilities in your operating system. These utilities will allow you to switch between typing English and other languages in word processors and Web tools. This process will also make sure the correct fonts are installed and available on your operating system.

See instructions for Setting up Keyboards for details.

Windows

Other Applications

Microsoft includes several Devanagari keyboards for Hindi, Sanskrit, Marathi and Konkani, but they may need to be installed from the Windows System disk.

See the Windows Complex Scripts Keyboard Instructions for details on how to activate the keyboard. To see where the critical keys are, go to the Microsoft Keyboard Layouts Page.

Macintosh

Apple has two keyboards called Devanagari and Devanagari QWERTY (phonetic) keyboard and one for Nepali (as of 10.4/Tiger). See instructions for activating Macintosh keyboards.

Recommended Applications

The following applications most fully support vowel placements.

Top of Page

Browser and Font Recommendations

Test Sites

If you have your browser configured correctly, the Web sites above should display the correct characters. If you have difficulties, see list below for font and browser configuration instructions.

Fonts by Platform

Third Party Fonts

Read pages for instructions on whether it is Windows compliant or Linux compliant.
Note on OS X: These fonts can be installed on a Mac, but vowel marks may not display correctly.

See also

Recommended Browsers

Browsers which fully support Unicode are the strongly recommended. Click link in list to view configuration instructions. You will be asked to match a script with a font.

Note on OS X: Only Opera displays most vowel signs correctly on the Mac. Vowel signs are visible in Firefox/Mozilla or Safari, but are displaced off the letter. Users of these browsers can cut and paste text into TextEdit is content is not clear

Note on System 9: Because Unicode support is incomplete in System 9, it may be beneficial to upgrade to OS X if you need to work with Unicode.

Manually Switch Encoding

If you see Roman character gibberish instead of a South Asian script, you will need to manually switch from Western encoding view to the Unicode encoding under the View menu of your browser.

 

Top of Page

Web Development

Devanagari Encoding and Language Tags

These are the codes which allow browsers and screen readers to process data as the appropriate language. All letters in codes are lower case.

Encoding: utf-8 (Unicode) , ISCII (older), ITRANS (older)
Use Unicode to develop new pages.

Using Encoding and Language Codes

Computers process text by assuming a certain encoding or a system of matching electronic data with visual text characters. Whenever you develop a Web site you need to make sure the proper encoding is specified in the header tags; otherwise the browser may default to U.S. settings and not display the text properly.

To declare an encoding, insert or inspect the following meta-tag at the top of your HTML file, then replace "???" with one of the encoding codes listed above. If you are not sure, use utf-8 as the encoding.

Generic Encoding Template

<head>
<meta http-equiv="Content-Type" content="text/html; charset=??? ">
...
<head>

Declare Unicode

<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8 ">
...
<head>

XHTML

The final close slash must be included after the final quote mark in the encoding header tag if you are using XHTML

Declare Unicode in XHTML

<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
...
<head>

No Encoding Declared

If no encoding is declared, then the browser uses the default setting, which in the U.S. is typically Latin-1. In that case many Unicode characters could be displayed incorrectly. Also, older browsers such as Netscape 4.7 may not be able to process the entity codes correctly without the "utf-8" declaration.

Language Tags

Language tags are also suggested so that search engines and screen readers parse the language of a page. These are metadata tags which indicate the language of a page, not devices to trigger translation. Visit the Language Tag page to view information on where to insert it.

Inputting and Editing Text in an HTML Editor

One option is to use Dreamweaver, Microsoft Expression or other Web editor and change the keyboard to the correct script. This will allow you to type content in directly with the appropriate script. However, it is important to verify that the correct encoding is specified in the Web page header.

Another option is to compose the basic text in an international or foreign language text editor or word processor and export the content as an HTML or text file with the appropriate encoding. This file could be opened in another HTML editor such as Dreamweaver or Microsoft Expression, and edited for formatting.

Other Web Tools

For Web tools such as Blogs at Penn State, Facebook, Twitter, del.icio.us, Flicker, and others, users can typically change the keyboard and input text. In most cases, this content will be encoded as Unicode.

Unicode Chart with HTML Entity Codes

For short texts, such as the yoga om sign ( = &#2384;) it may be desirable to use Unicode entity codes for Devanagari and enter HTML entity codes.
Note: The appearence of conjunct letters is not discussed.

ISCII vs. Unicode

Before the development of Unicode encoding, the government of India had developed a standard called ISCII (Indian Script Code for Information Interchange). In this standard similar characters in multiple scripts would be assigned the same character number. For instance Devanagari (ka) and Gujarati (ka) would be assigned the same code point. However, most modern development is in Unicode

PDF and Image Files

In some cases, your best options may be to use PDF files or image files. See the Web Development Tips section for more details.

Top of Page

Links

Script Basics

Open Type Fonts (Windows)

Read pages for instructions on whether it is Windows compliant or Linux compliant.
Note on OS X: These fonts can be installed on a Mac, but vowel marks may not display correctly.

General Computing

Top of Page

Last Modified: Tuesday, 04-Jun-2013 12:39:41 EDT