The Linguist

The Linguist 61,2 April/May 2022

The Linguist is a languages magazine for professional linguists, translators, interpreters, language professionals, language teachers, trainers, students and academics with articles on translation, interpreting, business, government, technology

Issue link: https://thelinguist.uberflip.com/i/1463531

Contents of this Issue

Navigation

Page 17 of 35

18 The Linguist Vol/61 No/2 2022 thelinguist.uberflip.com ONLINE INCLUSION Deborah Anderson looks at why it's so important for languages to be in Unicode, and the work being done to get them there M ost speakers of modern European languages today can send text messages, email and documents back and forth over the internet, and be relatively confident that the letters and symbols will be received as typed. This is because the major European languages use the Roman script, which is generally very well supported on computers and devices. However, those languages that are written with less commonly used scripts typically encounter problems: instead of getting the letters and symbols they expect, nonsense characters or boxes (known as 'tofu') may appear, making text difficult or even impossible to read. The problem of sending and receiving texts in different scripts electronically was apparent by the 1970s and 1980s, when businesses, governments, linguists and others were not able to exchange text data easily or reliably. To resolve this problem, an international standard was developed: the Unicode Standard and its close relative, ISO/IEC 10646. Unicode is today supported on all modern operating systems. It serves as the backbone for sending and receiving text electronically in the various languages of the world, and is the foundation upon which fonts, keyboards and software rely. In essence, Unicode enables typing in different languages in text messages, emails, webpages and word-processing documents across platforms, and it also makes search and cut-and-paste capabilities possible. The major scripts, such as Latin, Cyrillic, and Chinese/Japanese/Korean ideographs, are included, but many less common scripts are not. As a result, sending critical health information in the Bété script, for example, would not be possible unless one employed a workaround, such as a non-standard font – and this still doesn't guarantee that the original text is received as expected. A vital initiative To address the problems facing communities of lesser-used scripts, I started a project called the Script Encoding Initiative (SEI) at UC Berkeley's Department of Linguistics in 2002, which has had support from the National Endowment for the Humanities. The project Supported scripts INCOMPREHENSIBLE 'TOFU' Text typed in Bété script with a non-Unicode font (left); and (right) how the text appears when sent via email to a mobile device ,-45=> 3' ¶ÆÈÉËÚ © SHUTTERSTOCK

Articles in this issue

Links on this page

Archives of this issue

view archives of The Linguist - The Linguist 61,2 April/May 2022