A synthetic speech system is composed of two parts: the synthesizer that does the speaking, and the screen reader that tells the synthesizer what to say.
The synthesizers used with PCs are text-to-speech systems. Their programming includes all the phonemes and grammatical rules of a language. This allows them to pronounce words correctly. Names and compound words can cause problems, as they often contain unusual spellings and letter combinations.
The synthesizer can be a card that is inserted into the computer, a box attached to the computer by a cable, or software that works with the computer's sound card. Some synthetic speech sounds robotic, although some can sound almost human. Software synthesizers are routinely included with the purchase of a screen reader.
A screen reader is a program that is loaded into the computer's memory that reads the text displayed on the screen. It allows the user to send commands instructing the speech synthesizer what to say by : (1) pressing different key combinations on the computer keyboard; (2) pressing keys on a separate keypad; or (3) automatically when changes occur on the computer screen. These commands instruct the synthesizer to read a word, line, or full screen of text. Different key combinations give the commands to spell a word, find a string of text on the screen, announce the location of the PC cursor or focused item, and so on. They can also perform more advanced functions such as: locating text that is written in a certain color, reading pre-designated parts of the screen on demand, or reading text that is highlighted--allowing the user to know which is the active choice on a menu. They also permit the user to use the spell checker in a word processor or to read the cells of a spreadsheet.
There are screen access programs available currently for use with the PC running DOS, Windows 95, Windows 98, and Windows NT, as well as MACs and UNIX. Each incorporates a different command structure and most support a variety of speech synthesizers.
How Windows-based Screen Readers Work.
The graphical and visual nature of the Windows operating environment makes it necessary for the screen reader to do more than simply lift material from the screen and send it to the synthesizer. Its functions can be divided into five categories:
- Identifying and Reading Text and Graphics
- Once text has been displayed on the screen, Windows 95 stores it in a matrix of pixels, or tiny dots. It is impossible for the screen reader to interpret this information or to determine what is text and what is a picture. Windows-based screen readers intercept all information as it is being sent by Windows applications to the screen and store it in a memory construct known as the off-screen model (OSM). The screen reader then reads from the OSM rather than from the graphical image drawn on the screen itself.
- Identifying and Announcing the Function of Windows Constructs
- Windows maintains the type, or class, of each element in an application, and most screen readers are capable of retrieving this information and delivering it to the user. In a typical Windows dialog box there may be a button that the user must select to proceed with a task. The Windows screen reader can identify the item as a button rather than simply reading the text and color of the item along with other text.
- Identifying Graphics
- Many Windows features are not labeled with text, but are simply displayed as icons or pictures on the screen. Windows screen readers label these graphics so that they can be spoken in meaningful terms. A picture of a waste basket can be labeled "Delete," for example.
- Serving as a Mouse or Pointing Device
- Some features of Windows 95 applications are available only by clicking with a mouse. To overcome the difficulty of positioning the mouse on a particular point of the screen, Windows 95 screen readers incorporate features which move the mouse pointer in straight rows and columns or by meaningful units such as words or characters, find specified text and place the mouse pointer on it, and provide keystrokes that simulate the clicking of a mouse button.
- Providing the Information Efficiently
- The screen reader must provide an alternative interface to the user that gives efficient access. A synthetic speech program that reads the entire screen from top to bottom may eventually divulge the essential information, but it may take several minutes to do so. At the same time, it must be easy for the user to determine which of the items being spoken is the "current" item and which is additional, essential information. For example, if the speech program reads an entire dialog box, which of the controls is the focused item?
In the process of testing and reviewing a Windows screen reader for purchase, several questions must be answered:
- What version of Windows will be used? Is the screen reader compatible with the version of Windows to be used?
- Are there standard system configurations with which the screen reader does not work (color schemes, common video cards, etc.)?
- What synthesizers are/are not supported?
- From among the applications that will likely be used, are there some with which the screen reader does not work, no matter the skill level of the user?
- How much "automatic" speech does the screen reader give when the user is performing standard Windows functions such as selecting menu items or moving through items in dialog boxes? Can the amount of speech be adjusted to suit the user's skill level and preferences?
- How difficult is it to change simple standard features such as voice rate or the choice of a reading key?
- What must the user do in order to make an unfriendly program work well enough to be usable?
- What useful and unique features does the screen reader have?
- What problems does the screen reader add to Windows use?
- Is the manual accessible and accurate?
- Is there a tutorial in a usable format?
Source: AFB Copyright © American Foundation for the Blind 2005. All rights reserved. Used with permission.