LING 270: Homework for Unit 7

Available: Monday March 24

Due: Monday March 31

The following problems pertain to interactions with real speech synthesizers The goal of this exercise is to get a sense of how well systems that are deployed today work.

The answers to the question should be in the form of a couple of pages of text. Use as much or as little as you need, but in each case I would like to see a fairly thorough analysis of how you felt the technology worked. In particular, if there were problems, what were those problems. Speculate on what might have caused the problems: I am not interested in whether the speculation is correct (in any case, I won't always know) but I want you to be thinking deeply about how this technology works (or doesn't work).

  1. You should visit at least three of the websites If you speak some other language than English well, then feel free to pick systems for that language. You also should not necessarily restrict yourself to the sites I have listed, if you can find others.

    If you have a receiver for NOAA weather radio (or otherwise have a radio that can receive WXJ76 at 162.550MHz), which uses a TTS system from SpeechWorks (now ScanSoft), you may substitute that for one of the websites.

    If you have a Macintosh running OS-X, you have a synthesizer shipped with the product. You can use that synthesizer if you wish.

    In any event, what you should do is try out various texts with the systems. Try to break them. (In the case of NOAA weather radio, you don't have that option of course: in that case listen to about 15 minutes of it.) If you succeed in breaking them, explain what you typed and what was wrong with them. If the system has a canned greeting (e.g. "Hello, and welcome to the XYZ text-to-speech system"), how well does the canned greeting sound compared with other similar text that you select? Write a description of your general impression of the systems. How "natural" are they? How intelligible are they? (In order to test the latter, it helps if you don't know what text it is reading: you might try snarfing a random bit of text, and inserting it without looking into the system's input.) List at least three problems that you find. (These might be mispronounced words, pops or clicks, strange discontinuities, intonational oddities, and so forth.) You will find that one of the sites uses a cute trick to improve the percept of quality.

  2. Pick whichever English TTS system you like the best from the above and do a side-by-side comparison between it and the voice of HAL, some audio for which can be found here. You should feed the same text to the TTS system as HAL (or more properly the actor Douglas Rain) read. Try to be as precise as you can about what the differences are. The TTS systems will do things wrong. The question is what do they do wrong? You might think about the following criteria, but this should not be taken as an exhaustive list: