A Global Village? Not Yet...
Jeffrey R. Harrow
Principal Technologist, The Harrow Group
(download PDF version)
(read his bio)
Shrinking Our World.
The Internet has shrunk our world in many ways. Information from around the globe that most folks had never imagined existed, is now just a click away. There's more international travel and business and diplomacy than at any time in history. Yet our global village is a very illiterate one.
I was traveling this month, spending almost two weeks in a country where I neither understand nor speak a word of the local language. (This is my fault, of course - it's I who should be speaking their language when I'm visiting. But my years of high school language courses didn't go very far towards preparing me to actually do so - but that's another discussion.)
Traveling where I'm language-challenged is not a new experience for me; I've given speeches (in English) in many countries throughout the Western and Eastern world. But this time I was traveling without colleagues who knew the local language, and I spent most of the time away from large cities where many people do speak at least a little English.
I was, truly, illiterate. I couldn't interact with the people. I couldn't read signs that warned me about dangers, or gave me information, or provided instructions on how to use things like automated train ticket dispensers. I couldn't call someone to make an advance reservation since I couldn't make my needs understood, but then I couldn't even get the phone number I needed to place the call since I couldn't read the phone book. Not to mention that my Verizon cell phone wouldn't work there anyway.
It's a very scary thing, being illiterate on a personal basis. But when we expand this condition to the international business of government, and to the increasingly international business environment, being internationally illiterate is a Very Bad Thing.
Technologically Taming The Tower Of Babel.
So how could things be different? Short of the still mythical "RNA pill" that would teach me a language (and so much more) overnight, existing technology could still have eased my linguistic pain.
Tapping For Words And Phrases.
First, let's explore a very gross solution. I occasionally had access to the Internet in a hotel. When I did, I was able to "talk" with the desk clerks by using good old Google's "translation" feature - we'd each type our sentences into the box and a not-very-good, but good-enough translation made all the difference in my being able to communicate. To say that this was refreshing would be understatement. Cumbersome and problematic, yes, but still wonderful.
I could have used similar online translation services while out and about through a Web-enabled cell phone, except there was rather poor cell phone coverage where I was. So the online option wasn't viable.
Another language assist could have been from using one of several dedicated pocket translators (such as from Merriam-Webster) that would let me type in some words or pre-defined phrases in my language and then translate them onto a screen (or even speak the translated words or phrases). But these devices are limited in that there are only so many phrases that they know, and I'd have to find and then pick the phrase of interest to me from a menu on the device. While potentially valuable, to me this is just a small step up from using an old paper translation dictionary that falls more into an emergency communication category.
There are also one-way devices that would let me speak one of many pre-defined phrases, and then speak it in the target language (you can see a short movie of the "Phraselator" by clicking here.)
Combine And Conquer.
But what about combining several existing, perhaps "good enough" technologies to help me actually converse with a non-English speaker? The concept is pretty simple, so let's see how it works.
- First, we need a program that can convert speaker-independent human speech into its text equivalent. One example that has been on the market for years is Nuance's Dragon NaturallySpeaking. (It's hardly a perfect process, but it can be "good enough" for many tasks, including this one.)
- Now that we have our words in a format that computers can crunch (text), we feed them into any of the standalone translation programs that are on the market; one example is Language Weaver's Cross-Lingual Chat, repurposed to take the text from our speech-to-text software and turn out translated text.
We could now display this speech-to-translated-text on a screen for the other person to read, but our goal is to allow natural verbal communications.
- So we need a third part, a "text-to-speech" program that can read the translated text and speak it in the new language. Happily, there are numerous such products on the market. In fact, if you're using Windows you already have a very basic text-to-speech capability as part of Windows' "Accessibility" capabilities, called Narrator. Other, more advanced text-to-speech programs are available, such as one offered by ReadPlease. (You can listen to examples, in several languages, of ReadPlease's output here; you may be surprised at how good it can be.)
Now we're ready to string the three pieces together. You speak in your language and the first element converts your words into text. The second element translates your text into the target "foreign" language, and finally the third element speaks the translated text to your listener. When your listener replies in her language, the process is reversed so that the device speaks back to you in your language. Automatically.
That kludging together is not a process for the faint of heart, but there are already dedicated devices on the market that begin to do just that. For example, Ectaco is one company already offering several models of pocket devices that perform this translating speech-to-speech solution, such as the SpeechGuard TL-4! I haven't used one of these myself, but I hope to bring one on my next trip to see how the hype meets the reality. (Not to mention that I'll enjoy peoples' reactions to my using what might seem like an early Star Trek goodie!)
To See Is To Believe.
OK, speech-to-translated-speech devices, at least as they mature, might deal with verbal international illiteracy, but what about the visual version? The inability to read a foreign language impacts everything from driving to ordering in a restaurant to reading a contract.
But this problem, too, has the potential to fall to technology's march forward, even though it won't happen nearly as soon.
The first thing we need is software that can "look" at an image delivered by a camera, and in real time recognize that certain elements in the picture represent writing. (Although lots of work is going on in this direction, a general solution is still beyond our abilities. Today.)
But once we do have such software, the picture of any writing can be converted to text (often called Optical Character Recognition, or OCR; your scanner probably came with software that does a fair job of this.)
Then, as before, existing translation software can convert the "foreign" text into your language, which you could read. BUT - safety could easily become an issue if I spent much time looking at a translation display as I drove around, or even as I walked the streets (buses and the like are not very forgiving of stepping off a curb.) This then, is where "augmented reality" comes in.
This is the concept of adding computer-generated information TO our normal senses. In this case, imagine that my translation computer was integrated with a pair of eyeglasses that contained a camera to see writing, and a microphone to listen to words. The glasses would also contain a display that adds information into my field of view. Here's a concept drawing of such a device from MicroOptical Corp.
In fact, if it were sophisticated enough, the system might actually obscure any "foreign" writing with its translated version in my language -- the sign would simply appear as if it were in my language!
It's Not Entirely Sci. Fi!
Consumer eyeglasses-mounted displays are on the way. Consider, for example, this working eyeglass display prototype from Olympus:
It's being used to overlay information about a train's schedule as the wearer watches the train roar into the station.
We're not ready for these displays, much less the full audio and visual translation solutions, to spread like iPODs, but we're certainly moving in that direction. Many groups are working towards machine visual recognition, and the military already uses sophisticated augmented reality displays to enhance fighter pilots' reaction times, and for other uses.
As these technologies mature, it’s a foregone conclusion that the pieces will come together to take down that Tower of Babel.
The Global Village.
The global village is a nice, and a valuable concept. And in some ways, such as cross-national product development and manufacturing, and customer service, it's already a reality. But until we break the multiple language barrier by one means or another, we will continue to find it very difficult to play in a global pond and get along with our "playmates."
The Internet has shrunk many aspects of our world. Now, we need to become comfortable and efficient and effective in traveling through, and working in it.
This essay is original and was specifically prepared for publication at Future Brief. A brief biography of Jeff Harrow can be found at our main Commentary page. Other essays written by Jeff Harrow can be found at his web site. Jeff receives e-mail at email@example.com. Other websites are welcome to link to this essay, with proper credit given to Future Brief and Mr. Harrow. This page will remain posted on the Internet indefinitely at this web address to provide a stable page for those linking to it.
To download a PDF version of this essay, click here. Please feel free to share the PDF with others who may be interested. To hear about future Commentary essays, take a few seconds to read about Daily Brief, one of the "briefest" Internet updates offered anywhere.
© 2006, Jeffrey Harrow, all rights reserved.