Boring conversation? Let your computer listen for you

MOST of us talk to our computers, if only to curse them when a glitch destroys hours of work. Sadly the computer doesn't usually listen, but new kinds of software are being developed that make conversing with a computer rather more productive.

The longest established of these is automatic speech recognition (ASR), the technology that converts the spoken word to text. More recently it has been joined by subtler techniques that go beyond what you say, and analyse how you say it. Between them they could help us communicate more effectively in situations where face-to-face conversation is not possible.

ASR has come a long way since 1964, when visitors to the World's Fair in New York were wowed by a device called the IBM Shoebox, which performed simple arithmetic calculations in response to voice commands. Yet people's perceptions of the usefulness of ASR have, if anything, diminished.

"State-of-the-art ASR has an error rate of 30 to 35 per cent," says Simon Tucker at the University of Sheffield, UK, "and that's just very annoying." Its shortcomings are highlighted by the plethora of web pages poking fun at some of the mistakes made by Google Voice, which turns voicemail messages into text.

What's more, even when ASR gets it right the results can be unsatisfactory, as simply transcribing what someone says often makes for awkward reading. People's speech can be peppered with repetition, or sentences that just tail off.

"Even if you had perfect transcription of the words, it's often the case that you still couldn't tell what was going on," says Alex Pentland, who directs the Human Dynamics Lab at the Massachusetts Institute of Technology. "People's language use is very indirect and idiomatic," he points out.

Despite these limitations, ASR has its uses, says Tucker. With colleagues at Sheffield and Steve Whittaker at IBM Research in Almaden, California, he has developed a system called Catchup, designed to summarise in almost real time what has been said at a business meeting so the latecomers can... well, catch up with what they missed. Catchup is able to identify the important words and phrases in an ASR transcript and edit out the unimportant ones. It does so by using the frequency with which a word appears as an indicator of its importance, having first ruled out a "stop list" of very common words. It leaves the text surrounding the important words in place to put them in context, and removes the rest.

A key feature of Catchup is that it then presents the result in audio form, so the latecomer hears a spoken summary rather than having to plough through a transcript. "It provides a much better user experience," says Tucker.

In tests of Catchup, its developers reported that around 80 per cent of subjects were able to understand the summary, even when it was less than half the length of the original conversation. A similar proportion said that it gave them a better idea of what they had missed than they could glean by trying to infer it from the portion of the meeting they could attend.

One advantage of the audio summary, rather than a written one, is that it preserves some of the social signals embedded in speech. A written transcript might show that one person spoke for several minutes, but it won't reveal the confidence or hesitancy in their voice. These signals "can be more important than what's actually said", says Steve Renals, a speech technologist at the University of Edinburgh, UK, who was one of the developers of the ASR technology used by Catchup.

{follow the source link for more}

First Carrier-Deployed Voice-to-SMS Application Hits iPhone App Store

Promptu Systems Corporation today announced that the Italian version of its fully automated voice-to-text messaging application created for Telecom Italia Mobile (TIM) is now available to Italian iPhone owners from Apple's App Store.

Like Promptu's forthcoming ShoutOUT, dettaSMS lets Italian speakers dictate their text messages in fluent, natural speech, instead of typing on the iPhone's touch-screen keypad. Transcribed SMS messages can be reviewed, edited and appended to before being sent.

"dettaSMS is the world's first carrier-deployed voice-to-text SMS iPhone application," said Giuseppe Staffaroni, Promptu's CEO. "To insure privacy, security, and scalability, message transcription is completely automatic -- no human is involved."

dettaSMS is integrated with Telecom Italia Mobile's billing system and built on Promptu's network speech recognition (NSR(TM)) architecture for speed, accuracy and scalability. Promptu's fully automated speech recognition delivers high accuracy, low latency and unparalleled security. User privacy is assured because the real-time voice signal is never processed manually.

 

M*Modal's Advanced Speech Recognition Technology is Incorporated into Scribe's Web-based Document Solutions

M*Modal today announced Scribe Healthcare Technology has incorporated the company's Speech Understanding technology into its web-based medical dictation, transcription and archival solutions.

Scribe's technology offerings simplify the business of medicine by providing web-based solutions for clinical information production, workflow management and analysis to healthcare providers and medical transcription service organizations that service them.

AppTek Bolsters Media Monitoring with Hybrid Machine Translation

In addition to providing coverage of dialects for automated speech recognition, MediaSphere provides AppTek’s hybrid machine translation system for customers.

MediaSphere is a software solution that offers multilingual transcripts of various television and radio stations for many domestic and international news bureaus. To adjust to changes in dialect and language in real-time, MediaSphere makes use of AppTek’s speaker adaptive speech recognition engine. Offering a unified and scalable solution, the updated media monitoring software seamlessly integrates AppTek’s HMT system with its ASR engine.

Voice Recognition Software Helps Florida Caseworkers Work Faster


A solution is emerging to enable reporting efficiency. Since July 2008 the department has been deploying voice recognition technology, designed to let fieldworkers dictate their notes while in the field. Software converts the dictation into typed copy, letting investigators spend more time on the road. Once back in the office, the fieldworker plugs the dictation device into a PC and gets a printed report. [click heading for more]

Terre Haute company to test dictation software

InfraWare Inc., a medical transcription software company, announced Thursday it soon will begin beta testing its new dictation recognition engine aimed at increasing the speed and reducing the cost per line of medical transcription through intelligent back-end automation.The Terre Haute-based software developer and transcription ASP provider will marry its existing InfraWare 360 transcription speech recognition platform and its newly developed artificial intelligence engine to generate more accurate and less expensive first-draft text versions of physician audio dictations. Of the $1.2 million project budget, $871,000 was supplied by a grant from the state’s 21st Century Research and Technology Fund. [click heading for more]

blinkx Named "Star Performer" by Speech Technology Magazine

blinkx today announced that it has been named a winner of Speech Technology magazine's 2008 Speech Industry Awards in the "Star Performers" category. Awards are presented to individuals and companies for extraordinary efforts made to increase benefits, acceptance and adoption of speech technologies and for outstanding accomplishments in bringing new products and services to the marketplace. The awards were presented recently at the 14th Annual SpeechTEK Conference and Exposition in New York City. [click heading for more]

Philips, Oy Konttorityö introduce speech recognition-based hospital report management solution in Finland

Royal Philips Electronics announced today the release of a report management solution for Finnish hospitals that is expected to significantly speed up the availability of medical reports and information. Philips SpeechMagic Executive Advanced (SMEA) caters to a variety of medical documentation use cases from dictation and transcription to speech recognition. The system seamlessly integrates with any healthcare infrastructure and features the new Finnish MultiMed ConText for hospital-wide speech recognition. The solution strengthens Philips position as the only company to provide healthcare speech recognition solutions for the major Nordic countries, including Finland, Norway, Denmark and Sweden. [click heading for more]

Euromed to integrate SpeechMagic for accurate, convenient and efficient capturing of healthcare information

Euromed Networks today announced that it has signed a partnership with Philips Speech Recognition Systems, for the integration and distribution of the company’s SpeechMagic platform in the United Kingdom and Ireland. Euromed Networks’ digital dictation and image management software MedSpeech now features Philips industrial grade speech recognition for accurate, convenient and efficient information capturing. MedSpeech powered by SpeechMagic has been designed to seamlessly integrate with healthcare IT applications, such as hospital or radiology information system (HIS/RIS) or picture archiving and communications systems (PACS). Highly scalable and network-based, the solution supports large volumes of dictations and a high number of simultaneous users.[click heading for more]

East Midlands Procurement Hub Selects SRC for Digital Dictation and Speech Recognition

SRC awarded preferred supplier status for the provision of Digital Dictation and Speech Recognition solutions across the East Midlands Strategic Health Authority. Award enables all NHS Trusts across Derbyshire, Lincolnshire, Nottinghamshire, Leicestershire County & Rutland and Northamptonshire to contract with SRC for the provision of cost effective Digital Dictation and Speech Recognition solutions without the need to carry out their own time consuming and costly tendering process. [click heading for more]