A real-
time system incorporating
speech recognition and linguistic
processing for recognizing a spoken query by a user and distributed between
client and
server, is disclosed. The
system accepts user's queries in the form of speech at the
client where minimal
processing extracts a sufficient number of acoustic speech vectors representing the
utterance. These vectors are sent via a communications channel to the
server where additional acoustic vectors are derived. Using Hidden Markov Models (HMMs), and appropriate grammars and dictionaries conditioned by the selections made by the user, the speech representing the user's query is fully decoded into text (or some other suitable form) at the
server. This text corresponding to the user's query is then simultaneously sent to a
natural language engine and a
database processor where optimized
SQL statements are constructed for a full-text search from a
database for a
recordset of several stored questions that best matches the user's query. Further
processing in the
natural language engine narrows the search to a single stored question. The answer corresponding to this single stored question is next retrieved from the file path and sent to the
client in compressed form. At the client, the answer to the user's query is articulated to the user using a text-to-speech engine in his or her native
natural language. The
system requires no training and can operate in several natural languages.