Speech Application Language Tags full report
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
computer science technology
Active In SP
**

Posts: 740
Joined: Jan 2010
#1
29-01-2010, 08:18 AM



.doc   Speech Application Language Tags.doc (Size: 72.5 KB / Downloads: 448)

ABSTRACT:
Speech Application Language Tags (SALT) is a small number of XML
elements that may be embedded into host programming languages to
speech-enable
applications which enables the users to speak and listen to a
computer will greatly enhance
the ability for users to access computers at any time from nearly any
place. SALT may be
used to develop telephony (speech input and output only) applications
and multimodal
applications (speech input and output, as well as keyboard and mouse
input and display
output). SALT and the host programming language provide control
structures not available in
VoiceXML, the current standard language for developing speech
applications.
INTRODUCTION:
Speaking and listening is so
fundamental that people take it for granted.
Everyday people ask questions. They give instructions. Speaking and
listening are necessary
for learning and training, for selling and buying, for persuading and
agreeing, and for most
social interactions. For the majority of people, speaking and
understanding spoken speech is
simply the most convenient and natural way of interacting with other
people.
So, is it possible to speak and listen to a computer?
Yes.
Emerging technology enables users to speak and listen to the computer
now. Speech
recognition converts spoken words and phrases into text, and speech
synthesis converts text
to human-like spoken words and phrases. While speech recognition and
synthesis have long
been in the research stage, three recent advances have enabled speech
recognition and
synthesis technologies to be used in real products and services: (1)
faster, more powerful
computer technology, (2) improved algorithms using speech data
captured from the real
world, and (3) improved strategies for using speech recognition and
speech synthesis in
conversational dialogs enabling users to speak and listen to the
computer.
MOTIVATION FOR SPEAKING AND LISTENING TO A COMPUTER :
Speech applications enable users to speak and listen to a computer
despite physical
impairments such as blindness or poor physical dexterity. Speaking
enables impaired callers
to access computers. Callers with poor physical dexterity (who cannot
type) can use speech to
enter requests to the computer. The sight-impaired can listen to the
computer as it speaks.
When visual and/or mechanical interfaces are not an option, callers
can perform transactions
by saying what they want done and supplying the appropriate
information. If a person with
impairments can speak and listen, that person can use a computer to
bypass the limitations of
small keyboards and screens. As devices become smaller, our fingers
do not. Keys on the
keypad shrink often to the point where people with thick fingers
press two or more keys with
one finger stroke. The small screens on some cell phones may be
difficult to see, especially
in extreme lighting conditions. Even PDAs with QWERTY keyboards are
awkward.
(QWERTY is a sequence of six keys found on traditional keyboards used
by most English and
Western-European language speakers.) Users hold the device with one
hand and hunt and
peck with the forefinger of the other hand. It is impossible to use
both hands to touch-type
and hold the device at the same time. By speaking, callers can bypass
the keypad (except
possibly for entering private data in crowded or noisy environments).
By speaking and
listening, callers can bypass the small screen of many handheld
electronic devices.
IF THE DEVICE HAS NO KEYBOARD:

Many devices have no keypad or keyboard. For
example, stoves, refrigerators, and heating and air conditioning
thermostats have no
keyboards. These appliances may have a small control panel with a
couple of buttons and a
dial. The physical controls are good for turning the appliance on and
off and adjusting its
temperature and time. Without speech, a user cannot specify complex
instructions such as,
turn the temperature in the oven to 350 degrees for 30 minutes, then
change the temperature
to 250 degrees for 15 minutes, and finally leave the oven on warm.
Without speech, the
appliance cannot ask questions such as, When on Saturday morning do
you turn the heat
on? Any sophisticated dialog with these appliances will require
speech input. And speech
can be used with rotary phones, which do not have a keypad.
WHILE CALLERS WORK WITH THEIR HANDS AND EYES:

Speaking and listening are
especially useful in situations where the callerâ„¢s eyes and/or hands
are busy. Drivers need to
keep their eyes on the road and their hands on the steering wheel. If
they must use a computer
when driving, the interface should be speech only. When driving
machines requiring their
hands to operate controls and their eyes to focus on the machine
activities, machine operators
can also use speech to communicate with a computer. (Although is it
not recommended that
you hold and use a cell phone while driving a car.) Mothers and
caregivers with children in
their arms may also appreciate speaking and listening to a doctorâ„¢s
Web page or medical
service. If a person can speak and listen to others while they work,
they can speak and listen
to a computer while they work.
AT ANYTIME DURING THE DAY:
Many
telephone help lines and receptionists are
available only during working hours. Computers can automate much of
this activity, such as
accepting messages, providing information, and answering callersâ„¢
questions. Callers can
access these automated services 24 hours a day, 7 days a week via a
telephone by speaking
and listening to a computer. If a person can speak and listen, they
can interact with a
computer anytime
during the day or night.
WITH INSTANT CONNECTION WITHOUT BEING PLACED ON HOLD:

Callers become
frustrated when they hear your call is very important to us because
this message means
they must wait. Thanks for waiting, all of our operators are busy
means more waiting.
When using speech to interact with an application, there are no hold
times. The computer
responds quickly. (However, computers can become saturated which
results in delays; but
these occur less frequently than callers waiting for a human
operator.) Because many callers
can be serviced by voice-enabled applications, the human operators
are freed to resolve more
difficult caller problems.
USING LANGUAGES THAT DO NOT LEND THEMSELVES TO KEYBOARDING:
Some languages do not lend themselves to data entry using the
traditional QWERTY
keyboard. Rather than force Asian language users to mentally
translate their words and
phrases to phonetic sounds and then press the corresponding keys on
the QWERTY
keyboard, a much better solution is to speak and listen. Speech and
handwriting recognition
will be the key to enabling Asian language speakers to gain full use
of computers. If a person
can speak and listen to an Asian language, they can interact with a
computer using that
language.
TO CONVEY EMOTION:
In an effort to enhance
written text to convey emotions, callers
frequently use emoticons ” keyboard symbols to convey emotions to
enhance their text
messages. Example emoticons include Smile for happy or a joke and Sad for
sad. With speech,
these emotions can be conveyed naturally by changing the inflection,
speed, and volume of
the speaking voice.
TO USE MULTIPLE CHANNELS OF COMMUNICATION BETWEEN USER AND COMPUTER:
Speech enhances traditional GUI user
interfaces by enabling users to speak as well as click
and type, and hear as well as read. Multimodal user interfaces will
improve the exchange of
information between users and computers by transferring information
in the most appropriate
mode”speech for simple requests and simple answers, and GUIs for
complex requests and
graphical and pictorial answers.
LANGUAGES FOR SPEECH APPLICATIONS:
This new environment led to the creation of VoiceXML, an XML-based
declarative language
for describing the exchange of spoken information between users and
computers and related
languages. The related languages include the Speech Recognition
Grammar Specification
(SRGS) for describing what words and phrases the computer should
listen for and the Speech
Synthesis Markup Language (SSML) for describing how text should be
rendered as verbal
speech. VoiceXML is widely used to develop voice-only user interfaces
for telephones and
cell phones users.
VoiceXML uses predefined control structures, enabling developers to
specify what should be
spoken and heard, but not the low level details of how those
operations occur. As is the case
with many special-purpose declarative languages, developers sometimes
prefer to write their
own procedural instructions. Speech Application Language Tags (SALT)
was developed to
enable Web developers to use traditional Web development languages to
specify the control
and use a small number of XML elements for managing speech. In
addition for use with
telephony applications, SALT can also be used for multimodal
applications where people use
multiple modes of input”speaking, as well as typing and selecting
(pointing).
SALT:
The SALT Forum [saltforum] originally
consisting of Cisco,
Comverse, Intel, Microsoft, Philips, and SpeechWorks (now ScanSoft),
published the initial
specificationin June 2002. This specification was contributed to the
World Wide Web
Consortium (W3C) in August of that year. Later in June 2003, the SALT
Forum contributed a
SALT profile for Scalar Vector Graphics (SVG) to the W3C.
The SALT specification contains a small number of XML elements
enabling speech output to
the user, called prompts, and speech input form the user, called
responses. SALT elements
include:
¢ <prompt>”presents audio recordings and synthesized speech to
the user. SALT also
contains a prompt queue and commands for managing the
presentation of prompt on the queue to the user.
¢ <listen>”recognizes spoken words and phrases. There are three
listen modes:
Automatic”used for recognition in telephony or hands-free scenarios.
The speech
platform rather than the application controls when to
stop the recognition facility.
Single”used for push-to-talk applications. An explicit
stop from the application
returns the recognition result.
Multiple”used for open-microphone or dictation
applications. Recognition results
are returned at intervals until the application makes an
explicit stop.
¢ <grammar>”specifies the words and phrases a user might speak

¢ <dtmf>”recognizes DTMF (telephone touch-tones)
¢ <record>”captures spoken speech, music, and other sounds
¢ <bind>”integrates recognized words and phrases with
application logic
¢ <smex>”communicates with other platform components
SALT designers subsetted the SALT functionality into multiple
profiles that are implemented
and used independently of the remaining SALT modules. Various devices
may use different
combinations of profiles. Devices with limited processor power or
memory need not support
all features (for example, mobile devices do not need to support
dictation). Devices may be
tailored to particular environments (for example, telephony support
may not be necessary for
television set-top boxes). While full application portability is
possible within devices using
the same profile, there is limited portability across devices with
different profiles.
SALT has no control elements, such as <for> or <goto>, so developers
embed SALT
elements into other languages, called host languages. For example,
SALT elements may be
embedded into languages such XHTML, SVG, and JavaScript. Developers
use the host
language to specify application functions and execution control while
the SALT elements
provide advanced input and output using speech recognition and speech
synthesis.
ARCHITECTURES FOR SALT APPLICATIONS:

Users interact with telephony
applications using a telephone, cell phone, or other mobile device
with a microphone and
speaker. The hardware architecture for telephony applications,
illustrated in Figure 1,
contains:
¢ Web server”contains HTML, SALT and embedded scripts. The
scripts control the
dialog flow, such as the order for playing audio prompts to the
caller.
¢ Telephony server”connects the IP network (and the speech
server) to the
telephone network
¢ Speech server”contains a speech recognition engine which
converts spoken
speech into text, a speech synthesis engine which converts text to
human-sounding
speech, and an audio subsystem for playing prompts and responses back
to the user.
¢ Client devices”device to which to user listens and speaks,
such as for example
mobile telephones and telephony-enabled PDAs.
There are numerous variations for the architecture shown in Figure 1.
A small speech
recognition engine could reside in the user device (for example, to
recognize a small number
of command and control instructions), or it may be distributed across
the device and speech
server (the device performs DSP functions on spoken speech,
extracting speech features
that are transmitted to the speech server which concludes the speech
recognition processing).
The various servers may be combined or replicated depending upon the
workload. And the
telephony server could by replaced by internet connections to speech
-enabled desktop
devices, bypassing the telephone communication system entirely.
Some mobile devices”and most desktop devices”have screens and input
devices such as
keyboard, mouse, and stylus. These devices support multimodal
applications, which support
more than one mode of input from the user, including keyed text,
handwriting and pen
gestures, and spoken speech.
TELEPHONY AND MULTIMODAL APPLICATIONS USING SALT:
Figure 2 illustrates a sample telephony application written with SALT
elements embedded in
HTML. The bolded code in Figure 2 will be replaced by the bolded code
in Figure 3, which
illustrates the same application as a multimodal application.
Figure 3 illustrates a typical multimodal application written with
SALT embedded in HTML.
In this application, the user may either speak or type to enter
values into the text boxes. Note
that the code in Figure 3 is somewhat different from the code in
Figure 2. This is because
many telephony applications are system-directed (the system guides
the user by asking
questions which the user answers), while as with visual-only
applications, multimodal
applications are often user-directed (the user indicates which data
will be entered by clicking
a mouse or pointing with a stylus, and then entering the data).
Programming with SALT is different from programming traditional
visual applications in the
following ways:
¢ If the developer does not like how the speech synthesizer
renders text as human-
understandable voice, the developer may add Speech Synthesis Markup
language
(SSML) elements to the text to provide hints for the speech synthesis
system. For
example, the user could insert a <break time = "500ms"/> element to
instruct the
speech synthesizer to remain silent for 500 milliseconds. SSML is a
W3C standard
and is used by both SALT and VoiceXML 2.0/2.1.
The developer must supply a grammar to describe the words and phrases
users are
likely to say. Grammars help the speech recognition system recognize
words faster
and more accurately. SALT (and VoiceXML 2.0/2.1) developers specify
grammars
using the Speech Recognition Grammar Specification (SRGS), another
W3C
standard. An example grammar is illustrated in Figure 2, lines 44“54.
Application
developers should spend effort to fine-tune the specification of
grammars to recognize
words frequently spoken by the user at each point in the dialog, as
well as fine-tune
the wording of the prompts to encourage users to speak those words
and phrases.
¢ Speech recognition systems do not understand spoken speech
perfectly. (Even
humans occasionally misunderstand what others say.) In the best
circumstances,
speech recognition engines fail to accurately recognize three to five
percent of spoken
words. Developers compensate for poor speech recognition by writing
event handlers
to assist users in overcoming speech recognition problems by
prompting the user to
speak again, often rephrasing the question differently so the user
responds by saying
different words. Example event handlers are illustrated in Figure 2,
lines 35“37 and
lines 38“40. Developers may spend as much as 30 to 40 percent of
their time writing
event handlers which are needed occasionally but are essential when
the speech
recognition system fails.
COMPARISON OF SALT WITH VOICEXML:
SALT and VoiceXML enable very different approaches for developing
speech applications.
SALT tags control the speech medium (speech synthesis, speech
recognition, audio capture,
audio replay, and DTMF recognition). SALT tags are often be embedded
into another
language that specifies flow control and turn taking. On the other
hand, VoiceXML is a
stand-alone language which controls the speech medium as well as flow
control and turn-
taking.
In VoiceXML the details of flow control are managed by an a special
algorithm called the
Forms Interpretation Algorithm. For this reason, many developers
consider VoiceXML a
declarative language. On the other hand, SALT is frequently embedded
into a procedural
programming language. Many developers consider the programming
languages into which
SALT is embedded to be procedural. It should be noted, however, that
SALT can be used as
a stand-alone declarative language by using the assignment and
conditional features of the
<bind> statement. Thus, SALT can be used in resource-scarce platforms
such as cell phones
that cannot support a host language. For details, see section 2.6.1.3
in the SALT specification.
While SALT and VoiceXML make it easy to implement speech-enabled
applications, it is
difficult to design a quality speech application. An HTML programmer
easily learns how to
write SALT applications, but designing a usable speech or multimodal
application is still
more of an art than a science. [Balentine and Cohen] present
guidelines and heuristics for
designing effective speech dialogs. A series of iterative designs and
usability tests are
necessary to implement speech applications for users to both enjoy
and use efficiently to
perform their desired computer tasks.
CONCLUSION:
It is not clear at when this article was written if SALT will
overtake and replace VoiceXML
as the most widely used language for writing telephony applications.
It is also not clear if
SALT or some other language will become the preferred language for
developing multimodal
applications. The availability of high-level design tools, code
generators, and system
development environments that hide the choice of development language
from the speech
application developer may minimize the importance of programming
language choice.
FURTHER READING:
Balentine B, and Morgan, D. P. (2004) How to Build a Speech
Recognition Application: A
Style Guide for Telephony Dialogues (2 nd edition), 1999, San Ramon,
CA: Enterprise
Integration Group.
Cohen M. H., Giangola J. P., Balogh, J., (2004). Voice User Interface
Design, Addison
Wesley.
Speech Applications Language Tags Specification Version 1.0 , 15 July
2002,
saltforum
Speech Recognition Grammar Specification (SRGS), Version 1.0, W3C
Recommendation,
16 March 2004, w3TR/2004/REC-speech-grammar-20040316/
Speech Synthesis Markup Language (SSML), Version 1.0, W3C Proposed
Recommendation,
15 July 2004, w3TR/2004/PR-speech-synthesis-20040715/
Voice Extensible Markup Language (VoiceXML), Version 2.0, W3C
Recommendation, 16
March 2004, w3TR/2004/REC-voicexml20-20040316/
Use Search at http://topicideas.net/search.php wisely To Get Information About Project Topic and Seminar ideas with report/source code along pdf and ppt presenaion
Reply
seminar topics
Active In SP
**

Posts: 559
Joined: Mar 2010
#2
22-03-2010, 07:12 PM

Please read topicideashow-to-Speech-Application-Language-Tags-SALT for more of Speech Application Language Tags

Definition
Advances in several fundamental technologies are making possible mobile computing platforms of unprecedented power. In the speech and voice technology business fields SALT has been introduced as a new tool. SALT supplies a critical missing component, facilitating intuitive speech-based interfaces that anyone can master. Verizon Wireless has joined the SALT Forum to make speech applications more accessible to wireless customers. The SALT specification defines a set of lightweight tags as extensions to commonly used Web-based programming languages, strengthened by incorporating existing standards from the World Wide Web Consortium (W3C) and the Internet Engineering Task Force. In multimodal applications, the tags can be added to support speech input and output either as standalone events or jointly with other interface options such as speaking while pointing to the screen with a stylus. In telephony applications, the tags provide a programming interface to manage the speech recognition and text-to-speech resources needed to conduct interactive dialogs with the caller through a speech-only interface.

SALT is a speech interface markup language. SALT (Speech Application Language Tags) is an extension of HTML and other markup languages (HTML, XHTML, WML) that adds a powerful speech interface to Web pages, while maintaining and leveraging all the advantages of the Web application model. These tags are designed to be used for both voice-only browsers (for example, a browser accessed over the telephone) and multimodal browsers. SALT (Speech Application Language Tags) is a small set of XML elements, with associated attributes and DOM object properties, events, and methods, which may be used in conjunction with a source markup document to apply a speech interface to the source page. The SALT formalism and semantics are independent of the nature of the source document, so SALT can be used equally effectively within HTML and all its flavors, or with WML, or with any other SGML-derived markup. SALT targets speech applications across a wide range of devices including telephones, PDAs, tablet computers and desktop PCs. As all these devices have different methods of inputting data SALT has taken this also into consideration.

SALT provides a multimodel access in which users will be able to interact with an application in a variety of ways: input with speech, a keyboard, keypad, mouse and/or stylus; and output as synthesized speech, audio, plain text, motion video and/ or graphics. Each of these modes could be used independently or concurrently. For example, a user might click on a flight info icon on a device and say "Show me the flights from San Francisco to Boston after 7 p.m. on Saturday" and have the browser display a Web page with the corresponding flights.

There are mainly three major challenges that SALT will help address.

1. Input on wireless devices:
Wireless devices are becoming pervasive, but lack of a natural input mechanism hinders adoption as well as application development on these devices.

2. Speech-enabled application development:
Speech-enabled integration between existing Web browser software, server and network infrastructure and speech technology, SALT will allow many more Web sites to be reachable through telephones.

3. Telephony applications:
There are 1.6 billion telephones in the world, but only a relatively small fraction of Web applications and services are reachable by phone.

.ppt   SPEECH APPLICATION LANGUAGE TAGS.ppt (Size: 1.4 MB / Downloads: 295)

OVERVIEW

Introduction
Who developed SALT
What kind of application can be build with SALT
Elements of SALT
How SALT works
Architecture of SALT
SALT standards
SALT-Adding speech to GUI based application
Benefits of SALT
Examples of SALT
Voice XML versus SALT
Conclusion
References
INTRODUCTION

¢ SALT (= Speech Application Language Tags)
“ is an extension of HTML
“ consists of a small set of XML elements (tags)
“ adds a powerful speech interface to Web pages.
¢ SALT can be used for both
“ voice -only browsers .
“ multimodal browsers.
WHO DEVELOPED SALT?

¢ The SALT spec (version 1.0) was developed by the SALT Forum
“ saltforum.org /
and later contributed to the W3C
“ w3

¢ The SALT Forum was founded by

“ Microsoft, Cisco, SpeechWorks, Philips, Comverse and Intel.
What kind of applications can we build with SALT?

SALT can be used to add speech recognition.

Synthesis and telephony capabilities to HTML or XHTML based applications.

Making them accessible from telephones or other GUI“based devices such as PCs, telephones, tablet PCs and wireless

personal digital assistants (PDAs).
Hello SALT

<html xmlnsConfusedalt =saltforum2002/SALT>
<body onload = "hello.Start()">
<salt:prompt id = hello>
Hello World
</salt:prompt>
</body>
</html>
EXPLANATION

¢ SALT tags have been added to the HTML document:
<xmlnsConfusedalt> defines a namespace
<salt:prompt> defines a speech prompt

¢ Document needs to be loaded in SALT 1.0 compatible browser.

¢ Methods such as Start() initiate SALT tags.

¢ It would say "Hello World using a text-to-speech engine.
The main top-level elements
<prompt ¦>
For speech synthesis configuration and prompt playing
<listen ¦>
For speech recognizer configuration, recognition execution and post-processing, and recording
<dtmf ¦>
For configuration and control of DTMF collection
<smex ¦>
for general-purpose communication with platform components


The input elements <listen> and <dtmf> also contain grammars and binding controls

<grammar ¦>
For specifying input grammar resources

<bind ¦>
For processing of recognition results

<record ¦>
For recording audio input

<prompt>
¢ Simple TTS prompt
<salt:prompt id = "Welcome">
Welcome to your SALT application.
What would you like to do?
</salt:prompt>
¢ Pre -recorded audio
<salt:prompt id = "RecordedPrompt">
<content href = "welcome.wav"/>
</salt:prompt>
<listen>
¢ Using <listen> for speech recognition:
<salt:listen id = "listenEmployeeName">
<grammar src = "MyGrammar.grxml"/>
<bind targetelement = "txtName"
value = "//employee_name"/>
</salt:listen>

¢ Note: once recognised "//employee_name" is bound to "txtName".
<listen>
¢ Using <listen> for voice recording:
<salt:listen id = "recordMessage"
onreco = "processMessage">
<record beep = "true"/>
</salt:listen>
<script>
<![CDATA[
function processMessage() {
¦ ;]]>
</script>
How SALT Works
MULTIMODAL

For multimodal applications, SALT can be added to a visual page to support speech input and/or output. This is a

way to speech-enable individual controls, or to add more complex mixed initiative capabilities if necessary.
A SALT recognition may be started by a browser event such as pen-down on a textbox.

How SALT Works
TELEPHONY

For applications without a visual display, SALT manages the interactional flow of the dialog and the extent of user

initiative by using the HTML eventing and scripting model.
In this way, the full programmatic control of client-side (or server-side) code is available to application authors

for the management of prompt playing and grammar activation.


SALT ARCHITECTURE
SALT ARCHITECTURE

A Web server. This Web server generates Web pages containing HTML, SALT, and embedded script. The script controls

the dialog flow for voice-only interactions.
A telephony server. This telephony server connects to the telephone network. The server incorporates a voice

browser interpreting the HTML, SALT, and script. The browser can run in a separate process or thread for each

caller.
A speech server. This speech server recognizes speech, plays audio prompts, and responses back to the user.
The client device. Clients include, for example, a Pocket PC or desktop PC running a version of Internet Explorer

capable of interpreting HTML and SALT.


SALT builds on existing Standards

SALT Standards


Speech interface XML “ lightweight addition to any mark-up “ enhance web model

Clean integration with mark-up

Uses W3C Voice Browser Group Standards


SALT: Adding speech to a GUI based application

Adds powerful tags to HTML, WML, cHTML, xHTML
¢ Speech Recognition: <listen>,<grammar>, <bind>, <record>, <param>
¢ Prompts and TTS: <prompt>, <value>, <content>,<param>

SALT Code Sample: Adding Multimodality
Benefits of SALT

¢ Benefits of using SALT:
“ reuse of application logic
“ rapid development:
“ Speech+GUI

¢ Anybody wanting to speech-enable an application can use SALT.

¢ SALT markup is a good solution for adding speech



Example
Code:
<html xmlns:salt = "saltforum2002/SALT">
    Â¦
    <input name = "txtBoxCity type = "text" />
    <input name = "buttonCityListen" type = "button"
          onClick = "listenCity.Start();"/>
   ¦

<!- Speech Application Language Tags -->
  <salt:listen id = "listenCity">
         <salt:grammar name = "g_city" src = "city.grxml" />
         <salt:bind targetelement = "txtBoxCity" value = "//city" />
  </salt:listen>
</body>
</html>
Multimodal Application
Example: Multimodal Application
This example shows the <audiometer> element in action:

Example: Telephony Application
¢ This example uses Microsoft™s Speech Application SDK 1.0:

VoiceXML versus SALT
¢ VoiceXML and SALT are both
“ markup languages
“ that describe speech interfaces.

¢ VoiceXML is designed for telephony applications:
“ interactive voice response applications are the focus.

¢ SALT targets speech application across a whole spectrum:
“ multimodal interactions are the focus.
VoiceXML versus SALT
¢ VoiceXML

contains a large number of elements
since it defines a data and execution model
in addition to a speech interface.

¢ VoiceXML

deals not only with the user interface (e.g. <prompt>)
but also with data models (e.g. <form>, <field>) and
procedural programming (e.g. <if>, < goto >).
VoiceXML versus SALT
¢ SALT

has only a handful of tags (e.g. <prompt>, <listen>)
because it focuses on the speech interface.

¢ SALT

does not define an execution model
but instead uses existing execution models (HTML +
Java script).
builds speech applications out of existing Web
applications.
enables multimodal dialogs on a variety of devices.
CONCLUSION


SALT “ an open Standard
Extends and embeds in existing web languages

Empowers millions of web developers

Offers voice only and multimodal spanning

Addition of speech brings REAL benefits to applications

SALT makes it easy to add speech and telephony to web applications

Microsoft .NET Speech supports SALT directly or indirectly through ASP.NET controls
REFERENCES

research.microsoft~joshuago

homepages.inf.ed.ac.uk/s0450736/slm.html

speech.sripeople/stolcke/papers/icassp96/paper.html

asel.udel.edu/icslp/cdrom/vol1/812/a812.pdf
REFERENCES

www-2.cs.cmu.edu/afs/cs.cmu.edu/user/aberger/www/lm.html

cs.qub.ac.uk/~J.Ming/Html/Robust.htm

cs.qub.ac.uk/Research/NLSPOverview.html

research.ibmpeople/l/lvsubram/publications/conferences/mmsp99.html

dea.brunel.ac.uk/cmsp/Proj_noise2003/obj.htm


QUESTIONS???


THANK YOU.!!
Use Search at http://topicideas.net/search.php wisely To Get Information About Project Topic and Seminar ideas with report/source code along pdf and ppt presenaion
Reply
psrujana
Active In SP
**

Posts: 2
Joined: Mar 2011
#3
03-03-2011, 07:53 PM

plz send me this seminar and presentation ppt's............
Reply
psrujana
Active In SP
**

Posts: 2
Joined: Mar 2011
#4
07-03-2011, 11:13 PM

plz send me the documentation of this topic..........
Reply
seminar class
Active In SP
**

Posts: 5,361
Joined: Feb 2011
#5
24-03-2011, 09:50 AM

PRESENTED BY
TYSON SUNNY


.doc   tyson s.doc (Size: 407 KB / Downloads: 64)
INTRODUCTION
Speaking and listening is so fundamental that people take it for granted. Everyday people ask questions. They give instructions. Speaking and listening are necessary for learning and training, for selling and buying, for persuading and agreeing, and for most social interactions. For the majority of people, speaking and understanding spoken speech is simply the most convenient and natural way of interacting with other people.
So, is it possible to speak and listen to a computer?
Yes.
Emerging technology enables users to speak and listen to the computer now. Speech recognition converts spoken words and phrases into text, and speech synthesis converts text o human-like spoken words and phrases.
While speech recognition and synthesis have long been in the research stage, three recent advances have enabled speech recognition and synthesis technologies to be used in real products and services:
(1) faster, more powerful computer technology, (2) improved algorithms using speech data captured from the real world, and (3) improved strategies for using speech recognition and speech synthesis in conversational dialogs enabling users to speak and listen to the computer.
NEED FOR SPEAKING AND LISTENING TO A COMPUTER
Speech applications enable users to speak and listen to a computer despite physical impairments such as blindness or poor physical dexterity. Speaking enables impaired callers to access computers. Callers with poor physical dexterity (who cannot type) can use speech to enter requests to the computer. The sight-impaired can listen to the computer as it speaks.
When visual and/or mechanical interfaces are not an option, callers can perform transactions by saying what they want done and supplying the appropriate information. If a person with impairments can speak and listen, that person can use a computer to bypass the limitations of small keyboards and screens. As devices become smaller, our fingers do not. Keys on the keypad shrink often to the point where people with thick fingers press two or more keys with one finger stroke. The small screens on some cell phones may be difficult to see, especially in extreme lighting conditions. Even PDAs with QWERTY keyboards are awkward. (QWERTY is a sequence of six keys found on traditional keyboards used by most English and Western-European language speakers.)
Users hold the device with one hand and “hunt and peck” with the forefinger of the other hand. It is impossible to use both hands to touch-type and hold the device at the same time. By speaking, callers can bypass the keypad (except possibly for entering private data in crowded or noisy environments). By speaking and listening, callers can bypass the small screen of many handheld electronic devices.
SPEECH APPLICATIONS
IF THE DEVICE HAS NO KEYBOARD:

Many devices have no keypad or keyboard. For example, stoves, refrigerators, and heating and air conditioning thermostats have no keyboards. These appliances may have a small control panel with a couple of buttons and a dial. The physical controls are good for turning the appliance on and off and adjusting its temperature and time.
Without speech, a user cannot specify complex instructions such as, “turn the temperature in the oven to 350 degrees for 30 minutes, then change the temperature to 250 degrees for 15 minutes, and finally leave the oven on warm.” Without speech, the appliance cannot ask questions such as, “When on Saturday morning do you turn the heat on?” Any sophisticated dialog with these appliances will require speech input. And speech can be used with rotary phones, which do not have a keypad.
WHILE CALLERS WORK WITH THEIR HANDS AND EYES:
Speaking and listening are especially usefusituations where the caller’s eyes and/or hands are busy. Drivers need to keep their eyes on the road and their hands on the steering wheel. If they must use a computer when driving, the interface should be speech only.
When driving machines requiring their hands to operate controls and their eyes to focus on the machine activities, machine operators can also use speech to communicate with a computer. (Although is it not recommended that you hold and use a cell phone while driving a car.)
Mothers and caregivers with children in their arms may also appreciate speaking and listening to a doctor’s Web page or medical service. If a person can speak and listen to others while they work, they can speak and listen to a computer while they work.
AT ANYTIME DURING THE DAY:
Many telephone help lines and receptionists are available only during working hours. Computers can automate much of this activity, such as accepting messages, providing information, and answering callers’ questions. Callers can access these automated services 24 hours a day, 7 days a week via a telephone by speaking and listening to a computer. If a person can speak and listen, they can interact with a computer anytime during the day or night.
WITH INSTANT CONNECTION WITHOUT BEING ON “HOLD
Callers become frustrated when they hear “your call is very important to us” because this message means they must wait. “Thanks for waiting, all of our operators are busy” means more waiting. When using speech to interact with an application, there are no hold times.
The computer responds quickly. (However, computers can become saturated which results in delays; but these occur less frequently than callers waiting for a human operator.) Because many callers can be serviced by voice-enabled applications, the human operators are freed to resolve more difficult caller problems.
USING LANGUAGES THAT DO NOT LEND TO KEYBOARDING:
Some languages do not lend themselves to data entry using the traditional QWERTY keyboard. Rather than force Asian language users to mentally translate their words and phrases to phonetic sounds and then press the corresponding keys on the QWERTY keyboard, (QWERTY is a sequence of six keys found on traditional keyboards used by most English and Western-European language speakers.) a much better solution is to speak and listen.
Speech and handwriting recognition will be the key to enabling Asian language speakers to gain full use of computers. If a person can speak and listen to an Asian language, they can interact with a computer using that language.
TO CONVEY EMOTION:
In an effort to enhance written text to convey emotions, callers frequently use emoticons — keyboard symbols to convey emotions to enhance their text messages. Example emoticons include Smile for happy or a joke and Sad for sad. With speech, these emotions can be conveyed naturally by changing the inflection, speed, and volume of the speaking voice.This tells the importance of speech in conveying emotions. Thus we can say that it is an important application for one who have to express his feelings naturally. Speech applications have excellent scope today.
Reply
seminar paper
Active In SP
**

Posts: 6,455
Joined: Feb 2012
#6
14-02-2012, 02:48 PM

to get information about the topic Speech Application Language Tags full report ,ppt and related topic refer the link bellow

topicideashow-to-speech-application-language-tags-full-report

topicideashow-to-speech-application-language-tags

topicideashow-to-speech-application-language-tags-salt

topicideashow-to-salt-speech-application-language-tags--3477

topicideashow-to-salt-speech-application-language-tags--1667

topicideashow-to-speech-application-language-tags-full-report?page=2

Reply
seminar paper
Active In SP
**

Posts: 6,455
Joined: Feb 2012
#7
29-02-2012, 12:01 PM

to get information about the topic Speech Application Language Tags full report ppt and related topic refer the link bellow

topicideashow-to-speech-application-language-tags-full-report

topicideashow-to-speech-application-language-tags-salt

topicideashow-to-speech-application-language-tags

topicideashow-to-salt-speech-application-language-tags--3477

topicideashow-to-speech-application-language-tags-full-report?page=2

topicideashow-to-speech-application-language-tags-full-report?pid=13702
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page

Quick Reply
Message
Type your reply to this message here.


Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  web spoofing full report computer science technology 13 8,925 20-05-2016, 11:59 AM
Last Post: Dhanabhagya
  cyber crime in hindi language jaseelati 0 228 23-02-2015, 01:40 PM
Last Post: jaseelati
  advantages and disadvantages of computer in tamil language jaseelati 0 378 05-02-2015, 04:51 PM
Last Post: jaseelati
  android full report computer science technology 57 73,127 24-09-2014, 05:05 PM
Last Post: Michaelnof
  steganography full report project report tiger 23 25,736 01-09-2014, 11:05 AM
Last Post: computer science crazy
  3D PASSWORD FOR MORE SECURE AUTHENTICATION full report computer science topics 144 92,342 13-05-2014, 10:16 AM
Last Post: seminar project topic
Video Random Access Memory ( Download Full Seminar Report ) computer science crazy 2 2,392 10-05-2014, 09:44 AM
Last Post: seminar project topic
Brick Virtual keyboard (Download Full Report And Abstract) computer science crazy 37 30,949 08-04-2014, 07:07 AM
Last Post: Guest
  Towards Secure and Dependable Storage Services in Cloud Computing FULL REPORT seminar ideas 5 4,121 24-03-2014, 02:51 PM
Last Post: seminar project topic
  eyeOS cloud operating system full report seminar topics 8 11,427 24-03-2014, 02:49 PM
Last Post: seminar project topic