Open Source Speech Sessions

Wednesday | Thursday | Friday

Wednesday, July 25

Will the Next Generation Internet Still Depend on Open Source?
Fred Baker, Cisco Systems
Track: Keynote
Date: Wednesday, July 25
Time: 9:00am - 9:30am
Location: Grande Ballrooms ABC in the East Tower

Sponsored by
IBM
How important is open source to the future of the Internet? The Internet evolved as it did because of open source software and open standards. The spirit of open source is best expressed by the Internet Engineering Task Force, which operates on the basis of "rough consensus and running code." However, today's Internet is not the playground it was a decade ago. While some applications, like Napster and AIM, use the open Internet effectively, the sacrifice of the end-to-end model makes deployment of innovative applications challenging. The introduction of so-called "middle boxes" - firewalls, translators, caches, and application layer gateways - means that the new applications must actively circumvent these, or must gain their cooperation.

In a highly competitive market, with a lot at stake, developing consensus as well as running code can be difficult. Industry consortia and business models may determine how the future of the Internet gets decided - and who makes those decisions. Cisco Fellow Fred Baker will talk about the challenges that will shape the Internet, and whether Open Source will play as big a role as it has in the past.


An Open Source Success Story on Wall Street
W. Phillip Moore, Open Source Advocate
Track: Keynote
Date: Wednesday, July 25
Time: 9:45am - 10:15am
Location: Grande Ballrooms ABC in the East Tower

Sponsored by
IBM
Morgan Stanley has what is widely recognized as one of the best IT departments in the financial industry, and has built one of the worlds largest integrated and truly "Enterprise-wide" technology platforms for application deployment.

This infrastructure was architected with a combination of Open Source and proprietary software. This presentation will discuss the challenges faced, both technical and political, when deploying OSS on such a large scale and the problems managed as the environment changes and grows.

The discussion covers the contrast between the OSS experience with that of proprietary closed source products in the same environment, the lessons learned from this experience, and how the OSS community can help make OSS a continued success.


Thursday, July 26

Shared Source vs. Open Source: Debate and Panel Discussion
Craig Mundie, Microsoft, Michael Tiemann, RedHat, Inc.
Track: Keynote
Date: Thursday, July 26
Time: 8:45am - 10:15am
Location: Grande Ballrooms ABC in the East Tower

Sponsored by
Sun Microsystems
Microsoft Senior Vice President Craig Mundie and Red Hat CTO Michael Tiemann set the stage for a wide-open panel discussion about Microsoft's Shared Source program and the response from the open source community, when they square off in this shared source vs. open source debate.

Mundie set off a far-reaching discussion recently when he introduced Microsoft's Shared Source program, which blends access to source code with the preservation of strong intellectual property rights by software developers, and contrasted Shared Source to Open Source and the GNU General Public License.

There's been a strong response from the open source and free software communities, accusing Microsoft of trying to co-opt the momentum of open source with a program that offers superficial similarities, but few of the real benefits. Microsoft counters that they are trying to find a balance between the needs of commercial developers and the lessons learned from the open source movement.

Mundie will discuss ways in which Shared Source differs from Open Source, and why Microsoft believes that the Shared Source Philosophy supports a strong software business case for commercial software developers and their customers.

Red Hat CTO Michael Tiemann will then discuss the industry's experience with open source vs. pseudo-open licensing, and why he believes that the future will favor stronger (rather than weaker) licenses to protect choice for users and freedom for developers.

His speech will be followed by a panel discussion with Tiemann, Mundie, and other experts on intellectual property and the software industry including,

Tim O'Reilly will moderate the panel.


Friday, July 27

Big Hairy Problems: Open Source Challenges in the Enterprise
Michael Tiemann, RedHat, Inc.
Track: Keynote
Date: Friday, July 27
Time: 8:45am - 10:15am
Location: Grande Ballrooms ABC in the East Tower

Sponsored by
ActiveState
If you talk to CTOs, their biggest concerns aren’t whether to use commercial software or open source software but a set of large-scale problems that don’t yet have obvious solutions. Oracle may not have solutions for them, but neither does Open Source. Our panel of top CTOs will tell us about enterprise-class problems that they are worried about solving into the future.


GNUCOMM and Bayonne
Rich Bodo, GNUCOMM, TOSI & Voxilla.org
Track: Open Source Speech
Date: Friday, July 27
Time: 10:45am - 11:15am
Location: Marina II in the East Tower

This session discusses a work in progress called GNUCOMM. GNUCOMM is a working subsystem of the GNU project providing interoperable components that enables human communications. The system consists of several programmable servers and clients, supporting component libraries, and a library of applications. GNUCOMM applications, composed at a high enough level, allow users with little or no special knowledge of system internals or theory to design their own telecommunications systems.


SIP and the Vovida Open Communication Application Library (VOCAL)
Cullen Jennings, Cisco Systems
Track: Open Source Speech
Date: Friday, July 27
Time: 11:15am - 11:45am
Location: Marina II in the East Tower

The Session Initiation Protocol (SIP) is taking the telecom and data industry by storm. Everyone is talking about it. MCI is deploying services on top of SIP. We take a fundamental look at this protocol and its impact on the VoIP market place. Vovida Open Communication Application Library (VOCAL) is an open source, SIP-centric, toll quality, carrier grade communications software for next generation carriers and service providers. We examine its architecture, scalability, and reliability, and learn how to develop features on top of the platform.


The OpenVXI VoiceXML Interpreter
Jerry Carter, SpeechWorks International
Track: Open Source Speech
Date: Friday, July 27
Time: 11:45am - 12:15pm
Location: Marina II in the East Tower

VoiceXML is an emerging standard within the speech industry that holds great potential for widespread acceptance. The ability to allow callers to navigate web sites with their voice, listen to audio content on the web, and eventually combine speech with wireless applications, brings communication to another level. SpeechWorks supports these efforts by adopting a standards-based strategy including open source code to promote market expansion.

This presentation highlights VoiceXML and how developers can take advantage of open source code to build speech applications. Part of the presentation reviews how to build solutions with the OPenVXI VoiceXML interpreter. Other topics include VoiceXML integration with complementary protocols like SIP (Session Initiation Protocol) for call transfers correlated to a data stream using speech links open source code.


Internet-Accessible Speech Recognition Technology
Joe Picone
Track: Open Source Speech
Date: Friday, July 27
Time: 1:45pm - 2:15pm
Location: Marina II in the East Tower

Since the late 1980's, we have promoted a vision of free software for speech and signal processing research following in the footsteps of other successful projects, such as the Free Software Foundation. Our goal has been to develop a comprehensive software environment for performing data intensive science and engineering research that is flexible yet computationally efficient. Our premier application is large vocabulary speech recognition though our software is designed to be much more general than this specific application.

Our approach to creating such a lasting infrastructure incorporates a diversity of technologies, including some novel Internet-based Java applets that let users interact with the technology without the need to install the software locally. We also host an extensive signal processing web site that includes on-line courses, tutorials, and toolkits. Support, training, and education are important ingredients to our success.

The goal in this presentation is two-fold. First, we review the challenges associated with developing such a large-scale application. Second, we discuss the limitations of current programming paradigms. Though software tools and concepts have advanced significantly in the past decade, there are still some very fundamental obstacles that preclude the development of simple, easy-to-use, and extensible software tools for engineers.


Building Synthetic Voices
Alan W. Black, Kevin Lenzo, Cepstral, LLC
Track: Open Source Speech
Date: Friday, July 27
Time: 2:15pm - 2:45pm
Location: Marina II in the East Tower

University of Edinburgh's Festival Speech Synthesis System is a free software, multi-lingual synthesis toolkit already used by many research and commercial groups in the field. The CMU FestVox project offers the ability to build new synthetic voices for supported and new languages. Although a research project, FestVox includes reliable methods for building customer quality voices for real applications as well as the latest research techniques in unit selection.


The CMU Sphinx Open Source Speech Initiative
Kevin Lenzo, Cepstral, LLC, Alan W. Black
Track: Open Source Speech
Date: Friday, July 27
Time: 2:45pm - 3:15pm
Location: Marina II in the East Tower

After a long history in leading the field in speech recognition, CMU decided to also lead in another direction by releasing its Sphinx recognizer as free software. Since the release of Sphinx II, a real-time multi-lingual, multi-platform speech recognizer, we have continued to add to this with new acoustic models for narrow band (telephone) and wideband (desktop) speech. With the release of SphinxTrain, the scripts and programs to build new acoustic models, and the release of the Cambridge/CMU Language Model toolkit, import parts of the free software speech chain are easily accessible. Sphinx III is under development offering more accurate recognition through use of acoustic models with fully continuous observation densities.


Advanced Dialog Systems
Alexander I. Rudnicky
Track: Open Source Speech
Date: Friday, July 27
Time: 3:45pm - 4:15pm
Location: Marina II in the East Tower

The Carnegie Mellon Communicator project has developed a telephone-based spoken language dialog system that allows users to interactively create travel itineraries using live information from the Web. The system is built from existing open source components, including the Carnegie Mellon Sphinx recognizer, the Carnegie Mellon / Colorado Phoenix parser and the Festival text-to-speech system. To this the Communicator project has added the Agenda dialog manager to support flexible mixed-initiative dialog and we have also developed a stochastic language generator as well as other components to assemble a complete set of components for dialog system implementation. Communicator serves as a platform for research in language and dialog and as a base for developing spoken language systems for new domains.


DARPA Communicator: The Development of Advanced Dialog
Bryan George
Track: Open Source Speech
Date: Friday, July 27
Time: 4:15pm - 4:45pm
Location: Marina II in the East Tower

Over the last five years, three technological advances have cooperated to push speech enabled dialogue systems back into the limelight: the availability of robust real time speech recognition tools, the explosion of Internet accessible information sources, and the proliferation of mobile information access devices such as cell phones. However, the systems being fielded, and the standards arising from these efforts, represent only a limited set of capabilities for robust voice enabled interaction with knowledge sources. These limitations arise partly from limitations of technology, but also significantly from dialog systems being based on closed, proprietary architectures and software.

The DARPA Communicator program is exploring how to engage human users in robust, mixed initiative speech dialogue interactions that reach beyond current capabilities in dialog systems. To support this exploration, the Communicator program has funded the development of a non-proprietary, distributed message-passing architecture for dialogue systems, as well as its implementation as an open source infrastructure. In this presentation, we describe the features of the Galaxy Communicator System Infrastructure (GCSI), and of how the Communicator program and dialog system community benefit from open source software licensing.


The Colorado Communicator
Bryan Pellom
Track: Open Source Speech
Date: Friday, July 27
Time: 4:45pm - 5:15pm
Location: Marina II in the East Tower

The University of Colorado (CU) Communicator system combines continuous speech recognition, natural language understanding and flexible dialogue control to enable natural conversational interaction by telephone callers to access information from the Internet pertaining to airline flights, hotels and rental cars. Specifically, users can describe a desired airline flight itinerary to the Communicator and use natural dialog to negotiate a flight plan. This talk will describe our spoken dialog speech technology components that have recently been made available to the community under an open source license for non-commercial use. Under this license, commercialization rights can be obtained through membership in the Center for Spoken Language Research (CSLR) industrial consortium. The talk will also describe the Phoenix natural language parser, it's use, and potential applications. The Phoenix parser is released as open source for unrestricted commercial use by CSLR at the University of Colorado.