Topic -

Development > Engine drivers >

Engine interfaces

SpeechHub communicates with an engine using a component called a connector. Two types of interface are used:

stdio

With this type of interface, which is used by most engines on both Windows and Linux, the engine driver is a program which receives simple text commands on its stdin and responds on its stdout. In most cases audio is returned on the driver's stdout and errors are sent on the driver's stderr.

The advantage of this interface is complete isolation between SpeechHub and the engine driver which can be written in any computer language. If the engine returns audio, there is no dependency on the OS audio system, thus promoting compatibility across OS and Linux distributions.

With the stdio interface, SpeechHub uses mostly its own protocol called stdio-native. This is a simple light and responsive protocol which we recommend for engine driver developers and is explained in detail in the page 'Writing_New_Engine_Drivers' later in this section. This protocol is implemented in the drivers for both Windows and Linux in eSpeak and PicoTTS and in Linux for Voxin (IBMTTS).

In the MaryTTS driver another stdio based protocol related to MaryTTS's own communications called stdio.marytts, is used.

In the FreeTTS driver an experimental stdio json based protocol is used called stdio.connector.json. This driver also supports the same json protocol based on TCP / IP socket.

COM (Component Object Model)

This type of interface is used by SpeechHub in Windows only to communicate with Speech API version 5 (SAPI5) and Microsoft Speech Platform (MSP). The API for both SAPI5 and MSP is fully documented by Microsoft.

[ Next - Engine directory ]

[ Up - Engine drivers - Section ]

[ Up 2 - Development - Section ]

[ Up 3 - SpeechHub - TTS server for the vision impaired community - Main Index ]