[Accessibility] Audio framework requirements

Fri Apr 1 06:07:54 PST 2005

Hello all,

just recently, an effort was started to find the requirements on an audio
framework, audio architecture or audio sound system that would be both
accessible and suitable for developing accessibility tools. The final goal of
this effort after the requirements are found is to find a way how to create a
solution that fulfills them, probably by taking an existing solution and adding
the missing parts.

Here is a draft of the requirements document that incorporates the ideas from
previous more general discussions. I've used a very similar structure as the
one of the document about TTS API.

Please send comments and suggestions.

Thank you,
Hynek Hanke

Accessibility requirements on audio frameworks
==============================================
Document version: 2005-04-01

The purpose of this document is to describe the requirements on an audio
(multimedia) framework that provides all the necessary features so that a Free
Software or Open Source desktop is accessible, especially to handicapped people,
and so that it's possible to implement further assistive technologies on it
without having to care about the particular details of audio output. The
purpose of this document is not to define the particular API that will be used
for the communication with a given audio framework.

By the term ``audio framework'' this document means a full solution which
enables application developers to easily have their sound data output through
the speakers in a way compatible with accessibility concerns and that enables
users (possibly handicapped ones) to control the audio in their system in the
way they need. The term should not imply that library-based solutions are
preferred over server-based or even kernel-based solutions.

A. Structure
============

The requirements are categorized in the following priority order: MUST HAVE,
SHOULD HAVE, and NICE TO HAVE.

The priorities have the following meanings:

     MUST HAVE: All conforming audio frameworks must have this capability.

     SHOULD HAVE: The audio framework will be usable without this feature, but
       it is expected the feature is implemented in all solutions intended for
       serious use.

     NICE TO HAVE: Optional features that make the audio framework better
       usable for a handicapped person.

Requirements outside the scope of this document will be labelled as OUTSIDE
SCOPE. They serve to avoid confusion or to define some points more precisely.

A. Requirements
===============

1. General requirements

   1.1 Documentation

       1.1.1 MUST HAVE: The API is well documented.

       1.1.2 MUST HAVE: Example application(s) showing how to implement/use the
	     features described in section (2) of the requirements is provided.

   1.2 Portability

       1.2.1 MUST HAVE: The audio framework API is independent of the system in
	     use with all its basic functionality and the features described in
	     section (2) of the requirements.

   1.3 Design

       1.3.1 OUTSIDE SCOPE: Whether the conformance with these requirements is
	     achieved by a single application or by several cooperating
	     applications or by a combination of applications and libraries is
	     outside scope of this document.

2. Audio requirements

   2.1 Network Transparency

       2.1.1 MUST HAVE: Full redirection of the audio output to a different
	     machine is possible. It must be possible to do this centrally
	     without having to reconfigure end-user applications.

	     Reason: It is an important accessibility feature for remote
	     access.

       2.1.2 OUTSIDE SCOPE: Whether audio redirection is achieved by the
	     multimedia framework itself or by the underlying audio
	     architecture is outside scope of this document. However, it must
	     be ensured that such an architecture exists on each platform and
	     that the audio framework conforms to all other requirements in
	     section (2) in such a configuration.

   2.2 Data transfer and handling of different formats

       2.2.1 MUST HAVE: The audio framework must be able to receive data by
	     specifying a datafile on local computer to read *and by sending
	     the data directly from the memory over a socket or by similar
	     means*.

       2.2.2 MUST HAVE: If the data is sent directly from memory without using
	     a file, it must be possible to send all the data at once at the
	     beginning of the playback. The client application must not have to
	     care about any buffering and timing issues.

       2.2.3 MUST HAVE: It's possible to send data as a raw stream and specify
	     all the related audio parameters manually.

       2.2.4 MUST HAVE: It's possible to send audio data at least in the PCM
	     WAV and OGG Vorbis formats without any need to perform previous
	     decoding or parsing of the data.

   2.3 Real-time audio output

       2.3.0 For the purpose of this section, the word ,,immediately'' means
	     ,,no later than in 20 milliseconds'' as (MUST HAVE) and ,,no later
	     than in 10 milliseconds'' as (NICE TO HAVE).

	     Reason: Reading of characters on the screen when using assistive
             technologies and cancelling the audio output to be able to read a
             new character should be fast to catch up with a fast typist or
             even with autorepeat. Consider a typical autorepeat rate 25
             characters per second. Ideally within each of the 40 ms intervals
             synthesis should begin, produce some audio output and stop. The
             time to start and stop the audio playback must fit into this time,
             including processing the request and decoding the incoming data.

       2.3.1 MUST HAVE: The playback in the speakers starts immediately after
	     audio data are sent to the audio framework. The transport
	     mechanism used (both of the schemes from paragraph 2.2.1) must
	     allow enough time for this when both the data to play and the
	     device intended for playback are located on the same machine. When
	     the transport of the data is done over network, the requirement
	     only applies after the data is fully received by the target
	     machine.

       2.3.2 MUST HAVE: When the playback on the speakers terminates, the
             application that issued the request for playback must be informed.

       2.3.3 NICE TO HAVE: The application is notified when certain previously
       	     specified times in the sent audio data are reached.

   2.4 Simultaneous audio output and volume control

       2.4.1 MUST HAVE: The audio framework is able to play several audio
             streams at once and mix them together automatically *without any
             effort of the client application itself*.

	     Reason: It must be possible to run several applications using
	     audio output without the fear that one of them will block the
	     output for the rest. For accessibility this is essential since the
	     speech output must always pass through, but shouldn't block any
	     other media output.

       2.4.2 SHOULD HAVE: The user is able to separately control the maximum
	     and/or default volume of sounds originating from different
	     applications (according to their identification passed to the
	     audio framework) statically from a configuration file or by
	     similar means.

       2.4.3 NICE TO HAVE: The user is allowed to specify priorities for client
	     applications (according to their identification passed to the
	     audio framework) so that it's possible to *automatically* mute
	     some applications or decrease their volume when there is a more
	     important sound to play.

    2.5 Destination of audio flows

	2.5.1 NICE TO HAVE: User is able to specify a desired destination
	      of/redirect the audio flow (e.g. sound card 1, sound card 2,
	      network) for different applications that are using this audio
	      framework (without the end-user application having to care about
	      it).

    2.6 Compatibility

       2.6.1 MUST HAVE: The whole multimedia framework must be able to run on
	     top of Advanced Linux Sound Architecture and Open Sound System on
	     the GNU/Linux system.

	     Reason: If we want to use it in near future, we must ensure that by
	     doing so, we don't deny accessibility to larger groups of people.
             The necesity to support Open Sound System will probably be dropped
             in future.

	     Open Issue: What other architectures must be listed so that we can
	     all agree such a requirements is appropriate for all of us? Gnome
	     accessibility, KDE accessibility, others?

C. Copying This Document
========================

  Copyright (C) 2005 ...
  This specification is made available under a BSD-style license ...