[Accessibility] Audio framework requirements
Hynek Hanke
hanke at brailcom.org
Fri Apr 1 06:07:54 PST 2005
Hello all,
just recently, an effort was started to find the requirements on an audio
framework, audio architecture or audio sound system that would be both
accessible and suitable for developing accessibility tools. The final goal of
this effort after the requirements are found is to find a way how to create a
solution that fulfills them, probably by taking an existing solution and adding
the missing parts.
Here is a draft of the requirements document that incorporates the ideas from
previous more general discussions. I've used a very similar structure as the
one of the document about TTS API.
Please send comments and suggestions.
Thank you,
Hynek Hanke
Accessibility requirements on audio frameworks
==============================================
Document version: 2005-04-01
The purpose of this document is to describe the requirements on an audio
(multimedia) framework that provides all the necessary features so that a Free
Software or Open Source desktop is accessible, especially to handicapped people,
and so that it's possible to implement further assistive technologies on it
without having to care about the particular details of audio output. The
purpose of this document is not to define the particular API that will be used
for the communication with a given audio framework.
By the term ``audio framework'' this document means a full solution which
enables application developers to easily have their sound data output through
the speakers in a way compatible with accessibility concerns and that enables
users (possibly handicapped ones) to control the audio in their system in the
way they need. The term should not imply that library-based solutions are
preferred over server-based or even kernel-based solutions.
A. Structure
============
The requirements are categorized in the following priority order: MUST HAVE,
SHOULD HAVE, and NICE TO HAVE.
The priorities have the following meanings:
MUST HAVE: All conforming audio frameworks must have this capability.
SHOULD HAVE: The audio framework will be usable without this feature, but
it is expected the feature is implemented in all solutions intended for
serious use.
NICE TO HAVE: Optional features that make the audio framework better
usable for a handicapped person.
Requirements outside the scope of this document will be labelled as OUTSIDE
SCOPE. They serve to avoid confusion or to define some points more precisely.
A. Requirements
===============
1. General requirements
1.1 Documentation
1.1.1 MUST HAVE: The API is well documented.
1.1.2 MUST HAVE: Example application(s) showing how to implement/use the
features described in section (2) of the requirements is provided.
1.2 Portability
1.2.1 MUST HAVE: The audio framework API is independent of the system in
use with all its basic functionality and the features described in
section (2) of the requirements.
1.3 Design
1.3.1 OUTSIDE SCOPE: Whether the conformance with these requirements is
achieved by a single application or by several cooperating
applications or by a combination of applications and libraries is
outside scope of this document.
2. Audio requirements
2.1 Network Transparency
2.1.1 MUST HAVE: Full redirection of the audio output to a different
machine is possible. It must be possible to do this centrally
without having to reconfigure end-user applications.
Reason: It is an important accessibility feature for remote
access.
2.1.2 OUTSIDE SCOPE: Whether audio redirection is achieved by the
multimedia framework itself or by the underlying audio
architecture is outside scope of this document. However, it must
be ensured that such an architecture exists on each platform and
that the audio framework conforms to all other requirements in
section (2) in such a configuration.
2.2 Data transfer and handling of different formats
2.2.1 MUST HAVE: The audio framework must be able to receive data by
specifying a datafile on local computer to read *and by sending
the data directly from the memory over a socket or by similar
means*.
2.2.2 MUST HAVE: If the data is sent directly from memory without using
a file, it must be possible to send all the data at once at the
beginning of the playback. The client application must not have to
care about any buffering and timing issues.
2.2.3 MUST HAVE: It's possible to send data as a raw stream and specify
all the related audio parameters manually.
2.2.4 MUST HAVE: It's possible to send audio data at least in the PCM
WAV and OGG Vorbis formats without any need to perform previous
decoding or parsing of the data.
2.3 Real-time audio output
2.3.0 For the purpose of this section, the word ,,immediately'' means
,,no later than in 20 milliseconds'' as (MUST HAVE) and ,,no later
than in 10 milliseconds'' as (NICE TO HAVE).
Reason: Reading of characters on the screen when using assistive
technologies and cancelling the audio output to be able to read a
new character should be fast to catch up with a fast typist or
even with autorepeat. Consider a typical autorepeat rate 25
characters per second. Ideally within each of the 40 ms intervals
synthesis should begin, produce some audio output and stop. The
time to start and stop the audio playback must fit into this time,
including processing the request and decoding the incoming data.
2.3.1 MUST HAVE: The playback in the speakers starts immediately after
audio data are sent to the audio framework. The transport
mechanism used (both of the schemes from paragraph 2.2.1) must
allow enough time for this when both the data to play and the
device intended for playback are located on the same machine. When
the transport of the data is done over network, the requirement
only applies after the data is fully received by the target
machine.
2.3.2 MUST HAVE: When the playback on the speakers terminates, the
application that issued the request for playback must be informed.
2.3.3 NICE TO HAVE: The application is notified when certain previously
specified times in the sent audio data are reached.
2.4 Simultaneous audio output and volume control
2.4.1 MUST HAVE: The audio framework is able to play several audio
streams at once and mix them together automatically *without any
effort of the client application itself*.
Reason: It must be possible to run several applications using
audio output without the fear that one of them will block the
output for the rest. For accessibility this is essential since the
speech output must always pass through, but shouldn't block any
other media output.
2.4.2 SHOULD HAVE: The user is able to separately control the maximum
and/or default volume of sounds originating from different
applications (according to their identification passed to the
audio framework) statically from a configuration file or by
similar means.
2.4.3 NICE TO HAVE: The user is allowed to specify priorities for client
applications (according to their identification passed to the
audio framework) so that it's possible to *automatically* mute
some applications or decrease their volume when there is a more
important sound to play.
2.5 Destination of audio flows
2.5.1 NICE TO HAVE: User is able to specify a desired destination
of/redirect the audio flow (e.g. sound card 1, sound card 2,
network) for different applications that are using this audio
framework (without the end-user application having to care about
it).
2.6 Compatibility
2.6.1 MUST HAVE: The whole multimedia framework must be able to run on
top of Advanced Linux Sound Architecture and Open Sound System on
the GNU/Linux system.
Reason: If we want to use it in near future, we must ensure that by
doing so, we don't deny accessibility to larger groups of people.
The necesity to support Open Sound System will probably be dropped
in future.
Open Issue: What other architectures must be listed so that we can
all agree such a requirements is appropriate for all of us? Gnome
accessibility, KDE accessibility, others?
C. Copying This Document
========================
Copyright (C) 2005 ...
This specification is made available under a BSD-style license ...
More information about the Accessibility
mailing list