A PRACTICAL SYSTEM FOR THREE-DIMENSIONAL SOUND PROJECTION (Vennonen, Cont'd)
4. IMPLEMENTATION
The Ambisonic prototype system as implemented here consists of the following major components (see Fig. 4). It was adapted from the previous system based around the 1616 computer with channel cards.
The Macintosh performs the role of calculating all B-format coefficients, either in real time using a mouse and foot pedal interface, or otherwise calculating tables of coefficients for downloading to the 1616 for faster movements. In practice the latter is very similar to defining and manipulating samples, so this mode is called spatial samples. All spatial data is sent over MIDI at 31.25kbaud, using system exclusives for table downloads. Program change and note on/off commands are also used to control the 1616 in sample mode, for instance to initiate sample playback or to alter speed and looping parameters The spatial software on the Macintosh and 1616 has been developed by myself, with the use of some aspects of Streamer, a Forth application designed by David Worrall [32].
In both real time and sample modes, the horizontal path of the sound is drawn by the mouse, while elevation is controlled by a MIDI foot pedal. A JL Cooper Fadermaster converts the variable resistance of the pedal into MIDI control change values, updated as fast as 100 times per second if desired, or otherwise slower. This MIDI data goes in to a buffer in the Mac, which is read as required. The combination of mouse and pedal interfaces allows the operator to completely specify the location of the source in an intuitively obvious way. Distance is simulated by scaling the B-format coefficients according to mouse position, which also can be made to send control changes to an effects machine for simulation of reverb at greater distances, or changing equalizer parameters for simulating close up proximity effect or the loss of high frequencies at large distances [25] [26]. Additional MIDI controllers corresponding to the location of the mouse or pedal can control other effect parameters in real time to simulate different sounding spaces. Doppler effect is optionally simulated by controlling a pitch shift parameter in the effects machine.
The effects machine currently used is the ART SGE Mach II. This is a sophisticated device capable of up to ten chained effects at once, with user-definable MIDI control of most parameters. Among other things, it offers reverbs, delays, pitch shifting, equalization and compression. It receives a MIDI input from the Mac and audio from the sound to be placed in the 3D soundfield. Internal settings can be dumped and retrieved via MIDI, and incoming MIDI data analysed on-screen.
The Applix 1616 computer is a 1987 4MHz Australian made 680x0 based machine with four card slots, one of which is occupied by the internal floppy drive, leaving three for channel cards. A MIDI interface was added to make it compatible with the output from the Mac via the ART effects machine. The Forth software extracts incoming coefficients from the MIDI stream, either directly memory mapping them to the channel cards (real time mode) or building arrays in memory for subsequent playback (spatial samples mode). In the latter mode, Controllers and Note On/Offs are interpreted to control sample playback in real time, for instance to alter the speed of a moving sound. To allow sample mode to control parameter changes in the effects machine requires one to take the MIDI output of the 1616 and attach it to the MIDI in of the effects machine via a MIDI merger.
Effected audio is applied to the channel cards from the previous version
of the system, which sit on the 680x0 bus and are written to by the 1616. Each
card (Fig.5) has one input and six outputs, and consists of six MDACs, bus
interface components, audio input stage and an input inverting stage. The
individual card outputs are added by a simple mixing stage in the 1616 case to
make a common B-format output. The reason for using six MDACs is that each chip
can only perform 2-quadrant multiplication, whereas B-format requires the gain
control element to invert phase when required, as well as scale the input
audio. Thus one chip is required for the non-inverting phase, and a second is
fed by a phase-inverted signal. The coefficients are arranged by the Mac such
that only one chip of the pair is passing the signal at any one time, so that
one can simply add the MDAC outputs to obtain a true four-quadrant effect. This
pairing was only thought to be necessary for the X and Y signals - the W is
never inverted since it is a reference for the decoder and Z has been always
positive in our application, as the dome is a hemisphere. It was assumed that
any other playback environment will not have speakers below listener level.
At this point, the Ambisonic B-format output from the 1616 channel cards is compatible with that produced by the Soundfield microphone processor box. This means that either or both may be used in studio production and recorded on multitrack tape.
For the moment, let us ignore the studio in the signal chain and assume that it merely gives us a B-format output. The next block is the decoder [27] [28], which generates the psychoacoustically correct signals to suit a given loudspeaker layout (Fig. 6). In our case it actually produces two sets of outputs, one for the sixteen speaker dome system as well as a conventional four channel quad set, for use in a quad production studio. This means that the decoder can be used in a studio compositional environment (one can add a fifth speaker at the zenith driven from the corresponding dome output if desired) as well as in the dome, with roughly equivalent results. Switching is provided for distance compensation and selection of shelf filter gain to suit horizontal or 3D layouts. The W, X, Y and Z inputs have gain controls, and four bargraph displays are used for test tone calibration and level monitoring.
The decoder consists of input stages, gain controls and a buffer stage to drive the switchable shelf filters, designed in accordance with Table 1. After optional distance compensation, inverted versions of the X and Y signals are derived and the modified B-format signal is applied to a resistor matrix. The matrix is what actually decides what proportions of W, X, -X, Y, -Y and Z go to the decoder outputs, and resistances were calculated to suit the azimuths and elevations of the sixteen dome speakers plus of course the regular quad layout. Each decoder output is then a combination of the six signals, summed with a standard inverting op-amp stage.
Once the speaker signals have been derived, we have a D-format output customised to the given playback space. It is necessary to ensure that amplitude and phase of these signals is strictly preserved all the way to the speakers. The other, and more difficult requirement is that reverberation is minimised in the space. This presents some difficulties in the dome, as the canopy is quite reflective of high frequencies and its shape causes a focussing effect at the centre of the space. It is important to reduce reflections as much as possible by sound absorbtion material close to the canopy, and carpet on the wooden floor of the dome platform.
A possible further refinement is to augment the low frequency response of the fairly small speaker boxes in the dome. It is not true that directional hearing does not extend below several hundred Hertz - thus any subwoofer system must be multiple. For the dome this means five channels of amplification and five large boxes are required, situated below the corresponding speaker in the lowest ring. The subwoofer signals could then be derived from those five D-format outputs, via low pass filters. At a pinch this can be achieved with a mixing desk having a post-equalizer direct output on each channel. The limitation with this scheme is that the extended bass will only work best with sources on the horizontal plane, which accords well with nature and our associations of low sounds with low elevations. However, this is no good for simulating aeroplanes overhead! In that case, the subwoofer outputs should at least be separately decoded to include a Z component as well. However, it must also be noted that to keep strict accordance with Ambisonic psychoacoustics, sixteen identical speakers with an extended bass capability should be used in the dome.