Audio Digitization for the “Oral Histories of the American South” Project

The description of the audio digitization process given below is from 2007, but it is a model of thoroughness of description:

Cassettes are played on a Nakamichi MR-1 discrete head professional cassette deck. Tape heads are cleaned before each side of the cassette is played and the azimuth (the angle between the tape heads and tape medium) is adjusted to create maximum contact between the playback head and the tape to ensure the widest frequency response. Playback equalization is set to 120µ seconds for IEC standard Type I cassettes, and 70µS for Type II and Type IV cassettes.

XLR outputs of the Nakamichi transmit the balanced signal directly to the Apogee Rosetta 200, 24 bit, 2 channel, Analog to Digital and Digital to Analog converter. The signal is digitized at a sample rate of 96 kHz and 24 bit sample depth and travels to the computer via an XLR cable from the digital outputs to the Lynx One sound card AES/EBU audio port.

Recently, we added the Apogee Big Ben Master Digital Clock, a master word clock that virtually eliminates any possible jitter [abrupt and unwanted variation of one or more signal characteristics] that can cause high frequency distortions in the signal. This process creates audio files with excellent clarity and a very large quantity of information. A typical file, representing one side of one cassette, comprises around 1 GB of data.

Files are then stored in a designated digital deep storage on the libraries archival servers as they are too large to be stored on CD without converting the sample rate to 44.1 kHz and reducing the quality.

The signal can be monitored from each source separately from Genelec 8030A bi-amplified monitors routed through a Coleman Audio MS6A switcher with monitor controller. The switcher has balanced XLR inputs and outputs to preserve signal-to-noise ratio and features completely passive switching. Interviews are re-recorded using Wavelab, a non-linear digital audio software platform.

Each cassette side is recorded, assigned a number as a preservation master (PM), entered into a PM database including pertinent metadata, and saved as a single audio file into deep storage in a dedicated digital archive maintained by UNC. The interview audio file is then converted into a file for burning a CD listening copy for in-house library patron research. First the file is resampled to 44.1 kHz and 24 bit sample rate for audio processing. The audio file is processed in Sound Forge version 8.0 with Waves X Restoration, VST, Direct X, and Sony audio plug-ins to improve the quality.

A typical file requires two processes: normalization to an average RMS (root mean square) level of -14 dB applying dynamic compression in order to increase the volume, and noise reduction to remove as much background noise, tape hiss, and rumble as possible without affecting the source material. Some files require more specific equalization or a series of noise reduction to achieve audio of suitable quality and volume for researchers. The file is then converted to 16 bit samples, burned to a CD listening copy on a professional grade Mitsui gold audio CD at 4x speed with a Plextor DVDR PX-716A 1.09 drive using Sony CD Architect software, version 5.2. CDs are tested to determine audio is present. Finally, all individual audio files that comprise a complete interview are arranged in order and converted to one single 256 Kbps, 44.1 kHz, 16-bit, stereo MP3 audio file for the Documenting the American South, Oral Histories of the American South collection interface.

The above description accurately describes the current digitization process, a set of practices resulting from regular evaluation of current digitization standards and our abilities to meet and surpass them with the equipment and time we have available. When audio digitization of the interviews began November 1, 2005, masters were recorded at a 44.1 kHz, 16 bit sample rate. Soon, we hope to replace the LynxOne sound card with a FireWire card, removing another gain structure from the signal chain to create digital preservation masters with the least amount of information lost, or added, as possible.

Leave a Reply