Home Reproduction of Motion Picture SoundtracksToday’s laser disc [DVD] can, if made with great care, sound identical to the original film master from which it was made. On the other hand, listeners at home often perceive differences from the way they heard fine motion-picture presentations. They perceive the dialog to be less well defined and harder to understand. Foley sound effects like clothing rustle seem exaggerated and voices sound “hard”. There is a wide spread perception that in re-mixing film for video release the level of dialog has been reduced to that of the sound effects and music, thus causing intelligibility problems. It is the purpose of these notes to point out how the differences between cinema and home reproduction account for these observations, and how the differences can be minimized for maximum enjoyment of the program material at home.
A number of elements might cause a lack of translation from cinema to home. We include, in no particular order:
- Lack of a dedicated center channel
- Mis-matched front loudspeakers
- Front loudspeaker frequency response
- Front loudspeakers lacking directional
- Lack of headroom in the reproduction
- System mis-calibration
- Inappropriate system frequency response
- Lack of correct surround sound decoding
- Improper surround level, method and
- Room acoustics and equalization
All of these are potentially problem areas for playback. Let us look at each of them in turn.
Center channelA dedicated center channel with a power amplifier and a loudspeaker improves dialog intelligibility by increasing “clarity” of sounds originating in the center of a stereo sound field. Compared to two-channel stereo reproduction, imaging of centered audio sources at the center of the sound field generally occurs only for listening on the centerline - moving off the centerline makes the image of, say, a centered vocalist “snap” to the closer loudspeaker. So one undesired property of using only two front loudspeakers is that centered sound images will be placed correctly only for viewers seated on the centerline. This is a noticeable problem in conventional stereo listening, but it is an even bigger one in home theater listening, where there is a picture which we expect the sound to match. For centered sound to remain centered as we move about the room, and to remain anchored on the screen, a center channel loudspeaker is a requirement.
Even under perfect conditions, including matched loudspeakers, symmetrical rooms, and centered listening, the effect of improved clarity is heard for 3 vs. 2 front channels. What happens is that two loudspeakers, radiating exactly the same sound, produce a “phantom” image between them for centered listeners. Each ear receives signals from each loudspeaker. Sound from the left loudspeaker [for example] arrives at the left ear first, followed by sound from the right loudspeaker. Even if the two loudspeakers and their signals are identical and the room is symmetrical, there will be a frequency response difference caused by the differing angles of arrival of the two sound fields at the left ear. Also, the delay of the right channel signal compared to the left causes a mid-range frequency response dip when the two signals are summed in the ear canal. A dip in this frequency region reduces “presence” because this is the frequency range (around 2 kHz) which is most responsible for the impression of distance. A dip here will be heard as the source being farther away.
Even going so far as to equalize the dip for centered program material, comparing two vs. Three front channels still shows an improvement in clarity for three channels. So for all but the very most casual listening a dedicated center channel loudspeaker should be considered essential.
By the way, dialog is predominantly recorded in the center channel because it was found in making the first stereo films that it was disconcerting to hear dialog jump from one side to another as the picture changed: it was easier to accept inaccurate localization than it was to listen to dialog jumps. This is probably the source of the wide-spread misconception that the center channel is the dialog channel.
Mis-matched front loudspeakersQuite often the capabilities of the center channel are given short shrift. Often mis-stated to be a “dialog” speaker, in fact full frequency and dynamic range information is mixed to the center channel in movies. For example, the original recordings of music for film are most often three-channel recordings, left, center, and right. So center is just as important as left and right and must play equally as loudly without distortion, and have a matched frequency response so that sound which is panned does not change timbre in the center of the sound field.
A permissible compromise is to restrict the amount of low bass sent to the center, and re-direct it to [the subwoofer loudspeaker, by setting the switch usually labeled “Small/Large” on Dolby Digital decoders to the “Small” position]. With the exception of the low-bass region, the rest of the range of the loudspeaker should match left and right as closely as possible.
Sometimes small, horizontal format loudspeakers have been used for the center channel to minimize the visual impact of the speaker by matching the width of an ordinary video screen and minimizing height. Unfortunately, most of these designs employ two woofers, side-by-side, on either side of a center tweeter. While this might work all right if it matches the left and right loudspeakers in spectral balance
for a centered listener, just one seat off the centerline the spectral balance suffers. This is because for an off-center listener, the sound from the closer woofer arrives first, then the sound from the farther one. The time delay between the two will cause a dip at a mid-range frequency which make voices seem recessed compared to listening on axis. Very few of these smaller center loudspeakers take the approach that roll off one woofer (it doesn’t matter which) as frequency goes up to avoid this problem, or use a passive radiator in place of one of the woofers. So there are great variations among these set-top models.
By far the best approach to take is to use completely matched left, center and right loudspeakers, matched for frequency response and directional characteristics. This maximizes the freedom of the film-maker to place sounds anywhere in the front sound field and have them match completely. An objection which may be raised to this is that if full frequency range systems were used for all three channels, that the loudspeakers would have to be very large to accommodate wide range sound panned anywhere. Luckily, human perception of direction decreases with frequency, so it is practical to combine the bass from the channels and supply it to a subwoofer “channel” without compromise, thus limiting the size needed in each of the front speakers, and permitting their placement in relation to the picture so that the sound imaging left-center-right makes sense.
Front loudspeaker frequency response problemsA problem seen in some loudspeakers designed for music reproduction only is that the frequency region of the mid-range centered around 2kHz is depressed. The effect of a broad dip in this frequency region is to recess the sound stage; instruments seem further away thus exaggerating the depth dimension of the source. While this may be a pleasant enough distortion of the program material on some music sources, it is undesirable on other music, and not good at all on film sound reproduction.
Film sound mixes include dialog, music, and sound effects, and thus the loudspeakers have to be designed to handle all of these elements with equal fidelity. If a film sound track is played over a loudspeaker having a mid-range dip, voices seem recessed and buried in the competition from sound effects and music.
As we said, while a dip in the 2kHz region may be pleasant on some music, it actually harms other musical sources. For example, opera recordings share many of the same requirements as film sound: they too are a combination of dialog (singing), and music, and even occasionally, sound effects. So to discriminate against one frequency region in order to exaggerate depth is seen as a fundamental distortion of the program material. Thus loudspeakers having smooth, wide, and flat frequency response are best for all uses. For the occasional case where added “depth” is desired, the use of digital signal processing to achieve such depth through the introduction of delayed sound is indicated.
Front loudspeakers lacking directional controlLet us say that one part of a sound experience is localizing the various sources, and that a second part is correctly reproducing the depth of sources in the recording, and that a third is reproducing the enveloping part of stereophonic sound as a spacious surround field. Depth and envelopment are often confused. We separate them here into that component of the sound field which reproduces sound from the plane of the front loudspeakers and beyond, called depth, and a diffuse-field component, coming from no particular direction, called envelopment. Both might be called different elements of spaciousness. The first is available from conventional stereo systems, but to reproduce both, in proper relationship, absolutely requires more than two loudspeakers.
Now virtually all of the thinking in stereo loudspeaker design, even today, tries to achieve a balance among all of these competing requirements for reproduction over just two loudspeakers. Thus nearly every design on the market turns out to be a compromise among these, often competing, factors.
Today’s home theater sound system, on the other hand, uses more than two loudspeakers,
so each of the loudspeakers can be optimized for a more specific job, with less compromise than heretofore has been necessary. That is, front channel loudspeakers can be built to optimize sound imaging and depth perception, and surround loudspeakers can be built to optimize sound envelopment.
By using front loudspeakers with directional control, particularly in the vertical plane, a number of advantages are seen:
Seated and standing listening positions can be covered uniformly with direct sound
The total direct-to-reflected sound ratio is increased, thus improving localization of screen sound and intelligibility
Less sound is radiated in directions which can cause reflections having negative consequences, such as floor and especially ceiling reflections
Ceiling and floor reflections add to and subtract from the direct sound since they are delayed relative to the direct sound, causing audible frequency response aberrations. These reflections also tend to make the sound field more monaural, thus less spacious, since they come from the same position in the horizontal plane as the direct sound (whereas side wall reflections do not cause the same problem because their direction is different in the horizontal plane). These reflections may also interfere with the perception of depth in the recording, which has its own early reflections and may be heard with less “clutter” if the loudspeaker reduces ceiling and floor reflections; thus a vertically directional loudspeaker permits greater perception of recorded depth.
An old axiom of loudspeaker design was “wide dispersion is good”. But questions like “if wide is good, isn’t omnidirectional best?” were never adequately answered, for there were noticeable problems with very wide dispersion designs, such as an increased dependence on room acoustics of the listening room. Recent times have seen a refinement of the argument. Wide dispersion is good within a “listening window” that covers likely listening conditions with flat early energy, but “wider dispersion” than this promotes room reflections, particularly from the floor and ceiling, which may have deleterious effects. The best front loudspeakers for home theater, as for music listening, have controlled directivity, wide horizontally and narrower vertically.
Lack of headroom in the reproduction chainFilm sound is different from all other kinds of recorded sound. While most sound systems attempt to reproduce a live event, a home theater system is reproducing an experience which first occurred not live, but over loudspeakers. And the conditions of dubbing films are well known, with established frequency range, response, dynamic range, etc. So rather than needing an unknown amount of power, for example, to reproduce any source, what a home theater needs is the ability to play as loud as the original dubbing stage without distortion, and the job may be seen as being done. So there is an upper limit on what must be achieved.
On the other hand, think about the following progression: Mozart, Beethoven, Mahler,
Terminator 2. Through this progression we see an ever greater use of wider frequency range, dynamic range, and spatialization. So film sound is stressful in the sense that it is program material which may use all of the available recording “space” of the source media.
Thus, on the one hand, we have a way to set a reasonable upper boundary on requirements, and on the other, we have film makers who will go to the limits of the envelope. For engineers, this is actually a desirable condition, since it allows for reasonable planning for a solution.
Another factor is that the perception of program material as being too loud is often a complaint not actually of its being too loud, but of its being too distorted, something difficult for people to separate perceptually. It is likely that in most systems the first limitation encountered is due to the power amplifier, which will clip before other parts of the system.
A common misconception is that smaller loudspeakers require less power, since after all, they are smaller. In fact, smaller loudspeakers which cover the same frequency range are usually less sensitive than larger ones, so smaller ones require more, not less, power. So the required amount of amplifier power is set by the combination of requirements due to the program material, sensitivity of the loudspeakers, and room size. The combination of these three ingredients make a general solution to the problem difficult, but a rule of thumb can be stated by locking down some of the conditions. If the room is on the order of 3,000 cu. ft., and the loudspeaker sensitivity for the left, center, and right loudspeakers is about 88dB at 1 m and 2.83 V (1 watt in 8 ohms), then the required amplifier power is 100 watts per channel in 8 ohms. If the speaker was 3 dB less sensitive, at 85 dB, then twice the amplifier power would be needed.
System mis-calibrationBesides amplifier power, another source of trouble in system headroom is mis-calibration.
Following the manufacturer’s procedure for setup carefully should ensure optimum dynamic range in the surround sound decoder. Most include a mode for circulating test noise which can be used to set balance among the output channels. In the case of Home THX controllers, the noise is at an electrical level which should reproduced at 75 dB sound pressure level from each channel in turn. A RadioShack sound level meter, costing around $30, can be used to produce absolute level calibration of these systems. Set it to C weighting and Slow reading and adjust each output level control in turn for 75 dB SPL.
Inappropriate system frequency response for film reproductionAround 1985 it was found that laser discs could, if made very carefully, sound identical to the film masters from which they were made, when played back over the original sound system. But when played over a high-quality home stereo system it was noticed that high-frequency energy was exaggerated. This led to several audible effects:
Excessive Foley sound effects like clothing rustle and small movement
Hearing background noise shift from shot to shot that was not heard on the dubbing stage
Hard sounding voices particularly on vowels and “s’s”
These problems were traced to the fundamental differences between film dubbing and listening over home high-fidelity speakers. In Home THX this problem was solved through a process called re-equalization, taming the excess high-frequency energy. This is a circuit in Home THX controllers that is used in the THX Cinema mode. If you do not have a Home THX controller, you will probably notice that if music sources sound well balanced on your system, then video sources will sound too bright. You can partly overcome this by adjusting tone controls, although most [ordinary] surround sound [controllers/]receivers do not have common tone controls for all channels, usually limiting them to left and right.
Lack of correct surround sound decoding parameters including DSPEarlier in the evolution of the home theater concept, surround sound decoding of the two channels present on laser discs and tapes into the four channels left, center, right and surround was accomplished in an extraordinary range of ways. Many of these implementations suffer from not matching what is heard in the dubbing stage environment, and thus cannot be said to be reproducing the program material as it was intended. Today Dolby [Digital, Pro Logic/Pro Logic II/Pro Logic IIx/DTS] decoders predominate; these do match what is heard in production of a movie. If you have an older Dolby Surround decoder without the Pro Logic [Digital, Pro Logic II/Pro Logic IIx/DTS] mark, or any other decoder, it is likely there will be audible differences from the way it was heard in production. Some older decoders carry out a theory that said that since the home is smaller than the cinema, we ought to make the home experience bigger in order to match, by taking left channel information and supplying it to the left surround, and so forth. This caused a great distortion of the sound field, so that “congruence,” the match between picture and sound, was lost.
DSP means Digital Signal Processing. This is a general technique of manipulating signals digitally rather than one specific application, although there is confusion about which application is meant when the term DSP is used. For example, there are Home THX controllers on the market which accomplish the various THX processes digitally using DSP technology (there are [were] also ones which are [were] fully analog, and even analog-digital hybrids). What is important in the end is performance, not technique.
But another meaning given to DSP is the recreation of a sound space by adding various reflections and reverberations to the sound of the recording. While there are legitimate reasons to do this with music recording, in particular among those implementations which extract the ambience present in the recording and reproduce it correctly spatialized for multi-channel playback, there is no good reason to do this to film program material.
The difference lies in the different acoustics of concert halls vs. motion-picture theaters. A concert hall is a very real part of the experience of a concert; it provides warmth and spaciousness through reflection and reverberation processes which may be interesting to duplicate at home. A movie, on the other hand, is meant to be experienced in an acoustically neutral environment, so that the sound track recording can create the proper ambience from scene-to-scene, by recording and reproducing it. The motion-picture theater itself is neutral in sound. The idea is to be able to switch from the back seat of a car, to a gymnasium, from one shot to the next, and have the sound system of the theater reproduce the changing room acoustics of these spaces. If the theater added its own reverberation, then we could never get the intimacy of the back seat, and there would be less or even no change noticeable when we changed shots. There are however, some bad older theaters built before modern reverberation control materials were available, which do add reverberation of their own. Mostly these are vaudeville houses, converted in the 1930's to motion-picture use. More suitable for concert halls, to which many have been converted, these theaters are considered today to be fully obsolete, with none being built along these lines for a great many years.
So if you have a DSP mode called something like
Theater, 70 mm, Cinema Acoustics, or the like, know that the addition of reverberation or reflections to program material in which these effects have already been recorded carefully is a gross distortion of the program material.
Improper surround level, method, and placementMis-calibration of the surround level is a frequent problem. Often people think that because they bought and paid for a surround system that they should hear surrounds all of the time, so they mis-calibrate the surround level by turning them up. This is the equivalent of the “ping-pong” stereo era circa 1958 - that all stereo effects have to be exaggerated to be heard. Actually surrounds are at their best when they rarely attract attention to themselves, reproducing as they do the surround information from a film which largely consists of reverberation of on-screen sound, low-level ambiences, and the occasional spot effect. What is spectacular is to, having calibrated the system carefully, abruptly switch the surround loudspeakers off. This is usually a dramatic demonstration of how much the surround channel matters, rather than hearing information coming from the surround speakers all of the time.
The surround component of the stereo sound field is generally meant to be enveloping, carrying as it does reverberation of sound sources which occur on the screen, low-level ambiences, and the like, most of which are meant not to be localized, but rather to wrap the listener in a diffuse field, similar to the reverberant field in a concert hall having no particular preferred direction. To this end it is useful to use the opposite of the screen channel approach. For the screen channels a directional loudspeaker was used to promote localization. Since we want the opposite here, a minimum localizing loudspeaker is best, that is, one with very wide dispersion, or even with dispersion arranged so that the listener gets little or no direct sound from the loudspeaker. The Home THX solution is to use dipole-type loudspeakers generally located to the sides of the listener which radiate their sound throughout the space and use the acoustics of the listening room to promote envelopment.
The older quadraphonic era arrangement of left and right front and back loudspeakers has been found to be inadequate to reproduce a diffuse field. Today placement of surround loudspeakers is more often to the sides.
Room acoustics and equalizationThere are many problems in room acoustics in motion-picture theaters. The problems that are seen there are largely ameliorated at home though, simply because of the difference in room size. For example, echoes from the back wall of a theater can disturb listening in the front rows, as the long-delayed reflection off the back wall is returned to the front seats much later, a problem exceedingly unlikely at home.
On the other hand, a different set of problems present themselves. The most prominent of these is usually the effects of standing waves. Standing waves, or room modes, cause lack of uniformity of mid-bass response as the source of sound and the listener move about the space. Traditional approaches in home stereo to improving mid-bass response include advice to move the loudspeakers and the listening position around to optimize sound quality. This is more difficult in home theaters, where more loudspeakers and a picture are involved. Thus solutions relying on engineered approaches become more prominent. These include:
1. Choice of the ratio of room dimensions H:W:L to spread the effects of the room modes
2. Low frequency absorption, in particular in the corners of the room where it will be most effective, and
3. Use of room equalization
Improving methods for adjusting room equalizers by means of spatial and temporal (space and time) averaging have become practical in the last few years. Although equalizers have been on the market in the past, a precise way to tune them accurately has not been. Today a room equalizer, such as the Rane THX 44 [Rane THX 22 and AudioControl Bijou] provides equalization for the mid-bass frequencies in particular that are troublesome in home theaters. If other methods are unavailable to you, such as it being impractical to rebuild your room to fortuitous dimensions or add bass absorption, then equalization is the ingredient that will most help this problem.
ConclusionsProbably no one of these effects we have discussed is a cause for all of the observations which were stated in the introduction, but the accumulation of all of them certainly conspire to decrease the translation of the experience from film to home. Certainly we have found that if you account for practically all of the items we have discussed, then the observations of how different it sounds from cinema to home are greatly reduced. For example, even playing a film some 15 dB (a great deal!) softer than it was originally made over a system in which these parameters are optimized still produces clear sounding dialog, and there are no complaints that “they mixed the dialog down.”
All of the items we have been discussing are incorporated into the design of Home THX theater installations. Whether you have a complete system now, or contemplate on in the future, you should know that each of the developments of electronics and loudspeakers in Home THX stand alone. That is, they may be bought individually as space, time, and budget allow, and that each of the Home THX marked products will make a step in improving your home theater. Even if you can’t afford it now, these notes have hopefully made you more fully aware of the issues involved in home theater sound so that you can be a wiser user. Good luck with your home theater experience.