Capturing Perceptual Expertise:
A Sound Equalization Expert System
Dale Reed
University
of Illinois at Chicago, EECS Dept.
851 S. Morgan St. (M/C 154)
Chicago, IL
60607-7053 USA
(312) 413-9478
reed @ uic.edu
ABSTRACT
This paper describes an intelligent interface to assist in the expert perceptual task of sound equalization. Inductive learning is used to acquire expert skill using nearest neighbor pattern recognition. This skill is then used in a sound equalization expert system, which proficiently adjusts the timbres (tonal qualities) of sound in a context-dependent fashion. The computer is used as a tool to sense, process, and act in helping the user perform a perceptual task. The developed system shows that the nearest-neighbor context-dependent equalization is rated 68% higher than the set linear average equalization and that it is preferred 81% of the time.
Keywords
Learning, Expert system, Equalization
Inductive learning can be used to perform an expert skill using nearest neighbor (NN) pattern recognition. This is demonstrated through a sound equalization expert system that learns to proficiently adjust the timbres (tonal qualities) of brightness, darkness, and smoothness in a context-dependent fashion, creating an intelligent computer interface. This is innovative in that it applies the established nearest-neighbor technique to the new application area of performing a skillful perceptual task. This combination has been made possible through advances in computer memory and processor technology, making previously intractable problems now feasible. This work also demonstrates a human-computer interaction (HCI) paradigm where the computer is used as a tool to sense, process, and act in helping the user perform a perceptual task.
The expert system developed here for doing sound equalization is an example of capturing the valuable commodity of human expertise using a computer. Computer learning is needed to help overcome the knowledge acquisition bottleneck for these systems.
Human expertise can be separated into expert knowledge and expert skill. Expert knowledge consists of that which you know how to do, such as knowing when medical symptoms indicate a heart attack or who composed a particular piece of music. Expert skill consists of what you are able to do, such as being able to perform heart bypass surgery or to play a piece of music. Skills as such do not constitute what we think, but rather what we are. An aging professional athlete may still know what to do, but his or her body may no longer be able to execute the action. In our expert system the skill consists of changing tonal qualities (timbres) of sounds through equalization.
In the sections to follow we first discuss the nature of the perceptual task of sound equalization and related work. We then discuss using the computer as a tool to capture expertise (section 2). In section 3 we look at the underlying computational approach used, that of Nearest Neighbor Inductive Inference. Then we discuss setting up two experiments using equalization to change timbers of sound (section 4), with conclusions presented in section 5.
We define a perceptual task as a task where sensory input is processed to appropriately perform some action, e.g. riding a bike or vocal harmonization. We differentiate between sensing and perceiving in that perceiving takes the additional step of incorporating the sensory input into some sort of usable representation. Perceiving is not just observing, but additionally apprehending. We also differentiate between an “ordinary” perceptual task and an expert, or skillful, perceptual task. Many people can drive a car, but few have the skill to drive in a race. Many people can tell which of two equalizations for a piece of music they prefer, but few have the skill to isolate which frequency bands cause the differences.
Though early Artificial Intelligence (AI) researchers felt sensory-rich mundane tasks such as vision or locomotion would be easier to solve than “expert” tasks such as medical diagnosis, the opposite has proven to be true. We show how a computer system can be used as a tool to aid the user in both perceiving and performing a sensory-rich task, giving a non-expert an expert level of performance.
Sound equalization is used in public address systems, recording studios, movie theatres, and stereo systems. At a very basic level it is encountered on home stereo systems as the treble and bass tone controls. These act as amplifiers and filters changing the amount of energy in different frequency bands. Equalization is used to make a sound be perceived as more natural sounding, since audio equipment and room acoustics change aspects of the original sound. Secondly equalization is used to give a sound a new property, such as making drums sound more resonant or removing a harsh “nasal” quality of a singer’s voice. Bartlett [1] has a description of terms commonly used to describe timbral qualities.
Typically when a sound engineer is setting up a sound system, the system as a whole is first equalized to compensate for the equipment and the listening environment. Next individual channels are equalized for the microphones on particular instruments or other sound sources. Expert sound engineers are those who have developed through experience the ability to hear a sound and isolate exactly which one or several frequencies (out of 31 possible bands) need to be changed to give a desired effect. This is complicated by the context-dependent nature of equalization.
Our Human-Computer Interaction (HCI) paradigm of using the computer as a perceptive tool is related to computer representations of sensory data used to create virtual environments. For instance visual, aural and tactile feedback are used by biochemists in the pharmaceutical drug design process through a simulation representing the atomic interaction between molecules [5]. Users get tactile feedback as they manipulate the image of a molecule they are building. In this case the computer is used as a sensory tool in the virtual environment, however it isn’t an intelligent tool in that it doesn’t learn. Other examples of virtual environments are three-dimensional computer games, micro-surgery, and remote robotic control. Processing of sensory data is also used for autonomous vehicle navigation [12] and speech recognition [9]. The virtual environments described above are used to present sensory data, though the interface is not used interactively to enhance a user’s skill level.
There are two notable examples where a computer does learn to perform skillfully. The first is Lee Spector’s GenBebop program [10] where a genetic algorithm is used to create improvisation based on a short underlying musical segment. The result is very interesting, though arguably not expert performance. Second is Harold Cohen’s AARON system [2][2] that automatically generates paintings through the use of an elaborate rule-based system with a flat-bed plotter. Our work differs in that the computer is used as a tool to aid the user in both perceiving and performing an expert task.
In order to use the computer as a perceptual tool, the user must be an integral part of the system. This exploits both the memory and processing power of the machine as well as the intuitive and synthesizing ability of the user. Both the machine and the user perceive and remember independently of each other, but productive synergy can arise when they are combined.
Consider the schema used to capture expertise shown in Figure 1, applied in this work to the expertise developed by a sound engineer. Our goal is to externalize a sound engineer's internal expertise, capturing it in a form which can be reused by a non-expert.

Figure 1: Schema used
to capture expertise.
The engineer first recognizes the features or context of the present sound, then remembers similar sounds and equalization changes made in the past with respect to the desired outcome. This information is used to infer similar equalization changes to be made in the present case. Our intent is that this process be externalized to the point that a user can think only about the goals and need not have the expertise to match features or infer equalization changes. The same schematic would apply to perceptual tasks other than our example of sound engineering. By introducing a computer into the loop, the stimulus, context, goal, and resulting changes can all be remembered for later use, possibly by a non-expert.
Figure 2 illustrates how the expertise-gathering schematic from Figure 1 can be implemented in a computer system. The system must first be trained, accumulating the body of experience that constitutes the system’s expertise. The second phase, performance, uses the accumulated knowledge using inductive inference as shown by the thick light-gray lines. In order to train the system an expert user perceives the stimulus and is given a goal. The user manipulates the stimulus using the computer to achieve an aesthetically pleasing difference with respect to the goal. The context of the original stimulus (the auditory identifying signature), along with the new computer changes for the selected goal are then stored in the database.
Figure 2: Capturing expertise with a computer in the loop
In our application to equalization the Stimulus is a sound. The Context Analysis yields a representative “signature” made up of a measurement of the average energy per frequency in each sound. The Goals are changes in the timbres of brightness, darkness, and smoothness. Each example’s Changes are the equalizer settings used to implement the goal, and the Modifications Control is an audio equalizer.
For the performance phase, we add the inferencing module. As before, a stimulus (e.g. a sound being played) enters the system, but this time the user selects a goal. The system does pattern matching on the stimulus’ signature, finding the n most similar previously recorded examples (signature-goal pairs) in the data base, using nearest-neighbor pattern matching. The system then makes the same (or very similar) changes to the present stimulus as was made to the previously captured stimuli (nearest neighbors) for the same goal. The user can provide corrective feedback, with these changes added to the database as a new example. Note how the computer is used as a tool to help perceive the input (Context Analysis), induce the proper action to be taken (Inferencing), and also cause the resulting perceptual change (Modifications Control). The system also has the ability to change dynamically according to user preferences by remembering the users’ feedback in cases where the suggested change was inadequate.
Now let us take a look in more
detail at the Inferencing module as implemented here using nearest neighbor
pattern recognition.
Symbolic Artificial Intelligence processing is involved with “figuring out the rules,” or coming up with the underlying primitives and their relation to each other. Sometimes it is not possible to figure out the rules or in fact not necessary, particularly in cases where you are capturing a skill rather than knowledge (e.g. riding a bicycle.) In this research we are using knowledge without completely understanding its primitives and their relationships. Rather than “figuring out” or reasoning, we use pattern matching to inductively solve new problems in analogous ways to previously seen similar situations - an “expertise oracle,” as it were. Stated another way, “Intelligence is as intelligence does.” This is done algorithmically using Nearest Neighbor inductive inference.
Genetic algorithms [4] are
one of the most popular inductive inference methods, though they suffer from
lengthy training time. Modified
classifier systems [3] reduce the training time, but have great sensitivity
used in reward and punishment values. Decision trees [8] provide efficient
lookup, but suffer from a need for very large data sets and a length set up
time compared to the Nearest Neighbor (NN) [3]
approach. Although knowledge of the relationships between examples is more
opaque when using NN, it has the advantages of being very straightforward,
sensitive to local populations, and adaptable to dynamic changes in the data.
Nearest neighbor is an example-based pattern recognition approach where all the data points are stored in an n-dimensional space (hypersphere). A new example is mapped into that space and its predicted outcome is computed from the outcomes of it neighbors, that is the points close to it. These points share similar characteristics.
Consider applying NN to credit-risk analysis, where information from a credit card application is evaluated in order to assign a credit rating to an applicant. Applicants whose credit rating falls below some threshold will not be given a credit card due to the risk involved.

Figure 3: Placing a new example by its nearest neighbors. Outcome of credit-worthiness is determined by outcomes of its neighbors.
This is illustrated in Figure 3, where we are trying to predict the credit-worthiness of a loan applicant based on marital status, income, and age. The outcome of credit-worthiness is represented by the numbers inside the boxes, and the location of the boxes reflect the other fields’ values. We have chosen credit-worthiness to be scaled between 1 and 10 for this example, where 10 is most credit-worthy. Placing the new example into the hypersphere of only 3 dimensions in this case we find that it is closest to two other points whose outcomes are respectively 4 and 6. The new example’s outcome is then some function of those values, either by some sort of weighted average (“5” in this example) or the value that occurs most frequently in cases when there are multiple “close” values.
In our implementation applied to sound equalization, each field (or dimension) is actually a measure of energy in one of the frequency bands averaged over the length of the sound. Experts’ use of the system serves to train it, populating the NN search space. When a non-expert uses the system, new sounds are compared to existing ones, and new changes made are similar to those done in the past for similar sounds the system has already heard.
The goal of the implementation was to create a trained system usable as a tool by a non-expert to do expert sound equalization (eq), changing the tonal quality, or timbre of a sound using equalization through an implementation of a NN inductive inference system. We discovered that context needs to be taken into account in affecting timbre through equalization. In other words you can’t just always “do the same thing” to give a desired perceptual effect. It depends on what the underlying sound is.
We looked specifically at
the timbres of brightness, darkness, and smoothness, as illustrated in Figure 4 Brightness
can be thought of as high-frequency emphasis, with weaker low frequencies. Darkness can be thought of as the opposite
of brightness, with lower frequency emphasis and a decrease in high frequency
energy. A sound is smooth if it is easy
on the ears, not harsh, with a flat frequency response, especially in the
mid-range, with an absence of peaks and dips in the response. Loudness, which
is an overall increase in the energy level, was used as a control.

Figure 4: Equalization changes for three timbres.
As mentioned previously, what makes this equalization task difficult is that the equalization changes are context dependent. What makes one sound brighter may not work for another. Making a cymbal brighter would involve increasing the energy in the highest frequencies available (the sliders furthest to the right on a graphic equalizer), but doing the same thing to an electric bass sound may not make any difference at all. This is because there is no energy present at those high frequencies to begin with. When adjusting sliders to make an equalization change, one must take into account the characteristics of the underlying sound. It isn’t possible to just always do the same thing to every sound for a desired effect. Equalizations are not only context-dependent, but they are non-linear as well. Moving certain sliders could make a sound increasingly smooth, but after a point continuing to move the same sliders in the same direction could give an unpleasant quality to the sound.
Figure 5: Energy per frequency band for three sounds.
For example, consider the goal of an increase in brightness applied to the three sounds (bass, acoustic guitar, and rainstick) whose energy graphs are shown in Figure 6. To make the bass sound brighter we would want to increase the energy in the bands 500, 1K, and 2K. To make the rainstick sound brighter, however, we would have to increase the energy in bands 4K, and 8K, which is different.
The first experiment had two parts to it. First was the training phase, where subjects used the computer to make equalization changes with the computer remembering what they did. Second was the testing phase, where users gave feedback as to how good of a job the computer did in making equalization changes. The 11 subjects used in training and testing the system were sound reinforcement professionals as well as some music students.
The graphical user interface consisted of a 10-band[1]on-screen equalizer with real-time measurement of energy per frequency band. The sounds used were taken from unprocessed studio master tracks of typical folk/rock music (e.g. vocals, guitars, basses, drums, etc.). 42 stereo sound segments approximately 15 seconds long each were used for the training session. The testing session sounds were a distinct set of 10 more sounds. In order to be able to do pattern matching a “signature” consisting of measurement of energy per each of the nine frequency bands over all 15 seconds was taken for each sound, with a filter to exclude quiet spots in the sound segment in the averaging. For example, we did not want the measurement of average energy in a drum sound to include the silences between beats. The signature of energy in the nine bands was used to place each sound in a nine-dimensional space (nine dimensional array) for searching using nearest neighbor.
Each subject spent between 2 and 4 hours on the training phase using the interface shown in Figure 6. To start the first sound playing users would select the “Play” button. The desired goal was highlighted in the goals window (i.e. more or less of Brightness, Smoothness, or Loudness) so users would adjust the equalization sliders to make a “just noticeable difference” (jnd) relative to that goal (Gescheider 1976). The “Flat Eq” and “Changed Eq” buttons allowed subjects to compare the changed sounds to the original sounds. These controls were used in real time, while the sound was being played. The sound could be replayed as many times as needed. Once subjects were satisfied that the goal had been met, selecting the “Next” button took them to the next training example. For each user for each sound-goal combination the computer created and stored an exemplar consisting of:
· Soundfile name
· Goal (one of 6 from the goals matrix)
· Final slider positions (scaled from 0..31 for each slider)
· Energy-per-band “signature” for that sound. (RMS)
The equalizations for all six goals for each sound were completed before advancing to the next sound.

Figure 6: Training Screen for first Experiment. Slider changes corresponding to the presented goal are recorded by the system.
The same subjects used to train the system continued with the testing phase. The subjects’ accumulated data became the dataset of 462 examples (11 users x 42 sounds) per each of the six goals, for a total of 2,772 exemplars.
Subjects were asked to give a rating to each of 4 equalizations relative to the highlighted goal. These four types of equalizations are shown in Figure 7, where each consisted of the mean equalization across some set of exemplars. The y axis shows whether all or simply the 4 nearest neighbors were chosen, and the x axis shows whether exemplars from all users or simply the subject’s exemplars were considered. This arrangement was to help answer two questions:
1. Are the equalizations context sensitive? This could be determined by comparing the top row (All Exemplars) with the bottom row (NN Exemplars). Better values in the bottom row would suggest the equalizations are context sensitive.
2. Is there general agreement as to the definition of the equalization timbral terms brightness, smoothness, and loudness? This could be determined by comparing the left column (All Users) with the right column (One User). Little difference between the two columns would suggest agreement on these definitions.

Figure 7: Four Possible Equalizations for first
Experiment
The interface screen for the testing phase is shown in Figure 8. Each of the four equalizations mentioned above was represented by one of the eq selection buttons, with the rating given by moving the matching slider directly beneath it. The correspondence between the four equalizations and the eq selection button positions were randomized on each presentation.
To start the sound playing, subjects selected the “Play” button. While the sound was playing one of the four equalizations could be applied by selecting “Eq A”, “Eq B”, “Eq C” or “Eq D.” This selected equalization could be compared to the original un-equalized sound by clicking on the “Flat Eq” button. Users went back and forth between these two buttons (one of the buttons A, B, C or D and the “Flat Eq” button) until they were satisfied as to how good of a job this equalization (A, B, C or D) did relative to the goal highlighted in the goals window. A rating was then given to this equalization using the corresponding eq rating slider beneath the selected eq selection button. (Although the Eq Ratings sliders are labeled on the screen from 1 to 5, they actually mapped to values from 1 to 15.) These steps were repeated for the other three eq selection buttons (the three not yet selected from A, B, C or D). Once the user was satisfied with the ratings given to the four equalization options clicking on the “Next” button advanced to the next example.

Figure 8:
Testing Screen for first Experiment. Equalizations
are rated as to how well they do with respect to the highlighted goal.
The results showed that nearest-neighbor inferred changes were 7% better than the average equalization changes, while there was also agreement between users on the definition of timbral terms (within 0.6%). Unacceptably high standard deviations in the results led to some interesting observations.
When giving equalization ratings, subjects would sometimes leave a rating slider in the bottom-most “unchanged” position in cases where they could not hear any discernible difference between the original sound and the equalized sound. In the process of analyzing the data it became evident that there was a vast difference between subjects’ ability to hear some of the equalizations, with the number of unrated equalizations ranging from a low of 6% to a high of 50%. In other words, in the worst case the subject could not discern the equalization half of the time. This large number of unrated cases contributed to the large standard deviation.
The large number of unrated cases was addressed by only considering those exemplars rated by at least 8 of the 11 subjects. This revealed that nearest-neighbor did twice as well at yielding a discernibly different equalization for the smoothness goal as compared to the brightness or loudness goals. This indicates that subjects’ ability to hear equalization changes was most strongly context dependent for the smoothness goal. Interestingly, subjects had also indicated that smoothness was the most difficult timbre to implement.
One of the main reasons for the variance in the results was due to the acoustic environment varying between subjects. The system (computer, amplifier, speakers) was carried from location to location, sometimes being set up in subjects’ workplace, sometimes in their homes. Consequently the amount of ambient noise and the acoustics of the room varied from location to location, affecting subjects ability to hear. This was addressed by setting up a consistent listening chamber used in all sessions. This listening chamber was lined with SONEX sound baffling panels to eliminate early reflections, using Genelec 1030A powered near-field monitors to give more accurate sound reproduction.
A second problem was the bimodal nature of the goals. Subjects were asked to rate the goodness (1 to 15) of equalization adjustments, where the best rating should have been given to equalization adjustments making a just noticeable difference (jnd) vis-a-vis the goal. This meant that both gross equalization changes and no change should have given a worse rating. Having two separate conditions which both were supposed to be rated as poor contributed to the variance in ratings. Some subjects apparently just rated sounds as to how “good” of a job the equalization did vis-a-vis the goal.
For the second experiment the original 11 subjects were augmented by an additional 6, where subjects were first given a hearing test to determine their ability to hear the difference between different equalizations. Users were presented with two sounds, where one of them sometimes had an equalization change applied to it. He or she then indicated whether or not the two sounded the same or different. 30 such judgements were gathered, giving an indication as to how well the user could discern equalization changes. Results ranged from 60% to 90% correct judgements for the 17 subjects.
Subjects were asked to make equalizations and to give ratings of equalizations based on the aesthetic “goodness” of the equalization relative to the goal, rather than using the more problematic jnd. Fewer goals were presented in the second experiment. Rather than more and less of brightness, smoothness, and loudness (6 goals), only 3 goals were presented: more of brightness, darkness and smoothness. This cut the training time in half, helping prevent the auditory fatigue of which some subjects had complained. Sounds were edited to have a more uniform gain level, and the overall system loudness level was lowered.
As in the first experiment, there was a training and a testing phase. For the testing phase interface shown in Figure 6 was again used in equalizing the 41 sound segments, except that there were now only three goals (more of brightness, darkness, and smoothness) rather than the original 6. All examples with “Brightness” as a goal were done first, then those for “Darkness” and finally “Smoothness.”
For the testing phase, the best 11 out of the total of 17 subjects were determined by analyzing the extent to which they moved the sliders. Users who had to move the sliders to an extreme in order to effect a perceptible change in the sound were eliminated. As expected, it turned out the better the subject’s hearing as measured by the hearing test, the less the subject tended to move the sliders. The accumulated data for the 11 selected subjects became the dataset of 451 examples per each of the three goals, for a total of 1353 examples, embodying the “knowledge” of the system which was evaluated during the testing phase.
Rather than present four equalization possibilities (Figure 2) for rating by the subjects using the interface in Figure 8, there were now only three equalization possibilities rated by the subjects. These three were: (1) The average equalization for all exemplars for all users, (2) The average equalization of the nearest neighbors for this subject only, and (3) No change, which was used as a control.
Each of these three equalizations was represented by one of the Eq Selection Buttons. The linear average was the mean slider change across all 11 users for all 41 sounds for the current goal. This average embodied the approach of “always do the same thing” for a desired goal, such as always increasing the rightmost sliders to make a sound more bright.
At the other extreme the NN average was the mean slider
change of the 2 nearest neighbors from that subject’s training session only.
The nearest neighbors were computed by comparing the example’s signature
(energy per each of the 9 bands) with the signatures of the stored data. This
was essentially placing the example point in a 9-dimensional space and finding
the two closest points.
The experiment validated the hypothesis that nearest neighbor pattern matching (context dependent) does a better job at equalizations than does a linear average. The mean evaluation of the “no change” equalization (the control) was 2, the linear (non-context dependent) equalization mean rating was 6. The nearest neighbor (context dependent) equalization was over 10.08, which is 68% better than the linear equalization.
Rank ordering of the results by goal showed that brightness was the easiest timbre for subjects to identify, followed by darkness, with smoothness being the most difficult. (10% of the time subjects rated the “no change” as the most aesthetically pleasing smoothness equalization.) Rank ordering of which of the three equalizations was preferred showed than NN equalizations were preferred 81% of the time.
The implementation of NN to do sound equalization was successful in learning to perform a perceptual skill. This illustrates that combining fast processors and large memories make a pattern-recognition approach feasible for perceptual tasks, where the computer is used as a perceptual tool. The objection of memory limitations can be overcome through judicious pre-processing and if necessary, through a first-in-first-out or least-recently-used replacement strategy of examples.
The trained system developed here has been implemented as an expert equalizer (Figure 9), where a sound is selected, and then simply by moving a slider under the desired goal, a context-dependent appropriate amount of equalization is done.
One way to look at the system is that it implements a many-to-one mapping, putting many complicated controls into a single control that appropriately affects the outcome. The paradigm presented here could be used to exploit the computer as a tool in extending users' perception in the modalities of sight or smell or other applications in hearing. Using nearest neighbor for a perceptive task could be used by airlines in interpreting video or x-ray data in explosives detection in luggage [7] or by the Navy in interpreting audio signals for submarine detection.
The most interesting avenue for exploration with the type of user interface presented here would be to include temporal information, with the desired effect being control over a sound quality such as reverberation. Complex relationships between many controls such as filters and delays could be given a single control to help localize sound in 3 dimensions [6] , determine physical characteristics of the virtual listening environment, and set the perceiver’s location in the space.
As illustrated in this work’s application to sound equalization, a computer can be used to capture expertise, becoming a perceptive tool to give a user an expert level of skill.

Figure 9: Expert equalizer interface. Slider changes in the “Eq Effects” window
for a particular goal automatically give a context-dependent equalization.
Thanks to Orion Poplawski, Timothy Mills, and Dave Angulo for assistance in programming. Thanks to Doug Jones for testing speakers and to Peter Langston for providing sound files. Thanks to the following for the many hours of work training and testing the system: Tom Miller, Rob Motsinger, Jeff York, Moses Ling, Helen Hudgens, Shaun Morrison, David Schuman, Mike and Lisa Danforth, Dick Cutler, Jeff Cline, Pablo Perez, Stan Sheft, Norman Kruger, John Bobenko, and John Lanphere.
[1] Bartlett, Bruce, and Bartlett, Jenny. 1995. Engineer’s Guide to Studio Jargon. EQ (February): 36-41.
[2] Cohen, Harold. The further exploits of AARON, painter. Stanford Humanities Review 4:2.
[3] Frey, Peter W., and Slate, David J. 1991 Letter Recognition Using Holland-Style Adaptive Classifiers. Machine Learning. The Netherlands: Kluwer Publishers, 6:2 (March).
[4] Holland, John 1986. Escaping Brittleness: The Possibilities of General Purpose Learning Algorithms Applied to Parallel Rule-Based systems. In R.S. Michalski, J.G. Carbonell, & T.M. Mitchell, eds., Machine Learning II. Los Altos, CA: Morgan Kaufman.
[5] http://www.ncsa.uiuc.edu/Vis/Projects/Docker.
[6] Kendall, Gary S., and Martens, William L. 1984. Simulating the cues of Spatial Hearing in Natural Environments. Proceedings of the 1984 International Computer Music Conference, Paris.
[7] McCorduck, Pamela. 1991. AARON's code: meta-art, artificial intelligence, and the work of Harold Cohen. New York : W.H. Freeman.
[8] Murphy, Erin E. 1989. A Rising War on Terrorists. IEEE Spectrum, 26:11:33-36.
[9] Quinlan, J. R. 1986. Induction of Decision Trees. Machine Learning 1:81-106.
[10] Rudnicky, Alexander I., Hauptmann, Alexander G., Lee, Kai-Fu. Survey of Current Speech Technology. Communications of the ACM 37:3 (March): 52-57.
[11] Spector, Lee. 1995. International Joint Conference on Artificial Intelligence 95. Montreal, Canada, August 20-25. Workshop on AI & Music. In Press.
[12] Stanfill, C., and Waltz, D. 1986. Toward memory-based reasoning. Communications of the ACM, 29:1213-1228.
[13] Thorpe, C., Herbert, M., Kanade, T. and Shafer, S. 1987. Vision and navigation for the Carnegie-Mellon NAVLAB. In Annual Review of Computer Science. Vol. II. Annual Reviews Inc., Palo Alto, Calif.
[1] The sampling rate of 22.05 kHz. limited the highest sampled frequency to 11 kHz., so the tenth band at 16 kHz. was disabled.