Sunday, September 7, 2025

Psychology of Music (Tan et al., 2010): Chapter 3.2, The Anatomy of Audition

In this post, we'll cover the latter half of Chapter 3 on the (neuro)anatomy and physiology of the auditory system. Specifically, we'll discuss the 3 main components of the auditory system: the external ear, the middle ear, and the inner ear. The section on the inner ear will focus on the cochlea and how it may help us identify pitch. Finally, the last part of the chapter briefly talks about the auditory cortex.
Fig. 1 The anatomy of the ear (Plack, 2018)


Anatomy and physiology of the ear

The external ear

The external ear is comprised of 2 parts, namely the pinna and the auditory canal (see Fig.1). The pinna refers to the outer 'flappy' part of the ear that protrudes from the head. It functions something like a satellite dish, scooping up sound waves from the surroundings and directing them into the auditory canal. The irregular shape of the pinna also allows for the reflection of sound waves, which allows for sound localisation (i.e., figuring out where the sound is coming from).

That being said, sound localisation isn't a function performed by the pinna alone. Our bodies can play a part in this, too! For instance, because humans can't independently move our pinnae, we rely on moving our heads, necks, and bodies to better position the ears toward the sound source. Additionally, we can rely on binaural cues, that is, auditory information that arises due to the small distance between both ears. Because of this distance, sounds typically arrive at each ear at slightly different times, leading to an interaural timing difference. Since sound waves tend to lose energy with time, this also means that they can reach either ear at different amplitudes, leading to differences in interaural intensity. 

The other part of the external ear is the auditory canal, which is an S-shaped tube connecting the outer ear (i.e., the pinna) to the eardrum. The auditory canal functions somewhat like the coupled resonator of a musical instrument. Just as coupled resonators resonate with and amplify sound waves produced by the vibrating device in an instrument, the auditory canal acts like a resonator that resonates with and amplifies sound waves directed in via the pinna. This property of the auditory canal means that incoming sound vibrations will have enough energy to sufficiently agitate the eardrum by the time they reach the end of the canal.

The middle ear

The middle ear (see Fig. 1) starts with the tympanic membrane (i.e., the eardrum), which is connected to the inner ear via tiny bones (i.e., the malleus, incus, and stapes) collectively referred to as ossicles. The middle ear has 2 important functions. Firstly, it helps preserve auditory information while it is being transferred from the outer to the inner ear. This function is better appreciated when we consider how sound waves are transmitted via the medium of air in the outer ear, and via the medium of liquid in the inner ear.

Because liquid is denser than air, for the same amount of energy, there will be more resistance to motion in liquid than in air. This is quantified using medium impedance, or the ratio of the incident pressure to the induced flow in the medium. Given the same numerator (i.e., same incident pressure), the impedance of liquid has a smaller denominator than that of air (i.e., liquid flows less than air / more resistance is met in liquid than in air). Consequently, the impedance of liquid will be much larger than that of air.

How the middle ear, more specifically, the ossicular system solves this problem of increased resistance is through the process of impedance matching. Crudely, vibrations from the outer ear are transmitted to the tympanic membrane and then to the ossicles, which further amplify the vibrations such that by the time they reach the fluid-filled inner ear, auditory information is preserved and not lost.

Secondly, the middle ear also protects the inner ear from loud sounds. For example, muscles attached to the ossicles contract when there are very loud sounds, thereby bringing down their volume. Still, this process is too slow to attenuate sudden bursts of high-amplitude vibrations, so it's not a foolproof mechanism against all kinds of loud noises.

The inner ear

The inner ear is comprised of the cochlea (see Fig. 2), which is made up of 3 parallel tubes shaped in a snail-like fashion. These tubes are the vestibular, median, and tympanic canals, all of which are filled with incompressible fluids and also separated by several flexible membranes. Of interest is the basilar membrane that separates the median and tympanic canals. Vibrations from the ossicular bones first agitate the oval windows of the vestibular canal, which then induce pressure waves in the cochlear fluid. This pressure then reaches the basilar membrane, which is made of paired rows of inner and outer hair cells, called the organ of Corti (see Fig. 3).
Fig. 2 The unrolled cochlea (Plack, 2018)

Fig. 3 The organ of Corti (Plack, 2018)

What happens next applies primarily to the inner hair cells. When the pressure waves reach their hair cells, they push up against the positively charged cochlear fluid, causing a small opening at the base of the hair cells to open up. This allows positively charged ions from the cochlear fluid to rush in, thereby creating an action potential within the cell that subsequently starts the process of neural signal transmission.

This process of transforming energy from one form to another (e.g., from sound waves to neural signals) is known as transduction. This is the main function of the inner ear, that is, the transduction of auditory information from a physical pressure wave in the cochlea to an electrochemical signal in the brain. Importantly, while the form of energy transmission changes, auditory information is preserved.

Theories of pitch encoding

According to the authors, another term for the process of information preservation despite transduction is called encoding. In other words, while the cochlea transduces fluid vibration to neural signals, it also encodes in the neural signals the informational structure found in the vibrations. Concerning pitch encoding, there are 2 main theories, namely place theory and time theory, which are posited to account for the encoding of higher and lower frequencies, respectively.

Briefly, the informational structure of importance in place theory (see Fig. 4) is where the basilar membrane vibrates given a specific frequency. Simply put, a particular (fundamental) frequency specifically activates hair cells at distinctive areas of the membrane, and this produces neural signals specifying frequency and pitch. Meanwhile, the informational structure of interest in time theory is the timing between successive peaks in the vibrations. Here, vibrations of higher frequencies cause the auditory nerve to fire more rapidly, and this pattern of firing is what distinguishes one pitch from the next.
Fig. 4 Place theory (source: link)

Both theories are not without limitations. On the one hand, place theory doesn't explain how pitch can still be identified even when the fundamental frequency is missing (e.g., when singing low pitches over the phone). This means that the relevant 'place' on the basilar membrane will not be activated, so how can we still perceive pitch? On the other hand, for time theory, frequencies above 500 Hz result in continuous, nonstop firing of the auditory nerve. If nerve firing is continuous, no distinctive pattern is formed, so how can we perceive pitches of higher frequencies? Given these limitations, the general consensus states that both theories are required to explain pitch encoding. Specifically, pitch theory accounts for the encoding of higher pitches above 500 Hz, whereas time theory explains the encoding of lower pitches below 500 Hz.


Auditory pathways in the brain

Now that we've covered the different parts of the ear and how the cochlea facilitates the production of transduced neural signals, we'll now discuss how these signals travel from the ear to the brain. Generally, there are plenty of ipsilateral (i.e., same side) and contralateral (i.e., opposite side) connections between the cochlea and the brain. In simpler terms, there are rich pathways that connect the left ear to the left brain hemisphere, as well as those connecting the left ear to the right hemisphere (and vice versa for the right ear). This provides some initial neuroanatomical evidence of the importance of binaural information for hearing -- given the bilateral connections between ear and brain, it makes sense that we would capitalise on these pathways to compare interaural inputs.

Perhaps the most important brain structure for audition is the auditory cortex, which is situated at either side of the temporal lobes, and which consists of 3 layers of neurons. In the innermost part, we find the primary auditory cortex. Here, neurons are tonotopically arranged in columns. This might sound like jargon, but it essentially means that each neuron in a column is sensitive to a neural signal corresponding to a particular frequency. Using what we've learnt from place theory, we can conclude that tonotopical mapping implies that there is a mapping between a location on the basilar membrane and a location in the primary auditory cortex. And there is some evidence for this, with higher frequencies typically leading to activation in the centre of the primary auditory cortex, while lower frequencies activate more peripheral regions.
Fig. 5 The auditory cortex (source: link)

As for the outermost secondary and tertiary layers of neurons, they seem to respond more strongly to the higher frequencies above the fundamental in a complex tone. Hence, while the primary auditory cortex might be responsible for pitch identification and localisation, the outer layers of the auditory cortex seem to be more concerned with the perception of timbre and sound quality.


Concluding remarks

With that, we've finally come to the end of chapter 3. In the next chapter, we'll explore how neuroscientific approaches and methodologies have been generally employed in the study of music psychology. This will set us up for deeper dives into the link between neuroscience and music in later chapters!


References

Plack, C. J. (2018). The sense of hearing. In Routledge eBooks. https://doi.org/10.4324/9781315208145

Tan, S., Pfordresher, P., & HarrĂ©, R. (2010). Psychology of Music: From Sound to Significance. http://ci.nii.ac.jp/ncid/BB01824497

No comments:

Post a Comment