Neural Adaptation and Perception of Music

2022-01-19

This is continuation of a previous topic on this blog.

Introduction

Have you ever listened to a piece of music on repeat, for days on end?

But have you ever tried doing this while sleeping, throughout the whole night?

Once upon a time, I did exactly this: fall asleep to a song playing on repeat in headphones, all throughout the night. The next morning, however, I was shocked to find out that the same song I just listened to did not sound the same anymore. It was clearly the same song, but now it sounded distorted in very unusual ways. It's as if the band decided to switch instruments, hire a new producer and re-record the same song with different sounds.

However, throughout the day, my previous perception of the song slowly returned. In other cases, the previous perception would return on the second day.

Thankfully, I am not the only one who has experienced this phenomenon. After some Google searches, I could find very similar accounts from different discussion forums:

"I've found on many occasions that if I listen to a specific song on repeat whilst I sleep, it sounds very different when I play it again the next day. I usually have to check that it actually is the same version of the song. I can't pinpoint what specifically sounds different. It just sounds like a cover of the song. It's hard to describe."

"I'm a fan of Lady Gaga's song Aura, and recently slept while having the song play on loop. I played the track again a few hours after waking up, and I could not recognize anything in the song except for the lyrics, it's like all of the layers and the production were jumbled up to create an entirely different sound. I was wondering if anyone here had similar experiences with this "phenomena"?"

"The tone and the instruments seem really off, it almost sounds like someone tried doing a cover of the song, a really bad one."

Sadly, besides a couple of forum posts, not much is to be found using Google. There are no easily accessible resources on this topic at all, even after thorough searching.

And this brings us to the topic of this project - explore this phenomenon in-depth and try to find the mechanisms behind it. The first part of the project is devoted to research and the design of an experiment. In the second part, I conduct a small-scale pilot study and process the outcome.

Background

With the help of Dr. Jaan Aru, my supervisor for this project, I discovered relevant topics which could be the cause for the phenomenon in question. Namely, neural adaptation (and aftereffects), auditory saliency, rhythm & pitch perception and auditory remapping.

For example, Li, Wang and Chen (2021) found that "after adaptation to a decelerating rhythm, participants tended to perceive the subsequent isochronous rhythm as accelerating"¹. This is a type of short-term neural adaptation. Similarly, Walker and Irion (1979) have found the following in their study:

"If a low-pitched tone of long duration alternates with a high-pitched short tone during an inspection period of a few minutes, then a high test tone of intermediate length sounds longer than a low test tone of equal length. [...] A second contingent aftereffect occurs if pairs of tones of the same pitch, the first short and the second long, are presented repeatedly during an inspection period. If a pair of test tones of equal and intermediate length is then presented, the first tone sounds longer than the second"².

This is, once again, an example of short-term neural adaptation to repeating stimuli. However, the phenomenon we're interested in appears to be quite a bit more long-term, appearing throughout the whole night, for multiple hours instead of minutes. One would think that the mechanisms of adaptation might be different in our case.

Figure 1 shows us the setup for experiments made by Li et al (2021). After listening to a rhythm in the "Initial adaptation" phase, neural adaptation would happen. In the "Top-up/test trials" phase, both the initial rhythm and a neutral rhythm (with equal intervals) would be played in succession, and the participants tasked to specify which kind of rhythm do they hear during the neutral parts.

Figure 1. A schematic of accelerating and decelerating rhythms used by Li et al¹.

Besides neural adaptation, I explored the mechanisms of sound perception. I discovered that during REM sleep (but not NREM), the auditory cortex continues to process some low-level acoustic sounds. According to Sifuentes-Ortega et al (2021), the reaction to musical rhythm lessened greatly during REM sleep, yet some activity still persisted³. This can explain why neural adaptation can happen at all during sleep. And, this also means that music is indeed processed differently during sleep, which could be related to this very unusual effect.

For all sounds we hear, the auditory cortex generates "feature maps", which directly shape our perception of sound. These feature maps contain additional information on the newly processed sounds to direct our attention to very specific sounds⁴. Figure 2 contains a diagram of the processes involved when generating feature maps and choosing attention cues, proposed by Delmotte VD. (2012)⁴.

Figure 2. A diagram showcasing the generation of feature and saliency maps⁴.

One would guess that songs already familiar to us result in very similar feature maps being generated on each listen. Perhaps when sleeping, the adaptation effect introduces changed to the generation of feature maps. As a result, we immediately pay attention to the new, different aspects of the song and find it peculiar. This would also explain why the new perception of the song persists for some time after the change.

Despite these findings and much more searching, I could not find any literature that would describe the phenomenon in question. The most appropriate studies I could find mention rhythm and pitch of sounds instead of timbre. Also, I could not find relevant studies for adaptation to more complex music during REM or NREM sleep, instead of simpler rhythms and isolated sounds.

Experiment Design

Because I could not find any suitable studies, I resorted to designing my own experiment for the phenomenon. Due to the highly subjective nature of the effect, it is unclear how to capture the entire effect objectively. So, we should start with some minor aspects of the effect.

One way of introducing distortions to sound without directly changing the pitch or adding additional sounds is by changing the frequency spectrum. Different instruments and sounds in a song occupy different parts of the frequency spectrum. By introducing slight changes to some frequency bands, we can get different minor changes in the overall sound. For example, by cutting the 80-300 Hz band, we can reduce the prominence of bass instruments. Or, by boosting the 500-900 Hz area, we can increase the prominence of vocals or piano. Of course, these changes introduce overall distortions to the song which are not present in the original phenomenon.

The idea of changing the frequency spectrum is used in the experiment, described in-depth below. The experiment consists of four parts: Preparation, Test 1, Sleep and Test 2.

Preparation

A suitable song is chosen for the experiment. The song must be "catchy" enough to allow for repeated listening throughout the whole night without becoming tiring or annoying.

Then, the chosen song is processed for the testing parts of the experiment. This is done manually. The goal is to create multiple short (< 20 seconds) snippets of the song, containing different parts of the original (such as the intro, verse, chorus, solo). The created snippets are then processed multiple times using an audio editor. As a result, multiple versions of the snippets are generated, each containing a slight distortion from the following list:

80-200 Hz Boost
80-200 Hz Cut
300-1200 Hz Boost
300-1200 Hz Cut
2500-6000 Hz Boost
2500-6000 Hz Cut
Speed-up 5%
Slow-down 5%

All of the generated snippets are then saved under a web page, which will be used for the testing parts of the experiment. Figure 3 contains an EQ filter plugin with a slight 300-1200 Hz boost applied to the audio. Using a stronger boost would result in the distortion being too harsh and immediately noticeable.

Figure 3. Slight frequency boost in the 300-1200 Hz band⁵.

Test 1

When a suitable day is chosen, the participant is asked to listen to the chosen song throughout the day, paying attention to different parts of the composition.

After listening to the song throughout the day, the participant is asked to complete an online test. The test consists of multiple multi-choice questions. For each question, multiple (3-4) of the previously generated snippets are placed in a random order. Thus, each processed snippet appears at least once in the test. For most of the questions, the original snippet (without any distortions) is randomly included as well. The participant must listen to the snippets and decide, which one is closest to their current perception of the song. Additionally, the participant must give a confidence score of their choice, on a range from "Not confident at all" to "Very confident".

It would be expected that participants mostly select the original snippets with high confidence scores. If any distorted snippets are selected, a lower confidence score would be expected then.

Sleep

After completing the first test, the participant is asked to go to sleep for the night, with the chosen song playing repeatedly in a pair of headphones. It's important for the headphones to be comfortable enough to allow for sleep, as well as the volume of the song.

The "Apple Earpods", included with new iPhone devices, seems to be a safe choice, as the headphones are very light and do not sit deep in the ear canals.

Test 2

After sleeping through the night, the same test from the evening before is completed again. Now, the participant is also asked to rate their change of perception of the song on a range from "No change in perception" to "The song in hardly recognizable". The participant can also leave any additional subjective comments.

If the design of the experiment is successful in capturing the effect, we could expect to see more distorted snippets being chosen with higher confidence values. For the original snippets chosen, lower confidence values would be expected.

Pilot Study

In order to test the effectiveness of the experiment, multiple pilot experiments were carried out. Unfortunately, few participants were willing to take part in the study. One participant completed the first test, but failed to fall asleep afterwards. So, I resorted to completing the experiment multiple times myself.

It should be noted that in some cases, sedatives (quetiapine, mostly) were used for the experiments in order to fall asleep with the songs playing in moderate volume levels, for better results. The sedatives can theoretically influence the effect of neural adaptation, but more experiments without the use of sedatives are required to measure the changes.

Finally, it should be noted that no automation for creating the distortions was implemented. This would require much more time and deeper knowledge of audio processing to do in code. Instead, a GarageBand (DAW application) template was created with all distortions in separate tracks. The chosen song is first split into parts, then imported into the template. Then, a separate sample is exported for each track. Finally, the samples are moved to a custom test web page for the experiment (created by me for this project). In total, this process takes around 40 to 50 minutes for each experiment.

Experiment 1

The first experiment was carried out with the goal of learning to create the testing pages and distorted snippets. The participant was me (M, 21). The results showed that the created distortions were too severe and noticeable during both tests. Thus, the tests failed to accurately capture any differences in perception. Yet, a subjective opinion was noted (a change from 0 - "Not applicable/no change" to 2 - "Maybe sounds a little different"). Additionally, the following comment was given in the second test: Bass guitar more clear. Strings more pronounced, "in your face".

The first experiment showed that much more work and testing needs to be done in creation of the distorted snippets.

Experiment 2

In the second experiment the participant was female, aged 21. Sadly, she could not fall asleep with the repeated song, so only the first test was completed. The test showed multiple weaknesses in the created distortions, as it was hard to distignuish between them and select the original samples.

In the test, only 2 original samples were selected. 4 of the samples were selected as EQ cuts and 2 samples as EQ boosts. The confidence scores were selected as follows: "1" once, "2" 5 times, "3" 2 times. Thus, the participant could not confidently identify her perception of the song before the test, and told me separately that she experienced much confusion when selecting the samples.

The second experiment showed that the distortions should be a little more severe (5-6 dB instead of 3-4 dB), and that less samples should be used.

Link to test page for this experiment

Experiment 3

The third experiment featured a total of 9 questions and EQ distortions of 5 to 6 dB. Again, I participated in the experiment myself (M, 21). The experiment was a success, as it was easy enough to see a very slight difference between the created distortions, yet choosing the "correct" samples was also easy enough with high confidence. In total, all samples were selected as original in the first test. Three questions were marked with a confidence value of "3" and three marked with confidence of "2".

After a successful night of repeated listening, a very strong change of perception happened. The subjective score changed from 0 - "Not applicable/no change" to 3 - "Definitely has changed". Two samples were selected as original, but with confidence scores of "2" and "1". From the remaining samples, distorted versions were selected. One sample had a confidence value of "3" and all remaining confidence values were "2".

From this, we can conclude that the new perception was almost as strong as the previous perception, as the confidence scores were mostly medium, and one high. Additionally, the following comment was given for the new perception: Songs seems slower. Very distinct electric guitar parts in both ears (without reverb). Pre-chorus guitars very "sparkly". Bass rumbling more.

Link to test page for this experiment

Experiment 4

In the fourth experiment the participant was male, aged 21. For this experiment, a composition with orchestral parts was chosen. Unfortunately, similarly to the second experiment, the participant failed to fall asleep to the music playing in the background.

In the first test, only two (out of 6) original samples were selected. Most of the confidence scores were low (1-2), with only one score being high (3). This means that it was hard for the participant to decide on which of the samples were original, due to them sounding too similar to each other.

This experiment demonstrates again that using music with more elements and parts is preferred. Unfortunately, this means that the design of the experiment is not optimal for all genres and types of music. Thus, more methods should be created for capturing the effect for different types of music.

Link to test page for this experiment

Results

The conducted pilot experiments allowed for changes and tweaks being made to the experiment design. The experiments showed initial weaknesses in the choice of distortions. The number of samples was also decreased to make the tests shorter and not as exhausting to complete, sacrificing the amount of data collected.

Experiment 3 showed that a perception of the chosen song does indeed change, as marked with the subjective score and comments, as well the choice of selected samples shifting from all "original" (with mostly high confidence scores) to mostly distorted (with mostly medium to high confidence scores). Figure 4 contains comparisons of the confidence scores for each samples in experiments 1 and 3. We can see that in the second tests, most of the confidence values are lowered, yet still remain mostly medium.

Figure 4. Change of confidence values in experiments 1 and 3.

Figure 5 contains comparisons of the number of original samples selected in experiments 1 and 3. The first experiment was unsuccessful due to the created distortions being too severe. The third experiment was much more successful, as most of the selected samples were distorted.

Figure 5. Number of selected original samples in experiments 1 and 3.

The experiments show that the method does indeed capture the effect in more detail than simple descriptions. Additionally, the severity of the effect can also be measured by looking at average confidence scores and the overall subjective severity score.

Experiments 2 and 4 show that the severity of the created distortions depends on the song and should be chosen carefully. Songs with more instruments, which occupy a larger amount of the frequency band on average, are more suitable for the created experiment design. Songs with less instruments and more dynamic parts proved to be harder to recognize from the created distortions, so they should be avoided if possible. Alternatively, more severe distortions can be created for such songs, at a risk of them being too noticeable.

Conclusions

After studying background information on neural adaptation and occuring aftereffects, we can still expect neural adaptation to be a possible cause for this effect. Previous research shows that listening to a repeating sound can and does cause aftereffects and distortions in our perception. Additionally, it has been shown that our auditory cortex processes and filters all incoming sounds, which can also possibly contribute to the effect, combined with the adaptation process.

During this project, a method for capturing the effect more objectively and in more detail was developed. The method does not definitely confirm that the effect is caused by neural adaptation, though. Still, the created method is a first step in capturing and exploring the effect objectively and in more detail. It is an improvement over giving descriptive comments.

The developed method can be improved further by trying more distortions and effects on samples. Also, it is worth experimenting with very different genres of music and even non-musical sounds. Additionally, it would be useful to develop an application for generating the distortions automatically to speed up the process.

One large problem with this project is in the small number of participants and experiments. This project covered only the creation of the method and some pilot experiments to see if the method can even work at all. Another crucial detail is that the experiments which were completed were done by a single person (me). To see more results with more accuracy, it is essential to test the method on multiple people with different sex, age and possibly brain structures. Two other persons took part in the experiment, but failed to fall asleep after trying for some time. This is understandable as the experiment requires the participants to have good quality sleep or use moderately strong sedatives in case of problematic sleep. This also shows that moderately loud music disrupts normal sleep.

Separate research should be done to test this effect further using the developed method. Additionally, even more methods of measuring this effect should be developed to capture more aspects of this effect and to support more styles of music. For example, it might be useful to count the number of distinct melodies/riffs/parts of a song before and after the effect as another way of capturing the distortions and change in perception.