Obscure gestalts in short musical loops (draft)

Apr 10, 2024

I want to talk about some song segments, many of which I think are objectively special. But understanding their specialness requires focusing on stuff which music theory doesn't analyze.

(I'm not trying to say that music theory is bad or ineffective at what it is doing. It just doesn't analyze the aspects of music I'm interested in. And those aspects matter only for a ridiculously small subset of musical compositions.)

TL;DR

The point of the post, in short:

Any sequence of sounds contains elementary psychological ambiguities.
There are relatively simple ways to emphasize those ambiguities.
But musicians don't intentionally tap into that direction. Intentionally tapping in that direction would result in a weird and very restricted type of minimalism.

The key idea of the post is pretty simple (putting sounds into arbitrary groups), but unfortunately introducing it requires a lot of context.

This is a bad post

I want to address some bad and moot aspects of the post:

I apologize for not having learned music theory yet, despite being pretty invested in a musical topic.
Instead of learning your concepts I'm asking you (experts) to learn my concepts. That's cringe.
I know that studying music psychology is very hard. But I want to focus on a relatively small aspect of music psychology. Which isn't even 100% psychological.
This post is very pretentious, giving weird names to the most trivial things. Like comparing sequences to fractals. Cringe.

You don't have to engage with this post. But if you do, please engage with the ideas from the post "head on" (you don't need to read the entire post for that). If you didn't understand something, just ask.

Degrees of objectivity

I think there are three degrees of objectivity:

Subjectivity. When we disagree and can't even know why exactly we disagree.
Semi-objectivity. When we disagree, but we know why we disagree. We can pinpoint a very specific thing we disagree about.
Objectivity. When it's impossible to disagree without denying reality.

In this post I intend to talk only about objective and semi-objective things. Unless I state otherwise.

Please, make the distinction between subjectivity and semi-objectivity for this discussion. Usually, if a text contains at least a single thing you can disagree with, it makes the text "subjective". But here it doesn't matter how many things you can disagree with as long as the reasons of the disagreement are known, as long as the disagreement is tied to very specific things.

Vagueness

I think vagueness of a concept depends on its necessity (if a concept is necessary, it doesn't make sense to criticize it for vagueness) and domain of application (the same concept can have different degrees of vagueness in different domains).

So, to not see my concepts as "too vague", you need to keep in mind the motivation behind them (necessity) and that I'm discussing a very restricted kind of musical compositions (the domain).

Part 1

Common listener viewpoint

Any image consists of exact relative distances between its smallest parts. Be it a pixel image or a vector image. And you can clearly see those exact relative distances. If you couldn't, you would be blind, seeing any image as shapeless and unstable. But most of you can't notice those exact relative distances on an intellectual level. If you could, you would be able to make a good drawing of anything you see. But "drawing" and "seeing" are very different things for most humans.

With music there's a similar story. Music theory talks about a lot of things which a common listener can hear, but can't notice on an intellectual level. "Playing/writing music" and "listening to music" are very different things for most humans.

I'm interested in analyzing the simplest things which a common listener can notice on an intellectual level. Please, accept this "common listener viewpoint" for our discussion.

Decision viewpoint

Do you know ambiguous images? When you look at such an image, you can consciously "decide" what you see. But it doesn't mean that you make a conscious decision every single moment you look at the image. So, we can say the structure perceived in an ambiguous image depends on your half-conscious decisions.

I think the structure perceived in music is the same. It depends on your half-conscious decisions. If that's true, it means you can discover completely new structures in familiar music and hear familiar music completely differently.

Usually people don't talk about half-conscious decisions made when listening to music. But this post focuses specifically on analyzing them. Please, temporarily accept this "decision viewpoint" for our discussion.

Grouping sounds

If "decision viewpoint" is hard to understand, listen to some music and imagine arbitrary events synchronized with the music. Note that you can decide how exactly the music is synchronized with the events. Maybe every sound is synchronized with a separate event. Maybe multiple sounds are synchronized with parts of a single event. You decide. If you need an example of what I'm talking about, watch something from Disney's Fantasia 2000. Or watch some masterpiece by Kevin Caldwell.

Imagined something? Well, deciding the structure of music is similar. It just doesn't require explicit visualization. Only mental grouping and mental separation of sounds. I call a mental group of sounds an "event".

Let's try it on a specific example. Listen to the first 30 seconds of Electricity (Dr. Rockit's Dirty Kiss) by The Avalanches and Matthew Herbert. You can decide:

Do you treat the moment a note is hit and the period of long sustain as parts of the same event? Do you treat the sustained sounds as the background noise?
Do you treat the period the notes are played faster (0:08 - 0:15 or 0:22 - 0:24) and the period the notes are played slower (0:00 - 0:08 or 0:15 - 0:22) as parts of the same motif?

This post is all about analyzing the implications of such decisions.

Part 2

Terminology

Here's my definitions of the terms I'll use:

Event - a mental group of sounds. Sometimes an event can be smaller than a note. For example, if a note is sustained, it can be treated as multiple events. Identical events don't have to contain identical sounds.
Pattern - any sequence of sounds. Motif - the most important sequence of events. I use the term "motif" very liberally. Sometimes even a single event or a single note counts as the motif.

This post is about drawing arbitrary boundaries between sounds. So those terms are necessary.

Speed - the amount of events per second or per 5000 milliseconds or per 3 seconds or per any other timeframe. Sometimes changing pitch or changing loudness counts as changing "speed". Duration - the duration of a pattern or motif.

I treat "speed" as a continuous variable which can be measured at arbitrary timeframes. Just take two equal periods of time and compare how many events happened in each of them. More events = greater speed. And we can discuss speed fluctuations in the most trivial note patterns. For example, take heart sounds. It's just a 2-note pattern. But we can already talk about speed fluctuations in that pattern. Because the notes are not spread out completely evenly. The closest music theory terms (to what I mean by "speed") are probably "note density" and "rhythmic density". My notion of "speed" doesn't care about the time signature or the beat.

Call-and-response (redefined?) - may reference any consecutive events which feel like a conversation.

In the most general sense, call-and-response just means that a melody or motif has two parts which can be perceived as "call" and "response". And I use the term more or less in the same sense.

Elementary ambiguity

"Elementary ambiguity" is the key concept for our discussion. What is it? It's a case when two compelling, but contradictory mental groupings of sounds are possible. Let's look at a couple of specific elementary ambiguities.

Amo Bishop Roden by Boards Of Canada, e.g. the first 25 seconds (a piano cover, look at the right hand). Yume Nikki Ending by Kikiyama (a piano tutorial). Columbia by Quevedo, first 8 seconds (a piano tutorial).

In all three cases the main motif is played 1 note at a time, except for a brief moment where multiple notes are hit at the same time. You can treat those multiple notes as multiple events or as a single event. First interpretation implies greater speed. Second interpretation implies lesser speed.

Heart sounds ("lub dub... lub dub... lub dub"). The Next Episode by Dr. Dre, first 5 seconds (a piano cover, look at the left hand, small notes).

You can treat "lub" and "dub" as two separate events or as parts of a single event. First interpretation implies a motif with a steady speed. Second interpretation implies a motif with fluctuating speed. Subjectively, I experience this ambiguity as a mix of confidence (steady speed) and anxiety (speed fluctuations). Now, what about The Next Episode? It has a heart-like 2-note motif.

The point of EAs

Any motif contains elementary ambiguities. Without context it's a completely useless and vacuous concept. But some motifs are especially simple, emphasizing their elementary ambiguities. And there are additional methods, beyond simplicity, of emphasizing an elementary ambiguity.

For example, take Amo Bishop Roden by BOC. The main motif has an ambiguity, but the notes are also sustained (?) or some effect is applied to them. As if the motif drowns in noise. It adds an additional ambiguity ("do I really hear notes or is it just fluctuations of indiscernible noise?") which combines with the first one.

Or take The Next Episode (first 10 seconds) by Dr. Dre. It has a 2-note motif with an ambiguity (about fluctuating speed), but then there appears an additional instrument (0:05 - 0:10) which fluctuates in speed itself. Second instrument highlights the ambiguity of the first.

So, by using audio effects and multiple instruments we can emphasize an ambiguity of a simple motif. And while interpreting a motif as having or lacking an emphasized ambiguity is sort of arbitrary, we can make arguments about those interpretations. Just like normal music theorists do when they want to say something non-trivial about a piece of music.

EAs and emotions

I believe that any experience, not only the experience of music, contains elementary ambiguities. When those ambiguities are emphasized, you get especially distinct experiences. Such as uncanny valleys and liminal spaces; the pleasure of watching water flowing or fire burning; even stuff like trypophobic imagery. And the more emphasized ambiguities an experience contains, the stronger and more special it becomes.

But that's just a bold speculative conjecture. Feel free to ignore this aspect of the post.

Part 3

If you care about ambiguities, you'll realize that counting every specific ambiguity in an audio clip doesn't make much sense. What's better is to count types of ambiguities in an audio clip.

So, we need to split ambiguities into types. I'll split them into 6 types.

Fractal-like ambiguity

"Fractal-like ambiguity" means we have a note pattern which can be interpreted as the motif or multiple repetitions of the motif. Creating the feeling that the motif contains a smaller version of itself, hence the name.

For example, take a simple sequence. 4-note pattern repeated four times at different pitches. You can treat the 4-note pattern as the motif. Or you can treat the entire sequence as the motif.

Another way to create the fractal-like ambiguity is to play a simple pattern which changes a little bit before repeating. For example, listen to the first instrument of Losing It (first 24 seconds) by Rush (see a cover on glockenspiel). You can treat the motif as lasting at least ~6 seconds or as repeating each second. While the real duration of the loop is ~12 seconds. For another example, take Panic Attack (first 7 seconds) by Dream Theater (see a MIDI cover). You can perceive the motif as lasting ~4 seconds or as repeating each ~2 seconds.

The fractal-like ambiguity may fail to be emphasized when the change is too subtle or too distinct:

Dire Dire Docks (first 22 seconds) by Koji Kondo (a piano cover). There's a ~3 second pattern (12 notes) with slight changes. But the changes are too subtle.
Pull Me Under (first 20 seconds) by Dream Theater (a piano cover). Has an interesting 3-note pattern with an echo (?). But the changes of that pattern are either too subtle or too distinct.

Local ambiguity

"Local ambiguity" is when it's unclear how many events the motif contains, but the duration of the motif or the amount of times it repeats is clear.

Echoes (first 32 seconds) by Pink Floyd. There's a repeating "ping" with an echo. The echo makes it unclear with how many events to model the ping. But the duration of the ping or the amount of times it happens is clear.
Imagine I'm pressing and sustaining a note three times. "TUMMMMM... TUMMMMM... TUMMMMM". It's not clear if "MMMMM" should be ignored (treated as the background noise), but it's clear that the motif repeats three times.

Local ambiguity is emphasized when it relates to the entire motif, to the interpretation of the entire motif. Adding a random audio effect to a random motif may fail to create a local ambiguity.

Matching ambiguity

"Matching ambiguity" is when it's unclear if two patterns are the same motif or not.

To create this ambiguity you may need two instruments contrasting in timbre or speed (better both), yet having something in common. A matching ambiguity may fail to be emphasized if the instruments are dissonant or too synchronized or too independent from each other or too distracting from each other. The trick's finding the sweet spot between "too different" and "too similar".

Take The Next Episode (first 10 seconds) by Dr. Dre. It has two instruments. First appears at 0:00, second appears at 0:05. They both fluctuate in speed, so they can be perceived as playing the same motif. Or not, because they play different note patterns, have different timbre and speed.

Counterexamples:

Duvet remix / Nude version (up to 1:34) by ScummV and Grinch's Ultimatum cover by Pilotredsun & Latchezar Dimitrov. The instruments, including the wordless vocal, are too synchronized here.
Itinerant (0:46 - 2:33) by Rosetta, Language I: Intuition (0:41 - 1:10) by The Contortionist and Balkan (first 20 seconds) by Scale The Summit. Instruments sound dissonant to me.
Xenoflux (5:20 - 5:45 or 5:20 - 8:09). The second instrument distracts me from the first.

"Music which tries to continue, but can't" ambiguity

This ambiguity triggers when a short motif has two parts with uneven content. Or when a motif has very short duration. Or when a short motif ends abruptly. Such conditions make music sound as if it "stumbles" instead of continuing.

Examples of such ambiguity: Sooner or Later (up to 2:30) by King's X, There, There (first 32 seconds) by Radiohead and Island Door / Paranesian Circle (e.g. 1:26 - 1:50) by Susumu Hirasawa.

Counterexamples: I Want Wind To Blow (from 2:30 till the end) by The Microphones and Concrete Treehouse / Path to the Arcade from Uneven Dream. First feels too steady and even. Second is too long, too developed.

Call-and-response ambiguity

This ambiguity triggers when we have consecutive events with clearly contrasting pitches or note patterns.

Often "call-and-response" and "fractal-like" ambiguities go together. But in rare cases they are different. A pattern like "LUB-LUB... DUB-DUB... LUB-LUB... DUB-DUB" would have a call-and-response ambiguity without having fractal-like ambiguity. Because we have clearly contrasting (in pitch) consecutive events, but the duration of the motif is clear. Other examples:

Pull Me Under (first 20 seconds) by Dream Theater. Doesn't have fractal-like ambiguity, but has call-and-response ambiguity. Note pattern at 0:09 - 0:10 & 0:19 - 0:20 "responds" to the note patterns before.
Losing It (first 24 seconds) by Rush. The first instrument has a fractal-like ambiguity, but doesn't have call-and-response ambiguity. Because the change of the note pattern is not established clearly enough.

Merging ambiguity

"Merging ambiguity" is when it feels like multiple motifs (contrasting in speed and played alternately) are merging into one.

Part 4A

Here I share audio clips which can be interpreted as having four ambiguity types related to a single motif.

Example

Be Forever (first 30 seconds) by Piknik. There's a 3-note pattern which gets repeated 8 times with different pitches. You can see that by looking at how the music is played on a normal guitar. 3 notes repeated 8 times. Then there's weird effects applied to the guitar, probably echo and flanger effect which creates the "wowowowowow" sound. I perceive four types of ambiguity in this music:

(Local ambiguities.) 1.1. Effects applied to the 3-note pattern can be treated as additional events or not. 1.2. You can model the 3-note pattern as 3 events or as only 2. Because the last note is special. Played after a slight pause, it's like an echo. Or like a response to the first 2 notes. "TUDUM... TUM, TUDUM... TUM, TUDUM... TUM"
You can treat the 3-note pattern as the motif. Or you can treat it as only a single part of a longer motif. Because the pattern is not repeated exactly the same, it changes pitch. That's a fractal-like ambiguity.
As the 3-note pattern changes pitch, it can be interpreted as doing call-and-response with itself.
The 3-note pattern can be perceived as music which tries to continue, but can't.

Because of the special third note (see 1.2.) and flanger/echo and a sequence (see the second bullet point) Be Forever is especially self-similar.

Seagull (The Realistic Beach) from Yume 2kki is similar to Be Forever, but is weaker in every aspect. Less self-similar, less minimalistic, less echo/flanger. Music On Sand (0:50 - 1:23) by Nautilus Pompilius is similar to Be Forever too, but the local ambiguity is not emphasized.

Example

Electricity (Dr. Rockit's Dirty Kiss) (first 29 seconds) by The Avalanches and Matthew Herbert. We hear sustained notes. The note density changes significantly, from just a single note per ~6 seconds to a couple of notes per second. I perceive four elementary ambiguities in this audio clip:

(Local ambiguities.) 1.1. You can treat the moment a note is pressed and the long sustain as the same event or not. 1.2. The speed of note presses fluctuates significantly, so there are multiple ways to group those notes into events. 1.3. When the notes are played faster, you can treat them as having no sustain or having a very brief sustain.
You can treat the entire audio clip as a the motif. Alternatively, you can treat it as a smaller motif repeated two or more times.
The slower note patterns (0:00 - 0:08 or 0:15 - 0:22) and the faster note patterns (0:08 - 0:15 or 0:22 - 0:24) can be interpreted as the same motif or as different motifs merging into one. The slower pattern is steady, the faster pattern is sporadic. The slower pattern reminds me of the march, the faster pattern reminds me of the choir. From the original intro to the song. That's a merging ambiguity.
The motif can be interpreted as doing call-and-response with itself. Or not.

As Fire Swept Clean the Earth (first 24 seconds) by Enslaved. Similar ambiguities as in Electricity, but emphasized less. The moment a note is played seems too quiet and there's barely any change in speed. Watcher Of The Skies (up to 1:00) by Genesis. To me it's a worse version of Electricity. Too much dissonance/tension (?) which distracts from the ambiguities.

Entangled by Genesis (from 4:24 till the end) has the same merging ambiguity. The main instrument changes speed multiple times, creating the effect of multiple motifs merging into one. But I think Electricity emphasizes the ambiguity better.

Example

Doubt (first 24 seconds) by Piknik. There's a piano and the wind sound. Some effects are applied to the piano. I perceive four elementary ambiguities in this audio clip:

You can treat the piano effects and the wind sound as the same event or not.
If I'm not mistaken, the piano never plays exactly the same thing. So you can treat the whole 24 seconds as the motif. Or you can treat 0:05 - 0:13 as the motif. Or you can treat 0:05 - 0:08 as the motif.
You can treat the piano as doing some call-and-response with itself. Or not.
You can treat the piano and the wind as parts of the same motif. Or not.

Part 4B

Here I continue sharing audio clips with four ambiguity types related to a single motif.

Example

King of my castle (0:06 - 0:48) by Tiger Hifi. There are two instruments (the second appears around 0:30). I perceive three elementary ambiguities in this audio clip:

1.1. First instrument slowly fades out before repeating. You can treat the fade-out as its own event or not. Second instrument slows down before repeating. You can treat the slow-down as its own event or not. 1.2. It's unclear with how many events you should model the pattern played by the first instrument. Because it sounds like multiple echoes.
You can treat the patterns the instruments are playing as the same motif or not.
You can interpret the instruments as doing call-and-response. Or not.
It's unclear where the motif of the second instrument ends, because what's played by the second instrument subtly changes. Even if it's just increase in loudness.

Example

Rock'n'Roll Is Dead (first 21 seconds) by Aquarium. Let's imagine it's played by two guitars, the first one plays just ~2 notes at first. I perceive three elementary ambiguities in this audio clip:

Both guitars end on a fading note. You can treat the fade-out as its own event or not.
Both guitars play something very short and end on a fading note. You can perceive them as playing the same motif or not.
You can interpret the instruments as playing music which tries to continue, but can't.
You can interpret the guitars as doing call-and-response. Or not.

The Great Marsh by Camel, Silent Sorrow In Empty Boats by Genesis and The Flower Store (1:15 - 1:34) by Game Audio Ltd. AKA Keith Leary's company. To me those are worse versions of King of my castle and Rock'n'Roll Is Dead. Camel is too dissonant/ominous. Genesis doesn't have emphasized call-and-response. And in Keith Leary's case, the instruments are too independent from each other (one of them repeats 4 times before the other repeats 1 time).

Example

The Next Episode (first 10 seconds) by Dr. Dre. There's a 2-note pattern. At 0:05 a string instrument appears. There are also sustained notes, serving as the background noise. I perceive four elementary ambiguities in this audio clip:

Local ambiguities.
Matching ambiguity.
"Music which tries to continue, but can't" ambiguity.
Merging ambiguity.

Example

Title Screen (first 14 seconds) from PS1 game Jinx by Game Audio Ltd. AKA Keith Leary's company. It starts with an instrument. At 0:07 a choir gets added. I perceive four ambiguity types:

It's not clear where the motif of the instrument repeats.
The instrument can be perceived as music which tries to continue, but can't.
The instrument and the choir can be interpreted as playing the same motif. Or not. Because both can be perceived as changing speed.
The choir can be interpreted as doing call-and-response with itself. Or not.

Hub World (0:33 - 0:48) is the same, if not better.

The Ghost of Liberace (e.g. first 17 seconds) by Sparks. Angel Eyes (e.g. first 20 seconds) by Dagda. Instead of a choir there's an abstract sound in the background. To me those audio clips are worse versions of Jinx.

Example

Duvet remix (up to 2:01) by ScummV. I perceive four ambiguity types: fractal-like, call-and-response, "music which tries to continue but can't", multiple motifs merging into one. But the track is too slow, takes too long to establish all four ambiguities.

Introspectre by Depeche Mode. I perceive four ambiguity types: local, matching, call-and-response, "music which tries to continue but can't". But the track is too slow, takes too long to establish all four ambiguities.

Incantations Part Four (13:59 - 15:02) by Mike Oldfield and Time (2:09 till the end) by Dennis Wilson. Those tracks at least come close to having 4 ambiguity types.

Part 4C

Here I share audio clips which have less than four ambiguity types related to a single motif.

Counterexample

Losing It (first 24 seconds) by Rush, Rock'n'Roll Is Dead remix (0:23 - 1:22) by X-Mode, Everything / Epilogue by Safri Duo, A Million Miles Away / I Wish I Had A Time Machine by Edison's Children, Funky Waters / Amiss Abyss (1:17 - 1:45) by David Wise and Starless (up to 1:34) by King Crimson. I perceive the same three ambiguity types in all audio clips:

Fractal-like.
"Music which tries to continue, but can't". (less so in X-Mode)
Call-and-response. (less so in Rush)

Amo Bishop Roden (e.g. first 21 seconds) by Boards Of Canada and Dance On A Volcano (first 24 seconds) by Genesis. I perceive the same three ambiguity types in both clips: local, fractal-like, call-and-response.

Counterexample

All I Need (0:20 - 0:35) and Idioteque (0:11 - 0:39) are the most stimulating Radiohead audio clips for me.

All I Need should have the same ambiguity types as Be Forever: local, fractal-like, "music which tries to continue but can't", call-and-response. But I don't perceive the later as emphasized enough. Maybe weird timbre of the sound distracts me.
In Idioteque I perceive three ambiguity types: local, fractal-like, call-and-response.

Counterexample

1:24 - 1:42 or 0:35 - 0:53 (Mining Melancholy), 0:00 - 1:11 (Stickerbush Symphony), 0:00 - 0:19 (Forest Interlude). All by David Wise. I perceive three ambiguity types in all audio clips:

An audio effect which makes it unclear with how many events to model the sounds.
Fractal-like and call-and-response ambiguities.

Counterexample

Forest Temple (1:05 - 1:47) by Koji Kondo. The sound of the key instrument is too unusual, too harsh. So I can't focus on ambiguities.

Looking for Mommy (e.g. first 30 seconds) by Akira Yamaoka. The piano sounds too irregular to me. And instruments sound too independent from each other. So I can't focus on any ambiguities.

Part 4D

Meta-ambiguities

We analyzed elementary ambiguities. We also may consider "elementary meta-ambiguities".

Meta-ambiguities directly reference multiple interpretations of a track. Or reference things "outside" the track (e.g. the quality of recording).

Here I share audio clips which have a meta-ambiguity.

Example

Storm (up to 6:11) by Godspeed You! Black Emperor. I perceive three ambiguity types:

The motif can be perceived as fluctuating in speed or being static.
The motif can be perceived as "music which tries to continue, but can't".
The build-up can be interpreted as making the motif seem faster and faster. Or not.

Third bullet point is a meta-ambiguity, because it directly references multiple interpretations of the audio clip.

Example

Untitled #3 (Samskeyti) by Sigur Rós. There's a piano and a weird underlying sound. I perceive three ambiguity types:

The motif can be perceived as fluctuating in speed or being static.
The instruments can be perceived as doing call-and-response. Or not.
The build-up can be interpreted as making the motif seem faster and faster. Or not.

Third bullet point is a meta-ambiguity, because it directly references multiple interpretations of the audio clip.

Part 4E

Here I continue sharing audio clips which have a meta-ambiguity.

Example

Your Friends Are Scary (e.g. 1:12 - 2:06) by Younger Brother. We can model it as "normal instruments" playing most of the notes plus some "background sound". Among all normal ambiguity types, here's that:

The normal instruments can be perceived as just noise, a single texture (interpretation 1), or as the fluctuating in speed motif (interpretation 2). The background sound can be perceived as speeding up (because it changes pitch) or not.
In one way the background sound makes "interpretation 1" more appealing and in another way it makes "interpretation 2" more appealing.

Second bullet point is a meta-ambiguity, because it directly references multiple interpretations of the audio clip.

Cliffhanger from a PS1 game Hugo 2. To me it's like a much worse version of Your Friends Are Scary.

Example

Yume Nikki Ending remix (0:47 - 1:12) by Aphehad. There's an instrument playing the Yume Nikki Ending theme. At 0:24 the second instrument appears. At 0:47 vocal appears. Among normal ambiguity types, here's that:

The second instrument is more similar to the first in terms of timbre. But the second instrument is more synchronized with the vocal. So the second instrument is "torn" between the first and the vocal.

This is a meta-ambiguity, because it directly references multiple interpretations of the audio clip.

Example

Soap World from Yume Tagai. There's a 2-note pattern and an underlying instrument (synth?). Among normal ambiguity types, here's that:

Speed changes of the underlying instrument affect the perception of the speed of the 2-note pattern.

Example

Can't See (Useless) by Oingo Boingo. I hear the motif as "tick-tock TUDUDUM, tick-tock TUDUDUM, tick-tock TUDUDUM" (e.g. 1:17 - 1:33), even if it's not always played. Then there's "everything else" which creates the build-up. Among normal ambiguity types, here's that:

The motif can be interpreted as fluctuating in speed or just having some auditory glitch.
The build-up can be interpreted as making the motif seem faster and faster. Or not.

Example

Parade by Susumu Hirasawa. o'er the flood by Goreshit and Max Richter. Both tracks contain effects which can be perceived as "within" the track or "outside" the track.

Floods Outro by Pantera. The distortion can be perceived as existing "outside" the track. As if we're playing a damaged recording or hearing the music through an obstacle.

Part 4F

Here I share audio clips which have two meta-ambiguities related to a single motif.

Example

Agent Orange (0:36 - 1:01) by Depeche Mode. First instrument appears at 0:15 and plays a 3-note pattern. "Chick-chick WOW... chick-chick WOW... chick-chick WOW". The second instrument appears at 0:36 and plays a 4-note pattern. Not counting all normal ambiguity types, here's that:

(Local ambiguities.) 1.1 It's unclear with how many events to model the first instrument. 1.2 The 3rd note of the first instrument may sound like an echo of the 1st or the 2nd note. So it's not clear when the event of the 3rd note begins. 1.3 The first instrument can be perceived as just an echo of the second. Or not.
The two instruments can be perceived as a fractal of echoes. First instrument is an echo of the second. 3rd note of the first instrument is an echo of a previous note.

Second and 1.2 bullet points are meta-ambiguities.

Example

Psychic Gibbon (2:02 - 3:16) by Younger Brother (a guitar cover of that exact moment). Let's model it as two instruments, a guitar (playing the motif) and synth (making weird background sounds). The motif is a ~11 seconds loop. Not counting all normal ambiguity types, here's that:

The motif can be split into two parts, both about 5 seconds long. The first part can be perceived as fluctuating in speed (the hand of the guitarist changes speed). The second part can be perceived as another fluctuation in speed (the hand of the guitarist moves with a stable speed).
The fluctuations in speed can be perceived as existing on different scales of the motif or not. Like, there's speed differences between the two parts, but there's also speed differences between sub-parts of the first part... It's like a fractal.
The weird synth sound in the background can be perceived as making the motif seem faster and faster. Or not.

Second and third bullet points are meta-ambiguities.

Example

Aquatic Ambience (first 45 seconds) by David Wise (a recreation of the track where you can see its structure). We can model it as the motif and sustained sounds in the background. Not counting all normal ambiguity types, here's that:

The motif can be split into two parts. Both parts can be perceived as fluctuating in speed (pitch). And the difference between them can be perceived as another fluctuation in speed (pitch). That's interpretation 1. Interpretation 2 is that the motif is "not really music", because it's too simple and repeats too fast and is too similar to an echo.
The fluctuations in speed can be perceived as existing on different scales of the motif. Or not.
The sustained sounds in the background can be perceived as making interpretation 1 more appealing. Or interpretation 2.

Second and fourth bullet points are meta-ambiguities.

The Lost Jockey

Obscure gestalts in short musical loops (draft)

TL;DR

This is a bad post

Degrees of objectivity

Vagueness

Part 1

Common listener viewpoint

Decision viewpoint

Grouping sounds

Part 2

Terminology

Elementary ambiguity

The point of EAs

EAs and emotions

Part 3

Fractal-like ambiguity

Local ambiguity

Matching ambiguity

"Music which tries to continue, but can't" ambiguity

Call-and-response ambiguity

Merging ambiguity

Part 4A

Example

Example

Example

Part 4B

Example

Example

Example

Example

Example

Part 4C

Counterexample

Counterexample

Counterexample

Counterexample

Part 4D

Meta-ambiguities

Example

Example

Part 4E

Example

Example

Example

Example

Example

Part 4F

Example

Example

Example