Dynamical and topological tools for (modern) music analysis

Mattia G Bergomi

Dynamical and topological tools for (modern) music analysis

2015

Tesi di dottorato in cotutela tra l’Università degli Studi di Milano e l’Université Pierre et Marie Curie Thèse de doctorat en cotutelle entre l’Università degli Studi di Milano et l’Université Pierre et Marie Curie Programma in Informatica Scuola di Dottorato in Informatica, Università degli Studi di Milano Specialité Mathématiques École doctorale Informatique, Télécommunications et Électronique (Paris) Presentata da / Présentée par Mattia Giuseppe Bergomi Per il conseguimento del titolo di Dottorato di ricerca dell’Università di Milano Pour obtenir le grade de Docteur de l’Université Pierre et Marie Curie Titolo della tesi / Sujet de tèse Dynamical and Topological Tools for (Modern) Music Analysis Discussa il 10 dicembre 2015/ Soutenue le 10 décembre 2015 di fronte alla commissione composta da / devant le jury composé de Goﬀredo Haus Moreno Andreatta Davide Luigi Ferrario Elaine Chew Massimo Ferri Jean-Louis Giavitto Direttore di tesi Codirettore di tesi Referee Referee Esaminatore Esaminatore Directeur de thèse Codirecteur de thèse Rapporteur Rapporteur Examinateur Examinateur ii Dynamical and Topological Tools for (Modern) Music Analysis Mattia G. Bergomi 2015 iv I read in a book that the objectivity of human thought can be expressed by using the verb to think in impersonal form. Could we ever say “it plays” as we say “it rains”, or “today it is windy”? [...] And may we also say “it listens” as we say “it rains”? — Freely translated and adapted by the author from Se una notte d’inverno un viaggiatore, Italo Calvino. Abstract Is it possible to represent the horizontal motions of the melodic strands of a contrapuntal composition, or the main ideas of a jazz standard as mathematical entities? In this work, we suggest a collection of novel models for the representation of music that are endowed with two main features. First, they originate from a topological and geometrical inspiration; second, their low dimensionality allows to build simple and informative visualisations. Here, we tackle the problem of music representation following three non-orthogonal directions. We suggest a formalisation of the concept of voice leading (the assignment of an instrument to each voice in a sequence of chords) suggesting a horizontal viewpoint on music, constituted by the simultaneous motions of superposed melodies. This formalisation naturally leads to the interpretation of counterpoint as a multivariate time series of partial permutation matrices, whose observations are characterised by a degree of complexity. After providing both a static and a dynamic representation of counterpoint, voice leadings are reinterpreted as a special class of partial singular braids (paths in the Euclidean space), and their main features are visualised as geometric conﬁgurations of collections of 3-dimensional strands. Thereafter, we neglect this time-related information, in order to reduce the problem to the study of vertical musical entities. The model we propose is derived from a topological interpretation of the Tonnetz (a graph commonly used in computational musicology) and the deformation of its vertices induced by a harmonic and a consonance-oriented function, respectively. The 3-dimensional shapes derived from these deformations are classiﬁed using the formalism of persistent homology. This powerful topological technique allows to compute a fingerprint of a shape, that reﬂects its persistent geometrical and topological properties. Furthermore, it is possible to compute a distance between these ﬁngerprints and hence study their hierarchical organisation. This particular feature allows us to tackle the problem of automatic classiﬁcation of music in an innovative way. Thus, this novel representation of music is evaluated on a collection of heterogenous musical datasets. Finally, a combination of the two aforementioned approaches is proposed. A model at the crossroad between the signal and symbolic analysis of music uses multiple sequences alignment to provide an encompassing, novel viewpoint on the musical inspiration transfer among compositions belonging to diﬀerent artists, genres and time. To conclude, we shall represent music as a time series of topological ﬁngerprints, whose metric nature allows to compare pairs of time-varying shapes in both topological and in musical terms. In particular the dissimilarity scores computed by aligning such sequences shall be applied both to the analysis and classiﬁcation of music. v Contents Abstract v Introduction 1 I 5 Musical and mathematical preliminaries 1 Music theory preliminaries 1.1 Monody, polyphony and modern notation . . . . . . . . . . . . . . . 1.2 Voice leading practice . . . . . . . . . . . . . . . . . . . . . . . . . . 11 11 15 2 Mathematical models: state of the art 2.1 Simplicial complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The geometrical approach: continuous models . . . . . . . . . . . . . 2.3 The Tonnetz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 21 23 26 II 33 The horizontal dynamics of music 3 Voice leadings, partial permutations and geodesics 3.1 Deﬁning the voice leading . . . . . . . . . . . . . . . 3.2 Partial permutations . . . . . . . . . . . . . . . . . . 3.3 Voice leading and piecewise geodesic paths . . . . . . 3.4 Complexity of a voice leading . . . . . . . . . . . . . 3.5 Complexity analysis of two Chartres Fragments . . . 3.6 Rhythmic independence and rests . . . . . . . . . . . 3.7 Concatenation of voice leadings and time series . . . 3.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 39 40 43 47 50 52 54 57 4 Voice leading and braids 4.1 Partial singular braids . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Modelling voice leading in PSBn . . . . . . . . . . . . . . . . . . . . 59 59 63 5 Discussion and future works 73 vi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS vii III 75 The vertical dynamics of music 6 Music analysis through deformations of the Tonnetz 6.1 An anisotropic Tonnetz for music analysis . . . . . . . . . . . . . . . 6.2 Towards a topological classiﬁcation of music . . . . . . . . . . . . . . 81 83 90 7 Topological persistence 7.1 Simplicial homology . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 From homology to persistent homology . . . . . . . . . . . . . . . . . 93 93 98 8 A topological fingerprint for music 109 8.1 Persistent homology classiﬁcation of deformed Tonnetze . . . . . . . 109 8.2 Musical interpretation and persistent clustering . . . . . . . . . . . . 112 9 Audio feature deformation of the Tonnetz 9.1 Computing consonance values . . . . . . . . . . . . . . . . 9.2 Persistent homology and audio feature deformed Tonnetze 9.3 Tonnetz deformation through triads’ consonance . . . . . 9.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 126 130 136 147 10 Discussion and future works 151 IV 153 Harmonic sequences and persistence time series 11 Harmonic time series and pop music 11.1 Symbolic sequence alignment . . . . 11.2 Harmonic sequences . . . . . . . . . 11.3 Applications . . . . . . . . . . . . . . 11.4 Discussion and perspectives . . . . . . . . . 12 Musical Persistence Snapshots 12.1 Persistence and time varying systems . 12.2 Dissimilarity of persistence time-series 12.3 Applications . . . . . . . . . . . . . . . 12.4 Discussion and perspectives . . . . . . V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 160 171 175 184 . . . . 187 187 189 191 197 Conclusion and future works 203 13 Conclusion 205 14 Future works 14.1 Voice-leading modelling . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Persistent music features . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 Harmonic and persistence time series . . . . . . . . . . . . . . . . . . 207 207 210 216 viii VI CONTENTS Appendices 221 A Modes and Topology 223 A.1 Standard modes as superposition of chords . . . . . . . . . . . . . . 223 A.2 Modes through graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 228 B Geometric characterisation of the chord space (proof). 237 C Code C.1 Persistence algorithm . . . . . . . C.2 3d deformed Tonnetz . . . . . . . C.3 Persistent homology computation C.4 Persistent time series . . . . . . . 241 241 242 252 260 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D Scores 269 E Modern Chord Notation 291 List of Figures 292 List of Tables 299 Bibliography 303 Introduction Modelling a creative process is a daunting task, since it is not yet possible to deﬁne an operator capable of judging its aesthetics in an objective way. This is one of the main reasons that renders the realisation of formal models for the analysis and classiﬁcation of music such a challenging endeavour. It is necessary to investigate the compositional process, in order to provide a coherent analysis and a robust classiﬁer of music. Often, the core of a piece of music is made of a small collection of strong, recognisable musical concepts, that are grasped by the majority of the listeners (Dowling, 1972; Folgieri et al., 2014; Tulipano and Bergomi, 2015). These musical concepts are shaped by varying levels of tension over time, drawing the attention of the listener to particular moments thanks to speciﬁc choices, frustrating his or her intuition through unexpected changes, or conﬁrming his or her expectation with, for instance, a well-known cadence leading to resolution. Our approach to the analysis of music composition stems from the assumption that it is based on two main actions used by the composer to describe musical concepts and shape his or her piece. The philosopher and musicologist Ernst Kurth in Grundlagen des Linearen Kontrapunkts (Kurth and Rothfarb, 1991) describes the counterpoint as an equilibrium among streaming linear forces (kinetic energy) and congealing harmonics forces (potential energy). These terms, that are not meant to be interpreted as scientiﬁc deﬁnitions, suggest a twofold interpretation of music. On one side, the horizontal point of view, intended as the behaviour of superposed independent melodic strands of counterpoint; on the other side, a vertical perspective where music is compressed in a harmonic form, and chords summarise the information otherwise ordered in time. From a scientiﬁc viewpoint, the analysis of music has largely been attacked on the symbolical side with algebraic tools (Andreatta, 2003; Zabka, 2009) and the category theory (Mazzola and Andreatta, 2006; Mazzola et al., 2002; Popoﬀ et al., 2015), while its audio signals have been largely explored in computer science, leading to the ﬁeld of Music Information Retrieval (Casey et al., 2008). Recently, the mathematical community witnessed a surprising growth of the ﬁeld of Topological Persistence (d’Amico et al., 2006; Edelsbrunner and Harer, 2008). This theory provides a rigorous approach to the problem of shapes recognition, allowing to compare complex forms, while giving a simple and robust representation of their geometrical and topological properties. As the models for the analysis of audio signals take advantage of the strategies developed for image analysis (Smaragdis and Brown, 2003; Wang et al., 2003; Li et al., 2010), it is possible to borrow some tools from the topological analysis of shapes and data to tackle the problem of music 1 2 CONTENTS analysis. The main aim of this work lies in the introduction of low-dimensional topological and geometrical models in order to describe, albeit in a extremely simpliﬁed form, the compositional process. This task has been split into two smaller problems, following the approach described by Kurth. On one side, the analysis of multidimensional time series as a concatenation of events in time, which ﬁnds its natural musical counterpart in the voice-leading theory. On the other side, the representation of persistent features of static and time-varying shapes, encoding in their geometry the information carried by the symbolic and signal-based nature of music. The structure of this work reﬂects these considerations. After an introductory part, aimed at deﬁning some basic musical and mathematical concepts, it is developed in three parts. In Parts II and III, these horizontal and vertical approaches are described separately, although they are far from being orthogonal. Consequently, two strategies considering both these aspects are proposed in Part IV. In the following paragraphs the main contributions of each parts are described. Musical and mathematical preliminaries In this ﬁrst part, we introduce the main musical and mathematical characters that shall intervene in this work. First, a quick historical presentation of the concepts of monody and polyphony is presented and the links between these classical compositional techniques and their modern counterparts are discussed. Second, we place our research at the crossroad between mathematics and music. Two state-ofthe-art approaches that inspired our investigation are discussed. On one hand, we introduce the geometrical representation of harmonic objects provided by the chords space (Tymoczko, 2011; Callender et al., 2008), together with the interpretation of voice leadings as trajectories in this space, which inspired the research described in Part II. On the other hand, we introduce the notion of simplex and simplicial complex, two standard objects in algebraic topology, that will be used to provide a topological deﬁnition of the Tonnetz. Algebraic and geometric models for the voice leading theory Given a sequence of chords, the voice leading process corresponds to their transformation in a superposition of melodies, endowed with a certain degree of independence. The main contribution of this part is the introduction of a mathematical formalisation of the concept of voice leading and the representation of simultaneous motions of voices as partial permutations. An algorithm to univocally compute the partial permutation matrix associated to a voice leading is proposed. In particular, we demonstrate how this simple representation suﬃces to describe the behaviour of the voices, with a focus on voice crossing. As a geometrical counterpart to this ﬁrst algebraic interpretation, voice leadings are interpreted as piecewise geodesics in several spaces. The diﬀerent types of voice motions are analysed in each space, pointing out how minimal geodesic paths represent non-crossing voice leadings among two chords. Then, consecutive simultaneous motions of voices are represented as concatenation of geodesics. Once the essential role played by the juxtaposition of voice leadings as a concatenation of linear function has been modelled, the concatenations of geodesics CONTENTS 3 are substituted by concatenations of partial permutation matrices. Associating to each permutation matrix a 4-dimensional complexity vector describing the main features of the voice leading, we provide a representation of the counterpoint of the ﬁrst species. After generalising this model to the study of the concatenation of voice leadings containing rests, a naïve extension to other contrapuntal species is suggested. Finally, in order to provide a visualisation of voices motions, voice leadings are described as trajectories in 3-dimensional Euclidean space by using the mapping which is naturally deﬁned between partial braids and partial permutations. This representation shall prove to be very eﬃcient for the visualisation of voice leadings between n-notes chords, when a particular class of trajectories is considered. Persistent musical features In this section, the ordering of melodic or harmonic states that represented the core of the previous part is neglected. Music is seen as composed by vertical, unordered entities, as a pianist could interpret a scale as a cluster, to grasp at ﬁrst sight its intervallic properties. The main idea is to introduce a metric representation of the Tonnetz interpreted as a planar polyhedral surface, whose vertices are displayed along a third dimension, through a speciﬁc function. In particular, we shall consider two deformation functions. The ﬁrst one is deﬁned in the symbolic domain and takes into account the pitch classes and durations of a series of notes. The second, based on the interaction between signal and symbol, is constructed on the consonance function as it has been deﬁned by (Plomp and Levelt, 1965). The shapes obtained via these deformations are then classiﬁed by computing their persistent homology. The novel and lively ﬁeld of computational topology provides a series of tools allowing to associate a ﬁngerprint to a shape, describing its geometrical and topological features as a simple diagram. After a preliminary section introducing the basic deﬁnitions and theorems of persistent homology, this formalism is applied to the analysis of music. The results are interpreted in a musical context. Moreover, the distance between persistence diagrams is used to classify several datasets of compositions, modal scales and triads. Harmonic time series and persistence snapshots The dynamic and time-dependent nature of music is one of the main ingredients of this last part. In the ﬁrst chapter, we suggest a novel approach to the analysis of pop music. At the intersection of the symbolic and signal-analysis domains, this method consists in the interpretation of automatically transcribed harmonic progressions as symbolic sequences. Such sequences shall be analysed by computing their multiple alignment. Widely used in phylogenetic, this technique shall provide an encompassing representation of the harmonic features characterising a dataset of 138 Pop songs. The analysis of statistically recurrent motifs of these aligned sequences allows to quantify and analyse the shared inspiration and the contamination over time among compositions. Thereafter, we propose an adaptation of the model introduced in the previous section, in order to include time in the geometrical and topological analysis of 4 CONTENTS music. Static shapes are now considered as time-varying systems, whose evolution is describable as a sequence of observations in time. Thus, we shall consider the time series formed by topological ﬁngerprints computed on a sampling of the Tonnetz’s deformation in time. A musical interpretation of the meaning of these topology-based time series is followed by an application of this technique to a music classiﬁcation task on three diﬀerent datasets. Part I Musical and mathematical preliminaries 5 Table of Contents 1 Music theory preliminaries 1.1 1.2 Monody, polyphony and modern notation . . . . . . . . . . . . . . . 11 1.1.1 Monody and lead sheet . . . . . . . . . . . . . . . . . . . . . 11 1.1.2 Polyphony, modal harmony and melodic voicings . . . . . . . 12 Voice leading practice . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2 Mathematical models: state of the art 2.1 2.2 2.3 Simplicial complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.1.1 Simplices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.1.2 Simplicial complexes . . . . . . . . . . . . . . . . . . . . . . . 22 The geometrical approach: continuous models . . . . . . . . . . . . . 23 2.2.1 From pitch labels to continuous frequencies . . . . . . . . . . 23 2.2.2 Geometrisation of the chord space . . . . . . . . . . . . . . . 25 The Tonnetz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3.1 An overview on tone-networks . . . . . . . . . . . . . . . . . 27 2.3.2 The Tonnetz as a Simplicial Complex . . . . . . . . . . . . . 29 Abstract The aim of this ﬁrst section is to introduce the ingredients of music theory that inspired our research, in order to provide a practical musical setting for the whole work and to give some important music-oriented bibliographic references. In Chapter 1, a brief history of monody, polyphony, counterpoint and its relationship to western common practice tradition is discussed. In Section 2.1 we introduce the concept of simplicial complex, a core object in algebraic topology representing one of the main mathematical ingredients of this work. Then, we place our investigation in the mathematical/musical domain: we introduce the chord space, which inspired a model describing the complexity of voice leading presented in Part II and the Tonnetz, that will be used in Part III. 9 One Music theory preliminaries Monody and polyphony allow to introduce two apparently orthogonal approaches to music analysis. The former suggests the well-known interpretation of chords as unordered sets of notes (referred hereafter as a vertical analysis). The latter, can be deﬁned as the study of voices moving independently as a superposition of melodies (referred hereafter as horizontal analysis). Although both approaches encode relevant information, we shall observe that it is not possible, even on a historical basis, to order these approaches chronologically, nor to deﬁne them as independent techniques. In Figure 1.1 an intuitive representation of these viewpoints is depicted. Monody can be depicted as a set of horizontal lines in simultaneous motions, while polyphonic music can be represented as a series of independent lines in terms of height (pitches) and time (duration). The superposition of several melodies allows the composer to enrich and emphasise a main melody, which is preferred among the others. Shortly, we shall provide a quick historical overview on monophony and polyphony. This section aims at supplying the reader with the basic information concerning what shall be developed in the next parts of this work, to provide the essential music theory bibliographic references and some examples linking the classical concepts of monody and polyphony with modern music. 1.1 1.1.1 Monody, polyphony and modern notation Monody and lead sheet In the fourth century, when the ﬁrst monastic communities were created, the psalmodic practices arose as ancestors of the Gregorian chant (Apel, 1958; Chanan, 1994). The monophonic monastic psalmody was used as a metaphor of discipline, to (a) Monody. (b) Polyphony. Figure 1.1: Intuitive representation of monody and polyphony. (a) Monody is intended as a melodic line supported by a harmonic progression. (b) The polyphonic approach allows to create superpositions of independent melodic strands, that aﬀect the listener both as a whole and separated entities. 11 12 CHAPTER 1. MUSIC THEORY PRELIMINARIES Sonar C‹ & C ‰ œJ œ œ œ ‰ œj Ó £ j ‰ œ bœ ™ G‹7 j œ œ œ J œ J C7 w Figure 1.2: An example of lead sheet. create and enforce the bond among the members of the community, as it is described in (Basil, 1963). It is important to note that, in the context described above, monophony represents a speciﬁc choice, rather than an ancestor of polyphonic music, representing the rejection of earlier —presumably polyphonic— practices. Indeed, polyphony has never supplanted monophony in the history of western music. If the term monophony is used to describe music consisting of a single (generally vocal) melodic part, monophony is a melody sustained by a harmonic progression. This term was introduced in (Galilei, 1569), in order to describe a single voice supported by the chords played by a lute. We refer interested reader to (Taruskin, 2009, Chapter 1) for further details about the passage from monastic psalmody to monophony. In a modern music context, the idea of monody and its notation are widely used. It is common practice to use the lead sheet notation to represent music in a concise form, as it is depicted in Figure 1.2. The melody is written in standard notation, while chords appear above the staﬀ as symbols (see Appendix E for an introduction to chord notation). This kind of harmonic notation provides no information concerning the voicings that should be used and both the rhythmical and dynamical aspects are also neglected. In lead sheet notation, chords are represented as vertical structures. In this case, it is natural to think about them, as pitch-class sets, i. e. collections of notes in which neither the octave, nor the order of the notes composing the chord are speciﬁed. The mathematical model describing this construction will be detailed in Chapter 2. However, it is clear that the style and the time in which a song has been composed, arranged or re-arranged, lead the performer to certain musical choices, that at least partially, ﬁll the notation’s gaps. This vertical approach to music inspired the model, that we will describe in Part III. 1.1.2 Polyphony, modal harmony and melodic voicings Polyphony has always been present in European music. However, we can identify the 12th century as the moment in which polyphonic composition became the standard technique in Western music. As we claimed above, polyphony and monophony are not terms in opposition, but answers to diﬀerent needs. The practice of polyphony was ﬁrstly described in the treatise Musica Enchiriadis and its contemporary commentary Schola Enchiriadis, see (Erickson and Palisca, 1995). These treatises depict two polyphonic techniques that can be used to enrich a given melody. It is interesting to note how these two techniques can be reinterpreted in a modern context. The ﬁrst one is the ison chanting, in which the tonic note of the melody, explicitly notated, is supposed to be held while the main melody is sung. In a modern music 13 1.1. MONODY, POLYPHONY AND MODERN NOTATION A¨ lyd &C ? ? ? ? 5 E lyd & ? ? ? ? E lyd G¨ lyd ? ? ? ? D lyd ? ? ? ? ? ? ? ? £ C lyd £ ? ? ? ? ? ? ? ? ? ? ? ? Figure 1.3: Chord symbols are substituted by mode names. The Law of Diminishing Returns - Alan Pasqua. Solos part B. 12 4 & œœ œ œœ Rex cea - li, Ti - ta - nis 16 4 & œœ do ni œœ - mi - ti - œœ ne di œœ œœ œœ œœ œœ œ œ œ Te Se, hu - mi - les iu - be - as, fa - mu - li fla - gi - tant œœ œœ ma - ris squa - li œ œ œœ - œœ un - di di - que œ so so - œœ œœ œœ œœ œœ œ mo - du - lis va - ri - is ve - ne - ran - do li - be - ra - re Figure 1.4: Example of polyphony from Musica Enchiriadis. from (Taruskin, 2009, Chapter 2). œ ni, li, œ pi - is ma - lis Transcription context, this type of practice is analogous to the modal harmony notation. For example, consider the notation used in The Law of Diminishing Returns’ solo section depicted in Figure 1.3. In this case, the notation speciﬁes a particular modal choice (lydian) and its ison, i. e. the root of the modal scale and the reference pitch that allows to identify the lydian mode. See Appendix A.1 for an introduction to modal theory. The second technique describes the harmonisation of a given melody through parallel doubling, i. e. the accompaniment of a melody with another one consisting of its transposition to a ﬁxed consonant interval. The modern analogous to this technique is the arrangement process known as the voicing of a melody or block voicing. This practice lies at the intersection between monody and polyphony. Given a lead sheet, the melody can be voiced using its harmonic structure (see Figure 1.5 for an example). We refer to (Wei, 2008) for a detailed explanation of this technique in its modern version and to (Taruskin, 2009, Chapter 5) for details on its classical use. It is possible to ﬁnd something similar in the two partitions of Figure 1.4. They are not examples of polyphonic composition, but a reinforcement of the vox principalis through a lower melody, organum, producing an intuitive contrapuntal harmony. As the enrichment of a melody using voicings is strictly related to a chord the two techniques described in Musica Enchiriadis are far from the compositional independence that characterises a polyphonic composition. Two important innovations are described in the Micrologus (d’Arezzo et al., 1993). First, as it is depicted in Figure 1.6, more than one contrapuntal solution can be given as harmonisation of the same melody. In particular, in Figure 1.6b the final is reached with a passing 14 CHAPTER 1. MUSIC THEORY PRELIMINARIES D‹7 G9 œ &C D‹9 ˙ G9(„ˆˆ13) CŒ„Š9 œœœ œ œ & C œœœ { ?C œ 6 4 CŒ„Š7 œ ˙˙ ˙˙ ˙ œ Figure 1.5: Melody voicing. ? œœ Jhe œœ - œœ ru œœ œœ sa œ lem (a) Solution 1. 6 4 ? œœ Jhe œœ - ru œœ œœ sa œœ œ lem (b) Solution 2. Figure 1.6: Two diﬀerent harmonizations of Jerusalem. Guido d’Arezzo, Micrologus. note and the major second is used as a secondary consonance, giving rise to a smoother passage to the ﬁnal than the direct leap used in Figure 1.6a. Second, although some intervals like the perfect ﬁfth are still judged as hard-sounding, in Micrologus the contrapuntal technique is based on the pleasantness of a certain harmonic choice, rather than on a natural law. Thus, not only the process of voicing has been brought to a more human level, but even the concept of parsimony of voice leading is introduced, as one may notice from the movement of the organum in both the examples of Figure 1.6. At the same time, the examples given in the Micrologus stress a preference for contrary motion at cadences, while the parallel doubling represents a sporadical choice, thanks to the new degrees of freedom the voices are endowed with. See also (Rankin, 1993) for pre-guidonian evidences on the use of parsimonious voice leadings and contrary motion. In Figure 1.7 it is possible to observe an example of this relative independence of voices. This independence has been inherited by modern music, representing the typical behaviour of the melody against a bass line. The former moving more or less freely depending on the context, and the latter linked to the harmonic choices made by the composer. In Figure 1.8 the bars 1-4 of Interplay by Bill Evans are depicted. The harmonic progression is stressed by the movement of the bass line and enriched with a higher voice, in a harmonisation of major and minor twelfths. This choice states both a tonal (B♭ minor) and modal (F phrygian) choice. The melody moves with a high degree of independence, often in contrary motion and crossing the tenor 15 1.2. VOICE LEADING PRACTICE 25 4 & œ œ œ œ œ œ bœ œ œ œ œ œ œ œ œ œ œ œ œ bœ œ œ œ œ œ 25 4 & œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ Figure 1.7: Independent voice leading and contrary motion. A fragment of Alleluia: Angelus Domini - Chartres 109, fol. 75. 3 œ b bC ‰ œ œ j œ œ œ 3 ‰ j œ b & b œ JJœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œj ‰ b & b bb C ˙ ? bb b C ˙ b ˙ b˙ ˙ ˙ ˙ ˙ œ œ œ œ œ œ œ bœ œ bœ œ œ œ œ œ œ Figure 1.8: Polyphonic Jazz standard. Segregation among the melody and the bass voices. Interplay, Bill Evans, bars 1-4. voice. Voice leading techniques shall be detailed later in Section 1.2. However the degree of independence of the voices either in rhythmical or intervallic terms, allows to classify counterpoint into ﬁve species, depicted in Figure 1.9. To conclude this comparison among classical and modern monodic and polyphonic techniques, we show how the ﬁrst and ﬁfth contrapuntal species has been used in two jazz composition in Figures 1.10 and 1.11, respectively. The necessity of a representation of simultaneous motion of voices and its visualisation inspired the work we describe in Part II. The time-dependent nature of Music, suggested the time-series-oriented representation of Music, will be described in Part IV. 1.2 Voice leading practice Harmony and the study of counterpoint provide some theoretical axioms to guarantee the smoothness of a composition (where smoothness is intended in this context as understandability). We refer to (Aldwell et al., 2010, Chapter 6 ) for a list of phenomena occurring in the voice leading process in four-part writing. The following list aims at describing some compositional strategies, that shall be used in the remainder of this work. Vocal range. Each voice has to be settled in a range that can be sung without excessive eﬀort. Thus the construction of a melody associated to each voice has to take into account this particular feature: 16 CHAPTER 1. MUSIC THEORY PRELIMINARIES First & w w w ? w 9 w w Second w w w Ó ˙ ˙ ˙ w Third w & œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ w w w ? w 15 & ˙ ? w ˙ ˙ ˙ w ˙ Fifth w œ œ œœœ ˙ w ˙ ˙ ˙ ˙ ˙ w w Fourth Ó w ˙ ˙ œ œœœ œ ˙ œ œ w w Figure 1.9: Five diﬀerent degree of independence among voices. From note against note in the ﬁrst specie of counterpoint, to the complete degree of independence of the ﬁfth specie. A. Sx. B. Sx. Tpt Hn. ° ## 4 œ œ œ œ œ œ œ#œ œ & 4 3 ## 4 œ œ œ œ œ œ œ œ œ ¢& 4 3 œ œ œ œ œ nœ w > > œ. œ œ œ. œ œ œ œ œ 3 .œ œ >œ œ. œ >œ œ œ œn œ œ b œ œ œ œ w 3 œ œ œœœ œ w ° ## 4 œ nœ œ nœ œ œ œ œ œ œ & 4 œ œ œ œ. œ > œ. œ > œ 3 3 ## 4 nœ nœ nœ nœ b œnœ œ œ w ¢& 4 œ œ œ œ œ nœ b œ œ nœ. œ >œ œ. œ >œ nœ œ 3 3 Figure 1.10: A reduced orchestration of Boplicity bars 1-4. Birth of the Cool, by Miles Davis. i) Soprano : C4 → G6 , ii) Alto : G3 → C5 , iii) Tenor : C3 → G4 , iv) Bass : E2 → C4 . 17 1.2. VOICE LEADING PRACTICE A. Sx. B. Sx. Tpt. Hn. A. Sx. B. Sx. ° 4 &4 4 ¢& 4 ∑ ∑ ° 4w &4 4w & ¢ 4 œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ bœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ bœ œ œ œ œ œ œ w bw w bw bw bw 5 ° œ bœ œ œ œ œ œ œ œ Œ Ó & œ bœ œ œ œ œ œ œ œ Œ Ó ¢& Ó ∑ œ œ œ œ œ œ œ œ œ œ Œ Ó Tpt. ° & <b>˙ b˙ œ Œ Ó ∑ œ œ œ œ œ Hn. ¢& <b>˙ b˙ œ Œ Ó ∑ w œ œ ¢& Ó œ œ œ œ J œ œ œ œ œ Œ Ó Tpt. ° & ‰ œ j œ œ œ œ œ œ Œ Hn. ¢& ˙ 9 A. Sx. B. Sx. ° œ & ‰ ˙ bw Œ œ #œ œ œ œ #>œ ‰ J ‰ J Ó œ œ œ œ > ‰ #œJ ‰ œ #œJ œ œ œ ˙ bœ œ œ œ Figure 1.11: Alto sax, baritone sax, trumpet and horn voices in Move, bars 1-11, by Miles Davis. Doubling. We assume that the only absolute rule to augment the complexity of a voice leading is the doubling of a tension of a chord. The ﬁfth can be omitted in root-position chords, since they do not add information concerning the genre of the chord. Thus, in seventh-chords the doubled root could take the place of the ﬁfth, while the doubling of a seventh has to be considered as dissonant. The third can only be omitted to achieve special eﬀects. 18 2 4 & CHAPTER 1. MUSIC THEORY PRELIMINARIES similar œœ œ œ parallel œœ œœ oblique œ œ œ œ contrary œ œ œœ Figure 1.12: Motion classes for two voices. Similar: same direction but diﬀerent intervals; parallel: same direction and same intervals; oblique: only one voice is moving; contrary: opposite directions. Microscopic spacing rules. A wide spacing among the upper voices can create an eﬀect of thinness mostly if it is continued for two or three chords. Normally, adjacent upper voices should not be more than an octave apart, while even a two octaves separation is acceptable among the tenor and the bass voices. Furthermore a soprano voice segregated from the other voices or the excessive proximity of the tenor and the alto voices can create confusion. Voice crossing. Crossing occurs when two voices exchange positions. It is less problematic when it involves inner voices and for a small amount of time. Overlap. Overlap occurs when a voice moves above or under the former state of an adjacent voice. Thus the diﬀerence between voice crossing and overlap is that in the latter the relative positions of the voices are maintained, but their ranges intersect in two consecutive moments. This practice can lead to confusing voice leadings. Leap. The degree of complexity of a leap depends on its intervallic size and its consequent consonant or dissonant nature. Here follows a simple classiﬁcation: • Minor and major third: consonant leaps. • Sixth or seventh: dissonant leaps, usually followed by a change of motion. • Larger than an octave: not permitted, rarely used to create interest. • Perfect fourth and perfect ﬁfth: consonant and often followed by a motion change. Two consecutive leaps in the same direction are usually avoided, with the exception of two consecutive third leaps. Melodic motion. Generally, the soprano line tends to move by conjunct motion avoiding leaps. The bass line is normally in charge to support the other voices, clarifying the harmony of the piece, thus it can move disjointedly. Inner voices have to complete the tones of the chord framed by the bass and soprano lines. In conclusion, leaps of the soprano voice increase the complexity of a voice leading, but their complete absence would create repetitive and static melodies and would make the harmonic structure vague. 1.2. VOICE LEADING PRACTICE 19 Simultaneous motion. It is possible to classify the simultaneous motion of two voices as follows (refer to Figure 1.12 for an intuitive representation): • Similar Motion: same direction, diﬀerent spacing. • Parallel Motion: same direction and same spacing. • Oblique Motion: only one voice is moving. • Contrary Motion: opposite directions. Contrary motion provides contrast and independence to the voices, creating an interesting soundscape for the listener. Parallel motion in thirds, sixths and tenths can be considered among the most powerful voice leading techniques. In some cases, parallel motion bounds the possible conﬁgurations of the voices, thus it is forbidden for unisons, octaves, ﬁfths. Consecutive ﬁfths and octaves by contrary motion are normally avoided. Hidden ﬁfths and octaves are to be avoided in few voices contexts (forbidden in two parts writing). A complex texture or a dissonant context mitigate the eﬀect of parallel ﬁfths and octaves. The general rule holds, hidden octaves have to be avoided in the outer voices. Two Mathematical models: state of the art In this section we present two important music representation models. First, the chord space which has the interesting mathematical structure of an orbifold. This space has been recently introduced in (Tymoczko, 2006) and it is characterised by a metric, continuous structure. Second, in a sort of mathematical opposition to this model, we describe the Tonnetz. It was represented, at its origin, as a table (Euler, 1739b), aiming at stressing the acoustic relationships among pitches. It has been described as an abstract graph in (Zabka, 2009). We shall suggest a topological representation of the Tonnetz. In order to safely deﬁne these music representation spaces, we shall introduce two basic concepts of algebraic topology: simplices and simplicial complexes. 2.1 2.1.1 Simplicial complexes Simplices A standard object in Topology is the gluing diagram: a collection of topological polygons, whose edges are labeled and oriented. Such a diagram represents the space obtained by gluing the sides labeled with the same letters, and matching orientations. Geometrical entities like the torus T2 , the Möbius strip M , the projective plane RP2 and the Klein bottle K can be obtained by attaching two triangles as it is depicted in Figure 2.1. Simplices generalise this idea to higher dimensions: it is possible to think about the n-dimensional simplex as an equivalent of the n-dimensional triangle. Let V = { v0 , v1 , . . . , vn } be a set of points in Rm . The points in V are affinely independent if and only if the vectors vi − v0 for i ∈ { 1, . . . , n } are linearly indeP pendent. An affine combination of the points vi is given by x = ni=0 αi vi with Pn i=0 αi = 1. A convex combination of the vi is an aﬃne combination such that αi > 0 for all i. Definition 2.1.1. The convex hull of a set of points V = { v0 , . . . , vn } ⊂ Rm is the set of all convex combinations of points in V : C= ( n X i=0 αi vi X i αi = 1, αi > 0 ) . Definition 2.1.2. Let V = { v0 , . . . , vn } ⊂ Rm be a set of n+1 aﬃnely independent points. The convex hull conv (V ) is said to be a simplex of dimension n, denoted by σ = [v0 , . . . , vn ]. 21 22 CHAPTER 2. MATHEMATICAL MODELS: STATE OF THE ART Figure 2.1: Gluing diagrams of the torus T2 , the Möbius strip M , the projective plane RP2 and the Klein bottle K. Figure 2.2: Representation of low-dimensional simplices. The 0, 1 and 2-dimensional simplices are called vertices, edges and triangles. The 3-simplex, a tetrahedron, corresponds to the 3-dimensional extension of the triangle. These simplices are depicted in Figure 2.2. Let σ be a simplex generated by the set of aﬃnely independent points V . A face τ of σ, is the convex hull of a non-empty subset S ⊆ V . In particular a face is said to be proper if S is a proper subset of V . We will use the notation τ < σ if τ is a proper face of σ and τ 6 σ otherwise. The boundary of σ, denoted bd σ is the union of all its proper faces and its interior is int σ = σ − bd σ. 2.1.2 Simplicial complexes Simplicial complexes are particular collections of simplices, that are closed under the operation of taking faces and in which improper intersections of simplices are 2.2. THE GEOMETRICAL APPROACH: CONTINUOUS MODELS 23 Figure 2.3: Star and link of a vertex of a simplicial complex. forbidden. Formally, we have Definition 2.1.3. A simplicial complex K is a ﬁnite collection of simplices, such that for σ, σ0 ∈ K: (i) if τ < σ then τ ∈ K; (ii) σ ∩ σ0 is a face of both simplices or empty. The dimension of a simplicial complex K is the maximum dimension of its simplices. A subcomplex of K is a simplicial complex L ⊆ K. Particular subcomplexes of K are its skeleta, in particular the k-skeleton is deﬁned as the set containing all simplices of K of dimension at most k. The underlying space of K, denoted as |K|, is the polyhedron given by the union of its simplices with the topology it inherits from Rm . Let X be a topological space, it is said triangulable if it has a triangulation given by a homeomorphism Φ : X → |K|, where K is a simplicial complex. The star of a simplex τ ∈ K is the set of its cofaces, i. e. St τ = { σ ∈ K | τ 6 σ } which is generally not a subcomplex of K. Hence, we can consider its closure, the closed star of τ denoted by St τ , which is the smallest subcomplex containing St τ . The link of τ is the collection of all simplices in its closed star that does not intersect τ . See Figure 2.3 for a representation of the star and the link of a vertex of a simplicial complex. A simplicial complex of dimension 2 can be described as a purely combinatorial object, starting with a set of vertices, then attaching the edges to obtain a graph and ﬁnally, adding triangles to the graph’s structure. In the case of higher dimensional simplicial complexes, according to (Hatcher, 2002, Sec. 2.1), since the simplices of a simplicial complex K are univocally determined by their vertices, it is possible to give a combinatorial interpretation of K, as a set K0 of vertices, with sets Kn of n-simplices, i. e. (n + 1)-element subsets of K0 . In addition, every subset of (k + 1)-element subset of the vertices of Kn has to be a k-simplex, in Kk . 2.2 2.2.1 The geometrical approach: continuous models From pitch labels to continuous frequencies When considering the equal temperament, given the fundamental frequency ν of a note it is possible to represent its pitch as a real number through the function p : (0, +∞) → R deﬁned by p(ν) := 69 + 12 log2 ν . 440 (2.2.1) 24 CHAPTER 2. MATHEMATICAL MODELS: STATE OF THE ART (b) S 1 = R/12Z. (a) R Figure 2.4: The linear space of pitches and the space of pitch classes. The majority of humans, including either trained listeners or musicians are not sensitive to absolute note frequencies but rather to their ratios. This suggests that a notion of distance in the mathematical space of notes should be deﬁned in terms of ratios of their fundamental frequencies. The advantage brought by Equation (2.2.1) is that we are able to deal with subtractions (which are handier, compared to the ratios). It is thus reasonable to interpret the pitch space as the metric space (R, d), where d is the distance induced by the absolute value: d(p, q) := |p − q|. Observe that this model implies the existence of inﬁnitely many notes between any two pitches p and q. A way to visualise this concept is to image a continuous glissando of an instrument such as the violin or the trombone, or even a human voice. However, the values corresponding to the notes actually played in music (by a piano or a clarinet, for instance) are in fact speciﬁc integer numbers. This is due to the choice of working in the equal temperament framework, where the octave is subdivided into 12 equally spaced subintervals, so that the ratio of two consecutive semitones is 21/12 . In this work, we assume continuous trajectories among notes (represented as points of a space) to be paths between one discrete state of the space to another, as they are deﬁned by equal tuning. In order to carry out a more qualitative and deeper analysis, hence reaching a visualisation of the harmonic essence of a piece, we must consider pitch classes, obtained by identifying pitches modulo octave: [p] := { p + 12k | k ∈ Z } . (2.2.2) This amounts to take the quotient space R/12Z ∼ = S1 =: T1 , which we endow with the distance d¯ [p], [q] := min { |p − q| | p ∈ [p], q ∈ [q] } ; ¯ the pitch class space. we call (T1 , d) Thanks to the deﬁnitions given above, it is possible to start modelling objects belonging to the domain of harmony. Several studies aiming at a geometric description 2.2. THE GEOMETRICAL APPROACH: CONTINUOUS MODELS 25 of the chord space have already been developed, in particular by D. Tymoczko and others in (Tymoczko, 2006, 2008; Callender et al., 2008; Tymoczko, 2011). In Music, a chord is the simultaneous execution of two or more notes (say n, in general) modulo octave, which translates in mathematical language into an n-tuple of real numbers (i. e. a point of Rn ). Since in harmony, one is not sensitive to octaves when studying the relations between chords or notes, we actually think in terms of pitch classes. Hence, an n-tuple of pitch classes is, in principle, a point of the n-dimensional torus Tn . However, chords where notes (or pitch classes) are permuted are considered equivalent from the harmonic point of view. Therefore, if we ignore the order in which the notes are arranged, we have to quotient Tn by the symmetric group Sn , and we come up with the mathematical deﬁnition of chord. In what follows we shall always assume n > 2. Definition 2.2.1 (n-dimensional pitch space). A tuple of n notes (p1 , . . . , pn ), where P = {pi }ni=1 ⊆ Z12 is a point in the space T n = S1 n . The idea is to neglect the order in which notes are listed in P , thus Definition 2.2.2 (Chord space). A chord is a point in the space An = Tn /Sn , where Sn is the symmetric group, that acts by permutation of the coordinates: σ (x1 , . . . , xn ) = xσ(1) , . . . , xσ(n) . 2.2.2 Geometrisation of the chord space The n-dimensional torus can be viewed as a quotient space with respect to integer translations: Tn ∼ = Rn /(12Z)n . Since the action of (12Z)n on Rn has no ﬁxed points, the projection π : Rn → Tn is a covering map and therefore it preserves the local topology. Furthermore, the symmetric group Sn acts on the n-torus via diﬀeomorphisms (isometries) by permuting the coordinates of each point. Thus An inherits from Tn the structure of metric space. Moreover, since it has been obtained from a diﬀerentiable manifold through the action of a ﬁnite group, it is also an orbifold. We refer to (Thurston, 2002) for details on this topic. However, An is not a diﬀerentiable manifold, because the points ﬁxed by the action of Sn are singular.1 The following result was proven in (Slavich, 2010) and provides a geometric characterisation of the chord spaces. The proof has been rewritten in Appendix B since the original document is written in Italian. Theorem 2.2.1. The space of chords An is a metric space, obtained by gluing the (n − 1)-dimensional tetrahedral bases of a right n-dimensional prism via the equivalence relation induced by a cyclic permutation of the vertices. 1 A point in An is singular if at least 2 of its coordinates have the same value: in this case the action of the permutation group admits fixed points. 26 CHAPTER 2. MATHEMATICAL MODELS: STATE OF THE ART Figure 2.5: The space A3 . It is possible to characterise the points of the chord space by considering the number of repeated pitch classes they contain. For instance, the points of the space A3 depicted in Figure 2.5 are structured as follows: (a) the points representing chords with no repeated pitch classes lie in the interior of the prism Ån . (b) the chords whose representatives are tuples of the form (x, x, y) lie on the 2-dimensional faces of the prism. (c) the edges of the prism are constituted by unisons (modulo octave). The voice leading between two n-notes chords can be represented as a trajectory in the chord space. The singular boundaries of the prism acts as mirrors on the trajectory (this particular feature of the chord space will be discussed in more details in Section 3.3). To help the reader’s intuition, it is possible to think about these reﬂections in the simpliﬁed representation of the billiard table orbifold in Figure 2.6 on the facing page. The action of the group of isometries of the plane on the four sides of the table generates inﬁnitely many collections of balls in R2 and the edges of the rectangle R act as mirrors respect to the trajectory of the ball. 2.3 The Tonnetz The Tonnetz has been largely studied in computational musicology. Its structure mirrors the acoustical properties of pitch classes and the connections between its vertices highlight relevant tonal, harmonic objects, such as major and minor triads. In the following sections, we will sketch its history and we deﬁne it both as an abstract graph and a simplicial complex. 27 2.3. THE TONNETZ R Figure 2.6: The billiard table orbifold is generated by the group of isometries of R2 reﬂecting a rectangle along its four sides. The borders of the rectangle R act as mirrors on the dashed trajectory. Figure 2.7: The Euler Tonnetz. Two pitch classes are connected by an edge, if they form a consonant interval. The horizontal arrow (PV) links two pitch classes a perfect ﬁfth apart, while the two pitch classes connected by the vertical arrow (MIII) forms a major third interval. 2.3.1 An overview on tone-networks Leonhard Euler was the ﬁrst to describe a Tonnetz in (Euler, 1774). Although this structure has been largely generalised, see for instance (Douthett and Steinbach, 1998; Tymoczko, 2012), the original idea was to create a diagram mirroring the acoustical proximity of the pitch classes of the chromatic scale in just intonation temperament. This representation of the Tonnetz is depicted in Figure 2.7. Two 28 CHAPTER 2. MATHEMATICAL MODELS: STATE OF THE ART consecutive notes on the horizontal axis, equipped with the orientations of the arrows showed in the ﬁgure, form a perfect ﬁfth interval (PV). On the vertical axis, a couple of consecutive notes form a major third (MIII) from top to bottom2 . The Tonnetz has inspired important modern musical models. For instance, the spiral array (Chew, 2002) (in equal temperament) can be described as a spiralisation generalising the Euler 3 × 4 diagram. It is deﬁned as a 3-dimensional helix where the position of the ith pitch class has cylindrical coordinates p(i) = (sin(iπ/2), cos(iπ/2), ih), where h is a ﬁxed height parameter and i ∈ Z. Hence, consecutive pitches on the helix are arranged to form perfect ﬁfth intervals. Moreover, the periodicity of the trigonometric functions implies that πx,y (p(i)) = πx,y (p(i + 4)), where πx,y : R3 → R2 is the canonical projection. Thus, two such points diﬀer only in their last coordinate, and represent a major third interval. See Figure 2.8 for a representation of the spiral array and an example of the two conﬁguration of pitch classes described above. If the aim of the Tonnetz was to represent the acoustical nearness among the 12 notes of the chromatic scale, the ﬁrst inﬁnite Tonnetz was introduced by von Oettingen in 1866. A new direction on the graph can be considered as relevant: the notes on the left-bottom/right-top diagonals in the Euler’s matrix are minor third intervals. Thus it is possible to extend the diagram of Euler as a inﬁnite triangular planar lattice. To safely deﬁne the Tonnetz in the Graph Theory formalism, we introduce the following deﬁnitions. Definition 2.3.1 (Abstract graph). An abstract unoriented graph is a pair (V, E) where V is a ﬁnite non-empty set and E is a non-empty set of unordered pairs of diﬀerent elements of V . Thus, an element of E is of the form {v, w} where v and w belong to V and v 6= w. We call vertices the elements of V and edges the elements {v, w} of E connecting v and w. Pitches can be associated to the Tonnetz’ vertices by deﬁning a labelling function lV : V → L. It is clear how it is possible to associate to the Euler’s diagram a set of vertices, which in terms of pitch classes correspond to the chromatic scale, and associate an edge to every couple of pitches with intervallic distance equal to 7, 3 or 4 half-steps3 . A formal deﬁnition of the Tonnetz as an abstract graph is given in (Zabka, 2009). Definition 2.3.2 (Realization of a graph). Let (V, E) be an abstract graph. A realization of (V, E) is a set of points in RN , whose elements are associated to vertices in V and edges are realized as segments joining the pairs e ∈ E. Such a realization is termed a graph. We require that the following two intersection conditions hold: 2 A change of the orientation of the axis will reverse the intervals. A perfect fifth’s inversion is a perfect fourth, while the inversion of a major third is a minor sixth. 3 We shall always consider an octave to be splitted in 12 half-steps 29 2.3. THE TONNETZ Figure 2.8: The spiral array. Two consecutive pitch classes lying on the helix are a perfect ﬁfth apart (considering the orientation of the curved arrow), while the vertical arrow connects two pitch classes a major third far from each other. 1. two edges meet either in a common end-point or not at all; 2. no vertex lies on an edge except at one of its ends. It is possible to represent the Tonnetz as a geometric realisation of an abstract graph corresponding to a 2-dimensional triangular lattice, whose edges are determined by three translation functions of the form τi : Z/12Z → Z/12Z p 7→ p + i mod 12, where i ∈ {3, 4, 5} and p ∈ LV is the set of labels equipped with the labelling function lV . See Figure 2.9 for a visualization of the Tonnetz. The cardinality of the set of unique vertices of the Tonnetz T (τ1 , τ2 , τ3 ) is determined by the order of the translations involved in its construction. In particular, it is the maximum of the orders of the translation maps involved, and corresponds to the whole chromatic scale if and only if τi generates Z/12Z for some i ∈ {1, 2, 3}. In particular T (3, 4, 5) contains the whole chromatic scale since 5 is a generator of Z/12Z. 2.3.2 The Tonnetz as a Simplicial Complex Thanks to the theory introduced in Section 2.1, it is possible to give a simplicial complex interpretation of the Tonnetz, as originally suggested in (Bigo et al., 2013). 30 CHAPTER 2. MATHEMATICAL MODELS: STATE OF THE ART Figure 2.9: Realization of the Tonnetz as a tiling of the plane. The vertices of the graph in Figure 2.9 correspond to 0-simplices, edges to 1-simplices and the 2-simplices are attached to the structure deﬁned by the 1-skeleton we just provided. In particular, considering the labels inherited by the graph we have that the 0-simplices correspond to pitch classes, 1-simplices to perfect ﬁfth, major third and minor third intervals4 and 2-simplices to major and minor triads. In Figure 2.10 the 2-simplices are labeled as triads. The label corresponds to the triad generated by the superposition of the notes on the triangle’s vertices. For instance, the triad of C major corresponds to [C, E, G], while C minor is associated to [C, E♭, G]. In the remainder of this work, we will refer to the Tonnetz as a simplicial complex, denoting it by T and its underlying space by |T |. In particular, we deﬁne an extended shape E of the Tonnetz as a subcomplex E ⊂ T . Given a topological space X and a discrete group G acting on it, a fundamental domain of the action of G on X is an open set S ⊂ X, such that the projection π : X → X/G is injective when restricted on S and surjective on D̄. Observe that a fundamental domain of the Tonnetz corresponds to a region which is the torus generated by the major and minor third intervals. Geometrically, it is realised by identifying the horizontal and vertical edges of the square represented in Figure 2.10 on the next page, according to the labels of the vertices. In the remainder of this work, we shall denote such a region by F . 2.3. THE TONNETZ 31 Figure 2.10: The gluing diagram of the Tonnetz torus. Pitch classes correspond to 0-simplex. Each triangle represents either a major or a minor triad denoted by a bold label, with major triads indicated by capital letters. Figure 2.11: Simple shapes and four notes chords. Extended Shapes on the Tonnetz The extended shape generated by a trace of the pitch classes played in a music phrase on the Tonnetz depends on the intervals among the notes involved in the phrase. However, it would not be possible to distinguish geometrically the subcomplexes associated to a C∆ and Cm7 (modern chord notations and the deﬁnition of triad, seventh chord and altered chord are detailed in Appendix E), both corresponding to 4 Or their inversions depending on the orientation of the edges. 32 CHAPTER 2. MATHEMATICAL MODELS: STATE OF THE ART (a) Ionian extended shape. &C w w wœ œ œ (b) Locrian extended shape. #œ œ œ œ œ bœ bœ œ (c) The ionian mode. & C bb w w wœ bœ bœ œ bœ (d) The locrian mode. Figure 2.12: Extended shapes on the Tonnetz. Two diﬀerent modes are represented by the same extended shape. the subcomplex generated attaching two adjacent triangles of T sharing an edge. In particular, more exotic chords correspond to the same shape. In Figure 2.11 some of the possible subcomplexes given by the attachment of two triangles on T are depicted. It is possible to observe in the ﬁgure, that altered chords appear next to the standard ones. The same phenomenon occurs for modes by analysing extended shapes generated considering diﬀerent modal scales. In this context, we refer to a mode as a scale supported by a fundamental note or a chord deﬁning the set of resolutions and tensions in the scale. (See Appendix A for details on modern modal theory). In Figures 2.12a and 2.12b we show how the same extended shape is associated to two diﬀerent modes. Figures 2.12a and 2.12b have been realised with the software Hexachord5 from MIDI ﬁles corresponding to the partitions of Figures 2.12c and 2.12d. The idea that led to the model we shall present in Part III is to deﬁne a preferred subcomplex of the fundamental domain of the Tonnetz, generated considering the pitch classes and the durations of musical phrases. 5 Developed by Louis Bigo during his Ph.D. thesis and available at http://www.lacl.fr/~lbigo/ recherche. Part II The horizontal dynamics of music: an algebraic and topological viewpoint on voice leading theory 33 Table of Contents 3 Voice leadings, partial permutations and geodesics 3.1 Deﬁning the voice leading . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2 Partial permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.3 Voice leading and piecewise geodesic paths . . . . . . . . . . . . . . . 43 3.4 Complexity of a voice leading . . . . . . . . . . . . . . . . . . . . . . 47 3.5 Complexity analysis of two Chartres Fragments . . . . . . . . . . . . 50 3.6 Rhythmic independence and rests . . . . . . . . . . . . . . . . . . . . 52 3.6.1 Example: the Retrograde Canon by J. S. Bach . . . . . . . . . 53 Concatenation of voice leadings and time series . . . . . . . . . . . . 54 3.7.1 Dynamic Time Warping analysis . . . . . . . . . . . . . . . . 55 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.7 3.8 4 Voice leading and braids 4.1 4.2 Partial singular braids . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.1.1 The braid group . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.1.2 Partial braids and partial permutations . . . . . . . . . . . . 61 4.1.3 Singular braids . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.1.4 Partial singular braids . . . . . . . . . . . . . . . . . . . . . . 63 Modelling voice leading in PSBn . . . . . . . . . . . . . . . . . . . . 63 4.2.1 Leaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2.2 Partial singular braid diagrams on pitch classes . . . . . . . . 66 4.2.3 Concatenation of voice leadings in PSBn . . . . . . . . . . . . 68 5 Discussion and future works Abstract This part focuses on the analysis of voice leadings, i. e. the transformation of a sequence of chords into a collection of superposed melodies in simultaneous motion. In Chapter 3, the musical idea of voice leading is formalised from a mathematical viewpoint as a multiset: an unordered collection of pitches where repetitions of the same element are allowed. Thereafter, a representation of voice leadings as partial permutations is described. This algebraic approach is re-interpreted geometrically in Section 3.3. Voice leadings become geodesics and their concatenation a piecewise geodesic path in the space of pitches, pitch classes and the chord space. The diﬀerent kind of simultaneous voice motions are analysed in each space, pointing out how minimal geodesics paths represents non-crossing voice leadings among two chords. In Section 3.4 we show how partial permutation matrices encode the information concerning simultaneous motions of n voices, including possible crossings among pairs of voices. We propose a method to represent diﬀerent kinds of voice leadings used in a piece as a multiset of points endowed with a multiplicity. Then, we suggest a simple extension of this model to other contrapuntal species than the ﬁrst one. Sequences of voice leadings, described as 5-dimensional points, are seen as multi-dimensional time series and compared using dynamic time warping. Finally, in Chapter 4, partial singular braids are introduced as a tool for the visualisation of partial permutations and, hence, voice leadings. Indeed, by selecting a particular class of braids it is possible to visualise voice leading among chords of n notes in the 3-dimensional Euclidean space. Then, the ﬁrst model is extended to take into account the intervallic leap of each voice. In conclusion of this chapter, we analyse the behaviour of the model in the space of pitch classes, analysing four examples previously discussed in (Tymoczko, 2011). This part represents joint work with Alessandro Portaluri and Riccardo Jadanza. 37 Three Voice leadings, partial permutations and geodesics Counterpoint represents the melodic point of view of composition, reﬂecting a horizontal way of thinking. In the particular case of simultaneous motion of voices, the attention is centered on the composition of multiple, independent melodies that end up forming a sequence of chords. This choice allows to compose melodies aﬀecting the listener both as a whole (chords) and as diﬀerent autonomous ﬂuxes of notes (parts). In the following sections, we shall focus on the formalisation of the voice leading process, also called part writing, that is the evolution and the interaction of parts or voices in a sequence of chords.1 Intuitively, we can think of it as the assignment of a melody to a certain instrument, when more than one melody is played by more than one instrument at the same time.2 3.1 Defining the voice leading In general, it is possible to describe a melody as a ﬁnite sequence of ordered pairs of pitches or pitch classes (pi , pi+1 )i∈I , where I is a ﬁnite set of indices. See Section 2.2 for the deﬁnition of pitch and pitch class. In order to model the voice leading in a mathematical way it is necessary to introduce ﬁrst the concept of multiset, a generalisation of the idea of set. This approach was already considered in (Tymoczko, 2006). Formally, a multiset M is a couple (X, µ) composed of an underlying set X and a map µ : X → N, called the multiplicity of M , such that for every x ∈ X the value µ(x) is the number of times that x appears in M . In layman terms, we can think of a multiset as of a list, where an object can appear more than once, whilst the elements of a set are necessarily unique. As an example consider L = [a, a, a, b, b, c]. The underlying set of L is X = { a, b, c } and the multiplicity function µ takes values µ(a) = 3, µ(b) = 2, µ(c) = 1. We deﬁne the cardinality |M | of M to be the sum of the multiplicities of the elements of its underlying set X. Observe, however, that a multiset is in fact completely deﬁned by its multiplicity function: it suﬃces to set M := dom(µ), µ . Definition 3.1.1. Let L and M be two ﬁnite multisets, such that |L| = |M |. A 1 Here the term “chord” is used in the musical sense, not necessarily as a point of the space An . It is possible to think in terms of voice leading even in non-compositional contexts: for instance, a guitarist reading a partition makes a part-writing choice, deciding to play a note on a certain string. Thus we can imagine the six strings as a choir composed by six singers playing together. 2 39 CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS 40 bijection between L and M is the multiset Φ ⊂ L × M , such that Φ = {(l1 , m1 ), . . . , (ln , mn )}, where L = {l1 , . . . , ln } and M = {m1 , . . . , mn }. If we interpret a set of n singing voices (or parts played by n instruments, or both) as a multiset of pitches of cardinality n, then a voice leading can be mathematically described as follows. Definition 3.1.2. Let M := (XM , µM ) and L := (XL , µL ) be two multisets of pitches with same cardinality n and arrange their elements into n-tuples (x1 , . . . , xn ) and (y1 , . . . , yn ) respectively.3 A voice leading of n voices between M and L, denoted by (x1 , . . . , xn ) → (y1 , . . . , yn ), is the multiset Z := (x1 , y1 ), . . . , (xn , yn ) , whose underlying set is XZ := XM × XL and whose multiplicity function µZ is deﬁned accordingly, by counting the occurrences of each ordered pair. Remark 1. Observe that the deﬁnition just given is not linked to the particular type of object (pitches): it is possible to describe voice leadings also between pitch classes, for instance. Note that it is also possible to describe a voice leading as a bijective map from the multiset M to the multiset L, i. e. as a partial permutation of the union multiset M ∪ L := (XM ∪ XL , µM ∪L ), where µM ∪L := max{µM χM , µL χL } and χM and χL are the characteristic functions of XM and XL , respectively.4 3.2 Partial permutations A partial permutation of a ﬁnite (multi)set S is a bijection among two ﬁxed sub(multi)set of S. For instance, this function can be a string of n symbols, in which we admit ⋄ as a special character to denote the empty character. In this deﬁnition the domain of the partial permutation is constituted by the position indices of the non-empty elements of the string. For instance the string “1 1 ⋄ 2” represents the partial permutation of domain {1, 2, 4}. The symbol 1 ﬁxed, 2 is mapped to 1 and 4 to 2. The corresponding cycle notation is ! 1 2 3 4 , 1 1 ⋄ 2 where the two sub(multi)sets corresponds to the rows of the matrix, the mapping to the columns and ⋄ is associated to unmapped elements. 3 These are in fact the images of two bijective maps ψM : {1, . . . , n} → M and ψL : {1, . . . , n} → L., where M and L are understood as “sets” with (possibly) repeated elements. 4 For a multiset S we assume that µS (x) = 0 if x ∈ / XS . Under this assumption, the function µM ∪L is defined on the entire set XM ∪ XL . 41 3.2. PARTIAL PERMUTATIONS Remark 2. In order to be able to do computations with partial permutations, it is fundamental to ﬁx an ordering among the elements of the union multiset M ∪ L. We give to M ∪ L the natural ordering 6 of real numbers, being its elements pitches. Indeed, in classical music with equal temperament, one deﬁnes the pitch p of a note as a function of the fundamental frequency using the Equation (2.2.1). This can be done also in the case where the elements of the union multiset are pitch classes: the ordering is induced by the ordering of their representatives belonging to a same octave. Example 3.2.1. The voice leading (G2 , G3 , B3 , D4 , F4 ) → (C3 , G3 , C4 , C4 , E4 ) (3.2.1) is described by the partial permutation of the ordered union multiset (G2 , C3 , G3 , B3 , C4 , C4 , D4 , E4 , F4 ) deﬁned by ! G2 C3 G3 B3 C4 C4 D4 E4 F4 . C3 ⋄ G3 C4 ⋄ ⋄ C4 ⋄ E4 (3.2.2) Thus, a voice leading between two multisets of n voices can be seen as a partial permutation of a multiset whose cardinality is less than or equal to 2n. The next step is to associate a representation matrix with the partial permutation. Let V be an n-dimensional vector space over a ﬁeld F and let E := {e1 , . . . , en } be a basis for V . The symmetric group Sn acts on E by permuting its elements: the corresponding map Sn × E → E assigns (σ, ei ) 7→ eσ(i) for every i ∈ {1, . . . , n}. We consider the well-known linear representation ρ : Sn → GL(n, F) of the group Sn given by   0    ρ(1 i) :=  1   1 1 .. . 1 0 1 .. . 1    ,    where the 1’s in the ﬁrst row and in the ﬁrst column occupy the positions 1, i and i, 1 respectively. The map ρ sends each 2-cycle of the form (1 i) to the corresponding permutation matrix that swaps the ﬁrst element of the basis E for the i-th one. Note that each row and each column of a permutation matrix contains exactly one 1 and all its other entries are 0. Following this idea and (Horn and Johnson, 1991, Deﬁnition 3.2.5, p. 165), we say that a matrix P ∈ Mat(m, R) is a partial permutation matrix if for any row and any column there is at most one non-zero element (equal to 1). When dealing with a voice leading M → L, the dimension m of the matrix P is equal to the cardinality of the multiset M ∪ L. Remark 3. In general, the partial permutation matrix associated with a given voice leading is not unique. This is due to the fact that we are dealing with multisets: if M → L is a voice leading it is possible that some components of L have the same value, i. e. that diﬀerent voices are playing or singing the same note. For this reason we introduce the following convention. 42 CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS Convention 1. Let M := (x1 , . . . , xn ) → L := (y1 , . . . , yn ) be a voice leading and suppose that more than one voice is associated with a same note of L. To this end, let (xi1 , . . . , xik ) be the pitches of M (with i1 < · · · < ik ) that are mapped to the pitches (yj1 , . . . , yjk ) of L, with yj1 = · · · = yjk and j1 < · · · < jk . In order to uniquely associate a partial permutation matrix P := (aij ) with the above voice leading, we assign the value 1 to the corresponding entries of P by following the order of the indices, that is by setting ai1 j1 = 1, . . . , aik jk = 1. Thus, we shall henceforth speak of the partial permutation matrix associated with a given voice leading. Example 3.2.2. The partial permutation matrix associated with the cycle representation of Equation (3.2.2) of voice leading represented in Equation (3.2.1) is   0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0     0 0 1 0 0 0 0 0 0   0 0 0 0 1 0 0 0 0   0 0 0 0 0 0 0 0 0 .   0 0 0 0 0 0 0 0 0     0 0 0 0 0 1 0 0 0   0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 Therefore, if M → L is a voice leading, if both M and L are thought of as ordered tuples and if P is its partial permutation matrix, we have that P M = L; in addition, the “reversed” voice leading L → M is obviously described by the transposed P T of P : P TL = M . This representation has the advantage of providing objects that are much handier than a multiset of pairs, speaking in computational terms. Algorithm 3.1 presents the pseudocode for the computation of the partial permutation matrix of a voice leading. Algorithm 3.1 Computing the partial permutation matrix. Input: M →L Output: P 1: 2: 3: 4: 5: ⊲ Source (M ) and target (L) multisets describing the voice leading ⊲ Partial permutation matrix associated with the voice leading Evaluate multiplicities of all x ∈ M and all y ∈ L; Generate the ordered multiset U := M ∪ L; Initialise P ∈ Mat |U | , R by setting P (i, j) = 0 for all i, j; for i, j ∈ {1, . . . , |U |} do if U (i) → U (j) then P (i, j) = 1 end if end for 3.3. VOICE LEADING AND PIECEWISE GEODESIC PATHS 3.3 43 Voice leading and piecewise geodesic paths We can imagine a voice leading of n voices as a sequence of n-dimensional vectors (points in Rn ), whose components are the pitches associated with each note played by each voice. An important feature of this visualisation is that the melody of a certain voice is always represented by the same coordinate (say the ith one) in every vector of the sequence: we can thus read it very simply by looking at the projections πi : R n → R (v1 , . . . , vn ) 7→ vi for i = 1, . . . , n. A useful way to represent this idea is to take an oriented segment joining two consecutive points u and v of Rn , that is a path γ : [0, 1] → Rn given by γ(s) := u + s(v − u). (3.3.1) Note that this is just a convenient graphical tool and does not mean at all that every point constituting the path is eﬀectively “played”: the only ones that are involved in the melody are the endpoints γ(0) = u and γ(1) = v. The main characteristic of the path presented just above is that it is a geodesic between the points u and v, being the n-dimensional Euclidean space ﬂat. There are inﬁnitely many ways to connect two points in Rn , and we are not interested in the particular way they are joined. It makes sense to set the convention that they are linked in the simplest way possible; this choice will bring advantages also in the following, as the reader will see. If we iterate this process for each note and for each voice we obtain a polygonal chain in Rn , which is not a geodesic but rather a piecewise geodesic. This is not surprising and in fact quite desirable, because if we considered a melody of more than 2 notes (per voice) and if we joined the endpoints with a segment, then we would lose all the information between the two, that is we would erase the melody itself! For this reason it is meaningful to consider a concatenation of geodesics, which allows to reproduce every step of the music. This is the geometric representation of what has been presented above in the algebraic form through partial permutation matrices. Indeed, if we consider a melody as a ﬁnite sequence of points (say k) in Rn , with n the number of voices, then we can describe it geometrically through a piecewise linear path and algebraically as the product Pk · · · P1 , where Pi is the partial permutation matrix of the i-th voice leading. As an example, consider the progression of triads in Figure 3.1a on the following page: each of them is represented as a triple (p1 , p2 , p3 ) in R3 , with p1 < p2 < p3 . In general it is possible to build a voice leading by associating each note of a given chord with a note of the following one, respecting the order induced by <. This rule has been used to draw the path in Figure 3.1b. Let us now consider the four voice leadings (B3 , F5 ) → (C4 , E4 ) and (B3 , F5 ) → (E4 , C4 ) and (F5 , B3 ) → (E4 , C4 ), (F5 , B3 ) → (C4 , E4 ), depicted in Figure 3.2 on page 45: from the musical and perceptual viewpoint they are completely equivalent in pairs (each row describes the same voice leading). CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS 44 6 4 D‹ & œœœ #œœœ A F© œœœ B‹ ###œœœ A‹ #œœœ F œœ œ (a) Voice leading among triads in root position. 75 Am A F# F 70 Dm Bm 65 60 65 70 65 y x 75 70 (b) Graphic representation in R3 of the above voice leading. Figure 3.1: Voice leading and corresponding piecewise geodesic path. Generalising this fact to n voices, it is natural to identify the paths in Rn that are symmetric with respect to the diagonal of this space. An immediate generalisation of this situation leads to the conclusion that if we apply the same permutation to both the endpoints of the paths representing a voice leading, then we obtain de facto the same voice leading. The above discussion about symmetry, points out that it makes sense to represent a voice leading of n voices as a geodesic on the Riemannian manifold with boundary Rn /Sn . In the special case n = 2 the space R2 /S2 is isomorphic to the half-plane H := {(x, y) ∈ R2 | x 6 y}. Figure 3.2 shows the voice leading between two dyads in R2 . It possible to represent and to analyse voice leadings as paths on more harmonyoriented spaces, such as the pitch class space Tn (whose points tuples of pitch classes) or the chord space An , introduced in Section 2.2. From the harmonic point of view it is indeed admissible to ignore the octave which a certain note of a chord belongs to, and to identify each chord with the whole set of its possible voicings. We are therefore interested in geodesics on these spaces, as they will be the representation of voice leading also in this fairly general setting. The paths that we are seeking will be easily constructed once we note that Tn and An are obtained as identiﬁcation spaces from Rn . Therefore it suﬃces to draw the segments connecting the endpoints 3.3. VOICE LEADING AND PIECEWISE GEODESIC PATHS 2 4 & œ œ œœ 45 œ Œ Figure 3.2: The four possible voice leadings between the notes of the score depicted above. Observe the symmetric nature of the paths with respect to the dashed line (y = x). of the voice leading in Rn , just like before, and then project them via the covering map that gives rise to the desired space. Here are some illustrated examples. Example 3.3.1 (Voice Leading on T2 ). In Figure 3.3a on the following page, the torus is described as a gluing space (see also Section 2.1). Thus, a trajectory crossing the upper border of the square in a certain point (x, u) will re-enter in the square at the point (x, l), where y = u and y = l are the lines where the horizontal edges lie, respectively. Symmetrically, the same argument holds for the vertical edges. Counting how may times a trajectory crosses the opposite edges of the square5 , it is possible to retrieve the number of octave leaps made by one or more voices during a voice leading (which a priori is lost, since we are considering pitch classes). Let us consider for this purpose the following four voice leadings in R2 : i) (D0 , F0 ) → (E0 , G0 ), iii) (D0 , F0 ) → (E0 , G1 ), ii) (D0 , F0 ) → (E1 , G0 ), iv) (D0 , F0 ) → (E1 , G1 ). They all represent the same voice leading (D, F ) → (E, G) in the pitch class space, but their path realisations are diﬀerent. Figure 3.3a on the next page displays these four paths on T2 : 5 It would be equivalent to consider the generators of the fundamental group of the torus π1 T2 = Z × Z. See (Hatcher, 2002, Ch. 1). 46 CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS (a) Geodesics and octave leaps in T2 . (b) Geodesics in A2 . Figure 3.3: Voice leading paths in the pitch class space T2 and in its relative chord space A2 . (a) The torus T2 is represented as a square with the usual identiﬁcation rule on its sides expressed by the symbols > and △. (b) The identiﬁcation rule on A2 is represented by the arrows on the vertical edges of the square. Path i) is drawn as the shortest red arrow, since the jump between the dyads does not exceeds the 0-th octave; Path ii) is represented by the green arrow, exiting the square from its right side and coming back in from its left side: this reﬂects the fact that the ﬁrst voice makes a leap of one octave; Path iii) is associated with the blue arrow, pointing to the top and re-entering the ﬁgure from the bottom: in this case it is the ﬁrst voice that jumps to the next octave; 3.4. COMPLEXITY OF A VOICE LEADING 47 Path iv) is rendered by the dashed red arrow: it jumps from the top to the bottom and then from the right to the left of the square, because both voices exceed the 0-th octave. Example 3.3.2 (Voice Leading on A2 ). In the 2-dimensional case of dyads A2 = T2 /S2 is the Möbius strip. See (Bergomi et al., 2014b) for details about the construction and the positioning of the chords on the lattice. Figure 3.3b shows three diﬀerent geodesic paths corresponding to the voice leading {B, F } → {C, E}, where the curly braces mean that we are identifying all possible assignments of parts to each voice. Observe the identiﬁcation of the left and right side of the square with inverted orientation and note the singular boundary of the unisons, constituted by the upper and lower side of the square. As we intuitively described in Section 2.2, the paths bounce back when touching it because of the quotient by the symmetric group. Here, it is clear that harmony is favoured over melody, as neglecting both octaves and ordering leads to focus on the ensemble of voices. These examples share and show one important feature: the shortest paths joining the two pairs of pitch classes T2 or dyads A2 (the minimal geodesics between those two points) represent voice leading with neither crossing nor octave leaps, whilst the paths that touch the singular boundary correspond to part writings where at least one of these phenomena occurs. Although the voice crossing is not advised as a standard practice in harmony manuals, it is a useful technique to avoid repeated notes, parallel ﬁfths and hidden octaves and to assure a high degree of independence to each voice. For further details on orchestration and the use of voice crossing, see (Prout, 2012; Boland and Link, 2012; Russo, 1997; Sussman and Abene, 2012; Notley, 2007). 3.4 Simultaneous motions of the voices and complexity of a voice leading We have seen in the previous section how the partial permutation matrix associated with a voice leading contains the information describing the path leading from one note to the next for each voice. Here, we are going to illustrate that, in fact, the tool that we have built also encodes the direction of motion of the diﬀerent voices, including their crossings. On the one hand, in music one distinguishes between three main behaviours (cf. Figure 1.12 on page 18; here, we omit parallel motion because it is not involved in our analysis): (i) Similar motion, when the voices move in the same direction; (ii) Contrary motion, when the voices move in opposite directions; (iii) Oblique motion, when only one voice is moving. On the other hand, with reference to a partial permutation matrix (aij ), it is possible to describe the motion of a voice by noting three conditions, which are immediate consequences of the ordering of the union multiset: 48 CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS 1) If there exists an element aij = 1 for i < j then the i-th voice is moving “upwards”; 2) If there exists an element aij = 1 for i > j then the i-th voice is moving “downwards”; 3) If there exists an element aii = 1 then the i-th voice is constant. The connection between the two worlds is the following: • If either Condition 1) or Condition 2) is veriﬁed by two distinct elements then we have similar motion; • If both Condition 1) and Condition 2) hold for two distinct elements then we are facing contrary motion; • The case of oblique motion involves Conditions 1) and 3) or Conditions 2) and 3), for at least two distinct elements. As we mentioned in Section 1.2, voice crossing is a particular case of these motions where the voices swap their relative positions. This phenomenon can be described in terms of multisets as follows. Definition 3.4.1. Let (x1 , . . . , xn ) → (y1 , . . . , yn ) be a voice leading (n ∈ N). If there exist two pairs (xi , yi ) and (xj , yj ) such that xi < xj and yi > yj or such that xi > xj and yi < yj then we say that a (voice) crossing occurs between voice i and voice j. The partial permutation matrix retrieves this information, as the following proposition shows. Proposition 3.4.1. Consider a voice leading of n voices and let P := (aij ) be its associated partial permutation matrix. Choose indices i, j, k, l ∈ {1, . . . , n} such that aij = 1 and akl = 1. Then there is a crossing between these two voices if and only if one of the following conditions hold: i) i < k and j > l; ii) i > k and j < l. Furthermore, the total number of voices that cross the one represented by aij is equal to the number of 1’s in the submatrices (ars ) and (atu ) of P determined by the following restrictions on the indices: r > i, s < j and t < i, u > j. Proof. In a partial permutation matrix the row index of a non-zero entry denotes the initial position of a certain voice in the ordered union multiset, whereas the column index of the same entry represents its ﬁnal position after the transition. It is then straightforward from Deﬁnition 3.4.1 that for a voice crossing to exist either condition i) or condition ii) must be veriﬁed. Every entry akl satisfying one of those conditions refers to a voice that crosses the one represented by aij . Hence, the number of crossings for aij equals the amount of 1’s in positions (r, s) such that r > i and s < j, summed to the number of 1’s in positions (t, u) such that t < i and u > j. 49 3.4. COMPLEXITY OF A VOICE LEADING Remark 4. The fact that the number of crossings with a given voice equals the number of 1’s in the submatrices determined by the entry corresponding to that voice (as explained in the previous proposition) holds true only because we assumed Convention 1. Indeed, if we did not make such an assumption, the submatrices could contain positive entries referring to voices ending in the same note but that do not produce crossings. From what we have shown thus far it emerges that it is possible to give a qualitative description of a voice leading by counting the voices that are moving upwards, those that are moving downwards, those that remain constant and the number of crossings. We summarise these features in a 4-dimensional complexity vector c deﬁned by c := #upward voices, #downward voices, #constant voices, #crossings , (3.4.1) so that we are now able to classify and distinguish voice leadings by simply looking at these four aspects. Remark 5. The notion of complexity we deﬁned above is not equivalent nor related to the standard deﬁnitions of complexity. Example 3.4.1. Similar motion. represented by  0 0  0   0  0 0 The voice leading (C1 , E1 , G1 ) → (D1 , F1 , A1 ) is 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0  0 0  0   0  1 0 and its complexity vector is (3, 0, 0, 0). Oblique motion. The voice leading (G2 , G2 , C3 ) → (C3 , C3 , C3 ) is associated with  0 0   0  0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0  0 0   0  0 1 and its complexity vector is (2, 0, 1, 0). Voice crossing. The voice leading (C1 , E1 , G1 ) → (G1 , C1 , E1 ) is represented by   0 0 1   1 0 0 0 1 0 and its complexity vector is (1, 2, 0, 2). By virtue of these tools it is straightforward to analyse an entire ﬁrst species counterpoint, by considering the concatenation of its voice leadings and thereafter we retrieve a sequence of complexity vectors. This last piece of information can be 50 CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS 25 4 & œ œ œ œ œ œ bœ œ œ œ œ œ œ œ œ œ œ œ œ bœ œ œ œ œ œ 25 4 & œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ Figure 3.4: Alleluia, Angelus Domini, Chartres fragment n. 109, fol. 75. visualised as a set of points in a 4-dimensional space — or rather as one or more of its 3-dimensional projections (see Subsection 3.5). In fact, if one wants to represent the complexity of the whole composition as a point cloud, one should take into account that diﬀerent matrices can produce the same complexity vector. Therefore, we have a multiset of points in R4 (with non-negative integer components). 3.5 Complexity analysis of two Chartres Fragments We are going to analyse two pieces that are parts of the Chartres Fragments, an ensemble of compositions dating back to the Middle Ages: Alleluia, Angelus Domini and Dicant nunc Judei. Both of them are counterpoints of the ﬁrst species and involve only two voices. The musical interest in these compositions consists in the introduction of a certain degree of independence between the voices and the use of a parsimonious voice leading, i. e. an attempt to make the passage from a melodic state to the next as smooth as possible. Note how the independence of the voices is reﬂected by the presence of contrary motions and crossings, which can then be interpreted as a rough measure of this feature. For a complete treatise on polyphony and a historical overview we refer the reader to Taruskin (2009). In what follows, we represent the multiplicity of each complexity vector c as a circle of centre c ∈ R4 and radius equal to the normalised multiplicity µ(c)/n of c, where µ(c) is the number of occurrences of c in the analysed piece and n is the total number of notes played or sung by each voice in the whole piece. Alleluia, Angelus Domini. The fragment under examination is depicted in Figure 3.4; here is the list of its ﬁrst four voice leadings, as they are generated by the pseudocode described in Algorithm 3.1: Voice Leading: [2, 0, 0, 0] Voice Leading: [2, 0, 0, 0] Voice Leading: [1, 1, 0, 0] Voice Leading: [1, 1, 0, 1] - [’F4’, ’C4’] [’G4’, similar motion up [’G4’, ’D4’] [’A4’, similar motion up [’A4’, ’E4’] [’G4’, contrary motion [’G4’, ’F4’] [’F4’, contrary motion - 1 ’D4’] ’E4’] ’F4’] ’G4’] crossing Table 3.1 on page 52 contains the the complexity vectors and their occurrences in the piece. The point cloud associated with this multiset is represented in Figure 3.5. 3.5. COMPLEXITY ANALYSIS OF TWO CHARTRES FRAGMENTS (a) Projection neglecting the crossing component of the complexity vectors. 51 (b) Projection neglecting the constant voices component of the complexity vectors. Figure 3.5: Three-dimensional projections of the complexity cloud of the paradigmatic voice leading Alleluia, Angelus Domini. The radius of each circle represents the normalised multiplicity of the corresponding complexity vector. 43 4 & 43 4 & œ œ œ œ œ œ œ œ œ œ œ œ œ œœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œœ œ œ œ œ œ œ œ œ œ Figure 3.6: Dicant nunc Judei, Chartres fragment. Observe how the projection that neglects the component of c corresponding to the number of constant voices (Figure 3.5b) gives an immediate insight on the relevance of voice crossing in the piece. Dicant nunc Judei. following analysis: Voice Leading: [2, 0, 0, 0] Voice Leading: [0, 2, 0, 0] Voice Leading: [0, 2, 0, 0] Voice Leading: [1, 1, 0, 1] - The ﬁrst part of the output of Algorithm 3.1 produces the [’F4’, ’C4’] [’G4’, similar motion up [’G4’, ’E4’] [’F4’, similar motion down [’F4’, ’D4’] [’E4’, similar motion down [’E4’, ’C4’] [’D4’, contrary motion - 1 ’E4’] ’D4’] ’C4’] ’D4’] crossing The complexity vectors arising in the whole piece and their multiplicities are again collected in Table 3.1; see Figure 3.7 instead for a visualisation of the point cloud describing the piece. Note how the voice crossing is more massive than in the point cloud describing Alleluia, Angelus Domini. In addition, the point (0, 0, 0) in Figure 3.7b corresponds to the point (0, 0, 2, 0) ∈ R4 , that represents trivial voice leadings where both parts do not vary. 52 CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS (a) Projection neglecting the crossing component of the complexity vectors. (b) Projection neglecting the constant voices component of the complexity vectors. Figure 3.7: Three-dimensional projections of the complexity cloud of the paradigmatic voice leading Dicant nunc Judei. The radius of each circle represents the normalised multiplicity of the corresponding complexity vector. Alleluia, Angelus Domini c µ(c) Dicant nunc Judei c µ(c) (0, 1, 1, 0) (0, 1, 1, 1) (0, 2, 0, 0) (1, 0, 1, 1) (1, 1, 0, 0) (1, 1, 0, 1) (2, 0, 0, 0) (0, 0, 2, 0) (0, 2, 0, 0) (1, 0, 1, 0) (1, 0, 1, 1) (1, 1, 0, 0) (1, 1, 0, 1) (2, 0, 0, 0) 2 2 4 2 6 4 4 1 7 5 1 9 15 4 Table 3.1: Complexity vectors of the analysed fragments and their occurrences. 3.6 Rhythmic independence and rests The examples analysed in Subsection 3.5 are counterpoints of the ﬁrst species — which is the simplest case, voices follow a note-against-note ﬂow. It is, however, possible to study more complex scenarios by introducing rhythmic independence between voices and rests in the melody, reducing non-simultaneous voices to the simplest case. If the voices play at diﬀerent rhythms or follow rhythmically irregular themes, we consider the minimal rhythmic unit u appearing in the phrase and homogenise the composition based on that unit: if a note has duration ku, with k ∈ N, we represent it as k repeated notes of duration u (see Figure 7.4 for an example). This transformation of the original counterpoint introduces only oblique motions and does not alter the number of the other types of motion. In musical terms, if a voice is silent it is neither moving nor being constant and it cannot cross other voices. Therefore, in order to include rests in our model it is necessary to slightly modify Algorithm 3.1 by introducing a new symbol (p) in the dictionary of pitches. We also choose to indicate a rest in the matrices associated with a voice leading by the entry −1. We adopt the following convention concerning the ordered union multiset. 53 3.6. RHYTHMIC INDEPENDENCE AND RESTS & ? ˙ œ œ ˙ ˙ œ ˙ ˙ œ œ ˙ (a) Counterpoint of the fifth species. 8 4 8 4 & ? œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ (b) Reduction to the first species. Figure 3.8: Reduction of rhythmically independent voices to a counterpoint of the ﬁrst species. Convention 2. We choose rests to be the last elements in the ordered union multiset associated with a voice leading. In other words, we declare p to be strictly greater than any other pitch symbol. Example 3.6.1. The voice leading (p, D4 , D5 ) → (D4 , C3 , C3 ) corresponds to the matrix   0 0 0 0 0 0 0 0 0 0     1 0 0 0 0 .   0 1 0 0 0 0 0 −1 0 0 Remark 6. Note that when introducing the −1’s in the matrix associated with a voice leading, we are no longer dealing with partial permutation matrices. However, to study voice leadings with rhythmic independence of the voices as before (thus ignoring rests) it is enough to consider the minor of the matrix obtained by deleting all rows and columns containing −1 (which is obviously again a partial permutation matrix). We extend the complexity vector deﬁned previously in Formula (3.4.1) by adding a ﬁfth component that counts the number of voices that are silent at least once in the voice leading, i. e. it counts the number of negative (−1) entries of the associated matrix. Furthermore, we slightly modify also the notion of normalised multiplicity of a complexity vector c, needed for the representation of the complexity of a piece in the form of a point cloud, now dividing the number µ(c) of occurrences of c in the piece by the total number of notes per voice after the homogenisation. 3.6.1 Example: the Retrograde Canon by J. S. Bach We consider the Retrograde Canon (also known as Crab Canon), a palindromic canon with two voices belonging to the Musikalisches Opfer by J. S. Bach, the beginning of which is reproduced in Figure 3.9. CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS 54 b4 & b b4 Œ œ œ b˙ œ n˙ b4 œ œ œ œ œ œ œ œ œ œ œ œ nœ nœ nœ œ œ œ œ œ œ & b b 4 œ œ œ nœ œ œ œ 16 4 16 4 b &b b b &b b ˙ ˙ ˙ ˙ œ œ œ œ œ œ œ œ nœ œ œ œ ‰ ‰ œ œ œ œ bœ œ œ œ œ œ œ œ œ œ œœœœœœ œœœœœœ œœœœœœ nœ œ œ œ nœ œ œ œ œ œ Figure 3.9: The Retrograde Canon (bars 1–4), a palindromic canon belonging to the Musikalisches Opfer by J. S. Bach, and its reduction to ﬁrst species counterpoint (unisons have been omitted). c Retrograde Canon µ(c) c (0, 0, 1, 0, 1) (0, 0, 2, 0, 0) (0, 1, 0, 0, 1) (0, 1, 1, 0, 0) (0, 1, 1, 1, 0) (0, 2, 0, 0, 0) 2 8 2 43 1 11 (1, 0, 0, 0, 1) (1, 0, 1, 0, 0) (1, 0, 1, 1, 0) (1, 1, 0, 0, 0) (1, 1, 0, 1, 0) (2, 0, 0, 0, 0) µ(c) 2 43 1 14 3 11 Table 3.2: Complexity vectors of the Retrograde Canon and their occurrences. We homogenise the rhythm by expressing each note in eighths and we apply Algorithm 3.1. Here is the output of the ﬁrst four meaningful voice leadings: Voice Leading: [’D4’, c = [1, 0, 1, 0, 0] Voice Leading: [’D4’, c = [2, 0, 0, 0, 0] Voice Leading: [’F4’, c = [1, 0, 1, 0, 0] Voice Leading: [’F4’, c = [1, 1, 0, 0, 0] - ’D4’] [’D4’, ’F4’] oblique motion ’F4’] [’F4’, ’A4’] similar motion up ’A4’] [’F4’, ’D5’] oblique motion ’D5’] [’A4’, ’C#5’] contrary motion Table 3.2 collects the complexity vectors and their multiplicities; they are displayed in the form of point clouds in Figure 3.10. 3.7 Concatenation of voice leadings and time series The paradigmatic point cloud associated with a voice leading gives a useful 3-dimensional representation of the piece; however, this analysis is just structural, as it does not take into account the way in which voice leadings have been concatenated by 3.7. CONCATENATION OF VOICE LEADINGS AND TIME SERIES (a) Projection on the first three components of the complexity vector. 55 (b) Projection on the upward, downward and crossing components of c. (c) Projection on the upward, downward and rest components of c. Figure 3.10: Three-dimensional projection of the 5-dimensional point cloud representing the complexity of the Retrograde Canon. The radius of each circle represents the normalised multiplicity of each complexity vector. the composer. It is possible to introduce this temporal dimension by looking at the sequence of complexity vectors from a diﬀerent viewpoint. The concatenation of observations in time can be seen as a time series, that is a sequence of data concerning observations ordered according to time. In our case each piece of music can be described as a 5-dimensional time series, whose observations are the complexity vectors associated with each voice leading. More speciﬁcally, we use the so-called Dynamic Time Warping (DTW), a method for comparing time-dependent sequences of diﬀerent lengths: it returns a measure of similarity between two given sequences by “warping” them non-linearly (see Figure 3.11 for an intuitive representation) along the temporal axis. We invite the reader to consult Senin (2008) for a detailed review of DTW algorithms. 3.7.1 Dynamic Time Warping analysis Let F be a set, called the feature space, and take two ﬁnite sequences X := (x1 , . . . , xn ) and Y := (y1 , . . . , ym ) of elements of F, called features (here n and m are natural numbers). In order to compare them, we need to introduce a notion of distance between features, that is a map C : F × F → R, also called a cost function, that meets at least the following requirements: 56 CHAPTER 3. VOICE LEADINGS, PARTIAL PERMUTATIONS AND GEODESICS Figure 3.11: Dynamic Time Warping among two series of observation. i. C(x, y) > 0 for all x, y ∈ F; ii. C(x, y) = 0 if and only if x = y; iii. C(x, y) = C(y, x) for all x, y ∈ F. Now, if we apply C to the features X and Y , we can arrange the values in an n × m real matrix C := C(xi , yj ) , where i ranges in {1, . . . , n} and j in {1, . . . , m}. A (n, m)-warping path in C is a ﬁnite sequence γ := (γ1 , . . . , γl ) ∈ Rl , with l ∈ N, such that: 1. γk := (γkx , γky ) ∈ {1, . . . , n} × {1, . . . , m} for all k ∈ {1, . . . , l}; 2. γ1 := (1, 1) and γl := (n, m); y x 3. γkx 6 γk+1 and γky 6 γk+1 for all k ∈ {1, . . . , l − 1}; 4. γk+1 − γk ∈ (1, 0), (0, 1), (1, 1) for all k ∈ {1, . . . , l − 1}. The total cost of a (n, m)-warping path γ over the features X and Y is deﬁned as Cγ (X, Y ) := l X k=1 C(xγkx , yγ y ). k An optimal warping path on X and Y is a warping path realising the minimum total cost (see Figure 3.12). We are now ready to deﬁne the DTW distance between X and Y : DT W (X, Y ) := min { Cγ (X, Y ) | γ is a (n, m)-warping path } . Remark 7. Note that the minimum always exists because the set is ﬁnite. We computed the DTW distance between each pair of the three examples that we analysed in Subections 3.5 and 3.6.1, choosing as cost function the Euclidean distance in R5 . We embedded the 4-dimensional complexity vectors in R5 by adding a ﬁfth component and setting it to 0. The results of the comparison are shown in Table 3.3. Although we analysed only three compositions, it is possible to observe how the DTW distance segregates the two pieces belonging to the Chartres fragments. 57 3.8. DISCUSSION Figure 3.12: Optimal warping path on Alleluia, Angelus Domini and Dicant nunc Judei. Alleluia Dicant Canon Alleluia Dicant Canon 0.00 0.62 1.34 0.62 0.00 1.16 1.34 1.16 0.00 Table 3.3: DTW distance matrix for the three time series of complexity vectors. 3.8 Discussion Our analysis showed that our deﬁnition of complexity in terms of the relative movements of the voices and especially of crossing is suitable for characterising a musical piece. The point-cloud representation yields a “photograph” of complexity, a sort of ﬁngerprint that lets clearly emerge the main features of the examined composition, noticeable even at ﬁrst glance. Albeit the extension of this method to the whole set of contrapuntal species forces a naïve simpliﬁcation of the compositions, it provides a measure of their dissimilarity. The information collected by the time series of complexity vectors associated to a composition grabs the organisation of voice leadings in time, encoding the information concerning the motion of each voice, the overall conﬁguration of the voices (relative motions and crossings) and the distribution of rests in time and among the parts. The DTW provides a direct measure of the dissimilarity between the complexity time series of two pieces. The optimal warping path points out the regions of the compositions that can be considered comparable with respect to the set of properties listed above. Four A braid-oriented visualisation of voice leadings Partial permutations and their representation as low-dimensional sparse matrices are a handy computational tool to describe voice leadings, however they cannot provide a intuitive visualisation of the horizontal motions of the voices. Often describing voice leadings, musicologists refer to the superposed melodies naming them strands (Kurth and Rothfarb, 1991; Lester, 1994; Larson, 2012). Following this natural approach, we shall shortly deﬁne simultaneous melodic strands as a set of 3-dimensional paths, that is representative of a partial permutation. Furthermore, allowing these strands to intersect either at their starting or ending point, it is possible to visualise unisons. This comfortable representation of repeated pitches shall allow us to represent additional information such us leaps among voices. Finally, we take advantage of this representation, in order to tackle the problem of the interpretation of voice leadings between pitch-class sets. 4.1 The braid group and the partial singular braid monoid In this section the basic background concerning braids and partial singular braids are recalled. Our main references are (Hansen, 1989) and (East, 2007, 2010). 4.1.1 The braid group Definition 4.1.1. A braid β on n strands is a collection of embeddings B := {β α : [0, 1] → R3 , α = {1, . . . , n}}, with disjoint images such that: • β α (0) = (0, α, 0); • β α (1) = (1, τ (α), 0) for some permutation τ ; • the images of each β α is transverse to all planes {x = const}. Definition 4.1.2. Two braids are said to lie in the same topological braid class if they are homotopic relative to the endpoints in the sense of braids: one can deform a braid into the other without any intersection among the strands. 59 60 CHAPTER 4. VOICE LEADING AND BRAIDS (a) First braid (b) Second braid (c) Their concatenation Figure 4.1: Concatenation of braids. j+1 j+1 j j i+1 i+1 i i (a) Action of σi before σj . (b) σj before σi i+2 i+2 i+1 i+1 i i (c) σi σi+1 σi (d) σi+1 σi σi+1 Figure 4.2: Graphical representation of the Braids properties in Equations (4.1.1) and (4.1.2). There is a natural group structure on the space of topological braids with n strands, Bn , given by concatenation Figure 4.1. Using generators σi which interchanges the i-th and (i + 1)-th strands with a positive crossing yields the presentation for Bn . Let p1 : σ i σ j = σ j σ i ; |i − j| > 1 p2 : σi σi+1 σi = σi+1 σi σi+1 ; (4.1.1) i < n − 1. (4.1.2) These two properties are depicted in Figure 4.2. We are ready to give a presentation of the group Bn as Bn := hσ1 , . . . , σn−1 : p1 and p2 holdi. 61 4.1. PARTIAL SINGULAR BRAIDS Figure 4.3: A partial braid β ∈ IB5 Let si be the permutation (i i+1), and the symmetric group Sn can be presented as hs1 , . . . , sn−1 | si sj = sj si for |i − j| > 1, s2i = 1, si si+1 si = si+1 si si+1 for 1 6 i < n i, thus the projection deﬁned on the group of braids on n strands, on the symmetric group is given by π : B n → Sn σi 7→ (i i + 1). 4.1.2 Partial braids and partial permutations The braid group is too structured to represent generic voice leadings where voices can collide in unisons or can be rested. Our solution is to consider a weaker structure in which strands are endowed with higher degrees of freedom. Here, we introduce the inverse monoid of partial permutation and the partial singular braid monoid. Definition 4.1.3 (Monoid). A monoid is a couple (S, +), where S is a set, + is an associative binary operation and it exists e ∈ S such that it is the identity element (for that operation). We associate a monoid to a given set in the following way. Definition 4.1.4 (Inverse monoid). Given a set S the inverse monoid IS is the set of all the partial bijections of S. The braid inverse monoid IBn introduced in (Easdown and Lavers, 2004) is the braid analogous of the symmetric inverse monoid In of partial permutations on n symbols. The monoid IBn is the monoid of all homotopy classes of partial braids on n strands. A partial braid can be thought as a full braid b ∈ Bn , with some strands removed. A representative of a partial braid in IB5 is depicted in Figure 4.3. There exists an epimorphism IBn → In , extending the projection π : Bn → Σn deﬁned above. The epimorphism π ∗ : IBn → In β 7→ β 62 CHAPTER 4. VOICE LEADING AND BRAIDS (a) Two elements β and β1 of IB5 . (b) Concatenation β ∗ β1 . (c) Removal of the unlinked strands. Figure 4.4: Concatenation of partial braids in IB5 . is deﬁned by construction. The partial braid depicted in Figure 4.3 is naturally associated to the partial permutation in I5 , such that ! 1 2 3 4 5 . ⋄ 1 4 ⋄ 2 The operation deﬁned on IBn is the multiplication of partial braids. Given two elements of { β, β1 } ⊂ IBn (see Figure 4.4a), the multiplication of the two partial braids is depicted in Figure 4.4b and in Figure 4.4c. Observe that the string fragments which do not connect the upper plane to the lower plane are removed. See (East, 2007) for further details. 4.1.3 Singular braids Singular braids are a generalisation of the standard braids, in which strands can intersect generating at most a ﬁnite number of singularities. Singular points have no inverses (Fenn and Keyman, 2000), then the set SBn of singular braids on n strands is not endowed with a group structure. SBn is the monoid deﬁned as follows. −1 Definition 4.1.5. SBn is generated by s1 , . . . , sn−1 , s−1 1 , . . . , sn−1 , t1 , . . . , tn−1 , due to the relations 1. ∀i < n, si s−1 i = e. 2. For |i − j| > 1 the compositions si sj , ti tj and si tj commute; 3. ∀i < n − 1 si si+1 si = si+1 si si+1 ti si+1 si = si+1 si ti+1 ti+1 si si+1 = si si+1 ti+1 . 4.2. MODELLING VOICE LEADING IN PSBn 63 Figure 4.5: Singular generator of SBn . Geometrically the singular generator has to be included, see Figure 4.5. Musically, we shall only admit singularities representing unisons (multiple strands starting or ending at the same point). 4.1.4 Partial singular braids To model all of the possible simultaneous motions of voice leadings including rests and unisons, we need to consider the monoid of partial singular braids PSBn containing both the partial and the singular braid monoid deﬁned above. The theory about PSBn will be sketched in this section, however we refer to (East, 2010) for a detailed analysis. Consider the set { 1, . . . , n } with n ∈ N and a singular braid b ∈ SBn . A partial singular braid β is obtainable by removing some strands of b. The removal of the whole set of strands is allowed in PSBn . In this particular situation β is said to be a sub-braid of b. A partial singular braid β induces a partial permutation β̄ ∈ In exactly as a partial braid does. Let { β1 , β2 } ∈ PSBn , the two partial singular braids are said to be equivalent if the partial permutations they induce are the same and if β1 ⊂ γ1 and β2 ⊂ γ2 with γ1 and γ2 singular braids equivalent in the sense of rigid-vertex-isotopy (Birman, 1993). The multiplication of two partial singular braids follows the same rule we introduced for partial braids (see Figure 4.4). In particular denoting as |β| the number of strings of β and as N (β) the number of its singular points, the two submonoids of partial braids and singular braids of PSBn can be described as IBn = { β ∈ PSBn | N (β) = 0 } and SBn = { β ∈ PSBn | |β| = n }. 4.2 Modelling voice leading in PSBn The ﬁrst approach we describe is a mere translation in the braid formalism of the model described in Section 3.2. We deﬁne a voice leading of at most n voices as a partial singular braid β ∈ PSBm , where m is the cardinality of the underlying set of the multiset M ∪ L as it has been introduced in Section 3.1. These braids are better visualisable as piecewise linear braid diagrams, i. e. 2dimensional projections of the partial singular braids. Considering these diagrams strands are depicted as line segments in R2 . See Figure 4.6 for an example. This particular visualisation is suitable for representing simultaneous motions of voices, 64 CHAPTER 4. VOICE LEADING AND BRAIDS (a) Unison. (b) Unison and crossing. Figure 4.6: Partial singular braid representation of voice leadings. since the slope of each (projected) strand encodes the information concerning the movement (upward, downward) of each voice. The introduction of singularities in correspondence of the starting and ending points of the braid diagram allows to simplify the model described in Section 3.2. Given a voice leading v : M → L, where M and L are two multisets, we can represent the elements of M ∪ L labeled with the same symbol as a single point of the domain of the braid. In this case, the multiplicity of repeated symbols is encoded as in the example of Figure 4.6a, where the voice leading (C4 , E4 , G4 ) → (D4 , F4 , F4 ) is represented as a singular braid. A crossing of the voices in PSBn is represented as a crossing of the strands of the partial singular braid diagram. The voice leading (C4 , E4 , G4 ) → (F4 , D4 , F4 ) represented in Figure 4.6b has a surprisingly clear musical interpretation: it contains a voice crossing corresponding to the crossing of the projected strands and it is singular, since two voices collapse on their target chord in a unison. Let s (β α ) be the slope of the line segment resulting form the projection of the strand β α , and b+ , b− , b0 the number of strands with positive, negative and zero slope, respectively; cr the number of crossings; r = n − N (β), where n is the number of voices involved in the voice leading including rested ones; and ﬁnally s = |β| the number of singularities. We can now rewrite the complexity vector associated to voice leadings as c = b+ , b− , b0 , cr, r, s . 4.2.1 Leaps In Section 1.2, leaps were pointed out as an important feature to determine the quality of a voice leading. The partial permutation matrix model described in Section 3.2 65 4.2. MODELLING VOICE LEADING IN PSBn (a) Voices’ leaps. (b) Voices’ leaps and crossing. Figure 4.7: Partial singular braid representation of voices leaps. does not take them into account. To do so, we should deﬁne a partial permutation whose domain has minimal cardinality equal to n × m, where n is the number of voices involved in the voice leading and m is the number of half-steps from the lower to the higher pitch involved in the voice leading. For example, the representation of a voice leading between two 4-notes chords, ranging on 2 octaves, will require a partial permutation matrix of dimension 96.This is why in the partial permutation representation of voice leadings, we established a convention in order to manage repeated pitches and we build the model aiming at minimising the dimension of the matrix representing the voice leading1 . Partial singular braid diagrams allow to represent leaps deﬁning a domain of cardinality m, consisting in the pitches ranging from the minimum to the maximum pitch involved in the voice leading. Singularities can be used to represent repeated voices in a chord and the slope of each line segment corresponds univocally to a musical interval. See Figure 4.7. It is possible to store this information in the complexity vector either by writing explicitly the slope of each strand, or for instance, by splitting them into two classes of consonant and dissonant intervals (referring to the deﬁnitions given in Section 1.2). 1 Considering a concatenation of voice leadings, to represent the whole counterpoint with matrices and to take into account intervallic leaps, it would be necessary to use matrices of maximal dimension, generating in the case of an orchestra a high-dimensional sparse representation. 66 CHAPTER 4. VOICE LEADING AND BRAIDS (a) β1 (b) β2 Figure 4.8: Partial braids inducing the same partial permutation. Voice leading, partial singular braids and partial permutations As we stated in the mathematical introduction of this section, there exists a natural projection π ∗ : PSBn → In . Thus, a class of partial singular braids β, in the sense of braid’s homotopy (Deﬁnition 4.1.2) describes a particular partial permutation β̄ on the elements of the dominion of β in an obvious sense. As it is shown in Figure 4.8 the braids β1 and β2 induce the same partial permutation represented by the cycle ! 1 2 3 4 5 6 . 3 ⋄ ⋄ 5 ⋄ 4 A crossing of two strands, both oriented from left to right is said positive if it corresponds to a positive braid generator. The particular choice of dealing with piecewise linear, positive, partial singular braid diagrams allows to associate a partial permutation to this particular class of braid diagrams and vice versa. 4.2.2 Partial singular braid diagrams on pitch classes It is possible to model voice leadings modulo octave by considering the pitch-class 1 ¯ space T , d , introduced in Section 2.2. In this case, the partial singular braids domain is given by the chromatic set of pitch classes { [C] , [C♯] , . . . , [B] } ∼ = { [0] , [1] , . . . , [11] } and the braids diagram is wrapped around the cylinder C = R/12Z × [0, 2π]. In this case, strands correspond to geodesics on a cylinder, i. e. to helix segments parametrised as γ : [0, 2π] → R3 γ (t) = (cos (at) , sin (at) , t) . Consider the voice leading C1 = (C, E, E) → C2 = (F ♯, C♯, E) , depicted in Figure 4.9. Although the information concerning the octave is neglected, we can read in the image the measure of the leaps relative to each voice. The path 4.2. MODELLING VOICE LEADING IN PSBn 67 Figure 4.9: The partial singular braid representation of a voice leading deﬁned in R/12Z. connecting C to F ♯ makes a complete round along the cylinder, meaning that the two notes are more than one octave apart. The singularity at E in the top face of the cylinder represents the unison or the doubling of the second and third voices of the chord C1 . These doubled voices lead to C♯ and E respectively without octave leaps. Simultaneous voice motions have an interesting representation in this context: taking into account the orientation of the wrapping (clockwise or counterclockwise/righthanded or left-handed) of the helix segments on the cylinder. In our example we can deduce that C and one of the E move downward to F ♯ and C♯ respectively, while the last E is ﬁxed since the trajectory is a straight line. Topologically, this information is encoded in the fundamental group of the cylinder C: π1 (C) = Z meaning that m positive or negative turns around the cylinder encode the octave leaps information. Remark 8. Considering the class of partial singular braids with geodesics strands and positive crossings on pitch-classes, we cannot associate a unique braid to a partial permutation, in fact in this context geodesics are not unique, however it is always possible to consider minimal geodesics to represent strands among pitch classes, respecting the direction of the voice leading, if it is known a priori. True and false crossings The pitch class representation does not allow to distinguish among true and false voice crossings, unless the distance among the voices of the ﬁrst chord is known a priori: a trajectory representing a leap of more than an octave and less than two makes one complete turn around the cylinder. Hence it crosses all the other strands involved in the voice leading, even if voices do not truly cross in a musical sense. In Figure 4.10 four diﬀerent voice leadings among the 2-pitch class chords 68 CHAPTER 4. VOICE LEADING AND BRAIDS C1 = (C, E) and C2 = (D, F ) are depicted2 . As we stated few lines above, the lack on information given by the identiﬁcation of the octaves does not allow to distinguish among true or false voice crossings, as it can be shown by analysing the four voice leading represented in Figure 4.10: a) In Figure 4.10a the two strands do not make a complete tour of the cylinder and do not cross, meaning that the target pitch-classes lie in the same octave of the pitch-classes of the ﬁrst chord and that there is no topological and musical crossing among the voices. b) Figure 4.10b shows the crossed alternative of the previous voice leading C1 → σ12 C2 , where σ12 ∈ S2 . Since the helix segments are left-handed and right-handed respectively and the strands of the braid do not complete the tour of the cylinder, what we can deduce from this conﬁguration is that F lies in the same octave as C and symmetrically D lies in the same as E. To establish if the voices cross in a musical sense mirroring the strands’ crossing, it is necessary to know the distance among the voices of the ﬁrst chord: assuming C and E to belong to the same octave, the voices actually cross. However, if the two notes belong to two diﬀerent octaves, for instance C4 and E5 , no crossing occurs among them. c) In Figure 4.10c, C and E moves downward to reach F and D respectively and both voices move in contrary motion of less than one octave. No crossing can occur among these voices as it is mirrored by the trajectories of the braid. d) In the last ﬁgure, voices move in contrary motion, downward and upward respectively always targeting pitch-classes less than one octave distant from them. If the pitch classes of the ﬁrst chord lie in the same octave, then no music crossing occurs, despite the topological conﬁguration of the braid’s strands, however it suﬃces to choose the representative of C and E to be C4 and E3 to have an actual crossing corresponding to the one depicted on the ﬁgure. In conclusion, the analysis of voice leadings between pitch-class sets gives a representation of the no-crossing voice leading as the collection of shortest paths among multisets of pitch-classes3 and maximize the number of crossings in the other cases. Thus, the pitch-class braid-oriented visualisation of voice leadings collects the information concerning both octave leaps and voice crossings as they are described in Hughes (2015), where a model of voice leading built on the fundamental groupoid of the chord space An , is discussed. 4.2.3 Concatenation of voice leadings in PSBn As we point out in Section 3.3, when representing several ordered voice leadings, it is not desirable to compose the braids representing each of them, but to concatenate them one after the other: the composition PSBn inherits from PBn imposes to delete strand fragments not connecting the ﬁrst braid to the second, see Figure 4.4. Thus, 2 see (Tymoczko, 2011, p. 76) for a representation of the same voice leadings in A2 . Hence, neglecting the order in which voices are associated, we can always retrieve a no-crossing voice leading connecting the voices of the first chord to the ones of the second through minimal geodesics on the cylinder, as in Figure 4.10a. 3 69 4.2. MODELLING VOICE LEADING IN PSBn 2,1 5,−2 (a) (C, E) −−→ (D, F ). (b) (C, E) −−−→ (F, D). −7,−2 (d) (C, E) −−−−→ (F, D). (c) (C, E) −−−−→ (F, D). −7,10 Figure 4.10: Simultaneous motions of two voices. The movement of each voice is written above the arrow in half-steps, the sign distinguish among upward and downward movements. using the multiplication deﬁned on PSBn to compose braids, one would delete the strands representing any melody containing a rest, losing the information concerning the whole piece of music. The idea is to represent a succession of voice leadings as a time series { βi }i∈{ 1,...,n } , such that βi ∈ PSBn for each i, corresponding to a concatenation of braid diagrams as it is shown in Figure 4.11, where both the pitches and pitch-class braids for the ﬁrst seven voice leadings (corresponding to eight melodic states) of Alleulia: Angelus Domini are depicted. The fragment we analyse is given by the superposition of the two voices v1 = (F4 , G4 , A4 , G4 , F4 , G4 , B♭4 , A4 ) v2 = (C4 , D4 , E4 , F4 , G4 , G4 , F4 , E4 ) , represented by the blue and red trajectory respectively. In this case, being the pitches involved in the segment of the composition we represented contained in a octave, the pitch and the pitch class diagram are equivalent. It is possible to observe how this kind of representation gives a friendly access to the information describing the simultaneous motion of voices. It retrieves the special case of parallel motion, voice crossings and unisons are represented by strands crossings and singularities, respectively. Observe, that in this case, the concatenation of partial singular braid diagrams corresponds to the multiplication deﬁned in PSBn , since no rest is included in the passage we examined. 70 CHAPTER 4. VOICE LEADING AND BRAIDS Given a sequence of voice leadings {β1 , . . . , βn }, a motif is a subsequence {βp , . . . , βq } with 1 6 p < q 6 n. This last representation allows to compare voice leading motifs at ﬁrst sight (consider, for instance, the crossing pattern in Figure 4.11). It provides a possible solution for the evaluation of their features, according to both their geometry and concatenation in time. The advantage of this braid-based representation, is the possibility to encode the whole information concerning the voice leading in a three dimensional braid and hence a 2-dimensional diagram, despite the number of voices composing it. 4.2. MODELLING VOICE LEADING IN PSBn 71 25 25 25 25 4 & œ bœ œ œ œ 25 25 b œ b œbœœ œ œ 25 œ œ b œ 4 4 & & œ œ œ œ œ b œ b œ b b œ b œ œ œ œ œ œ œ œ œ œ œ œ œ & 4œ œ &œ œ œœ œœ 4œœ œœ &œœ œ4œœ œœ&œœ œœœ4œœœœœœ&œœœ œœœ œœœœ œœœ œœœ œœœœœœœœbœœœœœœœœœœœœœœœœœœ œœœ œœœœbœœœœœœœœœœœœœœœœœbœœœ œœœ œœœb œœœœ œœœ œœœœœœ œ 25 4 25œ b œ œ b œ25 25 œœ 25 &254254 25 bœœœœœœœœœœbbœœœb œœœœœœœœœœœ œœœœœbœœœœ œœœbœœœbœœœœœœœœœb œœœœ œœœ œ œ œb œ œœ œœ œ&œ25œœœ œœœ254œœœ25 œœœœœœœœœœ& 4œ&œ œ œ&œœb œ œœ4œ25œœœœœœœ& œœœœœœœœ25 œœœœœœbœœœœ œœ4œœœœb œœœœ25 œœ4& œ œ & œ œ 25 œœ œ œœœœœœœœœœ œœœœœœœœœœœœœœœœœœœœœœœœ œœœ œœœœœœœœœœœœœœœœœœœœœœœœœœ œ œœœœ œœœ4œœœ œœ& œœœœœœœ œ4œœœœœœœ œœ& &œ œœ4 œœ 4œœ&œœ &œœœ œœœ4 œœœ œœœ& œœœœœœœ4œœœœœœœœœ& œ & 4 œ &œ œ & œ œœ œœ œœ œ œœ œ œ œ œ œ œ œ œœ œœ œœ œ œ œ œ œ œ œ 25 œ œ 25 25 2525 œœœœœœœœœœœœœœœœœœœœœœœ œœœœ œœœœ œœœœœ œœœœ œœœœ œœœœ œœœœ œœœœ œœœ œœœœ œœ œœœ25 œ 4& && œ œœ25 œœœœœœœœœœœœœ25 œœœœœœœœœœ& 4 4 4 4 & & & & œ œ œ 4 &4& œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ 4 œ œ œ œ œ œ œ œœ œ œ œœœœ œ œœ œœ œ œ œ œœ œœœ œ œ œ œ œœ œ œ œ œ œœ œ œ { Figure 4.11: Concatenation of pitch and pitch-class partial singular braids.The observation of a single strand, or of the whole voice leading (regions 1, . . . , 7) provide an intuitive representation of both the motions of pairs of voices (similar, parallel, oblique, contrary) and of the behaviour of each voice (downward, upward and ﬁxed). The length of a crossing is simply measurable, as more complicated phenomena such as the overlap (see Section 1.2). Five Discussion and future works Thanks to the mathematical formalisation of the concept of voice leading, we deduced a model to describe voices’ motions as low dimensional partial permutation matrices. The geodesics-oriented interpretation of voice leadings and the analysis of their concatenation, provided a relationship between the algebraic and the geometric context. Thereafter, the information carried by the partial permutation matrix associated to a voice leading has been rewritten under the form of complexity vector. Sequences of such vectors have been used to characterise the counterpoint of ﬁrst specie as a multiset of 4-dimensional points. We proposed a generalisation of the model to the other contrapuntal species and the sequence of complexity vectors has been interpreted as a multi-dimensional time series, describing how diﬀerent kind of voice leadings has been concatenated in time. Dynamic time warping provides a measure of the distance between the complexities of pairs of compositions, and gives a quantitative description of the dissimilarity of their time-series representation. In order to visualise voice leadings in a 3-dimensional space, a braid-like representation has been introduced allowing to extend the ﬁrst model, measuring the intervallic leap of each voice in the passage from a chord to the next one. In addition, a pitch-class version of the braid diagram representation provides an environment to visualise voice leadings among n-chords, mirroring the properties of trajectories in the space of chords. A straightforward development oﬀered by the model we describe is the possibility to classify the collection of possible voice leadings among two chords in terms of the length of the geodesic strands of the braid representing it. Connecting the notes of two chords with minimal paths corresponds to a crossings-free voice leading. The variations of this conﬁguration could be classiﬁed considering the length of each strand of the braid associated to the voice leading. 73 74 CHAPTER 5. DISCUSSION AND FUTURE WORKS In addition, the model we introduced as a visualisation tool has topological properties that could be investigated, for instance in terms of knot theory (Alexander, 1923, 1928). To do that one should weaken our assumption on the crossings among the strands. A possible deﬁnition could involve, for piecewise braids, the slope of the line segment describing the voices, forcing a strand associated to a bigger leap to pass above the others. Part III The vertical dynamics of music: persistent musical features 75 Table of Contents 6 Music analysis through deformations of the Tonnetz 6.1 6.2 An anisotropic Tonnetz for music analysis . . . . . . . . . . . . . . . 83 6.1.1 A variable geometry, 3-dimensional Tonnetz . . . . . . . . . . 84 6.1.2 Preferred directions in music: a naïve approach . . . . . . . . 85 Towards a topological classiﬁcation of music . . . . . . . . . . . . . . 90 7 Topological persistence 7.1 7.2 Simplicial homology . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 7.1.1 n-chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 7.1.2 Boundary homomorphisms and homology groups . . . . . . . 94 7.1.3 An algorithm for computing homology . . . . . . . . . . . . . 95 From homology to persistent homology . . . . . . . . . . . . . . . . . 98 7.2.1 An intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 7.2.2 Persistent homology for topological spaces . . . . . . . . . . . 99 7.2.3 An algorithm for computing persistence . . . . . . . . . . . . 104 8 A topological fingerprint for music 8.1 8.2 Persistent homology classiﬁcation of deformed Tonnetze . . . . . . . 109 8.1.1 The lower star ﬁltration . . . . . . . . . . . . . . . . . . . . . 110 8.1.2 A ﬁltration of the deformed Tonnetz . . . . . . . . . . . . . . 111 Musical interpretation and persistent clustering . . . . . . . . . . . . 112 8.2.1 Musical Interpretation . . . . . . . . . . . . . . . . . . . . . . 112 8.2.2 Hierarchical persistent music clustering . . . . . . . . . . . . 116 8.2.3 1-dimensional persistence . . . . . . . . . . . . . . . . . . . . 121 78 TABLE OF CONTENTS 9 Audio feature deformation of the Tonnetz 9.1 Computing consonance values . . . . . . . . . . . . . . . . . . . . . . 126 9.2 Persistent homology and audio feature deformed Tonnetze . . . . . . 130 9.3 9.4 9.2.1 Persistence for point clouds . . . . . . . . . . . . . . . . . . . 130 9.2.2 Deformed Tonnetze for modern modes classiﬁcation . . . . . 131 9.2.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 9.2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Tonnetz deformation through triads’ consonance . . . . . . . . . . . 136 9.3.1 The consonance function for triads . . . . . . . . . . . . . . . 136 9.3.2 Analysis of block voicings on the consonance-deformed Tonnetze138 9.3.3 Gaussian curvature: a geometric music feature . . . . . . . . 142 9.3.4 Musical interpretation . . . . . . . . . . . . . . . . . . . . . . 144 9.3.5 Classiﬁcation of the consonance-deformed Tonnetze . . . . . . 146 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 10 Discussion and future works Abstract In contraposition to the analysis of voice leadings as superposition of horizontal melodies, this part is devoted to the study of music as composed by vertical structures. Albeit triads, seventh chords and their altered forms represent the basic vertical objects in western music, every melodic interval (arpeggio) can be notated and thought vertically as the harmonic superposition of its note. In Figure 5.1 both the melodic and harmonic representations of a major third interval, a major seven chord, an altered chord and a whole scale are depicted in consecutive bars. This double vision of scales and chords as arpeggi and clusters inspired several techniques such as broken chords (Pass, 1987, p. 3) to enrich the harmonic and melodic playing of guitar with uncommon phrases built breaking a chord in smaller clusters. Neglecting the notion of voices, but taking into account only pitches (or pitch-classes) and durations of the notes, it is possible to produce an eﬃcient and simple musical representation, able to grasp a compositional idea, that is repeated in a composition and hence, represents its core. In Chapter 6 an early approach aiming at describing music (concepts) through the deformation of a topological space is described. The vertices of the Tonnetz are endowed with a variable height, respect to particular choices of pitch classes and durations of a sequence of notes. This representation inspired a second approach, whose mathematical bases are introduced in Chapter 7. The topological theory described in this chapter provides the tools we shall employ to interpret the musical information, once it is represented in the geometrical and topological structure of the deformed Tonnetz. The applications to music analysis and classiﬁcation and the results obtained through these strategies are described in Chapters 8 and 9. The former takes place in the symbolical domain and the latter is positioned at the crossroad between signal and symbols (this last part has been the subject of the talk (Bergomi, 2015)). 3 ˙ & ˙ bœ w w œ œ œ w w w œ œ œ œ bœ œ w w bnb w w w w w œnœ œ œ œ œ#œ œ w w w #w w w ww 3 Figure 5.1: Melodic and harmonic intervals. Pairs of consecutive bars represent diﬀerent musical entities from a melodic and a harmonic viewpoint, respectively. 79 Six Music analysis through deformations of the Tonnetz One of the most common features of geometric spaces for music analysis is their isotropic nature. Indeed, the pitch-class space (R/12Z, d) is usually represented as a cyclic graph where the 12 vertices representing the pitch classes are evenly spaced (we depicted this representation in Figure 2.4b). This feature reﬂects the equality of pitches in equal tuning, as in the representation of the pitch classes as an abstract graph a weight equal to 1 is naturally associated to the graph’s vertices. Here, we seek a strategy to associate to these vertices a collection of musically relevant weights, in order to produce an intuitive and analysable geometrical representation of a music piece. When dealing with Western music and in particular with modern music it is natural to reduce the possible temperaments to the equal tuning and hence to develop models based on these evenly spaced representations and unweighted graphs. This homogeneity is unavoidably inherited by spaces generated through the identiﬁcation of notes modulo octave. For instance, the Tonnetz interpreted as a simplicial complex whose triangles are equilateral does not allow to distinguish between two extremely diﬀerent sonorities, as we mentioned in Section 2.3.2. Here, the main idea is to develop a strategy to introduce preferred directions in these spaces. By preferred directions we mean a change in the geometry of the space, encoding relevant musical information. For instance, we shall represent a relevant pitch-class set on the realisation of the Tonnetz as a mountain ridge, highlighting the relevance of that particular choice of pitch classes respect to the others. In musical terms, these preferred directions can be thought as the core concepts representing the main musical ideas of a composition. Often, a concept sprouts in the mind of a composer as a small cell generally based on a precise rhythmical idea, a sequence of pitches, a harmonic pattern, or their combination. The original idea is then varied, merged with new ones, stretched or condensed and maybe left for a completely new one. Often, these musical variations are conceived to accompany the listener and are well codiﬁed, as it is shown by the vast literature concerning the study of melodic variations and the analysis of motifs. See for instance (Piston, 1947; Dudeque, 2005; Johnson, 2009) for a musical theoretical point of view on this subject and (Lewin, 2007; Buteau and Mazzola, 2000) and many others for a mathematical-oriented viewpoint. In (Dowling, 1972) the perception of the inversion and the retrogradation of a melody has been investigated, showing that it is grasped and understood by 81 82 CHAPTER 6. MUSIC ANALYSIS THROUGH DEFORMATIONS OF THE TONNETZ w ? #4 w 4 w w w { ˙ # & w œ { œ œ œ œ & œ œ œ œ œ œ œ œ œ œ œ œ ? #4 4 œ œ œ œ œ œ œ œ œœœœœ œ œ œ œ œ 4 œ wœœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ w œ œ œ œœ ?# œ œ œ œ œ œ œ œ œ œ œ œœ œ œ œœ œ œ œœ œ œ Figure 6.1: The musical concept evolution in Time by Hans Zimmer. The ﬁrst bar represents the musical idea that opens the composition. The following bars depicts consecutive evolutions of the ﬁrst concept. the listener. Being such variations relevant to the humans’ perception of music, our claim is that their representation as deformations of a metric space would grab this fundamental information. Once constructed, such a space shall give an immediate visual feedback and represent this information in geometrical and topological terms. Figure 6.1 shows how the musical concept described in the ﬁrst bar of Time by Hans Zimmer evolves during the piece. Each bar describes one of its variations (see Appendix D for the whole partition, as it has been arranged for piano by Sebastian Wolﬀ). The A minor triad suggested by a minor third interval in the ﬁrst bar, appears as a whole in the second one. On the rhythmical side, always in the second bar, the bass’ clave changes doubling the rhythmical ﬁgure. Then a new melodic idea is introduced: the second of the chord is added to enlarge the sonority of the triad. In the fourth variation, an indication concerning the dynamics of the phrase is introduced. Finally, a new melodic concept is added in the ﬁfth bar. Observe how these variations of the ﬁrst idea allow to declare an initial context, describe it and later enrich it with details and dynamics. The musical concept that is implied in these variations shall represent our preferred directions in music. Figure 6.2: A displacement of the vertices of |Z/12Z| according to the occurrences of their labels in the second movement of Shönberg Op. 19. 6.1. AN ANISOTROPIC TONNETZ FOR MUSIC ANALYSIS 18 FPS (1-42) 83 Tonnetz Cmajor tonnetz MOVE mouse & press LEFT/A: rotate, MIDDLE/S: zoom, RIGHT/D: pan Figure 6.3: Deformed geometries generated from the Tonnetz. A portion of the planar Tonnetz is represented on the plane z = 0. 6.1 An anisotropic Tonnetz for music analysis Consider the graph describing pitch classes and assume to weight the vertices counting the occurrences of each pitch class in a piece of music. Equivalently, it is possible ¯ say h : V → R to deﬁne a function on the set of vertices of the space (R/12Z, d), associating to each vertex a height corresponding to the number of occurrences of its label in the piece. Hence, the function h is deﬁned as the composition l c h:V → − L→ − R, where l associates vertices to pitch classes and c : L → R is the function counting the pitch classes’ occurrences. Consider the cylinder R/12Z × [0, 1]. It suﬃces to redeﬁne the position of the vertices as (xv , yv , h(v)) ∈ R3 to obtain the space represented in Figure 6.2. Such a space does not diﬀer from a common pitch-class histogram (Six and Cornelis, 2012), the only additional information retrieved by this representation is due to the structure of the graph. Pitch classes a half-step apart are connected by an edge. This consideration suggests the possibility to take advantage of the structure of the graphs that already proved their eﬃcacy in music analysis. Dealing with a more structured graph as the Tonnetz (see Section 2.3), one has the possibility to take advantage of the symbolic and acoustical properties it is endowed with. In particular, we recall that interpreting the Tonnetz as a simplicial complex, its edges are associated to precise intervals and triangles represent triads. Moreover, it was conceived to represent the acoustical relationships among pitch classes. This whole structure is preserved, when updating the height of the vertices of its geometric realisation. The Tonnetz has already been used to classify genres in (Bigo et al., 2013) analyzing the compactness of the simplicial structures representing the trace of a piece of music on diﬀerent planar Tonnetze. As it has been shown in Section 2.3.2 the subcomplexes generated as a planar trace on the Tonnetz do not distinguish scales or sonorities in a geometrical and topological sense. In order to capture the temporal and harmonic information, the vertices shall be displaced depending on the pitch which is played and on its duration. The reason why we use only three dimensions to encode these two features is to have the possibility to visualise the surface generated through these displacements, and provide a direct visual feedback, as it is depicted in Figure 6.3. 84 CHAPTER 6. MUSIC ANALYSIS THROUGH DEFORMATIONS OF THE TONNETZ (a) T (b) 1-skeleton Figure 6.4: The Tonnetz deformed with a major triad and its 1-skeleton. The triad appears as a maximal triangle with respect to the height function. 6.1.1 A variable geometry, 3-dimensional Tonnetz Let T be the inﬁnite planar simplicial Tonnetz and |T | ⊂ R3 its geometrical realisation (see Sections 2.1 and 2.3). Given a note n, let p be its pitch and d its duration. The vertices of T labelled with the pitch class [p] are translated in R3 along the z-axis direction of a distance d. In symbols, let V be the 0-skeleton of |T | and l : V → L the function associating vertices to labels. Let V[p] = { v ∈ V | l (v) = [p] } be the set of vertices of |T | labelled with [p]. The updating of the height of the vertices corresponding to a certain pitch-class is provided by a family of functions {h[p] }[p]∈L deﬁned as h[p] : V[p] → R3 (xv , yv , zv ) 7→ (xv , yv , zv + d) (6.1.1) for every [p] ∈ Z/12Z. Considering a collection of notes {n1 , . . . , nm } = {(p1 , d1 ), . . . , (pn , dn )}, the vertices of the Tonnetz labelled with the pitch class [p] will be translated vertically of the value corresponding to the sum of the durations of the notes ni such that pi = [p] mod 12, for 1 6 i 6 n. We refer to the geometric realisation of the Tonnetz in a deformed state, i. e. when at least one of its vertices does not lie in the plane z = 0, denoting it as T. In Figure 6.4 deformation T induced by a major triad played for 8 seconds is depicted. A 3-dimensional interactive animation showing how the Tonnetz is deformed by a musical phrase and allowing the user to play with its own keyboard to generate speciﬁc deformations is available at http://nami-lab.com/tonnetz/examples/ deformed_tonnetz_int_sound_pers.html. The Javascript and html code are also available on the web. See Appendix C for a commented version of the code generating the animation and a brief tutorial concerning its functions. The translation of the vertices is rendered as a continuous displacement in time, at a constant speed. The resulting shape after the deformation is equivalent to the one generated through the collection of functions deﬁned in Equation (6.1.1). 85 6.1. AN ANISOTROPIC TONNETZ FOR MUSIC ANALYSIS Eb B G E Gb D C Ab Bb F Db A Figure 6.5: A vertex map from the fundamental domain of the Tonnetz to the Tonnetz torus. The red and blue lines corresponds to the two generators of the torus, given by the translation (transposition) of 3 and 4 half-steps, respectively. 6.1.2 Preferred directions in music: a naïve approach Before using persistent homology to classify diﬀerent conﬁgurations of the Tonnetz, we describe the ﬁrst approach we followed and consequently a ﬁrst simple music representation strategy. This trial is based on the deﬁnition of a preferred subcomplex of T induced by the values of the height function on the vertices of its deformed geometrical realisation. Consider the simplicial complex T derived by deforming the heights of the vertices of |T | with a sequence of notes. Let fV : V ⊂ T → R (x, y, z) → z. be the height function, V the 0-skeleton of T and m = max{f (v) | v ∈ V }. The label of the vertex of maximal height is given by {l(v) | fV (v) = m} for v ∈ V . Let Vt ⊂ V be Vt = { v ∈ V | fV (v) > m/t } the collection of vertices whose height is greater than the threshold m/t ∈ R. The preferred pitch class set P is constituted by the pitch classes labelling the vertices of Vt . This set has inﬁnite cardinality since any label is associated with inﬁnite vertices of T . The labelling of the Tonnetz is double periodic with respect to both the translations of major and minor third intervals. We can restrict our analysis to the fundamental domain F ⊂ T generated by the translation corresponding to these intervals. See Figure 6.5. To map the square F in the Tonnetz torus T depicted in Figure 6.5, it is necessary to introduce some deﬁnitions. Let K be a simplicial complex. 86 CHAPTER 6. MUSIC ANALYSIS THROUGH DEFORMATIONS OF THE TONNETZ Definition 6.1.1. A point x ∈ |K| belongs to the interior of exactly one simplex of σ ∈ K. Assume σ = [v0 , . . . , vn ]. Then x= n X λi vi , i=0 P where λi > 0 for each i and ni=0 λi = 1. Let v be an arbitrary vertex of K, the barycentric coordinates bv (x) of x with respect to v are deﬁned by setting bv (x) = ( λi 0 if v = vi for 0 6 i 6 n . otherwise The following lemma (Munkres, 1984, Ch. 2, Lemma 2.7) allows to extend a map deﬁned on the 0-skeletons of two simplicial complexes to their entire structures. Lemma 6.1.1. Let ϕ : KV → LV a map between the 0-skeletons of the complexes K and L. Suppose that whenever the vertices v0 , . . . , vn of K span a simplex of K, their images f (v0 ), . . . , f (vn ) are vertices of a simplex of L. Then ϕ can be extended to a continuous map Φ : |K| → |L|, such that x= n X i=0 λi vi ⇒ Φ(x) = n X λi ϕ(vi ). i=0 The map Φ is called the linear simplicial map induced by the vertex map ϕ. In our case, the enharmonic labelling of the Tonnetz allows to deﬁne a vertex map and its extension to a simplicial map Φ : F → T. Consider the subcomplex S ⊂ F given by the simplices of F whose vertices are elements of VP . We obtain a subcomplex of Φ(S) ⊂ T, by identifying the simplices of S lying on opposite edges of F . In Figure 6.6 we show how this subcomplex can be computed in the case of Ravel’s Jeux d’Eau and setting t = 2 in three steps: 1. Figure 6.6a represents the projection of the preferred set of vertices on |T |. The darker parallelogram is the fundamental domain F visualised as a region of |T |. 2. The same region of the space is depicted in Figure 6.6b, where diﬀerent colours highlight edges and points to be identiﬁed. 3. The last ﬁgure represents the subcomplex S ⊂ T generated by the vertices labelled with preferred pitch classes. In geometrical terms, to consider the threshold t corresponds to slice T, in order to retrieve the vertices belonging to level-set f −1 ([m/2, m]). In musical terms, this operation allows to segregate the pitch classes that we heard at least m/t seconds during the piece. In addition, relevant pitch-class sets endowed with the structures of triads and consonant intervals (perfect ﬁfths, major and minor thirds and their inversions), corresponds to simplices of S. It is also possible to retrieve the absence of preferred musical entities if every pitch is a preferred one, and hence S = T. 6.1. AN ANISOTROPIC TONNETZ FOR MUSIC ANALYSIS (a) Fundamental domain. 87 (b) Preferred pitch class set (c) Preferred pitch classes subgraph Figure 6.6: Deriving a preferred pitch class set for Jeau d’Eau. Sometimes we shall refer to the subcomplex S, or its equivalent that will be deﬁned in the remainder of this work, as the subcomplex of preferred directions associated to a musical phrase or to a composition. This expression refers to the musical preferred directions (pitch-class sets corresponding to intervals, triads, or the whole chromatic scale) highlighted by the height function. Remark 9. In our analyses we consider only the Euler Tonnetz. However, this approach and in general the whole set of strategies we shall describe in this part are suitable for the analysis of other types of Tonnetze (Cohn, 2011; Bigo et al., 2013). Application The structure of the subcomplex of preferred directions associated to a composition can be used in order to classify them. It is important to notice that in order to do that, no labelling is needed in order to perform the analysis. In Figure 6.7 the shapes associated to several music pieces are depicted along with their preferred subcomplexes. The threshold we considered for the examples presented in this paragraph is t = 2. It is possible to see how tonal pieces are associated to subcomplexes representing a selection of major and minor triads along the ﬁfth axis of the Tonnetz (horizontal edges). Jeux d’Eau is described by a suite of three pitch classes a perfect ﬁfth a part, or equivalently three triads composed by six distinct notes corresponding to a diatonic scale minus its fourth degree (see Figure 6.6c). The ﬁrst movement of Mozart’s Sonata no. 8, is represented by two consecutive triads a perfect ﬁfth apart. The subcomplex associated to the ﬁrst movement of Beethoven’s Sonata no. 13 is also developed on the ﬁfth’s axis of the Tonnetz, where a minor triad represented by the only 2-simplex in Figure 6.7d is enriched with a perfect fourth, evoking a pentatonic sonority. Figures 6.7h and 6.7j represent Klavierstük I and a sequence of random pitches, respectively. Note that the ﬁrst subcomplex includes 11 of the 12 vertices (modulo identiﬁcations) of F , while the collection of randomised pitches includes the whole set of pitch classes. The analysis of the third piece of the Schönberg’s work reveals exactly the same preferred pitch-class subcomplex, we identiﬁed for the ﬁrst one. On the contrary, the second piece has a diﬀerent structure. The subcomplex depicted in Figure 6.7e diﬀers both from the representations we found for random sequences of pitches and from the tonal ones. In this case, a minor triad is preferred together 88 CHAPTER 6. MUSIC ANALYSIS THROUGH DEFORMATIONS OF THE TONNETZ with its major sixth and minor second, which could be considered a modal, rather then a tonal or a chromatic choice. Remark 10. The results presented in this section can be replicated on the web application. The set of preferred vertices can be visualised on |T | by clicking on the button dips_pref , after the Tonnetz has been deformed. The information concerning the labels of the height of the vertices is displayed in the JavaScript console. 6.1. AN ANISOTROPIC TONNETZ FOR MUSIC ANALYSIS (a) Sonata n. 8, mov. 1- Mozart. (b) Preferred pitch class subcomplex. (c) Sonata in C major, mov. 1 Beethoven. (d) Preferred pitch class subcomplex. (e) Klavierstück I - Schönberg. 89 (f) Subcomplex. (g) Klavierstück II - Schönberg. (h) Preferred pitch class subcomplex. (i) Random Pitches. (j) Preferred pitch class subcomplex. Figure 6.7: Deformed Tonnetze and their preferred subcomplexes. Identiﬁcations are omitted for clarity. CHAPTER 6. MUSIC ANALYSIS THROUGH DEFORMATIONS OF THE TONNETZ 90 3.13 4.314 2.58 3.764 5.536 2.196 3.342 Jeux d’Eaud’Eau. (a) Jeux 4.628 4.722 4.934 5.56 6.26 (b) Clair de Lune. 3.454 3.408 3.224 3.768 3.818 5.358 2.93 5.556 3.876 3.35 (c) Arabesque. 6.48 (d) Klvaierstrüke II. Figure 6.8: Weighted preferred subcomplexes of T . 6.2 Towards a topological classification of music Although the coincidence of the subcomplexes associated to the ﬁrst and last piece of Drei Klavierstücke is encouraging given their common atonal nature, we retrieve identical preferred subcomplexes even by considering Debussy’s Clair de Lune and Arabesque and Ravel’s Jeux d’Eau. Again, this result is interesting, being Ravel and Debussy the two most relevant exponents of the Impressionistic music. However, it points out one of the limitations of this approach: forgetting the weight associated to the vertices, it is not possible to distinguish between identical preferred subcomplexes of the Tonnetz. Let f¯ : F → R be the restriction of the height function to the fundamental domain of the deformed Tonnetz. Such a function allows to distinguish the three isomorphic subcomplexes associated to Jeux d’Eau, Arabesque and Clair de Lune. Moreover, it induces an ordering on the vertices of F given by f¯ ◦ h : |F | → R, where h : |F | → F is a simplicial homeomorphism. The vertices of the preferred subcomplexes in Figure 6.8 are labelled with the values induced by the height function. The subcomplex associated to Jeux d’Eau is based on a minor triad and includes its perfect fourth, that is to say, this representation suggests a minor pentatonic sonority to be a characteristic of the piece. In the case of Claire de Lune a major triad enriched with its major second and its perfect fourth is highlighted. By ordering the notes according to their heights and assuming for simplicity the root of the major triad to be C, we obtain the sequence of note (G, C, D, E, F, A). This sequence reveals the diatonic inspiration of this piece, the height of the notes points out a mixolydian1 structure. The subcomplex 1 the mode built on the fifth degree of the diatonic scale and associated to dominant chords and pentatonic sonorities. 6.2. TOWARDS A TOPOLOGICAL CLASSIFICATION OF MUSIC 91 associated to the third movement of Drei Klavierstücke suggests either a sonority based on one of the modes deduced from the melodic minor scale as the dorian ♭2, or a chromatic construction of the composition. The height function retrieves relevant information even neglecting the temporal ordering in which notes are presented. The natural advance from this point is to avoid the deﬁnition of an arbitrarily threshold and to extend the function deﬁned on the vertices to the other simplices (edges and triangles) of the complex. Morse theory (Milnor, 1963) revealed the close relationship between the topology of a manifold and the critical points of a smooth function deﬁned on it. In particular, discrete Morse theory (Forman, 1998, 2002) is an adaptation of this formalism to simplicial complexes. Unfortunately, in order to produce a discrete Morse function from a real function f deﬁned on a point cloud (King et al., 2005), it has to be injective on the set of points where it is deﬁned. Even after the restriction to the fundamental domain of the Tonnetz, where the labelling function is bijective, this hypothesis is too strong for music analysis: two or more pitch classes can be played exactly for the same amount of time in a composition, or never be played, and this information cannot be neglected. However, it is possible to describe the topology of a simplicial complex by taking into account an ordering induced on its simplices by a continuous real-valued function under milder assumptions. The solution is provided by the theory of persistent homology, we introduce in the next chapter. Seven Topological persistence Topological persistence was introduced by Patrizio Frosini and collaborators under the name of Size Theory (Frosini, 1992), addressing the problem of shape recognition from a rigorous mathematical point of view. This theory is comparable (although more general) to the 0-dimensional persistent homology described in (Edelsbrunner et al., 2002). As we shall see in the remainder of this chapter, the hypothesis required by the formalism of persistent homology are weak enough to make it suitable for a plethora of applications, including the analysis of music. It has been applied to the classiﬁcation of shapes (Chazal et al., 2009; Di Fabio and Landi, 2011), of hepatic and melanocytic lesions (Adcock et al., 2014; d’Amico et al., 2004), the analysis of cortical data (Chung et al., 2009), covering of sensor networks (De Silva and Ghrist, 2007; Munch et al., 2012), group behaviour analysis (Topaz et al., 2015) and many other ﬁelds. In order to safely introduce persistent homology, the homology of a topological space has to be deﬁned. 7.1 Simplicial homology The homology theory is a standard subject in Algebraic Topology. It is extensively described in a general setting in (Hatcher, 2002) and (Munkres, 1984). In this context we will describe homology as it is deﬁned for simplicial complexes. 7.1.1 n-chains Let K be a simplicial complex and n ∈ Z. A simplicial n-chain is a formal sum P P i σi where αi ∈ Z/2Z and σi are n-simplices of K. Let c = i αP i αi σi and d = i βi σi be two n-chains and deﬁne their sum as c+d= X (αi + βi )σi . (7.1.1) i The set of n-chains, equipped with the operation deﬁned in Equation (7.1.1) form the group of n-chains (Cn , +) or simply Cn , if the context allows to simplify the notation. The neutral element is 0 and the associativity of + is inherited by the sum in Z/2Z. The inverse of an element c ∈ Cn is −c = c, since c + c = 0. Moreover, Cn is abelian since the addition modulo 2 is commutative. The group of n-chains is deﬁned for every n ∈ Z. In particular, if n is less than 0 or greater than the dimension of K, 93 94 CHAPTER 7. TOPOLOGICAL PERSISTENCE Figure 7.1: A 3 and a 2-simplex and their respective boundaries. Cn is trivial. It is possible to link chain groups of diﬀerent dimension by deﬁning the boundary homomorphisms. 7.1.2 Boundary homomorphisms and homology groups The boundary of the n-simplex σ = [v0 , . . . , vn ], denoted as ∂n (σ), is the sum of its (n − 1)-faces. In symbols we have ∂n (σ) = n X [v0 , . . . , vˆi , . . . , vn ] . i=0 where the hat indicates the vertex to be dropped in order to consider the ith face of σ. In Figure 7.1 a tetrahedron and a triangle are depicted along with their respective boundaries. When considering coeﬃcients in ﬁelds diﬀerent from Z/2Z it is necessary to orient each simplex by taking into account the indices of its vertices. Thus, it is necessary to deﬁne the boundary homomorphism as an alternate sum, rather than a simple one (Hatcher, 2002, Ch. 2). The boundary of a n-chain is the sum of its simplices’ boundaries. Let c and d be two n-chains, then ∂n (c + d) = ∂n (c) + ∂n (d). Hence, ∂n : Cn → Cn−1 is a group homomorphism. The sequence ∂n+2 ∂n+1 ∂ ∂ ∂ ∂ 0 1 2 n 0, C0 −→ C1 −→ · · · −→ · · · −−−→ Cn+1 −−−→ Cn −→ of the abelian groups {Cn }n equipped with the boundary homomorphisms, where ∂0 = 0, is called a chain complex. A n-chain c with zero boundary, i. e. such that ∂n (c) = 0, is called a n-cycle. The collection of n-cycles denoted by Zn is the kernel of the boundary homomorphism ∂n and consequently a subgroup of Cn . A n-boundary is a n-chain c such that ∂n+1 (d) = c, for some d ∈ Cn+1 . In other words, a n-boundary is a n-chain which is the boundary of a (n + 1)-chain. In particular, the collection of n-boundaries Bn = Im ∂n+1 and hence Bn ⊆ Cn is also a subgroup. Lemma 7.1.1. ∂n+1 ◦ ∂n = 0 for every n ∈ Z. Proof. Let σ be a (n + 1)-simplex. Its boundary consists of the sum of its n-faces and each (n − 1)-face belongs exactly to two n-faces, hence ∂n+1 ∂n (σ) = 0. 95 7.1. SIMPLICIAL HOMOLOGY C3 Z3 C2 Z2 C1 Z1 C0 = Z0 B2 B1 B0 C4 = 0 0 ∂ 4 C−1 = 0 0 ∂3 0 ∂2 0 ∂1 0 ∂0 Figure 7.2: Representation of a chain complex associated to a 3-dimensional simplicial complex. It is simple to verify this property in low dimension, for instance consider the boundary of the boundary of a 3-simplex: ∂2 ∂3 ([v0 , v1 , v2 ]) = ∂2 ([v0 , v1 ] + [v1 , v2 ] + [v2 , v0 ]), as we said above the boundary homomorphisms commute with the sum, hence ∂2 ([v0 , v1 ] + [v1 , v2 ] + [v2 , v0 ]) = 2(v0 + v1 + v2 ) = 0 mod 2. From the Lemma 7.1.1 follows that every n-boundary is a n-cycle. Hence, Bn is a subgroup of Zn . We can consider the quotient Zn /Bn , whose elements are the cosets of the form c + Bn where c ∈ Zn . This quotient is an abelian group, since Zn is so. In Figure 7.2 a chain complex associated to a simplicial complex of dimension 3 is depicted. Observe how the 4th and 0th chain groups are both trivial. In particular, the triviality of the latter implies C0 = Z0 . The dashed lines between consecutive chain groups denote the eﬀect of the boundary homomorphism. The n-cycles vanish when mapped to the lower chain group; the n-boundaries are represented as a subset of the subgroup of n-cycles and correspond to the image of ∂n+1 . Definition 7.1.1. The nth homology group of a chain complex is the quotient Hn = Zn /Bn , for n ∈ Z. Two cycles are said to be homologous if they are in the same coset. The chain group is also a vector space for every n ∈ Z. Furthermore, the group Cn ≃ (Z/2Z)sn , where sn ∈ N ∪ { 0 } is the number of n-simplices in K. Hence, Cn is generated by sn elements. These elements can be thought as the vectors having only one non-zero component, corresponding to the ith n-simplex of K, for i ∈ { 1, . . . , sn }. The same structure is inherited by the cycles and the boundaries of Cn . We deﬁne the nth Betti number of K as βn = dimHn = dimZn − dimBn . 7.1.3 An algorithm for computing homology To compute the Betti numbers of a simplicial complex K, it is necessary to introduce its matrix representation. The information provided by the boundary homomorphisms is stored in a collection of matrices called boundary matrices. The nth 96 CHAPTER 7. TOPOLOGICAL PERSISTENCE Figure 7.3: 2-dimensional simplicial complex. boundary matrix Bn is deﬁned as Bn (i, j) = 1 if the i-th (n − 1)-simplex is the boundary of the jth n-simplex and 0 otherwise. The ordering used to store the simplices in the boundary matrix is the one induced by the vertices of the simplicial complex. Example 7.1.1 (Boundary Matrices). Consider the simplicial complex K of Figure 7.3. Its 0-boundary matrix is h i B0 = 0 0 0 0 . Since, the 0-simplices (columns of the matrix) have no faces. Ordering the 1simplices of K as [v0 , v1 ] , [v0 , v2 ] , [v1 , v2 ] , [v1 , v3 ] , [v2 , v3 ] on the columns and the vertices following the subscripts indices on the rows, we have  1 1  B1 =  0 0 1 0 1 0 0 1 1 0 0 1 0 1  0 0  . 1 1 The last boundary matrix has one column corresponding to [v0 , v1 , v2 ] and six rows corresponding tot he 1-simplices of K ordered as before, we have   1 1     B2 = 1   0 0 Let v be the column vector of the coeﬃcients of a n-chain c. Its boundary is computed as Bn · v, where · is the standard matrix product. The vector v is a n-cycle if and only if it exists a vector u ∈ Cn+1 such that Bn+1 · u = v. The rank of the n-chain group Cn is the number of n-simplices in the simplicial complex K let denote it as sn , hence the n-th boundary matrix Bn ∈ Mat (sn−1 , sn ). To represent the sizes of Bn and Zn and consequently Hn , the matrix Bn is reduced to normal form, as it is shown in Figure 7.4. The operations required to achieve the normal form reduction of the matrix are equivalent to the ones used in Gaussian reduction to solve linear systems of equations. See Algorithm 7.1 for the pseudocode. 7.1. SIMPLICIAL HOMOLOGY 97 Figure 7.4: Reduced n-th boundary matrix. Algorithm 7.1 Boundary Matrix Reduction. Input: Bn ⊲ a boundary matrix Output: R ⊲ the reduced boundary matrix 1: while m ∈ { 1, . . . , sn } do 2: if Bn (i, j) == 1 and i > m and l > m then 3: Exchange the rows m and i and the columns m and j; 4: for k ∈ { m + 1 . . . sn−1 } do 5: if Bn (k, m) == 1 then 6: add row k to row m; 7: end if 8: end for 9: for l ∈ { m + 1 . . . sn } do 10: if Bn (m, l) == 1 then 11: add column j to column m; 12: end if 13: end for 14: end if 15: end while 98 CHAPTER 7. TOPOLOGICAL PERSISTENCE On every iteration on m at most a linear number of rows and columns operations 3 is performed. Hence the total running time is at most O N , where N is the number of simplices of K. Example 7.1.2 (Boundary matrix reduction). Consider the boundary matrices of Example 7.1.1, in their normal form they are h i R0 = B0 = 0 0 0 0 ,  1 0  R1 =  0 0 and 0 1 0 0 0 0 1 0 0 0 0 0  0 0  , 0 0   1 1     R2 = B2 = 1   0 0 Setting zn = RankZn and bn = RankBn , we have z0 = 4 from B0 and b0 = 3 from B1 , hence the 0-th Betti number is β0 = 1, which is exactly the expected value, since the simplicial complex of Figure 7.3 has one connected component. In dimension 1 we have z1 = 2 and b1 = 1, thus β1 = 1 corresponding to the 1-dimensional hole of the simplicial complex. Finally z2 = 0 and hence β2 = 0. 7.2 7.2.1 From homology to persistent homology An intuition Observing a shape for the ﬁrst time, one tries to identify its persistent properties by neglecting details that can be easily lost by changing the position of the shape or hidden by small occlusions. The idea behind persistent homology is to measure these properties by rebuilding a shape as a monotonic sequence of nested spaces called a filtration. In Figure 7.5, we considered a point cloud subsampled from the image of a manuscript note and we associated to each point a circle of radius r ∈ R. The blob formed by the union of the circles assumes the shape of a musical note, while increasing the radius of the circles. Disconnected regions of the blob, as well as the small holes generated by partial intersections of circles do not impede the perception of the whole shape. Moreover, it is necessary to largely increase the radius of the circles to hide the persistent shape of the note. As an example, consider the classiﬁcation of manuscript notes, in order to recognise their author. The ﬁltration we produced above, is sensitive to the variations of the relative positions of the points, but invariant under uniform translation or rotations of the whole point cloud. Hence, this particular choice is suitable for the discrimination of the author who wrote the notes, that can be rotated according to 99 7.2. FROM HOMOLOGY TO PERSISTENT HOMOLOGY 600 600 600 500 500 500 400 400 400 ⊆ ⊆ 300 300 300 200 200 200 100 100 100 50 100 150 200 50 100 150 200 50 100 150 200 Figure 7.5: Three topological spaces corresponding to a part of the ﬁltration associated to a cloud of points derived form an image representing a manuscript note. their position on the score, but that are generally characterised by similar shapes of the head, or thickness of the stem. Shortly, we shall develop the necessary theory to deﬁne a ﬁltration function by considering either a continuous function on a topological space, or as a sequence of nested subcomplexes of a given simplicial complex. Furthermore, the noisy and persistent properties of the shape will be represented by a multiset of 2-dimensional points. This representation provides a surprisingly suitable framework for the comparison of shapes and in our case, of musical compositions. 7.2.2 Persistent homology for topological spaces Here and for the remainder of the dissertation, we assume X to be a triangulable topological space and f : X → R a continuous function. We recall that homology is computed with coeﬃcients in Z/2Z. Homological critical values and tame functions Let f : X → R be a continuous function on X. We denote by Xu = f −1 ((−∞, u]) the sub-level set of the function f , for every u ∈ R. Consider the topological sphere T S depicted in Figure 7.6 on the following page. The real numbers a1 6 a2 6 . . . 6 a7 generate seven nested sub-level sets T Sai = f −1 ((−∞, ai ]) of the height function f : T S → R, for i ∈ {1, . . . , 7}. To safely introduce persistence homology, two fundamental ingredients have to be deﬁned: homological critical values (Govc, 2013) and tame functions. Definition 7.2.1. Let X a topological space and f : X → R a continuous function. A real number a is called a homological regular value of f if there exists ε > 0, such 100 CHAPTER 7. TOPOLOGICAL PERSISTENCE f a7 a6 a5 a4 a3 a2 a1 Figure 7.6: Sub-level sets of the height function on a topological sphere. The six critical points of the height function are depicted as red dots. that for every couple of real numbers x < y on the interval (a − ε, a + ε), the inclusion f −1 ((−∞, x]) ֒→ f −1 ((−∞, y]), induces isomorphisms on all homology groups. Otherwise, a is called a homological critical value of f . Definition 7.2.2. Let X be a triangulable topological space. A continuous function f : X → R is tame if it has a ﬁnite number of homological critical values and the homology groups Hk (Xu ) are ﬁnite-dimensional for every u ∈ R and k ∈ Z. Note that in general, the deﬁnition of a tame function asks the homology groups to have ﬁnite dimension. The fact the f has a ﬁnite number of homological critical values assures that changes in the homology groups occur a ﬁnite number of times along the ﬁltration and in correspondence of these critical values. Examples of tame functions are Morse functions on manifold and piecewise linear functions on triangulable topological spaces. Persistence modules and persistent Betti numbers Definition 7.2.3. Let X be a triangulable topological space, f : X → R a tame function and u, v ∈ R, such that u < v. The kth persistence module Hku,v is the image of the homomorphism ιu,v k : Hk (Xu ) → Hk (Xv ), induced by the inclusion Xu ֒→ Xv . 101 7.2. FROM HOMOLOGY TO PERSISTENT HOMOLOGY α α α Hk (Xi−1 ) → Hk (Xi ) → . . . → Hk (Xj−1 ) → Hk (Xj ) Figure 7.7: The class α is born in Xi since it is not in the image of ιi−1,i depicted in k green. It dies in Xj , since it merges in Imιi−1,j . k u,v The k-persistent Betti number is deﬁned as βf,k = dim Im(ιu,v k ), for every k ∈ Z. It counts the homology classes of dimension k surviving in the passage from Xu to Xv . Here, it is possible to speak about the dimension of the image of the function induced by the inclusion Xu ֒→ Xv , since considering coeﬃcients in a ﬁeld, Hk (Xu ) and Hk (Xv ) are vector spaces and ιu,v k is a linear function. It is now possible to deﬁne the ﬁltration of the topological space X induced by the sub-level sets of a tame function f : X → R. By Deﬁnition 7.2.3 f has a ﬁnite number of homological critical values, say {c1 , . . . , cn }, where n ∈ N. Let {r0 , . . . , rn } be regular homological values of f , such that ri−1 6 ci 6 ri for every i ∈ {1, . . . , n}. It is possible to deﬁne a filtration of X as the collection of nested subspaces { Xri }i∈{0,...,n} , such that Xr 0 ⊆ X r 1 ⊆ · · · ⊆ Xr n . In addition, set r−1 = c0 = −∞ and rn+1 = cn+1 = +∞, in order to include in the ﬁltration that empty set and the whole space. If the context is clear we denote the subspaces of this ﬁltration simply as Xi . Finally, by traversing the ﬁltration, we assist to a ﬁnite number of changes (since f is tame) of the homology groups associated to each subspace Xi for i ∈ {0, . . . , n + 1}. In Figure 7.7, we say that the homology class α is born entering in Xi , since it does not came from a class in Xi−1 . Symmetrically, we say that α dies entering Xj if the image of the map induced by the inclusion Xj−1 ⊂ Xj contains α and the image of the map induced by Xi−1 ⊆ Xj−1 does not. Persistence barcodes and persistence diagrams The information retrieved by the analysis of the lifespan of homology classes along the ﬁltration can be represented as a diagram called persistence diagram. In such a diagram birth and death-levels of each homology class are represented as points lying in the open half-plane above the diagonal and endowed with a multiplicity (Frosini and Landi, 2001; Ferri et al., 2011). To describe how the persistence diagram is built, it is necessary to introduce another fundamental ingredient of persistent homology: the pairing. Under the 102 CHAPTER 7. TOPOLOGICAL PERSISTENCE hypotheses of triangulability and tameness of X and f respectively, let ci be a critical homological value of f . If ci is responsible of the birth of a homological class α, it is paired with the homological critical value cj responsible of its death (if it exists). The lifespan of α corresponds to the open interval [ci , cj ), with i < j. The homology with inﬁnite lifespan are paired with ∞. The collection of intervals retrieved running the whole ﬁltration is called a persistence barcode (Carlsson et al., 2005; Ghrist, 2008). The persistence diagram is built considering the pairing (ci , cj ) as point in R2 , or to be more precise, in the closure of the Euclidean plane including the points at inﬁnity. (u, v) ∈ R2 u = v be the diagonal of the Euclidean plane, ∆+ = Let ∆ = (u, v) ∈ R2 u < v be the open-half plane above the diagonal and ∆∗ = ∆+ ∪ { (u, ∞) | u ∈ R } the extension of ∆+ including the points at inﬁnity. Observe that the deﬁnition of ∆∗ is necessary, in order to describe cycles with inﬁnite lifespan. Definition 7.2.4. Let (u, v) be a point of ∆+ . The number µ(u, v) ∈ R realising the minimum over the real numbers ε > 0, with u + ε < v − ε, of βf,k (u + ε, v − ε) − βf,k (u − ε, v − ε) − βf,k (u + ε, v + ε) + βf,k (u − ε, v + ε) , is called the multiplicity of (u, v) for the persistent Betti number βf,k and k ∈ Z. A point (u, v) is called a proper cornerpoint for βf,k if its multiplicity is strictly positive. Definition 7.2.5. Let r : u = ū be a vertical line in R2 . We identify it with its point at inﬁnity (ū, ∞) ∈ ∆∗ . The multiplicity µ(ū, ∞) is the minimum over the positive real numbers ε, with ū + ε < 1/ε, of βf,k ū + ε, 1 ε − βf,k ū − ε, 1 . ε A point at inﬁnity endowed with a strictly positive multiplicity is called a cornerpoint at infinity for βf,k . Finally, we can introduce the deﬁnition of persistence diagram. Definition 7.2.6. The k-persistence diagram Dk (f ) is the multiset1 of all cornerpoints for βf,k , union the points of the diagonal ∆ counted with inﬁnite multiplicity. The persistence barcode and the persistence diagram associated to the ﬁltration of a topological sphere are depicted in Figure 7.8. Green intervals describe the information concerning the 0-dimensional persistence of the shape, while the blue ones describe the contribution of the 2-persistence module. Finite intervals correspond to proper cornerpoints, while the cornerpoints at inﬁnity are represented as vertical half-lines, also called cornerline. In conclusion, persistence diagrams describe the topological and geometrical properties of a shape X. These properties are retrieved by the analysis of the life and death-levels of the homological classes of the nested spaces determined by the 1 Each cornerpoint is equipped with a multiplicity. 7.2. FROM HOMOLOGY TO PERSISTENT HOMOLOGY 103 f c6 c5 c4 c3 c2 c1 Figure 7.8: An example of persistence barcode and persistence diagram. Noisy classes are represented as short bars in the barcodes and as points near the diagonal in the diagram representation. The critical points of the height function are denoted by red circles. According to their labels, the pairing is given by (c1 , ∞), (c2 , c4 ), (c3 , c5 ) and (c6 , ∞). ﬁltration induced by the sub-level sets of a tame functions f . Moreover, the lifespan of the homology classes represented by a cornerpoint corresponds to its distance from the diagonal. Thus, noisy and persistent homological classes are represented by cornerpoints lying near to or far from the diagonal, respectively. Bottleneck distance Persistence diagrams are simpler than the shape they represents and describe its topological and geometrical properties, as they are highlighted by the homological critical values of the function used to build the ﬁltration. The bottleneck distance allows to compare such diagrams. Definition 7.2.7. Let X be a triangulable topological space and f, g : X → R two tame functions. The bottleneck distance between Dk (f ) and Dk (g) is dB (Dk (f ), Dk (g)) = inf sup kp − γ(p)k∞ , γ p∈D (f ) k where γ : Dk (f ) → Dk (g) is a bijection and kp − γ(p)k∞ = maxp∈Dk (f ) |p − γ(p)|. In Figure 7.9 a bijection between two k-persistence diagrams is depicted. Corner points belonging to the two diagrams are depicted in orange and yellow, respectively. Observe how the inclusions of the points of ∆ allows the comparison of multisets of points whose underlying set has diﬀerent cardinality (see Section 3.1 for a deﬁnition of multiset) by associating one of the purple points to one of the points lying on the diagonal. An important property of persistence diagrams is their stability. A small perturbation of the tame function f produces small variations in the persistence diagram with respect to the bottleneck distance. 104 CHAPTER 7. TOPOLOGICAL PERSISTENCE Figure 7.9: A matching between two k-persistence diagrams. The bijections between elements of the diagrams is denoted using left-right arrows. Theorem 7.2.1. Let X be a triangulable topological space and f, g : X → R two tame functions. For every integer k, the inequality dB (Dk (f ), Dk (g)) 6 kf − gk∞ , where kf − gk∞ = supx |f (x) − g(x)|, holds. 7.2.3 An algorithm for computing persistence Persistence is computed through an algorithm mirroring the one we described in Algorithm 7.1. Let K be a triangulation of X, and f˜ : K → X a monotone function such that f˜ (τ ) 6 f˜ (σ) if τ is a face of σ. Consider an ordering of the simplices of K, such that each simplex is preceded by its faces and f˜ is non-decreasing. This ordering allows to store the simplicial complex in a boundary matrix B, whose entries are deﬁned as B (i, j) = 1 if σi < σj . 0 otherwise (7.2.1) The algorithm receives in input a boundary matrix B and reduces it to a new 0 − 1 matrix R via elementary column operations. Let J = { 1, . . . , n } be the indices of the columns of B and lowR : J → N j 7→ l, where l is the lower row index of the last 1 entry of the jth column. If a column has only 0 entries lowR (j) is undeﬁned. A matrix R is reduced if for every couple of 7.2. FROM HOMOLOGY TO PERSISTENT HOMOLOGY 105 non-zero columns of indices j 6= j ′ , lowR (j) 6= lowR (j ′ ). The reduction process is described recursively in Algorithm 7.2. The corresponding Python code is available in Appendix C. Algorithm 7.2 Persistence Algorithm. Input: B ⊲ a boundary matrix Output: R ⊲ the reduced boundary matrix 1: function R = reduce(B) 2: R=B 3: for j ∈ { 1, . . . , m } do ⊲ the number of simplices in K 4: for j0 ∈ { 1, . . . , j − 1 }: do 5: if lowR (j) == lowR (j0 ): then 6: Rj = Rj + Rj0 mod 2 ⊲ Add the j-th and the j0 -th columns 7: R = reduce(R) 8: return R 9: end if 10: end for 11: end for 12: return R ⊲ if B is already reduced, return B 13: end function Thus, the algorithm compute a upper-triangular invertible matrix V = v (i, j) whose entries are elements of Z2 , such that R = BV . From the reduced matrix R it is possible to deduce the pairing of critical simplices and thus the k-persistence diagram for every k ∈ Z. Consider the couple of simplices (σi , σj ) such that i = lowR (j). We call σi positive and σj negative, since the homology class created by σi dies when σj is introduced. By construction ofB in Equation (7.2.1), dimσi = dimσj − 1 = n, thus ˜ ˜ the coordinates f (σi ) , f (σj ) has to be added to the n-dimensional persistence diagram. Observe that the reduced matrix is not unique, for instance it can be computed as the complete Smith normal form of B. However the points f˜ (σi ) , f˜ (σj ) do not depend on the choice of R. Let denote with Mij the minor of the matrix M ∈ MatR (k, l) obtained by deleting the ﬁrst i − 1 rows and l − j columns, and deﬁne j j−1 rB (i, j) = rankBij − rankBi+1 + rankBi+1 − rankBij−1 . Lemma 7.2.2. Let B = RU , being U = V −1 , a decomposition of B. Then lowR (j) = i if and only if rB (i, j) = 1. In particular, the pairing function is not dependent on the choice of R. Proof. A proof of the Pairing Uniqueness Lemma can be found in (Cohen-Steiner et al., 2006, Sec. 3). If the number of simplices in the complex is m, then the algorithm runs in a O m3 time in the worst-case. For instance, the Vietoris-Rips complex of a n-points 106 CHAPTER 7. TOPOLOGICAL PERSISTENCE Figure 7.10: 2-dimensional simplicial complex. cloud has at most nk simplices for each dimension k, making the computation time prohibitive. It is possible to ﬁx this issue by computing persistence in low dimensions, or limiting the length of the radius associated to each point. We give an explicit example of the construction of the persistent boundary matrix and the deduction of the pairing in the following examples. Example 7.2.1. Consider the simplicial complex K depicted in Figure 7.10. It consists of 10 simplices: 4 vertices, 5 edges and 1 triangle. To get a ﬁltration we add them following the order induced by their dimension. Since each simplex has to be associated to a column and a row of the boundary matrix, we number them from 1 to 10 following the ordering deﬁned by the ﬁltration. Hence, the simplices will be placed in the order (v0 , v1 , v2 , v3 , e01 , e12 , e20 , e13 , e23 , t012 ) on the rows and columns of the matrix: 1 1 0 2  0 3  0  4 0  5  0 B= 6  0 7  0  8  0  9 0 10 0  2 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 5 1 1 0 0 0 0 0 0 0 0 6 0 1 1 0 0 0 0 0 0 0 7 1 0 1 0 0 0 0 0 0 0 8 0 1 0 1 0 0 0 0 0 0 9 0 0 1 1 0 0 0 0 0 0 10  0 0   0    1   1   1   0    0   0  0 Following the algorithm we look for columns whose last 1 entry has the same index, i. e. the column B(j) and B(j0 ) with lowB (j) = lowB (j0 ) with j ∈ { 1, . . . , 10 } and j0 < j. In this case we have lowB (7) = lowB (6) and lowB (9) = lowB (8). We sum 7.2. FROM HOMOLOGY TO PERSISTENT HOMOLOGY 107 Figure 7.11: Reduction of the persistent boundary matrix to normal form. the two columns obtaining 1 1 0 2  0 3  0  4 0  5  0 R1 = 6  0 7  0  8  0  9 0 10 0  2 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 5 1 1 0 0 0 0 0 0 0 0 6 0 1 1 0 0 0 0 0 0 0 7 1 1 0 0 0 0 0 0 0 0 8 0 1 0 1 0 0 0 0 0 0 9 0 1 1 0 0 0 0 0 0 0 10  0 0   0    1   1   1   0    0   0  0 Iterating this process once we obtain the reduced boundary matrix R2 = R, depicted in its decomposition R = DV in Figure 7.11. The iteration of the algorithm stops at its second step, since there are no more lowR (j) = lowR (j0 ) with j0 < j, while j ranges on the column’s indices of the matrix. Consider the reduced matrix R of Figure 7.11, the red 1 of the 5-th column, corresponds to the row index 2, that is to say the vertex 2 creates a 0-cycle, killed by the edge 5. The same argument holds for the 1 in position (3, 6) and (8, 4). When the edges 7 and 9 appear nothing change in terms death of cycles, since the columns R (7) and R (9) has only 0 entries. These two zero columns corresponds to two 1-cycles generated by the edges (5, 6, 7) and (6, 8, 9) as it is shown by the 7-th and 9-th column of the matrix V in Figure 7.11. While the triangle kills the ﬁrst 1-cycle when the edge 7 is added to the complex in position (7, 10), the other one survive along the whole ﬁltration. Eight A topological fingerprint for music analysis A persistence diagram is a ﬁngerprint of a shape that represents its geometrical and topological properties as a multiset of 2-dimensional points. The deformations of the Tonnetz we discussed in Chapter 6 can be analysed using the height function deﬁned on T to induce a ﬁltration on the fundamental domain F . Moreover, the fundamental domain of the Tonnetz is completely rebuilt by this ﬁltration, allowing to remove the threshold we deﬁned in Section 6.1.1 and to take into account pitch classes that are less used in the piece, but that could reveal interesting properties of musical phrases, or whole compositions. In the ﬁrst part of this chapter we set up all the machineries needed to safely compute persistent homology when considering the deformation of the Tonnetz and analyse the persistence diagrams associated to several music pieces. In the second part we will utilise the bottleneck distance (see Deﬁnition 7.2.7) to provide a distance between musical pieces and to classify them, according to their persistent properties. Remark 11. In the following applications, the persistence diagrams and the bottleneck distance will be computed by using Dionysus 1 . 8.1 Persistent homology classification of deformed Tonnetze The main aim of this section is to compute the persistent homology of the deformed Tonnetz we described in Section 6.1. In the previous chapter, we shown how a ﬁltration can be deﬁned considering a tame function f on a topological space X. A ﬁltration of a ﬁnite simplicial complex K can be provided as a sequence of nested subcomplexes { K0 , . . . , Kn } containing as its ﬁrst and last elements (considering the ordering induced by the index of the subcomplexes of the ﬁltration) the empty set and K, respectively. In our case, it is necessary to deﬁne a ﬁltration on a simplicial complex K, equipped with a function f : V → R deﬁned on its 0-skeleton. 1 Dionysus is a C++ library for persistent homology available at http://www.mrzv.org/ software/dionysus/. 109 110 CHAPTER 8. A TOPOLOGICAL FINGERPRINT FOR MUSIC Figure 8.1: Lower star ﬁltration of a simplicial complex. 8.1.1 The lower star filtration Let fV : V → R be a real-valued function deﬁned on the vertices of K. Linearly extending fV we obtain a piecewise linear function f : |K| → R, such that f (x) = P i bi (x)f (vi ). This continuous, piecewise extension of f allows to build a ﬁltration of K. Moreover, since K is ﬁnite, f is tame. Remark 12. It is possible to show that under some hypotheses, every ﬁltration of a simplicial complex K is induced by a continuous function f : K → R (Di Fabio and Frosini, 2013). Assume that f is injective on the vertices of K. Then it is possible to order them as f (v0 ) < f (v1 ) < · · · < f (vn ). (8.1.1) For every 0 6 i 6 n deﬁne Ki as the subcomplex deﬁned by the ﬁrst i vertices, i. e. a simplex σ belongs to Ki if and only if its vertices are smaller or equal to vi with respect to the ordering of Equation (8.1.1). The lower star of vi is deﬁned as St_vi = { σ ∈ St vi | x ∈ σ ⇒ f (x) 6 f (vi ) } , where St vi is the star of vi , as it has been deﬁned in Section 2.1.2 (see also Figure 2.3 for an intuition). Each simplex has a unique maximum vertex, since we assumed f to be injective on V , such that σ belongs to a unique lower star. Moreover, Ki = ∪j6i St_ vj , and Kn = K. The lower star filtration of K is given by ∅ = K−1 ⊂ K0 ⊂ K1 ⊂ · · · ⊂ Kn = K. Each element of the ﬁltration corresponds to a sub-level set of the function f . Furthermore, for t ∈ [f (vi ), f (vi+1 )) ⊂ R, |K|t = f −1 ((−∞, t]) has the same homotopy type of Ki . Let σ be a simplex that is cut by the plane deﬁned by z = t. Assume that σ = [v0 , . . . , vk ] ∈ K, then there exists at least a couple (vl , vk ) of distinct vertices of σ, such that vl ∈ Ki and vk ∈ K − Ki . Consider σ as a union of line segments connecting the points of its maximal face in Ki to its the maximal face in K − Ki . By construction the collection of such line segments lies only partly in 8.1. PERSISTENT HOMOLOGY CLASSIFICATION OF DEFORMED TONNETZE (a) Maximum (b) Minimum. 111 (c) Monkey saddle. Figure 8.2: Critical points of the height function deﬁned on the vertices of a portion of the deformed Tonnetz. f −1 ((−∞, t]). Refer to Figure 8.1 for an intuitive representation of this construction. Deﬁne the fractions of the line segments contained in Ki as s : [0, 1] → σ s(λ) = λx + (1 − λ)y, where f (s(0)) = f (y) = t and s(1) = x is the endpoint of the line segment. By considering the deformation retraction, obtained going from time λ = 0 to λ = 1, we have that |K|t and Ki have the same homotopy type. 8.1.2 A filtration of the deformed Tonnetz The deformed geometrical realisation of the inﬁnite planar Tonnetz is not a comfortable solution for the computation of persistent homology (the height function would have inﬁnite minima and maxima). Following a symmetrical approach with respect to the one we used in Section 6.1.2, we consider the fundamental domain F of the Tonnetz and its warping T. The idea is to use the values of the height function to induce a ﬁltration on T. However, even after this restriction, the height function can assume the same value on several vertices of T. Indeed, more than one pitch class could be silenced in a phrase, having height 0, or two or more pitch classes could be played exactly for the same amount of time. Maintaining the notation introduced in the previous paragraph, assume that the function fV : V → R has the same value on some (or even all) the vertices of K. Consider the set of the unique values of fV , ordered as a1 < · · · < an , with n ∈ N. Deﬁne Vi = {v ∈ V | f (v) = ai }, the collection of vertices whose value with respect to the function fV is ai . We deﬁne Ki = ∪il=1 St_Vl . The sequence ∅ = K0 ⊆ K1 ⊆ · · · ⊆ Kn = K is a ﬁltration of the complex K. Hence, it is possible to induce a ﬁltration on the ﬁnite simplicial complex T, by considering the sub-level sets of the linear extension of the height function deﬁned on the vertices of F. Remark 13. It is possible to approximate the linear extension of f with the constant linear function f¯(σ) = maxx∈σ f (x), in order to obtain a function that is monotone in the sense of simplicial complexes (if σ ⊂ τ , then f (σ) 6 f (τ )). 112 CHAPTER 8. A TOPOLOGICAL FINGERPRINT FOR MUSIC The homological critical values, whose pairing determine the lifespan of the homological classes of T correspond to the critical points of the height function on the deformed Tonnetz. In Figure 8.2 the simplest conﬁgurations of a maximum, a minimum and a saddle on the geometrical realisation of a portion of the deformed Tonnetz are depicted. Observe that, in our case a maximum or minimum can be a whole subcomplex of connected pitch classes, whose vertices share the same height. As an example, the conﬁguration depicted in Figure 6.4 on page 84 is obtained by playing the pitch classes corresponding to a major triad for the same amount of time. Remark 14. For an intuition concerning the sub-level sets of the height function, it is possible visualise them using the web application. The button disp_filtr generates the plane z = 0. The slider filt_height allows to change the height of the plane and visualise the sub-level sets of the height function. 8.2 Musical interpretation and persistent clustering Introducing the Topological Persistence, we claimed that it is capable of describing persistent features of a shape, mirroring the process we use to identify them. In a music analysis context, it is necessary to endow our ﬁlter with a relative pitch rather then an absolute one. This is why we chose the height function. On one hand, it assures the invariance under uniform transposition of a phrase; on the other hand, it takes into account the structure of the Tonnetz, when extending the function to the whole simplicial complex. In Figure 8.3 a particular sub-level set of the height function of several deformed Tonnetze is depicted. The diﬀerences between the tonal and atonal approaches are highlighted by the geometry of the sub-level set associated to each composition. By considering a single sub-level set at a ﬁxed height t, we obtain the specular approach to the one we considered in Section 6.1.2. The ﬁltration induced by the height function rebuilds T entirely, hence it is not necessarily to ﬁx a threshold. The information retrieved evaluating the birth and death-levels of the 0 and 1 homological classes traversing the ﬁltration is encoded in two persistence diagrams. 8.2.1 Musical Interpretation The 0 and 1-persistence diagrams associated to three compositions are represented in the ﬁrst and second row of Figure 8.4, respectively. The two diagrams gives two representation of the each shape, in our case the diﬀerent conﬁgurations of cornerpoints and cornerlines can be interpreted as descriptors of the compositional styles characterising the compositions we analysed. 0-persistent homology Consider the ﬁrst row of Figure 8.4. Being F connected it is not surprising to observe the presence of one cornerpoint at inﬁnity in each diagram. This cornerline retrieves the connectedness of F . Moreover, its birth-level gives an insight on the use of the pitch classes in the composition, by representing the height of the minimal subcomplex of the deformed Tonnetz. More information is retrieved considering 8.2. MUSICAL INTERPRETATION AND PERSISTENT CLUSTERING (a) KV 311, mov. 1 - Mozart. 113 (b) KV 311, mov. 2 - Mozart. (c) KV 311, mov. 3 - Mozart. (d) Klavierstück, I - Schönberg. (e) Klavierstück, II - Schönberg. (f) Klavierstück, III - Schönberg Figure 8.3: Sub-level sets of the height function (in blue) on several deformed Tonnetze. Diﬀerent compositional styles are characterised by particular choices of pitch classes and durations. the proper cornerpoints, that describe the lifespan of the connected components along the ﬁltration. The three examples of Figure 8.4 present as many diﬀerent conﬁgurations. In particular, Arabesque and Jeux d’Eau that were topologically equivalent for our ﬁrst naïf classiﬁer are now neatly distinguished by their persistence diagrams. It is possible to give a musical interpretation of the 0-persistence diagram considering the birth-level of the cornerline, say x = b, and the proper cornerpoints p ∈ C. In our representation, b corresponds to the height of the minimal subcomplex 114 CHAPTER 8. A TOPOLOGICAL FINGERPRINT FOR MUSIC (a) Arabesque, Debussy. (b) Jeux d’Eau, Ravel. (c) Klavierstück I, Schönberg. (d) Arabesque, Debussy. (e) Jeux d’Eau, Ravel. (f) Klavierstück I, Schönberg. Figure 8.4: The 0 and 1-persistence diagrams representing the topological ﬁngerprints associated to three diﬀerent compositions. of the deformed Tonnetz. If b ≈ 0, there exist a pitch-class set that has not a relevant role in the composition, suggesting that it is based on a stable tonal or modal choice. On the contrary, if b >> 0 every pitch class has been used in the composition for a relevant time. This conﬁguration corresponds to a more atonal or chromatic style. The presence of more than one connected component has to be interpreted as the presence of two minimal subcomplexes respect to the height function. Hence, these subcomplexes are not connected by an edge in the 1-skeleton of F . Furthermore, the structure of the fundamental domain generated by major and minor third allows to retrieve a maximum of three connected components. To create this particular conﬁguration it is necessary to play a chromatic cluster, for instance C, C♯, D. The same argument holds for the maxima of the height function, that will be discussed in the next paragraph. Coming back to the three examples of Figure 8.4, we can interpret the low birth-level and the absence of proper cornerpoints in the diagram associated to Arabesque as an evidence of its pentatonic and diatonic modal inspiration (Trezise, 2003), we retrieve also the fact that the whole chromatic scale has been used during the composition. The cornerline in the persistence diagram associated to Jeuxd′ Eau is characterised has been a higher birth-level compared to the one we analysed previously. Moreover, it is not surprising in this case to retrieve a proper cornerpoint, whose presence is justiﬁed by the ante-litteram use of the Petrushka chord, a superposition of a major triad and its tritone substitute, for instance 8.2. MUSICAL INTERPRETATION AND PERSISTENT CLUSTERING 115 G = (G, B, D) + C♯ = (C♯, E♯, G♯). Finally, the diagram associated to Klavierstück I has a cornerline with a birth-level comparable with the one associated to Jeux d’Eau. In this case, two proper cornerpoints (that corresponds to two minima, forming a chromatic cluster with the one retrieved by the cornerline) point out the atonal nature of the composition. 1-persistent homology Now, consider the second row of Figure 8.4. The common denominator of the three diagrams is the presence of two cornerpoints at inﬁnity. These inﬁnitely ∼ S 1 × S1 , persistent homological classes retrieve the two generators of the torus F = see Figure 6.5. In musical terms, the value of birth-levels of the cornerlines and their distance are relevant and give a ﬁrst characterisation of the style of the composition. We discuss four conﬁgurations of the cornerlines, in order to provide an intuition concerning the stylistic information they retrieve. Let b1 < b2 the birth-levels of the two generators, their distance is given by d = |b1 − b2 |. (a) b1 ≈ 0 and d >> 0 (Figure 8.4d). This conﬁguration points out a tonal choice. A cycle representing one of the generator is born suddenly, this means that there exist a pitch-class set that has not been used in the composition. Hence, this feature suggest a precise choice in terms of tonality (or modulations among near tonal centres), or modality. The high distance between the cornerlines points out that a pitch class set is less used than the others in the composition, generating two maxima, as it is depicted in a representative surface of Figure 8.5a. (b) b1 >> 0 and d >> 0 (Figure 8.4e). An extensive use of the whole chromatic scale (both in terms of pitches and durations) is retrieved by the high birth-level of the ﬁrst cycle. However a particular modal or tonal choice is highlighted by the presence of two distinguishable maxima (see Figure 8.5b). (c) b1 >> 0 and d ≈ 0 (Figure 8.4f). This conﬁguration represents an atonal compositional choice. The whole chromatic scale has been equally relevant during the composition and the average height of the vertices of the Tonnetz does not allow to distinguish any preferred direction generating a compressed surface as the one of Figure 8.5c. In the applications we present in the next section, we shall see how a tonal piece modulating on several tonal centres, and hence, using extensively the whole set of available pitch classes, is equipped with a structure, that allows to distinguish it from a serial or a ultrachromatic composition. (d) b1 ≈ 0 and d ≈ 0. This case that is not represented in the persistence diagrams we chose. In this case, there exists a pitch-class set that less relevant for the composition and two distinct maxima with a low minimum. In this case the composition can be classiﬁed as modal or tonal and based on a small set of precise musical ideas. See Figure 8.5d for a representative surface of this conﬁguration. More information is retrieved considering the proper cornerpoints of the persistence diagrams. These points retrieve the lifespan of other maxima, arising in diﬀerent conﬁgurations by considering chromatic, dodecaphonic or serial compositions. 116 CHAPTER 8. A TOPOLOGICAL FINGERPRINT FOR MUSIC 4 4 3 3 2 2 1 1 0 0 (a) (b) 4 4 3 3 2 2 1 1 0 0 (c) (d) Figure 8.5: Smooth surfaces representative of four diﬀerent conﬁgurations of the deformed Tonnetz. 8.2.2 Hierarchical persistent music clustering Let (D∞ , dB ) be the space of the persistence diagrams equipped with the bottleneck distance. We recall that a piecewise linear functions deﬁned on ﬁnite simplicial complexes are tame. Thus the bottleneck distance is stable under small variation of f . A collection of k-persistence diagrams represents a point cloud in P ⊂ D∞ . In particular, considering the collections of k-persistence diagrams associated to a set of pieces of music, it is possible to compute their pairwise bottleneck distance. Then, we shall describe the conﬁguration of such point cloud through a hierarchical clustering analysis (Ott, 2009). This analysis gives a simple representation of all the possible clusterings between points, visualisable as a dendrogram. Representation of data through dendrograms Dendrograms provide an intuitive representation of the hierarchical clustering of data. We refer to (Langfelder et al., 2008; Martinez et al., 2010) for a complete description of these subjects. Consider the 2-dimensional data represented as points in R2 in Figure 8.6a. The data form two clusters and have two singletons labelled as I and J. The horizontal axis of the dendrogram represents the distance or the dissimilarity between clusters, while each object is represented by its label on the vertical axis. The information carried by the dendrogram concerns similarity and clustering of data. Each joining is represented by the splitting of a horizontal line into two horizontal line. The position of the split allows two retrieve the distance 8.2. MUSICAL INTERPRETATION AND PERSISTENT CLUSTERING 117 J 10 9 8 7 G 6 H 5 F E 4 3 D A 2 1 C B I 0 0 1 2 3 4 5 (a) 2-dimensional data collection. J G F H E I D C B A 2 4 6 8 10 12 14 16 (b) Dendrogram. Figure 8.6: Dendrogram representation of data dissimilarity. The structure of the 2-dimensional point cloud consists of two distinct groups and two outliers. The dendrogram reﬂects such a structure representing the two groups as separate clusters and joining the outliers to the clusters respecting their relative position respect to the conﬁguration of the point cloud. among two clusters. Observing the dendrogram in Figure 8.6b, one can see how the two main clusters are represented as branches occurring at about the same distance. The outliers are fused at much higher distances. Computation. Consider a collection of n objects and let D = dij be the matrix representing the distance among the clusters i and j, composed by ni and nj objects respectively. The dendrogram is computed as follows: i) Find the clusters ı̄ and ̄ such that dı̄̄ is minimum in D. ii) Merge ı̄ and ̄ in a new cluster k with nk = ni + nj objects. iii) Compute a new clusters distance matrix as dkl = aı̄ dı̄l + a̄ d̄l + bdı̄̄ + c|dı̄l − d̄l |. Particular choices of the parameters distinguish among diﬀerent algorithms. We shall utilize the complete linkage where ai = aj = 1/2, b = 0 and c = 1/2. iv) Iterate the previous steps. 118 CHAPTER 8. A TOPOLOGICAL FINGERPRINT FOR MUSIC Schoenberg - 2 Ravel - Jeux Schoenberg - 1 Mozart - 3 Mozart - 2 Beethoven - 1 Beethoven - 2 Debussy - Arab Beethoven - 3 Figure 8.7: Persistence-based clustering of nine classical and contemporary pieces. 0-dimensional persistence In the following examples 0-dimensional persistence has been used to classify diﬀerent collections of music pieces. The examples are organised in order to show some possible applications of the topological characterisation of music: classiﬁcation of tonal and atonal pieces, evaluation of diﬀerent versions of the same jazz standard played by diﬀerent improvisers and the discrimination of diﬀerent styles in a pop music context. Remark 15. To safely compare diﬀerent pieces, the height function has been normalised. The collection of MIDI ﬁles used to generate the examples described in the following section is available at http://nami-lab/experiments/midi-collection. com. Example 1. Tonal and Atonal Music. The ﬁrst example we present is the hierarchical clustering of some of the pieces included in the web application for the visualisation of deformed Tonnetze. The nine pieces we analyse have been selected among the compositions by Beethoven, Debussy, Mozart, Ravel and Schönberg, in order to provide a heterogeneous dataset in terms of compositional style. The similarity among these pieces, computed considering the pairwise bottleneck distances of the persistence diagrams associated to the selected pieces, is depicted in Figure 8.7 as a dendrogram. It is possible to observe how data are organised in two main clusters, segregating the two ﬁrst pieces of Schönberg’s Drei Klavierstücke and Ravel’s Jeux d’Eau, from the ones by Mozart, Beethoven and Debussy. The association between the second 8.2. MUSICAL INTERPRETATION AND PERSISTENT CLUSTERING 119 [2]complete_MIII [2]complete [1]complete_MIII [1]complete [3]theme [3]solo_4_coda [1]theme2 [1]theme [3]solo_2 [1]intro [3]intro [1]interlude [2]random [1]random [3]random Figure 8.8: Comparing three diﬀerent version of All the Things You Are. piece of the Drei Klavierstücke with Jeux d’Eau respects what we found examining the weighted subcomplex of the Tonnetz in Section 6.1.2. Some tonal traces are hidden in this piece, albeit they are not evident to a human interpretation, as it is proven by the disparate tonal interpretations of these three pieces, see for instance (Brinkmann, 1969; William, 1984; Ogdon, 1981). Concerning the Ravel’s composition, the ante litteram utilisation of the Petrushka chord highlight the atonal nature of the piece. The two movements from Mozart’s KV311 form suddenly a cluster reached at an increasing distance by the two ﬁrst movement of the Sonata in C major by Beethoven, while the third movement is grouped with Arabesque, which is characterised by a generous use of the pentatonic scale, before joining the others. Example 2. Comparing three versions of All the Things You Are. The aim of this test is ﬁrstly to investigate the distance between the 0-dimensional topological ﬁngerprint of three versions of the same jazz standard; secondly to show the invariance of such ﬁngerprint under (musical) transposition; and ﬁnally, to show the relationship between the ﬁngerprint extracted by the whole piece of music and its segments. In addition, a randomized-pitches version of each song has been introduced in the dataset, to test the ability of persistent homology to distinguish between a piece modulating in several distinct tonalities2 and enriched with chromatic solos, and a suite of random pitches without any apparent structure. 2 A tonal harmonic analysis of the standard reveals it modulates in five different keys: A♭ major, C major, E♭ major, G major and E major. 120 CHAPTER 8. A TOPOLOGICAL FINGERPRINT FOR MUSIC Figure 8.9: Pop clustering. The three versions are labelled in the dendrogram of Figure 8.8 as [1], [2] and [3]. The number is followed by an attribute such as [i]complete meaning the persistence diagram has been computed on the whole length of version i, or on its transposition [i]complete_interval, or a segment as [i]intro. The three versions we considered are structured as follows: a) version [1] is played by four instruments in a pretty standard way, with doubled harmony. The main section are a 3/4 introduction, a ﬁrst exposition of the theme [1]theme, a 12 bars interlude labelled as [1]interlude, introducing the last theme containing short improvisations and embellishments. b) [2] is performed by a piano solo, and it is characterized by a rich chromatic playing style of both hands in which the main theme is executed twice. c) the last version we examined is performed by a trio version (piano, bass, percussion). Its structure consists of an introduction, an exposition of the theme, and a piano solo. We included two improvisations dividing them according to the structure of the standard, the introduction and the main theme. It is not surprising to see that the transposed versions of the pieces have distance zero from the original ones. What is interesting to observe is that the randomized versions of the songs are well segregated from the rest of the dataset, as it is shown by the green cluster at the bottom of the ﬁgure. Proceeding from bottom to top, we ﬁnd a small cluster containing the interlude of the ﬁrst and the introduction of the third version, which share a very similar 8.2. MUSICAL INTERPRETATION AND PERSISTENT CLUSTERING 121 structures in terms of leaps and rhythm, see the partitions [1]interlude and [3]intro in appendix Appendix D. Finally, it is possible to observe how in the top cluster the two complete songs are linked to the fragment of the third version containing the theme. Hence the 0-persistence homology retrieves the fragments containing the whole structure of the standard. This feature is surprising, taking into account the several modulations of the piece and that we are considering only the complex created by the whole segment. The ﬁrst and second themes of the ﬁrst version all clustered together with the last improvisation of the the third version of the standard, which is the one more respectful of the original theme. Example 3: Pop clustering. The dendrogram obtained considering two songs by Christina Aguilera, and three songs by Sting and Paul McCartney respectively is depicted in Figure 8.9. Sting’s Fortress Around Your Heart is represented as an outlier. A cluster contains the two songs by Christina Aguilera, that result well separated from the other cluster grouping Sting’s and Paul McCartney’s tunes. The position of the outlier is due to the hard modulations of the piece. For its harmonic transcription refer to http://yalp.io/app/sting-fortress-around-your-heart-701. 8.2.3 1-dimensional persistence It is possible to explore dendrograms built utilizing the pairwise distances of persistence diagrams representing the behavior of the ﬁrst homology group. In layman terms, measuring the persistence of 1-dimensional holes generated by the ﬁltration, as its 0th counterpart discussed above, measures the lifespan of the connected components. Example 1: Tonal and Atonal Music. We propose a new clustering of the nine pieces we analysed above. As we expected, the persistence of 1-cycles gives diﬀerent results than its 0 counterpart. Figure 8.10 shows a dendrogram composed by two green clusters grouping the two movements of Mozart’s Sonata no. 9 and Debussy’s Arabesque and the ﬁrst and third movements of Beethoven’s Sonata no. 13 with Klavierstück II as an outlier. A red cluster is formed by the second movement of the Sonata no. 13 and Jeux d’Eau. Even in this case, the ﬁrst of the three Schöberg’s piano pieces is represented as an outlier of this last cluster. We still retrieve a classiﬁcation of the diﬀerent compositional styles. In this analysis, Beethoven and Mozart are represented by two diﬀerent clusters and the atonality of the Drei Klavierstücke is expressed by the high dissimilarity of its two pieces. Such a dissimilarity is nuanced for the two compositions, as in the 0-persistence analysis, the ﬁrst piece is the farther from the rest of the dataset. Example 2: All the Things You Are. In Figure 8.11 the hierarchical clustering between the three versions of All the Things You Are we analysed above, is depicted in its 1-dimensional version. The invariance under transposition still holds, being the transposed version of [1] and [2] at distance 0 from the original ones. The random-pitch versions are still grouped together. The introductions of the ﬁrst and third versions are represented as outliers. A homogeneous cluster recollect the ﬁrst 122 CHAPTER 8. A TOPOLOGICAL FINGERPRINT FOR MUSIC Figure 8.10: H1 persistence-based clustering of nine classical and contemporary pieces. version and its segments, denoting a sort of robustness to occlusions of the musical ﬁngerprint. The similarity between the ﬁrst and the second version is retrieved in this cluster. Finally, a solo of version [3] is grouped with its theme. Example 3: Big Pop Clustering. Figure 8.12 shows a simpliﬁcation of the clustering resulting by the comparison of 58 pop songs performed by 28 artists, spacing from Ray Charles to Lady Gaga. In order to give a simpliﬁed representation of this clustering we considered only the three big groups the algorithm found and listed on the left of each cluster the artists whose songs belong to that group. In particular names written in black bold characters are artists whose song are entirely grouped in the cluster at their right, while the artists’ names written in red bold characters identify the three artists whose songs are spread along the three groups. It is interesting to observe how the whole collection of songs by Ringo Starr, Paul McCartney and Simon and Garfunkel are grouped together in the blue cluster, with Ray Charles, Stevie Wonder and George Benson. At the same time it is admissible that the diversity characterizing Sting’s compositions is mirrored by the positions of his songs in the dendrogram. The seconds and third clusters are less homogeneous, but promising, taking into account that so far songs are identiﬁed by a single persistence diagram. Discussion In this chapter we suggested a model describing music taking into account the contribution of each paring (pitch class, duration) associated to the notes of a composition. A ﬁltration has been deﬁned on the fundamental domain of the Tonnetz. 8.2. MUSICAL INTERPRETATION AND PERSISTENT CLUSTERING 123 [3]theme [3]solo2 [1]complete_MIII [1]complete [1]theme2 [1]theme1 [2]complete_MIII [2]complete [3]solo_4_coda [1]interlude [2]random [1]random [3]random [1]intro [3]intro Figure 8.11: Comparing three diﬀerent version of All the Things You Are using 1-dimensional persistence. Such a ﬁltration is induced by the height function deﬁned on the vertices of T. The k-persistence diagrams associated to diﬀerent music pieces have been considered as point of a space equipped with the bottleneck distance. The possible clusterings of the points belonging to such dataset have been discussed and represented as dendrograms, showing that 0 and 1-persistence can be used to analyse and classify music. In particular, the stability of the bottleneck distance allows to generalise this construction from MIDI ﬁles to audio, as we shall discuss in the conclusions of the whole part. 124 CHAPTER 8. A TOPOLOGICAL FINGERPRINT FOR MUSIC Ringo Starr (3/3) Paul McCartney (3/3) Simon and Garfunkel (3/3) Ray Charles (1/1) Prince (1/1) Phil Collins (1/1) Bobby McFerrin (1/1) Stevie Wonder (1/1) George Benson (1/1) Aretha Franklin (1/1) All Saints (1/1) Enya (2/3) Jamiroquai (2/3) Whitney Huston (2/3) Michael Jackson (1/2) ABBA (1/2) George Michael (1/2) Oasis (1/2) Britney Spears (1/3) Cranberries (1/3) Sting (1/3) The Corrs (1/3) Jennifer Lopez (3/3) Britney Spears (2/3) Natalie Imbruglia (1/2) Marvin Gaye (1/2) George Michael (1/2) ABBA (1/2) Cranberries (1/3) Sting (1/3) The Corrs (1/3) Christina Aguilera (2/2) Backstreet Boys (2/2) Lady Gaga (1/1) Natalie Imbruglia (1/2) Oasis (1/2) Michael Jackson (1/2) Marvin Gaye (1/2) Jamiroquai (1/3) Enya (1/3) Whitney Huston (1/3) Sting (1/3) Cranberries (1/3) The Corrs (1/3) Figure 8.12: A simpliﬁed version of the clustering of 58 pop songs generated from their 1-persistence diagrams. Nine Audio feature deformation of the Tonnetz The analysis of music can be considered from two diﬀerent sides: a horizontal one naturally suggested by counterpoint and voice leading theory, and a vertical one given by the superposition of notes. Nevertheless, a signiﬁcant piece of information is carried by the signal. For instance, the timbre of the instrument we are listening to aﬀects the perception of a whole piece, as the same phrase played on an acoustic piano or on a Rhodes will surely evolve in diﬀerent ways in a composition, albeit keys and hammers are used in the same way by both instruments (Barona, 2014; Lee et al., 2009). In this chapter, we suggest two applications based on the deformed Tonnetz, in which the function used to displace its vertices is derived form the signal domain. Hence, the height of each vertex is computed by considering an audio feature. In particular, we shall use the consonance function as it has been introduced by Plomp and Levelt. In a ﬁrst part, it will be used to compute the displacement of the vertices of the Tonnetz labelled with the pitches belonging to a single octave and compared to a ﬁxed pitch. This space shall be used to classify the 21 modal scales Figure 9.1: A Tonnetz deformed through a signal-based height function. 125 126 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ derived from the diatonic, melodic minor and harmonic minor scale. The octave dependency of the consonance function is highlighted as a fundamental feature, that musicians exploit in their compositions. As a second application, we will show how a trivial extension of the consonance function to chords can give interesting results, once interpreted in a geometrical metric context and in the formalism of topological persistence. 9.1 The calculation of consonance values according to Plomp and Levelt The notion of consonance has a long history, dating back to the time of Greek philosophers. The notion itself is complex and has been given multiple meanings and explanations throughout the history of music theory and acoustics. For a detailed review of the history of consonance theory, the reader is invited to refer to (Sethares, 2004) and (Tenney, 1988). The original idea of the Greeks, and of various philosophers and scientists such as Galileo (Galilei, 1638), Euler (Euler, 1766, 1739a), or Diderot, was that consonant intervals are those based on small frequency ratios as consonants, an idea which originated from Pythagoras. In the nineteenth century, departing from consonance theories based only on the musical objects at hands, Helmholtz introduced in his book On the Sensation of Tone (Helmholtz, 1877) a theory of sensory dissonance based on the processes at work in the auditory system. It is a well-known fact that two pure tones of close frequencies produce beats, the frequency of which is equal to the diﬀerence of frequency between the original pure tones. When the frequencies of the original signals are close, the beat frequency is low, and the slowly evolving resulting signal is not perceived as dissonant. Helmholtz observed that when the beat frequency increases, the roughness of the resulting signal increases, peaking at a maximum for a reported beat frequency of 32 Hz. Thus, the dissonance of a tone is directly linked to the presence or absence of beats. By considering the interaction of all the partials of two harmonic sounds, and by assuming that the total dissonance is obtained additively from the dissonance between two partials, Helmholtz was able to calculate the dissonance value of any interval. In the mid 1960s, Plomp and Levelt (Plomp and Steeneken, 1968; Plomp and Levelt, 1965) published an inﬂuential experimental work on the sensation of consonance and dissonance for pure tones. A number of listeners were asked to rate the consonance of various pairs of pure tones sounded at diﬀerent frequencies. This resulted in the determination of a dissonance function, which gives a dissonance value as a function of the frequency ratio of the two pure tones, expressed in units of the critical bandwidth. The typical plot of this function is presented in Figure 9.2. The notion of critical bandwidth derives from the mechanism of the auditory system itself (Fletcher, 1940). In the cochlea, pure sinusoidal tones excite diﬀerent places of the basilar membrane. The place theory of pitch perception links the excitation place with the perceived pitch of the tone. When two tones of similar frequencies are sounded together, they excite similar places of the basilar membrane: in other terms, they occupy the same critical band. The width of this critical band is therefore linked to the ability to perceive two simultaneous tones of diﬀerent frequencies as a 127 9.1. COMPUTING CONSONANCE VALUES dissonance 1.0 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1.0 frequency ratio / critical bandwidth Figure 9.2: Plot of the consonance function between two sinusoidal tones, whose frequency ratio is expressed as a ratio of the corresponding critical bandwidth. unique tone or not. The critical bandwidth is roughly constant and equal to 100 Hz in the range 100-1000 Hz, and then increases proportionally with frequency (Zwicker, 1961; Zwicker and Terhardt, 1980). The results of Plomp and Levelt provide an experimental justiﬁcation of the work of Helmholtz, with the addition that the maximum of dissonance occurs at roughly one quarter of a critical bandwidth. It is therefore dependent on the frequency of the tones and not ﬁxed to 32 Hz, as was the case for Helmholtz (which is incidentally the value of one quarter of a critical bandwidth for a frequency of roughly 600 Hz). Multiple parametrizations of the Plomp and Levelt curve have been given by various authors. We use here the parametrization used by Sethares in (Sethares, 2004), wherein the consonance function between two pure tones of frequencies f1 and f2 > f1 is given by d(f1 , f2 ) = exp(−3.5 · s · (f2 − f1 )) − exp(−5.75 · s · (f2 − f1 )), where s is deﬁned as 0.24 , 0.021 · f1 + 19 and is introduced to account for the variation of the critical bandwidth with frequency. Based on their work on pure tones, Plomp and Levelt then studied the consonance of complex tones. Since a complex tone has a spectrum consisting of multiple partials, they assumed that the total consonance results from the addition of the consonance values between all pairs of distinct partials (under the hypothesis that all partials have the same intensity). In other terms, given a complex tone whose spectrum is a set {f1 , f2 , . . .} of partials at frequencies fi , the total consonance is given by s= D= X X fi fj >fi d(fi , fj ). 128 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ dissonance 5 4 3 2 5/4 6/5 4/3 1 5/3 3/2 0.5 1.0 frequency ratio Figure 9.3: Plot of the consonance value of a complex tone consisting of the superposition of two tones with six harmonic partials of identical intensities, as a function of the frequency ratio of the two tones, the ﬁrst one having a ﬁxed frequency. Using this deﬁnition, they calculated the consonance value of a complex tone consisting of the superposition of two tones as a function of their frequency ratio, the ﬁrst tone being ﬁxed. Each tone had a spectrum consisting in six harmonic partials of identical intensities. The plot of this consonance value is given on Figure 9.3. As can be seen on this graph, minima of the dissonance function are obtained for pure intervals such as the unison (1 : 1), the octave (2 : 1), the ﬁfth (3 : 2), the fourth (4 : 3), the major third (5 : 4), the minor third (6 : 5), and the major sixth (5 : 3). Of course, the calculation of the consonance value of a complex tone is not limited to harmonic sounds. Sethares (Sethares, 2004) has investigated complex tones whose spectrum is inharmonic, and has deduced the corresponding consonant intervals between such sounds. In the following sections, we explore the use of consonance calculations to determine the hierarchical organization of various musical entities. A consonance-based height function Maintaining the notation of the previous section, let p ∈ R be a pitch. Deﬁne hp : V → {pi } × Li → R as hp (v) = d (p, l (v)) , where d : R2 → R is the consonance function1 and l : V → Li is the labelling function associating to the vertices of the Tonnetz the chromatic scale built on the ith octave of the piano, with root r, such that [r] = [p]. In the remainder of this section we shall refer to p as the reference pitch used to compute the displacement 1 Here we identifies a pitch with its fundamental frequency, see Equation (2.2.1) for the formula associating fundamental frequencies to pitches. 129 9.1. COMPUTING CONSONANCE VALUES (a) T34 . (b) T35 . Figure 9.4: Deformation of a portion of the Tonnetz. The reference note used to displace its vertices is C3 . The labels associated to the Tonnetz’s vertices correspond to the chromatic scale built on the fourth and the ﬁfth octave of the piano. of the vertices of the Tonnetz. Let W : V ⊆ R3 → R 3 v 7→ (xv , yv , hp (v)), be the function that deﬁnes the height of every v ∈ V . Such a height corresponds to the consonance value of the interval (p, l (v)). It should be noted that, for a transposition of the reference pitch, the consonance value decreases when the frequency of its reference pitch increases. In order to compensate for this eﬀect, we renormalise the frequencies to the reference tone. Variable geometry The space we deﬁned above is endowed with two interesting properties. First, the evenness of the equal temperament assures that the computation of the consonance is robust modulo uniform transposition of the reference pitch and the chromatic scale. That is to say, the intervals (C3 , C♯3 ) and (D3 , D♯3 ) share the same consonance value. Second, the consonance function is not invariant modulo octave. Respecting the common sensory experience, an interval of minor second (C4 , D♭4 ), is less consonant than a minor ninth (C4 , D♭5 )2 . Hence, the geometry of the Tonnetz varies depending on the choice of both the chromatic set of pitches associated to its vertices and the choice of the reference pitch p. The surfaces resulting from the deformation of the planar Tonnetz computed with C3 as reference pitch and the pitches of the chromatic scale of the third and fourth octave of the piano are depicted in Figures 9.4a and 9.4b respectively. We will denote these surfaces as T33 , T34 . In the ﬁgure, a height function highlights 2 Behaviour of the dissonance function respect to octave changes: d (C2 , C♯2 ) = 1.7, d (C2 , C♯3 ) = 0.9 and d (C2 , C♯4 ) = 0.4. 130 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ Figure 9.5: Variations of the Tonnetz’s geometry on three octaves. maxima and minima of each surface. As in the application examined in the previous chapter, what we gain is the metric nature of this representation and the information given by the interaction between the structure of the Tonnetz and the deformation. In Figure 9.5 three diﬀerent states of the geometry of T are depicted. 9.2 Persistent homology and audio feature deformed Tonnetze While listening to a bass line or a whole harmonic sequence, it is a common experience to imagine a melody matching with the progression of notes or chords we are listening to. This melodic choice can be represented by a set of tensions and a set resolutions having the bass (or the chord) as a reference. One possibility is to build from such a choice a 7-notes, octaviant scale3 , that superposed to the reference bass creates a recognisable sonority, called a mode. The tensions and the resolutions of a mode give to a trained listener the whole information he or she needs to recognise it. Thus, the space we propose is particularly suitable for the representation of modes, since tensions and resolutions are nuanced by the height function deﬁned on the vertices and the whole scale is described as an extended shapes of the Tonnetz. Exploiting this characteristic, we classify the 21 modes derived form the diatonic, melodic and harmonic minor scales (see Table A.1) by considering the point cloud of vertices of the consonance-Tonnetz and computing their persistent homology. In the next section we introduce the construction that we will use to associate a ﬁltration to these point clouds. 9.2.1 Persistence for point clouds Let P ⊂ Rn be a point cloud. There are two main constructions used to associate a simplicial complex to P , called the Čech and the Rips complex. The former is deﬁned as an abstract simplicial complex denoted by CP (r), such that its 0-simplices are exactly the vertices of P and its simplices are generated whenever the balls of radius r centred on its vertices have non-empty intersections. In symbols, we have 3 The pitches composing the scale are contained in a single octave. 9.2. PERSISTENT HOMOLOGY AND AUDIO FEATURE DEFORMED TONNETZE CP (r) = ( k \ σ = [p0 , . . . , pk ] i=0 Br (pi ) 6= ∅, ) 131 , where pi ∈ P for every i ∈ {1, . . . , k}. Given a point cloud P , we have that CP (r) ⊆ CP (q) if r < q. Hence, it is possible to build a ﬁltration of the Čech complex choosing a sequence {r1 , . . . , rn } of increasing radii and setting Xi = CP (ri ). An important feature of this construction is that the homology of the Čech complex is exactly the one given by the union of r-balls centred on the points of P . This is a straight-forward consequence of the nerve lemma (the interested reader is referred to (Kozlov, 2007)). Definition 9.2.1. Let X be a topological space and U = {Xi }i∈I be a covering. The nerve of U is the (abstract) simplicial complex N(U), whose set of vertices is given by I and such that a ﬁnite subset S ⊂ I is a simplex of N(U) if and only if ∩i∈S Xi is nonempty. Lemma 9.2.1. Let F = {C1 , . . . , Cn } be a finite family of closed set, such that every intersection between its members is either contractible or empty. Then, the nerve of F and the union of sets in F have the same homotopy type. Balls in Rd satisfy the hypothesis of the Nerve lemma, and as an immediate consequence we have that for every point cloud P ⊂ Rn ,  Hk (CP (r)) ∼ = Hk  [ p∈P  Br (p) , for k ∈ Z. The Čech complex is an object of diﬃcult computation, thus it is often substituted by the Vietoris-Rips (or simply Rips) complex RP (r). A simplex is added to the Rips complex, when all pairs of points representing its vertices are less than 2r distant. RP (r) = { σ = [p0 , . . . , pk ] | kpi − pj k 6 2r, ∀ i, j } . Although the Vietoris-Rips complex of a point cloud does not share the homotopy type of the union of the balls built on its vertices, it is largely used for its computational ease. The error in the approximation of CP (r) with RP (r) is bounded by √ √ RP (r) ⊆ CP 2r ⊆ RP 2r . 9.2.2 Deformed Tonnetze for modern modes classification Let the mode M = (p, { r, m2 . . . , m7 }) be the couple composed by the reference pitch p and the set of pitches corresponding to a modal scale. This section’s aim is to provide a characterisation of the modal scales, considering the 3-dimensional point cloud generated by the vertices labelled as {r, m2 , . . . , m7 } on the fundamental domain F ⊂ T. 132 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ Methodoloy The procedure we use to classify modes is similar to the one that allowed us to deal with musical compositions in the previous chapter: (i) The Tonnetz is deformed according to the consonance values induced by the reference pitch p and the chromatic scale built on the same octave than p. (ii) The point cloud M0 is extracted from the the 0-skeleton of F. (iii) According to the deﬁnition given in Section 9.2.1, we compute the 0-persistent homology of the point cloud, considering the ﬁltration induced by the Rips complex. Such a ﬁltration is sensitive to the relative distance between the points composing the cloud, whose conﬁguration depends both on the dissonance function and on the structure of the Tonnetz. (iv) Finally, the similarity between the point clouds is represented as persistence diagrams and visualised in a dendrogram. 9.2.3 Applications Scale-wise classification of modes As we stated above, 21 modes can be deduced considering the degrees of the major, melodic minor and harmonic minor scales. In Figure 9.6 the dendrograms representing the hierarchical organisation of modes, induced by the distances of their 0-persistence diagrams is depicted scale-wise. As a ﬁrst remark, observe how no sonorities have 0 distance. Thus, this representation grasp the diﬀerent tension/resolution sets of each mode. In Figure 9.6a the modes deduced from the major scale are considered. The only mode associated to a half-diminished (its root note forms a half-diminished chord with the third, ﬁfth and seventh degree of the modal scale) chord is segregated from the others. The two more similar point clouds are the ones associated to the ionian and the mixolydian modes. The lydian scale, characterised by the presence of an augmented fourth and a major seventh, is separated from the ionian and mixolydian modes. However, these three major modes are represented as a homogenous cluster. Further from this three modes, we ﬁnd the dorian and the eolian sonorities. The phrygian and the locrian modes are a minor and a diminished mode respectively and are represented as outliers. Both contain a minor second and are the most tense and recognisable sonorities of this dataset. Figures 9.6b and 9.6c show the clusterings of the modes derived form the melodic and harmonic minor scales, respectively. In both cases the shapes result divided into two main clusters, one of them consisting at least of a mode built on a diminished triad. The bigger cluster of Figure 9.6b is formed by two pairs of modes grouped together: mixolydian♭6 - locrian♯2 and hypoionian - lydian augmented respectively. The mixolydian♯4, which is considered a blues mode (containing an augmented fourth on a dominant chord) is a well characterised sonority, and it is segregated from the other modes. Mirroring the structure of the clustering associated to the major scale’s modes the last mode of the bigger cluster in Figure 9.6c is the mixolydian♭2♭6, also known as Spanish phrygian and the most identiﬁable mode of this dataset. 9.2. PERSISTENT HOMOLOGY AND AUDIO FEATURE DEFORMED TONNETZE 133 locrian phrygian eolian dorian lydian mixolydian ionian 0 10 20 30 40 50 60 70 (a) Modes deduced from the major scale. superlocrian dorianb2 mixolydiand4 lydianaug hypoionian locriand2 mixolydianb6 0 10 20 30 40 50 60 70 80 (b) Modes deduced from the melodic minor scale. ultralocrian locriand6 mixolydianb2b6 doriand4 lydiand2 ionianaug hypoionianb6 10 20 30 40 50 60 70 (c) Modes deduced from the harmonic minor scale. Figure 9.6: Hierarchical clustering of modes interpreted as point clouds of the consonance-deformed Tonnetz. An overview on the organisation of modes Here we consider the grouping induced by the 0-persistence representation of the whole collection of modes we considered. The dendrogram representing this clustering is depicted in Figure 9.7. From top to bottom we ﬁnd the ultralocrian and superlocrian modes grouped together. Both are diminished modes and are generally considered the two most tense sonorities among the ones we analysed. In the second cluster are grouped three modes characterised by the presence of the minor second. The modes composing 134 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ ultralocrian superlocrian locriand6 locrian dorianb2 phrygian mixolydianb2b6 doriand4 mixolydiand4 lydiand2 hypoionian hypoionianb6 locriand2 eolian mixolydianb6 dorian mixolydian lydian ionian ionianaug lydianaug 0 50 100 150 200 Figure 9.7: Hierarchical clustering of the 21 modes of Table A.1. the third cluster are also equipped of a minor second but among the modes we considered, they are the ones associated to Spanish sonorities. The other 12 modes are clustered together. Taking a closer look to this bigger cluster we observe an interesting class of groupings: (dorian♯4, mixolydian♯4), (eolian, mixolydian♭6) and (dorian, mixolydian). The modes composing these pairs (that are built on a major and on a minor triads respectively) are commonly interchanged in jazz and fusion melodic phrasing on dominant chords: the minor third of the dorian and eolian sonorities provides the blue note typically used in such contexts. Thus, this representation is coherent from a harmonic and melodic viewpoint. It is also interesting to notice how, despite the augmented fourth, the lydian and ionian modes are represented as similar sonorities. These clusters represent well superposition of scales used commonly in jazz composition and improvisation. Octave dependency of the consonance function It is well known in musical practice, that a (altered) chord sounds more consonant in open than in root position, or that a bass player prefers to play the 12th rather than the 3rd of a chord while accompanying. What we expect analysing the modal point clouds played an octave higher than the accompanying bass, is a smaller distance between their persistence diagrams, and hence a fusion of the clusters in the dendrograms. The clusterings representing the distance between modes built with pitches belonging to two diﬀerent octaves are depicted in Figure 9.8. In this case we divided modes in three groups, consisting of minor seven, dominant and diminished modes, respectively. This subdivision has been obtained considering the chord built on the root of the modal scales. 9.2. PERSISTENT HOMOLOGY AND AUDIO FEATURE DEFORMED TONNETZE dorianb2 dorianb2 phrygian eolian doriand4 phrygian eolian doriand4 dorian dorian 5 10 15 20 25 30 35 40 5 45 10 15 20 25 135 30 35 (a) Minor seven modes. mixolydianb2b6 mixolydianb2b6 mixolydiand4 mixolydianb6 mixolydianb6 mixolydiand4 mixolydian mixolydian 4 5 6 7 8 9 10 11 2 12 2.5 3 3.5 4 4.5 20 25 5 5.5 6 6.5 7 (b) Dominant modes. locriand2 locriand2 locriand6 locriand6 locrian ultralocrian ultralocrian superlocrian superlocrian locrian 5 10 15 20 25 30 35 40 45 50 55 5 10 15 30 35 40 45 (c) Diminished modes. Figure 9.8: Octave dependency of the harmonic-oriented modes clustering. On the right the organisation of modes represented as point clouds of T33 , on the left their counterparts in T34 . First, we remark that the maximal distance between the shapes decreases when considering point cloud derived from T34 . Consider the two clusterings associated to the minor seven modes in the ﬁrst row of the ﬁgure. The dorian♭2 is an outlier: the tensions of the modes on the simplicial structure of the Tonnetz makes the point cloud distinguishable from the others even when the scale is played an octave higher. On the contrary, as we expected the phrygian is not an outlier in the diagram on the right. The minor ninth is less tense than a minor second and it is the only notes that diﬀers between the eolian and the phrygian scales. The dendrograms representing the distances between dominant modes segregates the mixolydian♭2♭6 which is the most recognisable sonority among them. It is interesting to notice how this is the only diagram which does not change its shape passing from one octave to the other. Containing the tritone (the interval between the third and the seventh of a dominant chord measures three whole steps) dominant chord are structurally tense. This feature makes their conﬁguration invariant modulo octave. The same phenomenon occurs for the locrian♯2 in the last row of the ﬁgure. 136 9.2.4 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ Discussion We used an audio feature to generate a deformation of the structure of the Tonnetz and then considered the point cloud generated by a vertical displacement of its vertices. The analysis of the Rips complex built on the point cloud representing the modal scale allowed to compare diﬀerent melodic patterns generated by an accompaniment and a scale. This paradigm can be extended to the analysis of more than one octave superposing diﬀerent deformed fundamental domains of the Tonnetz as it has been shown in Figure 9.5. Persistent homology has been used to provide a quantitative analysis of diﬀerent sonorities, proving that our representation is suitable for the classiﬁcation of diﬀerent tension/resolution patterns. Finally, despite the model we proposed is limited by the choice of a particular harmonic spectrum for the computation of the consonance, the stability of persistence diagrams, assures that small variation of the consonance function shall correspond to small variation of the diagrams. 9.3 Tonnetz deformation through triads’ consonance It is possible to generalise the dissonance function deﬁned on intervals to chords. In this section we present a comparison between the consonance function evaluated on six diﬀerent families of triads. As a ﬁrst application, we study the clustering of triads of diﬀerent types having varying their root on the whole chromatic scale and considering two diﬀerent harmonic spectra. Thereafter, for a ﬁxed triad, we will consider the dissonance generated by the block voicings composed by the superposition of the triad and each pitch of the chromatic scale an octave higher than the root of the triad. In particular, we will use a harmonic spectrum composed by six equal partials, in order to highlight the pathological behaviour of the consonance function. Again, this computation will be used to deﬁne the displacement of the vertices of the Tonnetz. The surfaces created by varying the triad’s class, shall be analysed by utilising both their metric properties and classiﬁed computing their persistent homology, for several ﬁltrations. 9.3.1 The consonance function for triads We consider here six diﬀerent classes of chords, namely: the major, minor, augmented, diminished, suspended fourth, and suspended second triads. These chords are built using ﬁfths, fourths, major thirds, and minor thirds. The list of these chords, along with their usual notation and a representative pitch-class set, is presented on Table 9.1. Each triad is considered in root position as composed by pitches belonging to the third octave of the piano. For each type of triad, we consider the twelve diﬀerent triads obtained on the twelve diﬀerent roots in the set of pitch-classes S = {C 4 , C♯4 , D4 , D♯4 , E 4 , F 4 , F ♯4 , G4 , G♯4 , A4 , A♯4 , B 4 }. The consonance of the various triads is calculated using the theory of Plomp and Level as exposed in Section 9.1. For each tone with a given frequency, an harmonic spectrum consisting of six partials is generated. The ﬁnal consonance value of a chord 9.3. TONNETZ DEFORMATION THROUGH TRIADS’ CONSONANCE Triad type name Notation Major Minor Diminished Augmented Suspended second Suspended fourth M m ◦ aug sus2 sus4 137 Representative pitchclass set {C, E, G} {C, E♭, G} {C, E♭, G♭} {C, E, G♯} {C, D, G} {C, F, G} Table 9.1: Names of the studied triads and their corresponding representative pitch-class set. is calculated by evaluating the individual consonance for each pair of frequencies between the partials of all tones. It should be noted that, for a given chord type, the consonance value decreases when the frequency of its root increases. In order to compensate for this eﬀect, we have renormalized the frequencies to the C 4 reference tone. The triads’ consonance-hierarchical organisation Once the calculation of the consonance value of each chord has been performed, a distance matrix between chords is obtained, wherein the value of each entry (i, j) is equal to the diﬀerence of the consonance values associated to the chord j and the chord i. Figure 9.9 shows two distance matrices representing the consonance relationships among triads, computing by using the harmonic spectra h1 = (1, 1/2, 1/3, . . . , 1/6) and h2 = (1/3, 1/5, 1, 1/6, 1/3, 1/6), respectively. Notice that in both distance matrices, all the block on the diagonal have zero values, which is a direct result of the nature of equal temperament: since all intervals have an equal size, the consonance value of a triad of a given type is therefore independent of its root. Moreover, notice how the consonance function depends on the harmonic spectrum by considering the two distance matrices.The colours associated to each block of the matrices describe the gain and loss of consonance passing from one class to another. Consider the ﬁrst column of each matrix. The one associated with a decreasing spectrum tells us that passing from a major triad to another class we always lose consonance (the block matrices associated to these classes are red). The same occurs for the ﬁrst column of the second matrix. However the harmonic spectrum where the third harmonic (the ﬁfth of each note composing the chord) is more powerful the the others, alters in an obvious way the perception of the triads. For example, minor and suspended fourth triads share the same consonance value. Each distance matrix allows us to calculate a corresponding dendrogram, which illustrates the hierarchical clustering of the diﬀerent triads. In Figure 9.10 we show how a distance-based clustering represents the triads classes as six diﬀerent clusters. In addition, the dendrograms in the second row of the ﬁgure show how diﬀerent inversions of a major chord (and for every other classes in equal tuning) are characterisable in terms of consonance. 138 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ CM C# M DM D# M EM FM F# M GM G# M M AM A# BM Cm C# m Dm D# m Em Fm F# m Gm m G# m A A# m Bm 0.06 0.04 C° C#° D° D#° E° F° F#° G° G#° A° A#° B° 0.02 0 C aug C# aug D aug D# aug E aug F aug F# aug G aug G# aug A aug A# aug B aug C sus2 C# sus2 D sus2 D# sus2 E sus2 F sus2 F# sus2 G sus2 G# sus2 A sus2 A# sus2 B sus2 C sus4 C# sus4 D sus4 D# sus4 E sus4 F sus4 F# sus4 G sus4 G# sus4 A sus4 A# sus4 B sus4 -0.02 -0.04 A sus4 A# sus4 B sus4 G sus4 G# sus4 E sus4 F sus4 F# sus4 B sus2 C sus4 C# sus4 D sus4 D# sus4 F sus2 F# sus2 G sus2 G# sus2 A sus2 A# sus2 D# sus2 E sus2 C sus2 C# sus2 D sus2 A aug A# aug B aug G aug G# aug E aug F aug F# aug C aug C# aug D aug D# aug C° C#° D° D#° E° F° F#° G° G#° A° A#° B° Am A# m Bm Gm G# m Em Fm F# m BM Cm C# m Dm D# m G# M AM A# M FM F# M GM D# M EM CM C# M DM -0.06 CM C# M DM D# M EM FM F# M GM G# M M AM A# BM Cm C# m Dm D# m Em Fm F# m Gm G# m Am A# m Bm 0.06 0.04 C° C#° D° D#° E° F° F#° G° G#° A° A#° B° 0.02 0 C aug C# aug D aug D# aug E aug F aug F# aug G aug G# aug A aug A# aug B aug C sus2 C# sus2 D sus2 D# sus2 E sus2 F sus2 F# sus2 G sus2 G# sus2 A sus2 A# sus2 B sus2 C sus4 sus4 C# sus4 D D# sus4 E sus4 F sus4 F# sus4 G sus4 G# sus4 A sus4 A# sus4 B sus4 -0.02 -0.04 A sus4 A# sus4 B sus4 G sus4 G# sus4 E sus4 F sus4 F# sus4 B sus2 C sus4 C# sus4 D sus4 D# sus4 G# A sus2 A# sus2 sus2 F sus2 F# sus2 G sus2 D# E sus2 sus2 C C# sus2 D sus2 sus2 A aug A# aug B aug G aug G# aug E aug F aug F# aug C aug C# aug D aug D# aug C° C#° D° D#° E° F° F#° G° G#° A° A#° B° Am A# m Bm Gm G# m Em Fm F# m BM Cm C# m Dm D# m G# M AM A# M FM F# M GM D# M EM CM C# M DM -0.06 Figure 9.9: Distance matrices between triads in equal temperament. The value of each cell (i, j) is equal to the diﬀerence in calculated consonance between the chord j and the chord i. The matrices have been computed using h1 and h2 as harmonic spectra, respectively. 9.3.2 Analysis of block voicings on the consonance-deformed Tonnetze The Tonnetz labelled with the pitches of the chromatic scale built on the 5th octave of the piano and deformed by the six classes of triads we considered is depicted in Figure 9.11. We start our analyses by describing the the dissonance values associated to each vertex of the six conﬁgurations of the Tonnetz induced by the triads. We shall see how the generalisation of the consonance function reﬂects our perception. Note that, at this stage, the deformed Tonnetze provide only a comfortable visualisation of chords. 139 4 g F sus C M II II D C# M II M I C # M II M I # G I B II D# M M I D# M M A II g au G M B II F au II EM D# M M GM II G# II M I M C M II II C M II DM II D# M EM F# M F# II g g M M G II G# II M A# II C# M A II au au M I F II A# aug E aug B aug C# a ug II II 0.0 g au M F# 0.9 E su s2 su s2 ug C 1.9 s4 4 su C#sus 4 D sus E sus4 G sus4 A# sus4 G# g D# au D aug g G au G#aug A aug C F° G° F#° D° A° G#° A#° E° D#° C# ° B° C° F# s2 s2 su g au # s2 G su s2 G su D#sus2 D sus2 C#sus2 C F° G° F#° D° A° G#° A#° E° D#° C# ° B° C° A# a su su 4 s4 M su s su 2 s2 A A# E M F#M F M C# M B M E M G# GM D# M DM A# M CM B sus4 D# sus A su s 4 F# su 4 s M A m B F# sus2 F sus2 E sus2 2.8 A# II C A#m C su 4 s4 M m A#m C F sus 0.0 2 A 0.8 s2 1.6 C#sus 4 D sus E sus4 G sus4 A# sus4 G# su B # C aug F aug F# aug D#aug D aug G aug G# g au A aug C m A F# m Fm Bm Gm Em D# m Dm C G #m #m 2.4 s4 4 su F sus 2 B sus 2 A sus A# sus2 G#sus2 G sus2 D# sus2 D sus2 C# sus2 C m A F# m Fm Bm Gm Em D# m Dm C G #m #m M 4 F#M FM B M C# M E M G# GM D# M DM A# M CM B sus4 D# sus A sus 4 F# su 4 s 9.3. TONNETZ DEFORMATION THROUGH TRIADS’ CONSONANCE M I F II I C# M F# M I I DM EM I I 3.0 1.5 0.0 A# M BM 4.6 3.9 2.6 1.3 0.0 FI GI M M AI C# I M M D M I AM F F# M I EI M A# M I M BM M G F M BI AM F# M I M I A M F M M M G M E M M G# D# DM A# M C# M M M M M M G# G E D# M DM A# M M C# C C M B I F# Figure 9.10: Hierarchical structure of triads’ consonance. In the ﬁrst row it is possible to observe how the consonance classify triads according to their classes, by using two diﬀerent harmonic spectra. In the second row the inversions of the major triads are classiﬁed according to their consonance value, computed with h1 and h2 , respectively. Major triad deformation, Tmaj . The major triad (C 4 , E 4 , G4 ) is depicted in Figure 9.11a. The height of each vertex of the deformed conﬁguration corresponds to a block voicing whose highest voice is the label of the vertex. Thus, it is not surprising to observe that the chord (C 4 , E 4 , G4 , C♯5 ) is the most dissonant point on the polyhedral surface, followed by the vertices labelled by G♯ and E♭. We also retrieve that the two smallest values of the consonance function correspond to the vertices labelled with G5 and C 5 , in particular we have d(C 4 , E 4 , G4 , G5 ) < d(C 4 , E 4 , G4 , C 5 ), this behaviour corresponds to the fast decreasing that characterises the consonance 140 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ model we considered and the interaction between the two sequences of overtones generated by the pitches of each chord. Ordering the vertices by increasing heights, the next block voicing to be considered is (C 4 , E 4 , G4 , A5 ), or a C add13 in standard jazz notation. This kind of harmonic solution is largely used in modern and classical music4 . The same holds for the conﬁguration corresponding to C add9 . From a modal point of view, the set of notes belonging to the ionian scale {C, D, E, F, G, A, B} represents the least dissonant seven notes set in this particular conﬁguration, while the substitution of the perfect fourth with the augmented one of the lydian scale, (F ♯), introduce a local maximum in terms of musical tension. The same argument can be applied considering the tension climax provided by the mixolydian scale {C, D, E, F, G, A, B♭}; the mixolydian ♭6 {C, D, E, F, G, A♭, B♭}; and the phrygian dominant scale {C, D♭, E, F, G, A♭, B♭}. Minor triad deformation: Tmin . As expected, the Tonnetz deformed with a minor triad C m the vertex corresponding to a minor third is a minimum. Comparing this geometrical state to the one associated to the major triad, we can observe that the major second D will result in a less consonant choice, being the leadingtone of the third E♭. The minor seventh (B♭), has a low dissonance conﬁguration compared to the major seventh (B). Furthermore, the vertex associated to the chord (C 4 , Eb4 , G4 , F 5 ) is a minimum of Tmin . From the modal point of view, we retrieve the results obtained by persistent homology in the previous application, for instance the vertices labelled with the pitches corresponding to the eolian scale {C, D, E♭, F, G, A♭, B♭}, and the dorian scale {C, D, E♭, F, G, A, B♭} share similar conﬁgurations, as well as the ones corresponding both to the hypoionian {C, D, E♭, F, G, A, B} and the hypoionian ♭6 scales. Augmented triad deformation: Taug . The state generated by the augmented triad is characterised by a low dissonance conﬁguration for the augmented ﬁfth interval, which is part of the underlying chord. The major seventh has a low dissonance conﬁguration and that should not surprise the reader, for the same frequency-spacing argument used before and for the standard use of ∆♯5 chords, which are naturally generated, for instance, on the third degree of the seventh-chord harmonisation of the harmonic minor scale. The augmented fourth loses its role of leading-tone, since G is not part of the chord, thus the F ♯ dissonance conﬁguration is lower than the F . Diminished triad deformation: T◦ . Among the cases we analysed, the diminished state is the only one where the vertex associated to the minor second (C♯) is not the absolute maximum. On the modal side, this conﬁguration appears to be reasonable, when considering the ﬁve standard possible modes associated to a diminished triad, as detailed in Table A.4. On the tonal harmonic point of view, the modal argument we just introduced can be translated in terms of chord or non-chord tones. Generally diminished chords (occasionally equipped either with a minor or a diminished seventh) bear tension 4 It suffices to think about A Foggy Day by Gershwin, or to the chorus of Man In The Mirror by M. Jackson. 9.3. TONNETZ DEFORMATION THROUGH TRIADS’ CONSONANCE (a) Major. (b) Minor. (c) Augmented. (d) Diminished. (e) Suspended 2. (f) Suspended 4. 141 Figure 9.11: Tonnetze deformed with the dissonance computed from the interaction of a triad in root position and the chromatic scale an octave higher than the chord. given either by the perfect fourth, represented as a local minimum in Figure 9.11d, or by the minor sixth5 . Suspended triads deformation: Tsus2 and Tsus4 . In a tonal composition suspended triads allow to circumvent or delay a precise tonal choice, since the lack of the third makes their univocal association to a key 5 Examples of the use of diminished chord in modern music can be found in Georgia on My Mind by H. Carmichael, or Make a Mistake with Me by Brad Paisley 142 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ impossible6 . This feature is well mirrored in the two simplicial complexes associated to the sus2 and sus4 triads, depicted in Figures 9.11e and 9.11f, respectively. For instance, observe how in Figure 9.11e, the consonant vertices correspond to the set of pitches {G, A, B♭, B, D, C, F }, which is the union of a G7 and a Gm7 arpeggio with the addition of the perfect fourth C, that loses the central role it has in the other conﬁgurations of the Tonnetz. What about the geometry? The structure of the Tonnetz is based on the harmonic relations among the pitches that label its vertices. The displacements of the vertices induced by the consonance of the triads generates maximal and minimal subcomplexes on the Tonnetz that characterise the triads and allow to recognise them even removing the labels from the vertices of the surface. The heat-map used to highlight the consonant and dissonant regions of the six deformed Tonnetze represented in Figure 9.11 makes them distinguishable at ﬁrst sight. The geometries generated by the major triads are characterised by tensions lying on the perfect ﬁfth axis. The conﬁguration induced by a minor triad is characterised by a relevant minor triad of tensions, (C♯, E, G♯) in the example. The same holds for the two classes of suspended 2 and suspended 4 triads, that can be assimilated to the geometries corresponding to a major and a minor triad, respectively. Moreover, the deformed Tonnetze associated to the augmented and diminished triads present completely diﬀerent conﬁgurations in terms of block voicings’ consonance. If in our ﬁrst analysis of the deformed Tonnetz the study of the sub-level sets of the height function was a natural consequence of our construction, here we can explore diﬀerent geometric properties of these simplicial complexes, that will be used at the end of this chapter to induce several ﬁltrations on these shapes, in order to classify them through persistent homology. 9.3.3 Gaussian curvature: a geometric music feature In this section we use the discrete Gaussian curvature to analyse the diﬀerent geometric states of the Tonnetz. In the next paragraph we provide an intuition concerning this geometrical property. Thereafter, we will give its musical interpretation, in the case of consonance-based deformed Tonnetz Intuition Consider a planar, unit speed curve γ : [0, 1] ⊂ R → R2 . Its curvature is deﬁned as the length of its acceleration κ(t) = γ̈(t). Geometrically, the curvature at the point p = γ(t) corresponds to the circle tangent to γ at p, having acceleration vector at p equal to the one of γ. This circle is called osculating circle, see Figure 9.12a for its representation. Thus, the curvature is κ(t) = 1/R, where R is the radius of the osculating circle. By this deﬁnition, the curvature is positive for every t ∈ [0, 1]. By choosing a normal vector ﬁeld N along γ, the curvature can take both positive and negative values. The resulting function κN is called a signed curvature. The curvature of a surface S is described by a couple of numbers at each point p. Let 6 Normally, it is possible to associate a precise tonality to a suspended triad, by analysing the harmonic context in which it is used. 9.3. TONNETZ DEFORMATION THROUGH TRIADS’ CONSONANCE 143 N p γ R p γ̇ (a) The osculating circle. (b) Computing the principal curvature. Figure 9.12: Visualisation of the curvature for planar curves and surfaces. N be the normal vector to the surface at p. Let π be a plane containing N . Its intersection with the surface generates a curve γ, as it is depicted in Figure 9.12b. We call the principal curvatures of S at p the numbers k1 and k2 realising the minimum and the maximum of the signed curvature κN , when considering all the normal plane π. The Gaussian curvature of a point p ∈ S is deﬁned as K = κ1 · κ2 . This deﬁnition allows one to classify the points of a surface as follows. 1. K > 0: elliptic points, κ1 · κ2 > 0, the quadratic approximation of the surface in a neighbourhood of p is an elliptic paraboloid; 2. K = 0: parabolic points, one of the principal curvature is equal to zero, the quadratic approximation of the surface near p is a parabolic cylinder; 3. K < 0: hyperbolic points κ1 · κ2 < 0, the quadratic approximation of the surface in p is given by a hyperbolic paraboloid. 4. Umbilical points: κ1 = κ2 6= 0 elliptic point, or κ1 = κ2 = 0, planar points: it is not possible to determine the shape of the surface near p examining the second order derivate. Intuitively, the discrete Gaussian curvature measures the bending of a polyhedral surface at each vertex. Let v ⊂ K be a vertex, the discrete Gaussian curvature (or angular defect) at v is deﬁned as Kv = 2π − n X θi , i=1 where θi are the interior angles at v of the triangles included in St_ v. Thus, for instance, a positive discrete Gaussian curvature is associated to vertices giving rise to maxima and minima, a negative curvature to saddle points and trivial curvature to planar points. The algorithm we used to compute the discrete Gaussian curvature of the consonance-deformed Tonnetze is described in (Cohen-Steiner and Morvan, 2003). 144 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ (a) (b) Figure 9.13: The elliptic paraboloid (a) and the hyperbolic paraboloid (b). Labels C5 C♯5 D5 E♭5 E5 F5 F ♯5 G5 A♭5 A5 B♭5 B5 CM + + − + − − + + − + + + Cm − + + + + − + + + + + − C aug − + − + − + − + − − + − C dim + + + − + − + + + − + + C sus2 + + − + − − − + − + + − C sus4 + − − − + + + + + + − − Table 9.2: The sign of the discrete Gaussian curvature characterise the each vertex of the by considering its interaction with its star. Here it is possible to compare the curvature values associated to each pitch, in the six classes of triads that we analysed. 9.3.4 Musical interpretation Consider Figure 9.14a, the curvature is positive in at the vertices labelled by the pitches C 5 , G5 , A5 , B♭5 , B 5 , E♭5 , F ♯5 , C♯5 . These vertices correspond either to particularly high maxima or particularly low minima, hence, they are endowed with a strong characterisation in terms of consonance/dissonance. On the contrary, the vertices labelled with D5 , E 5 , F 5 , A5 ♭ are not maxima, nor minima with respect to the directions deﬁned by the triangles included in their star. Symmetrical arguments hold for the other conﬁgurations of the Tonnetz depicted in Figure 9.14. Tmin and Taug are the only deformations in which the vertex labelled with C 5 has negative curvature (see Table 9.2). The vertex corresponding to G5 has positive curvature in every conﬁguration. It generates highly consonant block voicings, when 9.3. TONNETZ DEFORMATION THROUGH TRIADS’ CONSONANCE (a) Tmaj (b) Tmin (c) Taug (d) T◦ (e) Tsus2 (f) Tsus4 145 Figure 9.14: Discrete Gaussian curvature on the deformed Tonnetz in diﬀerent state. The colormap shows the smoothed value of the curvature in every point of the surface. We recall that the labelling of the Tonnetze is given by pitches of the chromatic whose root is one octave apart from the root of the triad used to produce the deformation of the Tonnetz. superposed to the major, minor and the suspended triads. On the contrary, it is extremely dissonant when coupled with either the augmented or the diminished triad. Both the vertices associated to C♯5 and B♭5 have negative curvature only in the Tsus4 . The suspended4 triads is meant to be perceived as a centre of gravity between two consecutive tonalities on the circle of ﬁfths, in this case C and F major. Hence, B♭5 lose its strength both in terms of tension (in C major) or resolution to F. Furthermore, C♯5 , labelling the most dissonant chord in the other conﬁgurations, 146 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ 0.02 aug maj sus2 0.015 0.02 min sus4 0.015 0.01 0.01 0.005 0.005 0 0 -0.005 -0.01 Eb -0.005 C A F# B Ab F D E C# Bb G -0.01 Eb C A F# (a) M vs aug vs sus2. B Ab F D E C# Bb G (b) m vs sus4. 0.015 dim min 0.01 0.005 0 -0.005 -0.01 Eb C A F# B Ab F D E C# Bb G (c) m vs dim Figure 9.15: Values of the discrete Gaussian curvature on the vertices of deformed fundamental domain of the Tonnetz. here is outclassed by E and F ♯ (see Figure 9.11). The discrete Gaussian curvature allow to classify the four notes chords whose consonance determine the height of the vertices of the Tonnetz for a ﬁxed triad. We saw how the value of the curvature associated to certain pitches recurs in conﬁgurations induced by diﬀerent triads. A comparison of the columns of Table 9.2 reveals that the sign of the curvature is almost identical for the conﬁgurations induced by major, augmented and suspended2 triads. The values of the discrete curvature associated to the vertices of diﬀerent deformed Tonnetze is represented in Figure 9.15. Although columns of Table 9.2 associated to the suspended4 and the minor triads are diﬀerent, it is interesting to notice how ﬁve of their vertices share almost the same curvature value. Moreover, notice how the union of the set of pitches of the major and suspended4 triads gives four or ﬁve note of the major pentatonic scale built on the root of the triads, respectively. The same holds for the minor and the suspended4 triads, with respect to the minor pentatonic scale. 9.3.5 Classification of the consonance-deformed Tonnetze To conclude the analysis of these consonance-deformed shapes, we compute their persistent homology by considering the three diﬀerent ﬁltrations, induced by the Rips complex (proximity of the vertices as a point cloud), the height function (block voicing consonance) and the more exotic discrete Gaussian curvature. The dendrograms computed by considering the distance between the 0-persistence diagrams of the shapes for each ﬁltration are depicted in the left column of Figure 9.16. 9.4. DISCUSSION 147 The three ﬁltrations highlight diﬀerent musical properties of the chords: (a) Rips complex. The six classes of triads are grouped into two almost specular clusters. The ﬁrst by the two suspended triads and the major one. The second composed by the diminished, augmented and the minor triad. The characterisation of the shape given by the Rips complex classify the suspended triads near to their resolution, while the minor triad is an outlier respect to the two most dissonant triads that have distance 0 in this representation. (b) Consonance. The ﬁltration induced by the sub-level sets of the height function order the simplices of T respect to the consonance of the voicings built on their vertices. As expected this ﬁltration classiﬁes the triads according to their own consonance value. (c) Curvature. The grouping of the triads obtained by studying the sub-level sets of the discrete Gaussian curvature reﬂects the our observations. We retrieve the pentatonic link between the major-suspended2 and minor-suspended4 triads, respectively. As well as the segregation of the diminished mode and the similarity of the triplet suspended2, augmented and minor triad. Once the musical interpretation of the geometric properties of the shapes we generated is clear, persistent homology allows us to explore diﬀerent parameters of the consonance function. The left column of Figure 9.16 has been obtained by changing the harmonic spectrum used in the consonance function to h = (1, 1/2, 1/3, 1/4, 1/5, 1/6). The classiﬁcation of the shapes provided by the Rips complex is almost unchanged; the ﬁltration induced by the height function points out the similarity between the shapes generated by the major and minor triads, while the others are segregated. The consonance function, in this second and more realistic conﬁguration groups the shapes associated to a suspended2 and major triads, while another cluster contains the augmented and diminished triads. The minor and suspended4 triads are outliers. 9.4 Discussion In Sections 9.2 and 9.3 we suggested two applications of consonance calculation for the hierarchical organisation of musical entities. The ﬁltration induced by the Rips complex, built on a point cloud representing each mode, allowed to compute a dissimilarity among modal scales. The octave dependency of the consonance function has been discussed. It has been shown how the mixolydian modes maintain the same relationship among them, when considering a ﬁxed reference pitch and scales built on two consecutive octaves. In general, the ﬁltration deﬁned by the Rips complex segregates recognisable sonorities, i. e. modal scales that together with a bass note used as accompaniment give rise to what we could refer to as Spanish sonority, or a bluesy one, and so forth. In Section 9.3, we have studied the organisation of common triads. A simple generalisation of the Plomp and Levelt consonance model allows to associate a unique consonance value to each triad class. It has been shown how the consonance is invariant modulo transposition of the same triad and how diﬀerent classes of triads are recognised, despite changes of the harmonic spectrum used to compute the their 148 CHAPTER 9. AUDIO FEATURE DEFORMATION OF THE TONNETZ consonance. Later on, this was used to study the relations between diﬀerent pitch classes in a given chordal context, by the usual update of the height of the vertices of the Tonnetz. This led to a variable geometry space which is suitable for the analysis of diﬀerent chord classes, and provides a link between the harmonic and the melodic level intended as the block voicings of a triad on the chromatic scale. The surfaces obtained in this way have been discussed, considering the height function, as well as their discrete Gaussian curvature. In conclusion, three diﬀerent ﬁltrations have been used to classify these six classes of triads, according to the geometric properties of their representations as deformed Tonnetze. Once the meaning of the function used to induce the ﬁltration is clear, the classiﬁcation provided by persistence homology allows to study the eﬀect of the harmonic spectrum on the geometry of the deformed Tonnetze and hence with respect to the tension/resolution patterns of each pairing of a triad class and chromatic scale. 149 au g dim au g dim 9.4. DISCUSSION min min 2.1 1.4 0.7 0.0 0.2 0.1 0.1 0.0 Ma Ma j j 4 Ma j min (b) Rips complex Ma j min sus 2 2 sus sus sus 4 (a) Rips complex aug 2 sus 243.1 188.9 134.6 80.4 190.9 137.1 83.4 29.6 dim aug 4 sus 4 dim sus 2 sus (d) Height function Ma j 2 sus g au sus 2 (c) Height function j Ma aug 40.8 28.0 15.1 2.3 45.1 33.5 22.0 10.4 min dim min sus dim 4 4 sus (e) Discrete Gaussian curvature (f) Discrete Gaussian curvature Figure 9.16: Hierarchical clustering of consonance-deformed Tonnetze generated by triads and two harmonic spectra: (1, 1, 1, 1, 1, 1) on the left column and (1, 1/2, 1/3, 1/4, 1/5, 1/6) on the right. Ten Discussion and future works The three applications described in this part represent a ﬁrst formalisation of the topological and geometrical music analysis. Music features have been represented as polyhedral surfaces and point clouds. After the analysis and the discussion of several dataset, persistent homology revealed itself an eﬃcient tool both in a purely symbolic or in a hybrid signal/symbolic context. Figure 10.1: Chromagrams. The ﬁrst model for the analysis and classiﬁcation of music based on pitch classes and notes’ durations we suggested has been realised considering diﬀerent datasets of MIDI ﬁles. The stability of the height function allows to extend this analysis to audio ﬁles. It is possible to describe the pitch classes and duration of the notes in an audio ﬁle, by computing a chromagram (Harte and Sandler, 2005). Such a representation allows to pass from the domain of time to the domain of frequencies through a fast Fourier transform. Then pitch classes are obtained wrapping frequencies on a single octave. Each pitch class is represented taking into account its magnitude. The ﬁrst two chromagrams1 in Figure 10.1 describe the pitch-class contribution during a perfect cadence Dm7 − G7 − Cmaj9 − Cmaj7 played using two virtual instruments emulating a Fender Rhodes and a Bösendorfer respectively. The third one represents a small fragment (≈ 10 seconds) of a jazz composition involving two guitars, a basic drum set and a bass. From such representation it is possible to deduce what notes are played considering their magnitude (colour) and how much time they last (horizontal axis). In this case, the height function would surely be aﬀected by the noisy data coming form the signal. However, the stability of the persistence diagrams for tame functions assures that a small perturbation of the 1 The figures have been realized through the Librosa Python library available at https://bmcfee. github.io/librosa/. 151 152 CHAPTER 10. DISCUSSION AND FUTURE WORKS function that induces the ﬁltration corresponds to small variations of the persistence diagrams. In order to generalise the consonance based applications to the analysis of complex signals, it is necessary to retrieve more information than the one represented in a chromagram. One possibility is represented by the Snail Analyzer-Tuner (http://medias.ircam.fr/x1b825e at minute 20). A software developed at the IRCAM, that allows to visualise the same information represented in a chromagram, on the whole audible frequency spectrum. This information coupled with a chord detection algorithm(Mauch et al., 2009; Ellis and Weller, 2010), would allow to compute the consonance values and hence to update the height of each vertex of the Tonnetz in time, directly from an audio signal. The model itself can be largely reﬁned in several ways. It is possible to augment its dimensionality, losing its property to be easily visualisable, but having the possibility to encode more information. For instance, it could be possible to associate to each pitch class of the Tonnetz its velocity, or merge the two pitch-class/duration and consonance approaches. Moreover, topological persistence oﬀers further tools to improve the strategies we suggested. A natural development is the study of the multidimensional persistent homology (Cagliari et al., 2010; Cerri et al., 2013; Carlsson and Zomorodian, 2009) of musical spaces. Part IV Harmonic sequences and persistence time series 153 Table of Contents 11 Harmonic time series and pop music 11.1 Symbolic sequence alignment . . . . . . . . . . . . . . . . . . . . . . 160 11.1.1 Pairwise sequence alignment . . . . . . . . . . . . . . . . . . 160 11.1.2 Multiple sequence alignment . . . . . . . . . . . . . . . . . . 164 11.2 Harmonic sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 11.2.1 From harmonic progressions to symbolic sequences . . . . . . 171 11.2.2 Weighting matrices . . . . . . . . . . . . . . . . . . . . . . . . 174 11.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 11.3.1 Database, visualisation and notation . . . . . . . . . . . . . . 175 11.3.2 Cover recognition . . . . . . . . . . . . . . . . . . . . . . . . . 177 11.3.3 Metadata and clustering . . . . . . . . . . . . . . . . . . . . . 177 11.3.4 Semiotic clustering . . . . . . . . . . . . . . . . . . . . . . . . 178 11.3.5 Towards semantic clustering . . . . . . . . . . . . . . . . . . . 178 11.3.6 Motif mining and molecular clock . . . . . . . . . . . . . . . . 182 11.4 Discussion and perspectives . . . . . . . . . . . . . . . . . . . . . . . 184 12 Musical Persistence Snapshots 12.1 Persistence and time varying systems . . . . . . . . . . . . . . . . . . 187 12.1.1 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . 187 12.2 Dissimilarity of persistence time-series . . . . . . . . . . . . . . . . . 189 12.2.1 Dynamic Time Warping algorithm for persistence time series 189 12.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 12.3.1 Musical interpretation . . . . . . . . . . . . . . . . . . . . . . 191 12.3.2 Optimal persistence warping path . . . . . . . . . . . . . . . 192 12.3.3 Dissimilarity of persistence time series . . . . . . . . . . . . . 194 12.4 Discussion and perspectives . . . . . . . . . . . . . . . . . . . . . . . 197 Abstract This part is devoted to the analysis of the time-dependent nature of music. In Chapter 11 we suggest a novel approach to the analysis of popular music, based on the multiple (read simultaneous) alignment of symbolic sequences. We will consider the sequences derived from several harmonic-oriented analyses of the harmonic progressions of a dataset of 138 compositions. Given a harmonic progression, music theory provides the tools allowing to retrieve tonal centres, cadences and modulations. We take advantage of these high-level features to deﬁne three families of symbolic sequences associated to our dataset. Their global pairwise alignment is used to tackle several problems such as detection of cover tracks and the retrieval of genres and artists. Finally, multiple sequence alignment is computed to produce an encompassing analysis of the transfer of musical patterns among the heterogeneous collections of songs, artists and genres of the dataset we analysed. This chapter represents joint work with Philippe Esling (Esling and Bergomi, 2015). The main contribution of Chapter 12 is the adaptation of the model proposed in Chapter 8 (whose aim was the analysis of persistent properties of music) to the study of the time-varying geometry of the Tonnetz, when its vertices are deformed considering the pitch classes and durations of the notes of a composition. In particular, we consider the time series arising from the topological analysis of the natural time framing of music, provided by its subdivision in bars. The state of the art concerning the representation of time-varying systems in the formalism of topological persistence is discussed and a method to align these time series composed by topological subfingerprints is provided. We investigate the musical relevance of the information carried by the time series of persistence diagrams, as well as the analysis of their dissimilarity. In particular, we will focus on three datasets collecting classical, pop and jazz compositions, respectively. The implementation of this last application is a joint work with Adriano Baraté. 157 Eleven Harmonic time series and pop music analysis The past decade has witnessed a growing interest in content-based retrieval for multimedia databases (Yoshitaka and Ichikawa, 1999). Large amounts of work have been devoted to performing similarity queries and genre recognition over musical songs databases, leading to the ﬁeld of Music Information Retrieval (MIR) (Casey et al., 2008). However, in this ﬁeld, the signal-based and symbolic-based approaches to musical analysis are often considered antipodal strategies. Could these viewpoints cohexist as complementary? Would it be possible to improve the results provided by signal analysis by augmenting its abstraction level through a symbolic framework? The limitations of the genre recognition tasks have recently been exhibited. Indeed, since the ﬁrst work in musical genre recognition in 1995 (Matityaho and Furst, 1995), it seems that most research still revolves around a signal-based classiﬁcation of a ground truth annotation of music genre provided by human experts and a train/test paradigm (Sturm, 2014). However, the reference databases for the evaluation of these systems suﬀer from multiple ﬂaws such as duplicatas, corrupted ﬁles, genres made of single artists and wrong (or too subjective) genre labelling (Sturm, 2013a,b). Furthermore, this simplifying task is far from accounting the fact that musical inspiration and transfer of diﬀerent musical patterns go way beyond the notion of musical genre. We introduce an innovative way to analyse pieces of popular music (termed here pop music), by performing a high-level symbolic analysis of their content. Our main goal is to provide an encompassing view over the cadential patterns and modulations motifs in pop music (Everett, 2000; von Appen et al., 2015) and how these can show artistic inﬂuences across various musical genres. In order to achieve this aim, the harmonic progressions corresponding to a dataset of pop songs is derived from their corresponding audio signals and a harmonic analysis is performed. As a result, we retrieve three diﬀerent classes of symbolic sequences describing high-level tonal features of each song. We further analyse the similarity among pairs of sequences belonging to the same symbol class by relying on several state-of-the-art global sequence alignment algorithms typically used in time series (Esling and Agon, 2012) and genetic analyses. Some interesting properties of this approach are discussed. First, it is shown how the similarities of these sequences can provide a valuable instrument to reﬁne hand-made semiotic segmentations of the songs. Second, the accuracy of the harmonic transcription and the harmonic analysis is evaluated by performing a cover track recognition task. In addition, the clusterings generated by 159 160 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC each possible combination of weight matrix and alignment algorithm are evaluated by measuring their cluster-wise accuracies with respect to the metadata corresponding to the considered collection of songs. Although our analyses provide higher-level assessments of melodic and harmonic similarities across musical pieces, we show that it still bears some coherence in the traditional evaluation paradigm of genre and artist recognition. We further introduce the application of Multiple Sequence Alignment (MSA) (Thompson et al., 2011) in the analysis of music at a symbolic level. This analysis is performed by considering several state-of-art MSA algorithms, which are evaluated through a variety of quality metrics. As a result of this multiple alignment procedure, it is possible to compute both the consensus sequences associated to each cluster and perform an analysis of motifs over the whole dataset. The former represents a paradigmatic sequence of modulations, which exhibits the typical path followed in the songs belonging to a certain cluster, while the latter provides an evident representation of the harmonic contamination and artistic inﬂuences among pop songs. Hence, we perform an encompassing analysis across pop music, based on the MSA structure organised across diﬀerent clusters. By analogy to the well-known molecular clock hypothesis in genetics (Martin and Palumbi, 1993) which allows to evaluate the similarity between species based on their shared amount of genetic similarity, we show that these types of analyses provide higher-level views on musical similarity across genres, with various artists inﬂuencing each other over time. 11.1 Symbolic sequence alignment 11.1.1 Pairwise sequence alignment The goal of pairwise alignment is that given a pair of symbolic sequences S1 and S2 of potentially diﬀerent length n, m ∈ R2 composed of symbols from a given alphabet Σ and a scoring function δ (x, y) that deﬁnes the similarity between two symbols x, y ∈ Σ2 we want to ﬁnd the sequences S∗1 and S∗2 of equal-length k such that the sum of similarity scores is maximized by inserting gaps in the two original series. Definition 11.1.1. Given two sequences S1 ∈ Σn and S2 ∈ Σm , pairwise alignment ′ ′ ′ ′ seeks the sequences S1 ∈ Σk and S2 ∈ Σk (k > max (m, n)) such that S1 − S2 is δ minimal. Based on a set of symbolic sequences, we can evaluate their pairwise similarities by aligning two sequences at a time (Sankoﬀ, 1972). This alignment can be based on any type of symbolic information, given that the two sequences are composed of symbols with the same underlying signiﬁcation. Pairwise alignment allows to gain information about the similarity between two sequences, but also about their inner structure. Hence, this can allow to ﬁnd common patterns, or to assemble together set of sequences (fragment assembly). The diﬀerent issues related to pairwise alignment are that • Most of the sequences we are comparing will diﬀer in length • There may be only relatively small matching regions in the sequences 11.1. SYMBOLIC SEQUENCE ALIGNMENT 161 GSAQVKGHGKKVADALTNAVAHV---D--DMPNALSALSDLHAHKL :: ::::|: || : :| :: | ||||:|: |:::|: | NNPELQAHAGKVFKLVYEAAIQLQVTDVVDMPNTLKNLGSVHVSKG Figure 11.1: Example of global alignment between two (apparently) lowly-related sequence. Exact matches are identiﬁed by (|) and related matchs are identiﬁed by (:). Even though the symbols in both sequences are quite diﬀerent, most of these are actually closely related in their functions, which implies that the sequences share a high amount of similarity. • We want to allow variable matches between the symbols It should be noted that three types of alignments can be performed. A global alignment seeks the best match between both sequences in their entirety and it is the only type of alignment utilized in this work. A local alignment will ﬁnd the best subsequence match, even in very small portions of the sequences. Finally a semi-global alignment seeks the best global match without penalizing gaps on the ends of the alignment. An example of global pairwise alignment is displayed in Figure 11.1. Levenshtein (edit) distance The ﬁrst way to obtain the alignment between two symbolic sequences is through the Levenshtein distance (also called edit distance) (Levenshtein, 1966), which considers that three types of diﬀerences can arise wen comparing two symbolic sequences Substitutions ACGA ⇒ AGGA Insertions ACGA ⇒ ACCGA Deletions ACGA ⇒ ACA The Levenshtein distance is deﬁned as the minimal number of applications of these operations that are required to transform one sequence into another. The main problems of this distance are that all operations are considered equivalent (the same score is assigned to any change) and only the binary match/mismatch relationship is taken into account (symbols cannot be more or less related). Dynamic Programming Dynamic Programming (DP) provides an optimal solution to the global alignment. Its basic assumption is that the optimal solution can be found by aggregating several optimal solutions computed considering smaller parts (subsequences) of the problem (Berndt and Cliﬀord, 1994). Scoring scheme This approach relies on a substitution (or weight) matrix δ (x, y) which indicates the score of aligning any characters x and y from our alphabet Σ. Moreover, the scoring use a gap penalty function w (k) which indicates the cost of a gap of length k, usually through a linear cost w (k) = g · k where g ∈ R is a constant. 162 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC The idea of dynamic programming is that given a sequence S of length n and a sequence T of length m, we can construct a (n + 1) × (m + 1) matrix F such that Fi,j is the score of the best alignment of S [1 . . . i] with T [1 . . . j]. This means that the score of any cell can be deduced by the scores of its three previous neighboring (up and left) cells. Therefore when extending an alignment in the cell Fi,j , three choices can be made • align S [1 . . . (i − 1)] with T [1 . . . (j − 1)] and match S [i] with T [j]. • align S [1 . . . i] with T [1 . . . (j − 1)] and match a gap with T [j]. • align S [1 . . . (i − 1)] with T [1 . . . j] and match a gap with S [i]. Hence one way to specify the DP problem is in terms of its recurrence relation    F (i − 1, j − 1) + δ (S [i] , T [j]) F (i, j) = max F (i − 1, j) + g   F (i, j − 1) + g Several algorithms have been developed based on this idea, such as the Dynamic Time Warping (DTW) algorithm (Berndt and Cliﬀord, 1994) which is the ﬁrst to use these principles. However, it is extremely brittle to the presence of outliers and noisy regions. These problems can be alleviated by allowing gaps in matching two sequences, with algorithms such as the Longest Common Subsequence (LCSS) (Das et al., 1997). Finally, the Edit Distance with Real Penalty (ERP) (Chen and Ng, 2004) attempts to combine the merits of DTW and edit distance by using a constant reference point. We will use these three algorithms in our subsequent analyses. Needleman-Wunsch algorithm The linear gap penalty function (w (k) = g · k) of the DP approach implies that a long gap of n positions has the same impact on the alignment as n gaps disseminated along both sequences. However, it seems obvious that we should favour a single long gap between highly matching sub-sequences (putting emphasis on the similarity of local structures shared between sequences). The Needleman-Wunsch (NW) algorithm (Needleman and Wunsch, 1970) was introduced to handle this mechanism by providing an affine gap penalty function w (k) = ( α + βk 0 k>1 k=0 where α deﬁnes the cost for opening the gap and β deﬁnes the cost for extending it. By choosing α > β, we implicitly penalize small sporadic gaps as it costs more to open a gap than to extend an existing one. In order to keep the computational complexity of this reﬁned alignment in O n2 time, the NW algorithm relies on three diﬀerent scoring matrices instead of a single one. First, the matrix M (i, j) deﬁnes the best score given that S [i] is aligned to T [j]. Second, IS (i, j) deﬁnes the best score given that S [i] is aligned to a gap and IT (i, j) deﬁnes the best score given that T [j] is aligned to a gap. Hence, the NW algorithm redeﬁnes the previous recurrence relations as 11.1. SYMBOLIC SEQUENCE ALIGNMENT 163   M (i − 1, j − 1) + δ (Si , Tj )  M (i, j) = max IS (i − 1, j − 1) + δ (Si , Tj )   I (i − 1, j − 1) + δ (S , T ) i j S ( M (i − 1, j) + α + β IS (i, j) = max I (i − 1, j) + β ( S M (i, j − 1) + α + β IT (i, j) = max IT (i, j − 1) + β The overall NW algorithm can be drafted in three steps 1. Initialization a) M (0, 0) = 0 b) Ix (i, 0) = α + β.i c) Iy (0, j) = α + β.j 2. Fill the three matrix (M, Ix and Iy ) together iteratively 3. Traceback a) Start at the largest value between M (m, n), Ix (m, n) and Iy (m, n) b) Stop at any of M (0, 0), Ix (0, 0) and Iy (0, 0) On the influence of the scoring matrix One of the core concepts shared by most variants of the DP and NW algorithms is that they rely on a scoring function δ (x, y) which provides a mechanism to deﬁne variable symbolic matching. Hence, one of the key factor in the success of alignment algorithms lies in this prior knowledge of the symbols (dis)similarities. This can be deﬁned as the dissimilarity measure δ (x, y) between symbols x and y (usually summarised in a weight matrix) Remark 16. The scoring function is not necessarily a metric. We recall that a function δ is called a metric if it is symmetric δ (x, y) = δ (y, x) and subadditive δ (x, z) 6 δ (x, y) + δ (y, z) . The deﬁnition of this scoring matrix highly inﬂuences the resulting alignment. Furthermore, we can evaluate the score of an alignment by using the sum of all distances X δ sk1 , sk2 D (S1 , S2 ) = k or by minimising the entropy of each column given by D (si ) = − X a cia log2 (pia ) 164 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC where si is the ith column of an alignment s, cia is the number of occurences of character a in column i and pia is the probability of character a in column i. The eﬀect of devising diﬀerent scoring matrix is displayed in Figure 11.2. Applications to music Over the past years, there has been several researches relying on pairwise sequence alignment for musical data analysis. Most of these works are devoted to contentbased querying such as Query By Humming (QBH), which allows to retrieve a song inside a database based on a query hummed by the user (Pardo and Sanghi, 2005). Other works have targeted the use of sequence alignment to improve optical music recognition from multiple recognizers (Bugge et al., 2011), score following in order to distinguish between aligned and non-aligned audio frames (İzmirli and Dannenberg, 2010), cover detection using local alignment algorithms (Martin et al., 2012) and folk music analysis (Bergomi and Andreatta, 2015; Bergomi et al., 2015). 11.1.2 Multiple sequence alignment Pairwise alignments allows to deﬁne a similarity but also to ﬁnd some common local structures between sequences. However, if an entire set of sequences is analysed, pairwise alignment fails to provide a more encompassing level of reasoning, as it is unable to align multiple sequence at the same time. Definition 11.1.2. The problem of Multiple Sequence Alignment (MSA) can be deﬁned as ﬁnding from a set of k sequences of various length S = {S1 , S2 , . . . , Sk }, the aligned set of k equal-length sequences S∗ = {S∗1 , S∗2 , . . . , S∗k }, where S∗i is obtained by inserting gaps into Si , ∀i ∈ [0, k] while minimizing the overall dissimilarities between the symbols. Compared to pairwise alignment, MSA requires a global objective to minimise across the whole set of sequences. The most straightforward way to deﬁne an error function to minimise in a MSA problem is to rely on the Sum-of-Pair (SP) score SPscore (a1 , . . . , ak ) = X δ (ai , aj ) 16i<j6k where ai is a column of the alignment composed of symbols from our dictionary (or gaps) and δ (ai , aj ) is the distance deﬁned in our weight matrix. Then, the overall score of an alignment S∗ can be deﬁned as SPscore (S∗ ) = X SPscore (S1∗ [x] , . . . , Sk∗ [x]) x In other words, we are trying to minimise the position-wise diﬀerences in symbols simultaneously for all sequences in the alignment. As opposed to pairwise alignment, there has been, to the best of our knowledge, no application of this approach to musical data. The only work relying on MSA was aimed at lyrics alignment (Knees et al., 2005) where lyrics extracted from the internet were used to perform faster retrieval of songs. We will provide in this work the ﬁrst assessment of MSA for harmonic and motivic analyses. 165 BA BA D BC A BBC A BEB D ED C A A D BB ABE E 11.1. SYMBOLIC SEQUENCE ALIGNMENT C ADB EDE 1.2 1.0 0.9 0.7 EBE AEA B C BA D D EB AD E C AEA D ED D EA BE D BD BC EBE ADE ADB EBD DDB BEB CAA B EBD 1.4 1.2 1.0 0.8 BAB ADB CA E B BE AB E AA C BE B A BBC EDD AAC BBD Figure 11.2: The eﬀect of using diﬀerent grammars (symbolic information) and diﬀerent weighting matrix can lead to dramatically diﬀerent results in the ﬁnal alignments and similarities between the sets of sequences. 166 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC Figure 11.3: Multiple sequence alignment of 3 sequences through dynamic programming. (a) Given a set of 3 sequences to align, (b) we can construct a 3-dimensional matrix in which (c) each cell deﬁnes 7 diﬀerent paths. (d) Following the same procedure as pairwise alignment, we can ﬁnd the optimal (e) multiple sequence alignment. (f) An interesting property is that we can project the multidimensional path on bi-dimensional planes to obtain pairwise alignments between any sequence of the set. Dynamic programming As seen previously, dynamic programming is an excellent tool to perform the alignment of two sequences, as it provides the global optimum to this problem. This technique can be extended to perform the alignment of a set of k sequences and provides the optimal solution for this set. We can rewrite the original dynamic programming equation as V (i1 , i2 ) = max (b1 ,b2 )∈{0,1}2 −{(0,0)} {V (i1 − b1 , i2 − b2 ) + δ (S1 [i1 b1 ] , S2 [i2 b2 ])} This equation simply states that the best path from one cell depends on its 3-neighbourhood of previous cells in the scoring matrix. As this form is closely related to that of the SP-score, we can extend it by considering V (i1 , . . . , ik ) = SPscore {align (S1 [1 . . . i1 ] , . . . , Sk [1, . . . ik ])} Hence, the score of the last column would be the SP-score of the optimal alignment between the k sequences. Therefore, for each cell of the k−dimensional matrix = max (b1 ,...,bk )∈{0,1}k V (i1 , · · · , ik ) = {V (i1 − b1 , . . . , ik − bk ) + SPscore (S1 [i1 b1 ] , . . . , Sk [ik bk ])} Therefore, the SP-score of the optimal multiple alignment of S = {S1 , S2 , . . . , Sk } is V (n1 , · · · , nk ) where ni is the length of Si (ie. the “last” cell of the scoring matrix). Overall, we ﬁll the k−dimensional scoring matrix similarly to the two sequences to compute V (n1 , · · · , nk ). This process is detailed in Figure 11.3. However, it should be noted that the complexity is exponential in the number of sequences to align. Therefore, this algorithm rapidly becomes impossible to apply both in terms of computation time and memory requirement. 167 11.1. SYMBOLIC SEQUENCE ALIGNMENT ABBCDDABB ABCCDDABB ABDDAABBBB ABDBBDBB ABBCDDCABB ABCCDDABB ABBCDDABB ABBCDDABB -ABBCDDABBBB -ABCCDDABBBB ABDCDAABBBB ABDCDAABBBB -ABBCDDAABBBB -ABBCDDABBBB -ABCCDDAABBBB ABDCDACABBBB --ABDBBDAABBBB - - ABBCDDCABBBB ABBCDDABBBB --ABCCDDABBBB ABDCDAABBBB -ABDBBDABBBB ABDBBDABB ABBCDDABB ABBCDDCABB ABBCDDCABB Figure 11.4: Summary of the centre star algorithm. Center-star method (approximation) It would be preferable to obtain a good approximation of the optimal alignment using polynomial time. The centre star method was one of the ﬁrst method proposed to minimise the SP-score in an eﬃcient way. The main idea behind this method is to ﬁnd a reference (centre) sequence inside the set of sequences to align, and then aligning all other sequences with this reference. In order to ﬁnd the reference sequence, we compute the pairwise alignments of all pairs of sequences and select the sequence that minimises the sum of distances (which represents the centroid of the set). Then, based on the pairwise alignments, we can iteratively ﬁnd the multiple alignment by simply adding gaps in the current alignment. The overall workﬂow for the center-star method is presented in Figure 11.4 Center_Star Input: A set S of sequences. Output: A multiple alignment M with a SP-score at most twice that of the optimal alignment of S. 1. Compute D (Si , Sj ) for all Si , Sj ∈ S. 2. Find the center sequence Sc which minimizes Pk i=1 D (Sc , Si ). 3. For every Si ∈ S − {Sc }, choose the optimal pairwise alignment between Sc and Si . 4. Introduce gaps into Sc so that the multiple alignment M satisﬁes the alignments found in Step 3. Heuristics methods The star method suﬀers from several ﬂaws in terms of time and space requirements, but also in the quality of the ﬁnal alignment. Furthermore, the centre star method is highly brittle to the choice of the reference sequence. Hence, several heuristics have been devised to alleviate this problem. 168 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC Progressive alignment Progressive alignment is based on the idea of iteratively aligning the most closely related sequences until all sequences are aligned. Several algorithms have been developed based on this idea such as ClustalW (Thompson et al., 2002), T-Coffee (Notredame et al., 2000) and ProbCons (Do et al., 2005). All these algorithms follow the same three main steps 1. Computing pairwise distance scores for all pairs of sequences, through the (triangular) distance matrix containing D (Si , Sj ) for all Si , Sj ∈ S (often expressed as the percentage of mismatches). The choice of alignment algorithm and weight matrix both highly inﬂuence the ﬁnal multiple alignment. 2. Generating the guide tree based on sequence similarities in the distance matrix to obtain a hierarchical clustering. Any linkage function and clustering algorithm can be used to obtain this iterative grouping (dendrogram of sequences similarities). 3. Aligning the sequences iteratively along the guide tree, by starting from the leaves and moving up the tree. Each internal node connecting several sequences represents an alignment of the corresponding sequences. This process is repeated until the root node. The process of iterative alignment implies several multiple alignments between subsets of the sequences (some of which already aligned in a previous step), which can be done through the principle of Profile-Profile alignment. Profile-Profile Alignment Given two aligned sets of sequences A1 and A2 , the proﬁle-proﬁle alignment introduces gaps to A1 and A2 so that both of them have the same length. In order to determine this alignment, we need a scoring function such as P SP (A1 [i] , A2 [j]) = X gxi gyj δ (x, y) x,y where gxi is the observed frequency of symbol x in the column i and δ (x, y) is the distance between symbols x and y (as deﬁned in our weight matrix). Hence, our aim is to ﬁnd an alignment between the two sets that in order to maximise the P SP score. The overall workﬂow for the progressive alignment methods is summarised in Figure 11.5 Iterative methods The main limitation of the progressive alignment is that it will not try to realign the sequences once the MSA is found. Hence, the ﬁnal alignment is highly brittle to the quality of initial alignments, and is not guaranteed to converge to the global optimum. In order to alleviate these ﬂaws, iterative methods introduce heuristics that starts with a progressive alignment and then iteratively improves it. Examples of iterative methods are MAFFT (Katoh et al., 2002) and MUSCLE (Edgar, 2004). These algorithms are based on two ideas 1. Generating a draft multiple alignment as fast as possible, usually through slightly modiﬁed progressive alignment method. First, the distance matrix is 169 11.1. SYMBOLIC SEQUENCE ALIGNMENT S1 S2 S3 S4 S5 S1 S2 S3 S4 S5 BBGEFCDCAC BADGEFDCAC BBDGFCDC GADGFDCCC GADGFDCAC (a) S1 1 .9 .8 .4 .6 S2 1 .6 .8 .9 S3 1 .5 .5 S4 1 .9 S5 1 S 1 B-BGEFCDCAC S 2 BADGEF-DCAC S 3 BBDG-FCD--C (c) (b) S 4 GADGFDCCC S 5 GADGFDCAC s1 s2 s3 s4 s5 s1 s2 s3 s4 s5 (d) S 1 B-BGEFCDCAC S 2 BADGEF-DCAC S1 S2 S3 S4 S5 B-BGEFCDCAC BADGEF-DCAC BBDG-FCD--C GADG-F-DCCC GADG-F-DCAC Figure 11.5: Summary of the progressive alignment algorithm. (a) The similarity matrix is computed based on pairwise alignments. (b) The guide tree is obtained from this matrix. (c) By going up the tree, each node generates a speciﬁc alignment, between subsets of sequences. (d) When the root of the tree is reached, we obtain the set of multiple alignments. computed faster by discriminating sequences based on the symbols frequency. Second, the Unweighted Pair-Group Method using Arithmetic mean (UPGMA) method is used to perform clustering Finally, the PSP score may favor gaps as it relies on the direct sum of weighted symbol distances (and gaps are considered as symbols). This can be alleviated by using the log-expectation score in the proﬁle-proﬁle alignment LE (A1 [i] , A2 [j]) = 1 − fiG 1− fjG log X x,y fix fjy pxy (px py ) ! where fiG is the proportion of gaps in A1 , fix is the proportion of symbol x in A1 , px is the overall proportion of symbol x and pxy is the probability that x aligns with y. It should be noted that pxy = eδ(x,y) . (px py ) 2. In the second stage, the distance matrix is computed by ﬁrst ﬁnding the fraction D of identical symbols shared by two aligned sequences, and computing D2 − log 1 − D − 5 ! . The progressive alignment is iterated but re-alignment of the sequences is performed only when there are changes relative to the original tree. The MSA algorithms that will be used here are ClustalW (Thompson et al., 2002), Muscle (Edgar, 2004), MAFFT (Katoh et al., 2002), ProbCons (Do et al., 2005) and TCoffee (Notredame et al., 2000). We will compare diﬀerent algorithms with various types of symbolic informations and diﬀerent weighting matrix to assess diﬀerent structural properties of music. Evaluating MSA results As MSA can produce widely varying results, we need objective measures of the alignment quality. In genetics, this is usually performed by comparing the alignment to a known reference sequence. However in our case, as we do not have a speciﬁc 170 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC reference, we rely on reference-free evaluation methods. The simplest quality metrics are the Total columns (TC) aligned, the Q-Score ( percentage of aligned pairs over the total number of pairs) and the Sum of Pairs (SP) score as deﬁned previously (Section 11.1.2). However, more advanced reference-free metrics have also been developed. The Z-Score (Ahola et al., 2006) relies on importance sampling and statistical proﬁle analysis for counting the number of signiﬁcantly conserved positions in the alignment. The Multiple Overlap Score (MOS) (Lassmann and Sonnhammer, 2005) identiﬁes alignment quality by expressing the overlap among groups of aligned sequences. The Information Content (IC) (Hertz and Stormo, 1999) provides a log-likelihood scoring scheme based on a priori probabilities of symbol occurrence. The APDB distance (OSullivan et al., 2003), Root Mean Square Deviate (iRMSD) and Normalized iRMSD (iRMSD) (Armougom et al., 2006) are based on the idea that if an aligned pair is correct, then the neighborhood of this pair should also be aligned. Therefore they are derived by computing the percentage of aligned neighbors across all positions in the alignment. Finally, the Mean Distance (MD) and Normalized MD (NorMD) (Thompson et al., 2001) combines column scoring and similarity scores by performing a ratio of the number, length and similarity of aligned subsequences. Motif mining Once the multiple alignment is obtained, it is straightforward to perform a motif mining analysis. Indeed, as sequences are now all globally aligned, the search of motifs can be performed by looking for highly conserved “blocks” of symbols. A motif is a particular subsequence that occur with a signiﬁcant number of repetitions across the set of aligned sequences and with eventual small variations. Here, we rely on the MEME (Multiple EM for Motif Elicitation) algorithm (Bailey et al., 2006) to perform motif discovery from the results of the multiple alignment. MEME works by searching for repeated, ungapped sequence patterns that occur in an aligned set of sequences. Using a process akin to local sequence alignment based on a set of selected seeds, MEME searches for statistically signiﬁcant motifs in the input sequence set by sliding these seeds over the multiple alignment. Computing the consensus sequence Once the multiple alignment and the motif analysis are computed, we can obtain the consensus sequences of diﬀerent motifs, which represents the “mean” sequence of a particular motif. These consensus sequences can allow to study diﬀerent properties of a group considered as a motif in a single glance (Bailey et al., 2006). Even though several statistical methods have been developed for constructing the consensus sequences, we will only rely here on the method based on frequencies, provided by the MEME Suite. An example representation of the consensus sequence is displayed in Figure 11.6 171 11.2. HARMONIC SEQUENCES 9 8 7 10 E E AK K 6 0 EKE K E AEA F A 5 1 4 2 3 3 1 4 2 EEKEEKAKEK EAE-EFAKAK EEKAEFAKAK E--AEKAK-K EAEEE-AKEK bits S1 S2 S3 S4 S5 Figure 11.6: Possible representations of the consensus sequences 11.2 Harmonic sequences In this section, we deﬁne diﬀerent types of symbolic sequences obtained analyzing the harmonic features of each song. Moreover, we describe the construction of ad hoc weighting matrices for the alignment of harmonic-based sequences. 11.2.1 From harmonic progressions to symbolic sequences The lead sheet notation depicted in Figure 11.7a provides a natural interpretation of a harmonic progression as a symbolic sequence. The idea here, is to produce a higher abstract description of the harmonic content of a song, based on the sequence of symbols describing its chords. To achieve this aim, we interpret chords as the degrees of a major tonality (see (Piston et al., 1978, Ch. 2)). In music, a tonality is deﬁned as the collection of triads constructed from a certain scale as it is depicted in Figure 11.7b. Assuming a whole song to be written in a single tonality would be simplistic. However, it is possible to segment it in tonal regions, deﬁned as subsequences of consecutive chords belonging to the same tonality. The algorithm associating a tonality to each chord is based on the spiral array (Chew, 2002). In this model pitch classes are represented as points of a helix. Thus, a chord (a collection of several pitch classes) is represented by the convex hulls of the points describing its pitches on the spiral. The centre of gravity of the convex hull is the representative point of the whole chord. This construction allows to describe several musical entities. Even a whole tonality can be represented in the spiral array, when considered as a collection of pitch classes. Given a sequence of chords, the computation of the 3 nearest tonalities to each chord in the spiral array allows to deﬁne stable tonal regions on the whole harmonic structure and to avoid sudden tonality changes for small modulations1 . We shall consider four diﬀerent kind of symbolic sequences. Three of them are deduced directly form the tonal analysis of the harmonic progression, the fourth is 1 A harmonic sequence like Dm − C − Am is interpreted as a collection of chords belonging to the tonality of C major (see the chord labeled as 2,1 and 6 in Figure 11.7b). However, it is desirable that the sequence Dm − C − Am − B♭ be interpreted as a harmonic sequence in F major rather than a C major modulating to F major or B♭. It is common practice to substitute the third degree of a tonality with a major triad (from Em to E in C major). In this case, if a E major triad appears in the middle of a sequence of chords belonging to the tonality of C major, the algorithm hide this brief modulation, maintaining the same tonality. 172 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC E‹ E‹ AAA 44 Em & ˙™ œœ bb ˙˙ nn˙˙ & 44 ˙™ 9 9 Cm C‹ C‹ & w & bb w FFF DDD £ D£ ##w w BB¨ B¨ w œœ bb œœ œœ œœ w w w E E¨ E¨ ˙˙ D‹ D‹ Dm ˙˙ ™™ Em E‹ E‹ œœ œœ w w GGG CCC nnœœ ## ˙˙ ˙˙ AAA £ £ C ˙˙ ™™ w w BB¨ bb œœ Em E‹ A A E‹ A B¨ œœ ™™ œœjj œœ œœ œœ œœ w w ˙˙ ÓÓ (a) Miles Davis - Tune Up 7 4 triad C triad CC D‹ 7 & 4 œœœ & œœœ œœœ degree 1 1 2 degree 1 D‹E‹ Dm œœ œœœ œ 22 3 E‹ F Em œœœ œœœ 33 4 œœ œœœ œ FF œœœ œœœ G G A‹ G 44 5 55 6 A‹ Bº Am œ œœ œœ œ Bº B° œ 66 77 7 œœ (b) Triad harmonization of the C major scale. Figure 11.7: From chords to symbols. (a) In a lead sheet, the standard chord notation is substituted by symbols. (b) The triad harmonisation of the diatonic scale of C and its seven degrees. based on a semiotic analysis of music. As a paradigmatic example we refer to the harmonic structure of Tune Up, as transcribed in Figure 11.7a. Degree Neglecting the information regarding the tonality of each song, we consider only the degree associated to each chord. For instance, considering the harmonic structure of Tune Up we obtain 251251251425125, being Em the second degree of the tonality of D major, A the ﬁfth degree and so forth. The repetition of the same degrees pattern points out the extensive use of perfect cadences across the whole piece (see (Piston et al., 1978, Ch. 12)). Spike We consider the sequence built by taking the diﬀerences between the tonalities associated to two consecutive chords. This diﬀerence is deﬁned as the cardinality of the set given by the union of the altered notes of each tonality. This is equivalent to counting the number of counterclockwise steps separating the two tonalities on the circle of ﬁfths (see Figure 11.8). In the case of Tune Up, this sequence is 00020020040440. Geometrically speaking, this sequence corresponds to a sequence of spikes whose height depends on the modulation occurring in the harmonic structure. Tonality In this case, each chord is substituted by its major tonality. In our example, we have DDDCCCB♭B♭B♭DDB♭DD. Whereas the previous sequence could be visualized as a succession of spikes, this one can be thought of as a step function. The dictionary used to describe tonality is described in Figure 11.8. 173 11.2. HARMONIC SEQUENCES H G F F B I C Dm G Am D D Em Bm Gm E J K A E A Cm Fm Fm Cm Bm E L Gm D B Em C Dm Am F G B M Am C A C N O Figure 11.8: In the circle of ﬁfths major (and relative minor) tonalities are organized in relationship to the altered notes they contain. Two tonality a step apart diﬀer of a single note. The only exception is represented by the tonalities of C♯ and C♭, which are separated by a thick line. The bold letters surrounding the circle correspond to the alphabet used to build the tonality class of sequences. Semiotic annotation It is natural for a (trained) listener to intuitively segment music while listening to it. The automatic segmentation of a music piece into meaningful parts (like introduction, choruses and verses) is a diﬃcult task and it has been tackled in several ways. For example, in (Foote and Uchihashi, 2001) a subdivision is derived from the analysis of rhythmic changes occurring in the song. Another strategy described in (Aucouturier et al., 2005) is based on the evaluation of the evolution of the timbre. In (Jensen, 2007) several music features are interpreted to provide a segmentation of the song in choruses, verses and so forth. It is also possible to deﬁne formal techniques to obtain a segmentation of a song in semiotic blocks (Bimbot et al., 2012). The semiotic characterization of a song consists of the deﬁnition of a labelling function on a set of symbols and taking its values on the set of semiotic blocks identiﬁed during the segmentation. The association between blocks and labels highlights the similarity between diﬀerent parts of the song. For instance, it is possible to associate more than one block to the same label or to deﬁne a variation of a preexistent symbol. This procedure leads to a straightforward deﬁnition of a degree of similarity among the blocks. Summary The class degree describe the harmonic cadences used in a composition. Its dictionary is composed by 9 symbols. 7 of these symbols correspond to the degrees of the 174 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC C D A B X M AB AX M* C 1 0.7 -1 -1 -1 0.5 -1 -1 -1 D 0.7 1 -1 -1 -1 0.5 -1 -1 -1 A -1 -1 1 0.4 -1 -1 0.5 0.8 -1 B -1 -1 0.4 1 -1 -1 0.5 0.1 -1 X -1 -1 -1 -1 1 -1 -1 0.5 -1 M -1 -1 -1 -1 -1 1 -1 -1 0.9 AB -1 -1 0.5 0.5 -1 -1 1 -1 -1 AX -1 -1 0.8 0.1 0.5 -1 -1 1 -1 -1 0.9 -1 -1 1 M* 0.2 0.2 (a) Degrees’ distance matrix. -1 -1 (b) Semiotic weighting matrix. Figure 11.9: Two weighting matrices expressing the similarity between degrees of a tonality (left) and semiotic labelling (right). The former is computed considering the distances of chords in the spiral array, the latter is deduced from the similarity of the block retrieved by the semiotic segmentation of music. tonality depicted in Figure 11.7b, the 8th one denotes the absence of a chord (silences and percussive breaks); a last symbol is used to label chords involved in small modulations or harmonic substitutions. The two classes of sequences named spike and tonality represent the modulations (Piston et al., 1978, Ch. 8) occurring during the piece. The former class is invariant respect to musical transposition, while the latter is sensitive to it. Their dictionaries have 15 symbols corresponding to the 15 major (and relative minor) tonalities. Finally, the sequences belonging to the semiotic class are the only hand-made ones we consider and they reﬂect the perception of musical blocks of a trained listener. 11.2.2 Weighting matrices One of the main ingredient of sequence alignment is the weighting matrix. Here, we consider four diﬀerent weighting matrices in order to deal with the speciﬁc dictionaries of the classes of sequences described above. Let n ∈ N the number of considered symbols. • The binary matrix B = bij is built as an identity matrix associating a positive score to exact matches, i.e. diagonal entries bii and a uniform negative score to the elements bij with i 6= j corresponding to a mismatch. • The linear matrix L associates a score to a given pair of symbols by taking into account their distance from the diagonal and thus from maximum score (exact match). Let 1 6 i, j 6 n be two natural numbers. The entries of the 175 11.3. APPLICATIONS matrix L are deﬁned as lij = ( n if i = j . n − |i − j| otherwise • The constructed matrix C = cij is built to deal with the seven degrees of a tonality. It is a symmetric matrix and its entries are elements of the set {−1, 0.4, 0.6, 1}. cij =   1     0.6  0.4     −1 if i = j if n − |i − j| = 2 . if n − |i − j| = 5 otherwise These particular choice stresses out the natural relationships among degrees. Let d ∈ {1, . . . , 7} be a degree of a tonality. By construction, it shares two pitch classes with the d + 2 and d + 5 degrees, modulo 8. • The alternate matrix is a reﬁnement of the constructed one. It is built by computing the distance between pitch-class triads in the spiral array. There are several possibilities to interpret a chord of n notes as a point of a metric space (Bergomi et al., 2014a), for instance considering its pitches as coordinate of a point in Rn , or its pitch classes as a point of the n-dimensional torus Tn = (Z/12Z)n . In Figure 11.9a the distance between triads belonging to the harmonization of a tonality is computed considering the centre of gravity of triangles representing triads in the spiral array. This particular choice reﬂects the perceptual relation among the degrees of the tonality. For instance the ﬁrst, sixth and third degrees are near while the second is the farthest, followed by the seventh, fourth and ﬁfth degrees. • The semiotic similarity matrix is obtained considering the similarity deﬁned by the semiotic labelling function. The matrix is depicted in Figure 11.9b. The matches on the diagonal of the matrix have value 1, mismatches between unrelated symbols correspond to −1 entries of the matrix. The distance between similar labels nuanced according to their deﬁnition given in (Bimbot et al., 2012). 11.3 Applications 11.3.1 Database, visualisation and notation We considered a collection of 138 songs belonging to the Quaero database. These compositions are performed by 72 diﬀerent artists and cover a timespan of 50 years, from 1962 to 2012. In order to show how even heterogeneous music styles can be tagged as popular, here follows a list of the artists we considered in our analyses. 176 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC Quaero’s artists 50 Cents ACDC Aerosmith Ali Farka Toure Amy Winehouse Bjork Bobby McFerrin Britney Spears Buckcherry Buenavista Soc... Carl Douglas Cher CoCo Lee Cranberries D’Angelo Daughtry Destiny’s Child Dillinger Dolly Parton Eminem Eric Clapton Faith Hill Finger Eleven Flo Rida Franck Zappa Georges Michael Goran Bregovic Gwen Stefani Jedi Mind Tricks Jim Jones Joan Baez Judas Priest Justin Timberlake Kiss Lil Wayne Ludacris Madcon Madonna Mariah Carey Massive Attack Michael Jackson Moby Neil Young Obituary Patrick Hernandez Pink Floyd Platiskman Pucho and his latin soul brothers Puﬀ Daddy Faith Evans Radiohead Ray Charles Run DMC Scorpions Shack Sweet The Beatles The Cure The Fall A Sides The harmonic transcription of each song has been computed using the algorithm presented in (Mauch, 2010). The association between audio and chord symbols is not injective. Consider a chord composed by the pitch classes C, E, G, A. It can be interpreted as an minor seventh chord, or a major add6. In order to deal with a small dictionary of chord symbols and to reduce the ambiguity in their retrieval, we transcribed each songs utilizing only major, minor and diminished triads. From this computation we construct three classes of symbolic sequences and we will consider a fourth class given by the semiotic sequences. Let S = {Degree, Spike, T onality, Semiotic} be the set of these classes. Let M = {binary, linear, constructed, alternate, semiotic} be the set of weighting matrices and A = {DT W, ERP, LCSS, N W } the collection of alignment algorithms we shall consider. We denote a clustering of our dataset as an element of the set C ⊂ S × M × A. We visualize the information retrieved by the computation of the pairwise global alignment of symbolic sequences as a polar dendrograms. The information carried by a dendrogram concerns the similarity and the conﬁguration (clustering) of data. Each joining between sequences (or clusters of sequences) is represented by the splitting of a circular segment into two smaller ones. The position of the split respect to the centre of the circle allows two retrieve the similarity between two clusters. Outliers are fused to preexistent clusters near the centre of the circle. In Figure 11.10, the two sequences grouped in the red cluster are very similar, while the object labeled as 50 Cents is an outlier of the big grey cluster on its left. Finally, the multiple sequence alignment of a particular clustering c ∈ C shall be computed comparing the performances of ﬁve algorithms. The analysis of motifs highlighted by the alignment shall be computed using MEME. 11.3. APPLICATIONS 177 Figure 11.10: Dendrogram obtained by evaluating the dissimilarity among 19 songs of Quaero and 3 Beatles’ covers contained in the original set. 11.3.2 Cover recognition In this ﬁrst application, we consider a dataset composed of 19 Quaero’s songs belonging to diﬀerent genres and by 3 cover tracks of songs by The Beatles that are part of the original set. This collection of songs is processed to obtain their sequences of degrees, which are aligned using the NW algorithm weighted with the alternate matrix. This test aims at exhibiting the coherence of the harmonic information and the detection of tonal regions. The resulting dendrogram is displayed in Figure 11.10, where the positions of original songs and their cover tracks is highlighted. As we can see, the original Beatles’ songs are always coupled with their respective covers, albeit a non-neglectible distance. This points out the structural changes characterizing the alternative versions of these songs. 11.3.3 Metadata and clustering Consider a clustering c ∈ C and denote by ci its clusters, for i ∈ {1, . . . , n} ⊂ N. In order to compare these groupings with the traditional genre and artist classiﬁcation paradigm, we rely on the set of metadata provided with the analyzed songs. The 1-NN accuracy of c respect to the metadata is computed. The cluster precision and the cluster recall in terms of retrieval of genres and artists has been computed for every cluster ci ∈ c. This information is encoded in the 5-dimensional visualizations depicted in Figure 11.11 on the following page. As we can see, best results are obtained with the pairings (alternate, ERP ) considering degrees, (linear, N W ) for spike sequences and (binary, N W ) considering 178 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC (a) Degrees. (b) Spikes. (c) Tonalities. Figure 11.11: Evaluation of several harmonic-oriented clusterings in relation to a genre recognition task. Diﬀerent clusterings are represented as colored spheres of variable radius in the space. The colour represent the alignment algorithm used to obtain the clustering. The size of the spheres corresponds to the 1-NN accuracy of the clustering, while the height of the spheres depends on the weighting matrix used to generate the clustering. On the cluster precision/cluster recall plane (z = 0 ), the projection of each sphere is depicted as a cross. the class of tonality sequences. The idea, is to focus on the information provided by the evolution of the composition in terms of modulations. The resulting dendrograms are depicted in Figure 11.12. It is important to notice that well recognisable artists and genre tend to be mostly grouped together. For instance, in both clustering several songs by The Beatles are grouped together and the Hip Hop songs are segregated from the others. At the same time, it is possible to notice a certain degree of contamination among songs and artists from diﬀerent genres. Even though this observation appears obvious for most music listeners, it is diﬃcult to exhibit it using local signal-based features. 11.3.4 Semiotic clustering The computation of the distance between semiotic symbolic sequences, produces the clustering of Figure 11.13a. Such a dissimilarity has been computed using the NW algorithm and the weighting matrix deduced from the semiotic labelling depicted in Figure 11.9b. The clusters are surprisingly well shaped, however some aberrations appear. For instance a really homogeneous rock/pop group of Eric Clapton’s and Beatles’s pieces labelled in the ﬁgure as Pop Rock is followed by another cluster (Hip Hop) where hip hop and rock songs are mixed together. 11.3.5 Towards semantic clustering The previously discussed hand-made semiotic segmentation can be reﬁned by considering the information carried by the analysis of the sequences of degrees. If we consider the Pop Rock and Hip Hop clusters of Figure 11.13a, we can clearly see (1 ( 992 ( 199 ) P (1290087) Puink Flo ) f ( y ( 20 74) Lil f Da d ( 19 05 S W dd 04 ( 19 80 ) B we ayn y Fea Time ( 1 6 ) u et e (1 198992 9) TThe ckch - Fo - Loll t Fait 99 2 ) E he C err x o ipo hE van 2) ) M ric Be ure y - n th p sPi ick Cl at - 0 Sor e ru I ll nk a ap les 4 t ry n Be Mis Fl el J ton - 0 he h oy ac - 5 sin a gY d ks He Oc ngi ou - 0 on y top ng 5 - He us ga G W y s G rde re an ar n at n de G ab n ig e in st th art e i sk ng s y om et hi ng in fM sO bia Eye y a am n lls ck e d a C ami Be Bla ng ad re ls in stra 7 N se D ody eek l e ck a 0 o b W - H a 6 s - Th my ys A e me s C B - 0 ck d h D C - re Tri e An ouc ht Da accu t u C T d D A C C in - M - Eig ou ffe c 0) ) A The di M elo arey - 08 fore y erm e 8 e 9 0 ) e g C s tt (1 198 80 3) J DAn riah eatle n - B shor d them ( 19 0 ) a B pto 2 a an ( 2 0 95 ) M he la - 0 Us ( 19 08 T ic C re 7 0 ( 20 64) Er Cu yd ( 19 2) he Flo ( 199 0) T ink ( 198 ) P ( 992 (1 11.3. APPLICATIONS n e s ne en ge (1 (1 980 ( 99 ) Th ( 199 0) S e C ( 198 7) co ure ( 19 6) Bu rpio - 0 ( 20 97 M ena ns 5 th ( 20 08 ) R ad vis - Win e figur d ( 19 0 ) ad on ta (2 199 92 3) SUsh iohena - socia s of C ehead 0 2 ) h e l h a L a r 0 0) ) E Eric ack - Lo d - K Isla club ange Fr ric Cl - 0 ve arm Bo - De an C ap 2 C In cam a nit c la to o Th Po a in p k n o a la Za ton - A med is C lice lub ver p - lb y p eda a Ru erta - S nn on ing of on M fa r G ith re om Rock Hard Rock ACDC The Cure JMT Eric Clapton Faith Hill Aerosmith Scorpions Radiohead The Cure Usher 0.8 0.6 0.3 0.1 Rap Flo Rida Jim Jones 50 Cents Run DMC Dillinger World / Blues Eric Clapton Buenavista The Cure Madcon s re Fi od e o Th ly g l To rea e M ts e ha ont r a ak W D yc m - T 15 ey a rs s - on th m P ers he k H wi hene umb ou l ot ric 10 e t dy ll Br T - lov oly n S Wa love fin P o ind les in lde lues The b ac M at m 12 Go Let W di Be - I s - 14 lking rick In ers a rB 7) ) Je he en atle s oth l br 99 3 T ue Be tle - W the (1 200 64) ) Q he Bea pton Ano in sou t ( 19 75 ) T he Cla d ( 19 69 ) T ic loy his la r F ( (1199692) EPink o and ( 199 9) uch ( 197 2) P ( 99 (1 ( Pop/Rock ( 198 ( 197 4) Ru n ( 19 6) Easy Hip Hop (12900938) ADillin DMC Spears (2 9 ) li FBritney g r - - It s ( 0 9 J arkeWinehouse 0 (2 200 06) ) Br edi Amy (1 00 5 A itn Eminem Mi a T 1 Co like 99 3 ) E m ey nd our ca Tri e - ine 2) ) J mi y W Mixed S Pi edi ne ine pea cks Bak In M yB nk M m AlihFarka rs - 1 oy Fl ind -Daughtry Cl ouse - Ba 6 Th terey rain oy TQueen e e b e d ric ani - Y y O He o - 1Judas ar k nPries u n 0 s - Ou kn e Mo t of JMT Da Ec 08 t m ow re r lip A y C i m Time knes se SPop Country to los no s et go Dolly Parton rm od of CoCo Lee W Moby or ds The Beatles The Beatles & Rock/Pop (1 ( 964 ( 197 ) T ( 198 5) Q he B (1290070) Ju ueen eatle s ( ) ( 19 97 F das - D - 0 ( 19 74 ) Wainger Prie eath 8 Eigh ( 1 9 ) (1 199989 2) PCar co B Elev st - B on tw t Days 99 8 ) O in l D ro en rea o le gs A Wee 8) ) A bi k F oug the - P kin k Al li F tua loy las rs - ara g the i F ar ry d - Ta lyze law K k k ar a ti 02 un e r ka T l d b g Me To To our eat rea Fu ur e - h the figh Th eF e B tin - B ak g ire s ak oy oy ter e ey e e m ith ove L y n M ow W t 1 an ed er ur -1 uW e g o an en ny ks o s he ic Y lip str av Tr o Ec ly he uw yo d - D 0 ne in s in ee - 1 Lo ars ow ing i M L yd n - Te Kn s me ometh ed Co Flo pto n - yin dy win se s ) J Co ink Cla pto - Cr Nobo ese t accu arting 3 a u ) t l es 00 0 ) P ric C ith n - iam yo (2 200 92 2) EEric rosmlapto 03 s efore nna b ( 19 9 ) e C e - - B Wa ( 19 92 ) A ric ur on ( 19 93 E e C pt kson ( 19 92) Th Cla Jac ( 19 0) ric ael ( 198 2) E ick ( 199 ) M ( 982 (1 an ici ag 16 kM ac oV n Bl ctr io p n pt ta Ra -Ele ce ibe al usiz on r T m i M M 2 n - K 06 - 0 6 A - 02 an es - cks - 0 n 3 km tl ri ks io r a e m is ea T ric ct at B ind T lle t ov bar Pl he i M ind Co no Tim 4) ) T ed i M rica - its re u iye 99 4 J d e y e r ) aw (1 196 03 3) J s Amught ka To ( 20 0 as a ar 39 re - N ( 20 B ) D li F n ( 0) 06 A ee a Tou death ) u rk ( il (2109985) Q li Fa ary - t ( 197 8) A bitu ( 99 ) O (1 989 (1 (1 ( 96 ( 197 9) T ( 198 4) he ( 19 7) Car Bea ( 19 64 Ge l D tles ( 19 64 ) T or ou 0 1 ( 19 6 ) T he ge gla Co m ( 1 6 4) h B s M s (1 196995 4) TThe e Be eatle icha Kung e To ge 96 4 ) D he B at s - el F 4) ) T A B eat les 07 - 01 u figh ther ting Th he ng eat les - 02 Ca Fai n t e l h e B t e I l Be ea o - s - 1 03 B Sho Buy tl B M u a tle es row 4 Ev aby ld H e L o s - 1 n er s In av e K ve - 0 1 Su yb g o Bl no 4 E w R ver ar dy s ack nB oc y Tr ett k Lit yin A tl er e g n d T To R hin Be ol lM g M The Beatles (11) Pink Floyd Queen George Michael Waco Brothers Carl Douglas Finger Eleven D’Angelo Pucho JMT & World JMT (5) Ali Farka Toure D’Angelo ic us (2 ( 00 ( 196 3) J ( 196 9) edi ( 19 4) Th Min ( 19 64 Th e B d ( 19 64 ) T e ea Tricks ( 19 6 ) he Be tle (1 200 95 4) TThe Be atles s - 12 - 03 B 99 3 ) D he Be atle - 1 Po loo ) 0 ) Je An Be atle s - 1 E lythe d in B 1 v a M s n d as i M gelo tles - 0 4 Ev ery L e Pa lood o m ut siv in - - 0 4 R ery ittle Th e d T Bro 3 B ock bod in A r y w t ta icks n S aby And s T g ry ck - u s R - S 14 gar In B oll M ing T oB la a K u c s f e u e ic k My Fr blai om n K Ba by Ha ha rm n k al w (a) Spikes,linear, NW. ro t Soft Pop In et os Cl yB a 1 Buckcherry y -0 D’Angelo s te ut m n ha ck inu Carey O y iMariah r n nc m i ph T 4 Hill d Faith an in ogra Cha eBeatles in a -The l n g r C g b M by Rock & Blues d o go no n - e po lu m di don m Rock/Country B 8 ial c ky wi e s o 0 e n J a n Judas e c a n - Priest so Layl ou k in th 3) ) M mi co ure taEleven Y ig 00 8 ) E ad CFinger vis ton use - eat G (2 200 05 8) MThe Waco r na lapBrothers o e ( 20 0 ) ueCarl G h f li ( 20 80 ) B ic C Douglas y ne 05 Wi yd - of m ( 19 97 Er Pink ( 19 2) Amy FFloyd lo Love k ( 199 6) World & ( 200 2) Pinueen ( 99 ) Soft Q Hip Hop (1 975 (1 Eric Clapton The Fall Aerosmith ACDC… The Beatles JMT World / Soft Rock Ali Farka, Jedi Mind Tricks Eminem Madonna Obituary Daughtry Ali Farka Toure Jedi Mind Tricks Ludacris Queen Dance/Pop 0.8 0.5 0.3 Cher Gwen Stefani Queen Bobby McFerrin The Beatles Dolly Parton George Michael Moby JMT 0.0 Rock Pop Folk Eric Clapton Amy Winehouse Madcon The Cure Neil Young Joan Baez (20 (19 08) Yael Naïm (19 92) UEric sheClapton ( 8 E ic r-L ( 20 0) rUsher ix m Re ve Lo g hu -T ild Ch y in en st ita rd Ga De n Bo un s s em a n in Isl e r opu t h cia Em La n t Oc ey agi nt a - x o 05 y H kM lac Ce onn - Fo es - He ues nB 50 ad et atl on - l Bl r eta 0) ) M we Be pt tura 6 M Tib an 00 6 S he Cla a - 0 - 02 g M (2 198 74) ) T ric y - Natles ricks vellin ( 19 69 ) E ob Be ra T ( 19 92 ) M e ind on - T ( h (1290124) TJedi MPart ( 196 3) olly ( 200 7) D ( 00 (2 Glam / Blues e he of t hi ne s ac M ( 19 03) The Clap ove In C t ( 19 97 Je Rock ( 20 90 ) B di M ure on - A This C -0 ( 1 0 ) u Scorpions lb i lub (1 196969 0) FSco enFrank av nd Zappa Tri 2 a s erta 97 9 ) T ra rpiThe i s Beatles t 9) ) T he nck ons a s cks - hort t oc 07 erm Pi he Be ZEric -Clapton i a W a nk B a Beatles N l Sale e pp in for a - ds club ada ffect Fl eat tleThe oy les s - Beatles S of - E Cam d Queen bia - 1 01 C on o Cha l Cu - A XZibit 4 o f M ng arto nLil’Wayne ot Go me r G e de he ld T Tu re Rap o e en rB n la g Flo Rida ric Slu ethe ge ne m r Dillinger k s In be Th rs e W al l u yo d fin ve lo et - L rds o rs s ue he of W bl rot ng ul b torm ki al so S g n A i n n - W lati 08 eth rde un on is s - om ga S pt h k S he ing ve la d ric 2 p Lo C an T - 0 ipo ang w T h ic ho ind es oll ollo Me Er uc i M atl - L the ll F Buy 2) ) P ed Be ne - 04 05 I ant ply C e s 99 2 ) J he ay re (1 199 03 9) TLil W Cu atles s - 07 No R ez vou ( 20 6 ) he Be tle 1 ( 19 08 ) T e ea s - 0 e rend ( 20 80 Th e B atle sid ( 19 4) Th Be Sea ( 6 ) e (1199644) Thueen ( 96 ) Q (1 975 (1 (1 ( 97 ( 198 5) Q ( 200 4) uee ( 19 6) Run n ( 19 95 Jim D I m ( 19 80 ) B J MC - in lov ( 19 9 ) T jor on ew ( 1 9 7) h k es It s (2 197962 9) BBuee Cu- Its - we like ith m yc 00 5 ) R rit na re oh ar so fly hi 3 ) a n v e i K 0 g ) Je iss y C y S sta s 5 th quie h di - ha pe oc e f t M Cm rles ars ial igu re in o - c l h u d B n Tr an You ab b - D ead y ic d ks lo Do On e ca m - ve nt e M 1 2 me kno ore ino a R w m Tim la v er e e e is Shack Timberlake Radiohead Eric Clapton Bo Gwen Stefani Cher Hernandez Blues / Rock East Coast Rap Easy Rock y ny n Da me O The wa m ow - 0 Beatles kn s una rea y t live lt e Queen n R I D lad Do ea a -Ludacris g - et u life o b Doin t Be ris Pink z o Floyd e e w - Y my orn re he dac BaMichael B S s Jackson f ou T o e Y u n l z r ve de g at 9) ) L oa enThe ha Cure Lo an Wh arlin 96 6 ) J ue C Sweet n - Hern - 13 Oh D (1 200 70 5) QRay ePuff s 04 ( 19 7 ) u eickDaddy e l t ( 19 62 ) Q Cranberries ( 19 75 Patre Beaatles ( 19 9) TShack h Be ( 197 4) Joan e ( 196 9) Th Baez ( 96 Yael Naïm (1 The Cure Eric Clapton Mariah Carey D’Angelo Mixed ie C n Pop Rock re ve &ean lo f J HeDance Soft y 7 Soft Rock The Cure Buenavista Ray Charles Su Th es Kiss Pink Floyd Frank Zappa Queen The Beatles Sweet Eric Clapton The Beatles 179 (b) Tonalities, binary, NW. da Figure 11.12: Two possible clusterings. Each cluster has been labeled coherently with the genre represented by its objects. Clusters whose objects do not share a similar genre are labelled as Mixed. Big clusters have been labelled according to their subgroups. Finally, the cluster named as Beatles for Sale in (b) owes its name to the presence of a neat groups of songs belonging to this album. 180 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC Pop rock Rock grunge Beatles / Clapton only Nirvana, The Cure, Queen Hip hop Soft pop 50 Cent, XZibit, Lil’Mama Ludacris, Conquest, D’Angelo Flo Rida, The Cure, Pink Floyd Radiohead, Moby, Yael Naim, Madonna Enya, Dolly Parton, Bjork, Beatles, Finger Eleven Mixed Mixed Queen, Beatles & Jedi Mind Tricks Georges Michael, Clapton Beatles, Judas Priest Queen & Beatles Rock Beatles Pink Floyd, & Zappa ACDC, Pink Floyd Mixed Techno Run DMC, Clapton, Pink Floyd, Kaoma Plastikman 0.8 Bluesy rock 0.5 0.3 0.0 Pink Floyd, Carl Douglas, Ray Charles, The Beatles, Eric Clapton, Queen Jedi mind tricks Folk & pop (Calm hip hop) Joan Baez, Stan Juan Gilberto, Franck Zappa, DJ Cam, D’Angelo Latin styles Buenavista Social Club, Ali Farka Toure Michael Jackson, Dillinger Rock ACDC, Cranberries, The Cure, Faith Hill, Madcon, Queen Mixed Buenavista Social Club, Obituary, The Cure Modern pop Trip hop Gwen Stefany, Maria Carey, Madonna Winehouse, Timberlake, Kingston, Cher Buckcherry, Aerosmith, Jackson Massive Attack, Usher, Shack Lil’Wayne, Beatles, Clapton (a) (b) Figure 11.13: Interaction between the semiotic segmentation and the harmonic-based sequences. (a) The polar dendrogram representing the hierarchical organisation of the semiotic sequences aligned with the NW algorithm and the semiotic weighting matrix. Clusters are genre-wise labeled. Mixed clusters corresponds to incoherent groupings in terms of genre. (b) Re-organisation of the Pop Rock and Hip Hop clusters of (a) through the alignment given by the combination (Degrees, alternate, N W ). The new dissimilarity measure has been computed cluster-wise, enhancing the genre retrieval obtained by the semiotic approach. 181 11.3. APPLICATIONS (a) MSA comparison. 5 4 1 600 0.5 0 3 400 2 200 1 0 0 MAFFT ClustalW MUSCLE ProbCons TCofee SP TC APDB Q-score MAFFT ClustalW MUSCLE ProbCons TCofee IC MD iRMDS NorMD NiRMSD MAFFT ClustalW MUSCLE ProbCons TCofee APDB-r MOS (b) Reference-free MSA evaluation Figure 11.14: Reference-free methods are represented in three diﬀerent barplots according to their order of magnitude. that the homogeneity of the former and the heterogeneity of the latter both in terms of genres and artists retrieval. In order to reshape these two clusters while maintaining the knowledge provided by the prior semiotic alignment, we compute a further cluster-wise alignment. We considered the sequences of degrees and computed their dissimilarity through the NW algorithm and the alternate weighting matrix. Figure 11.13b shows the new clustering of the two groups of songs, where songs belonging to the Pop Rock cluster occupies the upper part of the dendrogram, while the Hip Hop cluster’s songs are entirely represented in the lower half of the dendrogram. As we can see, the songs belonging to the ﬁrst cluster are joint in a single group at a dissimilarity value of 0.24, while it is necessary to climb up the whole dendrogram to obtain a cluster composed by the whole set of the Hip Hop cluster’s songs. In the upper-half of the polar dendrogram, the songs by The Beatles are clustered with those by Eric Clapton. The small cluster composed by four Clapton’s blues songs is clustered with Tell Me Why and Baby’s Black by Neil Young and The Beatles respectively. Layla is an outlier of this cluster and is the only ballad that has been considered. By observing the reorganisation of the Hip Hop cluster, we can notice how the hip hop songs are grouped together on the bottom-left of the ﬁgure. The only exception represented by the ACDC song which is considered as an outlier of this cluster. The lower right part of the dendrogram is occupied by rock songs. 182 11.3.6 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC Motif mining and molecular clock The analysis of the clusterings generated by diﬀerent types of musical information, weighting matrices and alignment algorithms allows to grasp the contamination aﬀecting artists and genres. However, pairwise alignment cannot provide a broad overview on the structure of the whole sequences. The multiple sequence alignment of the clustering (T onality, binary, N W ) has been computed using ﬁve diﬀerent algorithms. Each results has been evaluated on 11 reference-free quality metrics. In Figure 11.14a on page 181 the whole set of evaluation in summarised in a single diagram, that highlights MAFFT as the algorithm giving the best results. A more informative representation of the MSA algorithms is represented in Figure 11.14b. Best results are obtained using MAFFT, that presents a lower quality measure than the other algorithms only evaluating its multiple overlap score. Finally, a motif analysis has been performed using MEME, in order to identify signiﬁcant modulation patterns in the overall structure of the multiple aligned sequences. Figure 11.15 shows the results of these analyses. By construction, sequences belonging to the tonality class represent the modulations occurring during a song and are sensitive to musical transposition. Hence we retrieve the particular tonal and modulation choices recurring in each cluster. At ﬁrst sight, it is possible to notice that although in equal temperament all major (and relative minor) tonalities are equal, they are not equally distributed among the clusters. Moreover, it is not surprising that the most recurrent modulations are given by small displacements on the circle of ﬁfths. Popular music is often composed for a voice, as the singer could not be at ease by singing a suite of incoherent altered notes without any link given by a clear modulation, generally starting on a meaningful pivot chord (Piston et al., 1978, Ch. 8). The Rock and Blues cluster contains four recurring motifs, based mainly on the tonalities of A and D, that are a step apart on the cycle of ﬁfths (depicted in Figure 11.8). This respects the typical blues structure that tends to be stable on a single tonality (considering only triads). It is interesting to note how the blues paradigm inﬂuenced other compositions: among the artists listed in the Rock and Blues cluster of Figure 11.12b, we can ﬁnd names like Aerosmith and Pink Floyd but also D’Angelo and CoCo Lee, whose genre is nearer to the classic pop than the other artists. The richer motif corresponding to this cluster is the third one, where three tonalities are considered, again D, A and E are visualisable as three consecutive points on the cycle of ﬁfths. If we look more closely at the following cluster (named Soft Pop), we can see a motif shared more than once by the whole set of songs, involving only one tonality (D). The second highlighted motif of this cluster is more complex, suggesting several possible modulations between four consecutive tonalities on the cycle of ﬁfths. Such a variability often characterises bridges and modulations which follow the climax of compositions endowed with a more complex structure. It is surprising how this feature is shared by artists like Faith Hill, Mariah Carey, Buckcherry and ﬁnds its origin in songs by The Beatles. Clusters with heterogeneous nature are points of interest of our analyses. The motifs characterizing the World-Soft Hip Hop are shared among songs by Ali Farka Toure, Madonna and Jedi Mind Tricks. It is interesting to observe how the cluster labelled as Rock Pop & Folk represents a counterclockwise shift on the circle of 183 11.3. APPLICATIONS D d a c a a B F d a c G c b B E F F a F e aee abc a I ac a c a ca a a a c a ac c F Q H F a caaca c G M I ab R M a E A G F ac da I Q K K H F AEE EEAEE A a ca caa BBABBABB I G bd F eee G D F I G G F eeee Fbea b e F beeeb C D F de Q H F G F G GA G G Q G D GA AA K K H F AGAG A GG GG G G Q H F G K F GG G Q G Q K K H F H F GAD G Q K H F World & Soft Rock Dance Pop Mixed Easy Rock p c a lu ir s w te is ri e n g Rock Blues m a ult n d ipl m e o al ti ig f a nm n a e ly n s ts is Rock Pop & Folk Soft Pop Hip Hop World D G I Q F D G I Q M H L D D G AD CC D I A A A F DD DA CC M AA H D F L A D M A H D A DA A AD AC AC E AAAAD A AD AA D A A F D DD C D E AE B E FA D DD BE D B FA DD AE F D D E C DDDDDDD C CDC C DD BDBD D D D F H D FC FFFFC F AF F A A F E A C C F FCFFFC C b C b DD AD AC F C I D K C ACAC AACACA CF F F C C A GC D C C FCGCG Figure 11.15: The polar dendrogram constituting the centre of the ﬁgure is the clustering obtained considering sequences of the class Tonality, aligned with the NW algorithm and the binary weighting matrix. The radial segments represent the result of the multiple sequence alignment. Recurrent modulation patterns have been highlighted as coloured segments. Finally, the consensus of the most relevant motifs have been depicted for each cluster. For the sake of simplicity, the consensus sequences are composed only by capital and lowercase letters, representing natural and ﬂat tonalities, respectively (the symbol C denotes the tonality of C major, while c the major tonality of C♭ ). 184 CHAPTER 11. HARMONIC TIME SERIES AND POP MUSIC ﬁfths respect to the previous one. The tonalities involved in its common patterns are B♭, F, C, G, D, A. The third motif of this cluster coincide with the ﬁrst of the previous one. In addition, a particular modulation motif concerning the tonalities of A and C suggests the utilisation of parsimonious voice leadings (Pearsall, 2012, Ch. 1, p. 10) or secondary dominants (Piston et al., 1978, Ch. 14). This techniques are subtler than a modulation by pivot chord. This approach is homogeneously shared among the songs forming this cluster. The Easy Rock cluster is characterised by a strong tonal stability. It is not surprising that G major/E minor is the most used tonality in a cluster grouping authors as Eric Clapton and the britpop band Shack. However, even compositions by Justin Timberlake and Radiohead are part of this cluster, that mean they modulate using the same modulation motifs. Considering the Mixed cluster, three main motifs are shared by compositions of Ray Charles, Buenavista Social Club and The Cure. The Dance cluster is of particular interest considering how three long motifs are shared among artists like Gwen Stefani, and Cher, but also Jedi Mind Tricks, Eric Clapton and two compositions by The Beatles. Finally, we observe that the second consensus sequence associated to the World/Soft Rock cluster is a half-step transposition of one of the Rock Pop & Folk grouping. 11.4 Discussion and perspectives In this chapter we considered a collection of symbolic sequences derived from the harmonic analysis of the chords progressions of 138 pop songs. These sequences describe the structure of each song in terms of cadential patterns, modulations and semiotic blocks. The global pairwise alignment of each type of sequences has been computed testing diﬀerent algorithms and weighting matrices. In particular, we suggested a speciﬁc collection of these matrices in order to evaluate the harmonic data retrieved from the audio. The pairings of weighting matrix and distance algorithm have been tested on several applications. First, the detection of cover tracks, where the global pairwise alignment retrieves the relationship between the original and the cover track and provides a measurement their dissimilarity. Second, respecting the classical tasks of MIR, for each clustering of the dataset obtained choosing a particular type of symbolic sequence, a distance algorithm and a weighting matrix, we computed its 1 − N N accuracy, the cluster precision and cluster recall in terms of retrieval of genres and artists. As a third application, we showed how the results of the alignment computed using the semiotic sequences can be improved in terms of coherence, by adding the harmonic information concerning the cadential patterns of each song. Finally, taking advantage of the analyses conducted in the applications described above, the multiple alignment of the sequences representing the changes of the tonal centres of each song has been computed using ﬁve diﬀerent algorithms. A collection of reference-free methods allowed to evaluate the results obtained by these algorithms and to choose a particular multiple alignment of the sequences of the dataset. The analysis of recurrent, coherent motifs of the multiple aligned sequences highlights the transfer of musical inspiration among several artists whose compositions do not necessarily belong to the same genre, or time. To conclude, we computed the consensus sequences of the most relevant motifs shared cluster-wise 11.4. DISCUSSION AND PERSPECTIVES 185 by the compositions we analysed. These paradigmatic modulation choices represent a common harmonic strategy used in the compositions contaminated by the same motif. The standard analyses based only on the signal content of the audio neglect this broad high-level viewpoint on the common strategies employed in songs that would result diﬀerent, if compared using standard descriptors. Multiple sequence alignment of harmonic-based sequences provides a tangible evidence of the transfer of music inspiration over time. Twelve Musical Persistence Snapshots In Part III we used persistent homology to compute a music descriptor based on the deformation of the geometry of the Tonnetz. This construction neglects one of the most important features of music: a composition evolves in time. This evolution allows the composer to introduce a musical idea, then describe it and ﬁnally proceed to a new scenario. Would it be possible to reﬁne our analysis considering several conﬁgurations of the Tonnetz in time? What follows is a primal attempt to include this time-dependency in the vertical topological analysis of the deformed Tonnetz, in order to compare the evolution in time of two compositions. 12.1 Persistence and time varying systems Given a continuous function on a topological space we expressed its geometrical and topological properties in terms of critical homological values of the function. In our static model a piece of music was represented by a single persistence diagram. Here, the idea is to study how the Tonnetz evolves when its vertices are updated by successive notes. 12.1.1 State of the art The theory of persistent homology has been generalised to the study of time varying systems in (Cohen-Steiner et al., 2006). Intuitively, given a time-dependent continuous function f : X × [0, 1] → R, it is possible to represent its evolution in time as a multiset of continuous paths. In Sections 7.2 and 8.1 we described the algorithm to build the ﬁltration of a simplicial complex given a function deﬁned on its vertices. Once the ﬁltration is built and the pairing between critical simplices deﬁned, it is possible to compute the boundary matrix B (see Equation (7.2.1)), and its decomposition B = RU . If the function varies in time, this variation will be translated in a change of the ordering of the simplices in the matrix B. The swap of the ith and the (i + 1)th simplices is expressed by the product P BP , where P is the permutation matrix swapping the ith and (i + 1)th rows and columns of B. To update the pairing of critical simplices and the persistence diagram, it suﬃces to recompute the RU -decomposition of the matrix. In particular, this result can be achieved in linear time in the number of simplices. This procedure can be interpreted in terms of a continuous function deﬁned on a topological space, by considering the evolution of its critical homological values in 187 188 CHAPTER 12. MUSICAL PERSISTENCE SNAPSHOTS H(x̄, 1) = g(x̄) H(x̄, t2 ) H(x̄, t1 ) H(x̄, 0) = f (x̄) Figure 12.1: Homotopy between the functions f, g : X → Y. The values t ∈ [0, 1] can be interpreted as time, thus H(x, t) describes the continuous deformation allowing to transform f in g. time. Definition 12.1.1. Let X and Y be two topological spaces and I = [0, 1] ∈ R. Consider two continuous functions f, g : X → Y. The maps f and g are said homotopic f ≃ g, if there exists a continuous function H : X × I → Y called a homotopy, such that F (x, 0) = f (x) and F (x, 1) = g(x), for every x ∈ X. Now, consider two tame functions f, g : X → R. Let H : X × I → R a homotopy between f and g. Assuming that F (x, t) is tame for every t ∈ [0, 1], we have a persistence diagram of dimension k, for every couple (t, k) ∈ I × Z. Let R̄3 be the extended Euclidean space including the points at inﬁnity. The time-dependent collection of k-persistence diagrams is called a k-dimensional vineyard. The 3dimensional trajectory associated to each corner-point is called a vine. The diﬀerent kind of paths described by the vines can be divided in three diﬀerent classes. A vine is said 1. open if it is represented by a path composed by proper corner points for every t ∈ I; 2. closed if it starts and ends at an oﬀ-diagonal point; 3. half-open or half-closed if it starts (ends) on the diagonal plane and it ends (starts) at an oﬀ-diagonal point. In Figure 12.2 on the facing page two open vines are represented as solid paths, a half-open and a half-closed vine are depicted as dashed trajectories, while the dotted path corresponds to a closed vine. If the homotopy is smooth, the vine is also smooth, the only exceptions are represented by the points in which the pairing of homology critical values change. These points are called knees points and come in pairs. We refer to (Cohen-Steiner et al., 2006) for the discussion of the possible conﬁgurations of vines, that shall not be used in this chapter. 12.2. DISSIMILARITY OF PERSISTENCE TIME-SERIES 189 b d t Figure 12.2: An example of vineyard. The axes represent the time t, the birth level b and the death level d of the k-homology classes of a system evolving in time. Vines are represented as continuous paths. In general, a vineyard has a complicated geometrical structure. Statistics has been introduced in this theory in order to provide a unique mean diagram associated to the vineyard (Mileyko et al., 2011; Munch et al., 2012; Munch, 2013). 12.2 Dissimilarity of persistence time-series Albeit vineyards represent a powerful tool for the description of the time-varying systems, their interpretation is not intuitive. In addition, there is not a general technique that allows to compare vineyards deduced from the evolution of two diﬀerent topological spaces. Furthermore, the comparison of two mean diagrams cannot provide a description of local changes in the evolution of the space that can be relevant. Let X be a topological space, f : X × [0, T ] → R a piecewise linear function and t = { t0 , . . . , tn } a partition of (n + 1) evenly spaced points of [0, T ] ⊂ R. A k-persistence diagram Dfi ,k is associated to each instant ti . The collection of these k-persistence snapshots is a time series Df,n = { Dfi ,k }ni=0 ⊂ D∞ . We name Df,n a k-persistence time series. 12.2.1 Dynamic Time Warping algorithm for persistence time series Let tn = { t1 , . . . , tn } and tm = { t1 , . . . , tm } two evenly spaced partition of [0, Tf ] and [0, Tg ], respectively, where Tf , Tg ∈ R and m, n ∈ N. Let f : T × [0, T1 ] → R g : T × [0, T2 ] → R be two functions, such that fti and gtj are tame for every i ∈ {1, . . . , n} and j ∈ {1, . . . , m}, respectively. Consider the two persistence time series Df,n = { Dfi ,k }ni=0 and Dg,k = { Dgi ,k }m i=0 associated to the evolution of the two functions. 190 CHAPTER 12. MUSICAL PERSISTENCE SNAPSHOTS There exists several methods to evaluate the dissimilarity of two time series in a metric space. The Dynamic Time Warping algorithm (DTW) has already been used in this work in Section 3.7.1 to obtain the dissimilarity scores between pairs of multivariate time series and in Section 11.1.1 to compute the pairwise alignment of symbolic sequences. By deﬁnition, the bottleneck distance between k-persistence diagrams dB,k : Dk × Dk → R satisﬁes the three properties that characterise a cost function for every k ∈ Z. Let x, y ∈ Dk , then • dB,k (x, y) > 0 for every x, y and for every k ∈ Z; • dB,k (x, y) = 0 if and only if x = y; • dB,k (x, y) = dB (y, x) for every x, y ∈ Dk , for every k ∈ Z. Let Df and Dg be two time series of k-persistence diagrams of length n, m ∈ N respectively and α and β two natural numbers such that 1 6 α 6 n and 1 6 β 6 m. Following the notation introduced in Section 3.7.1, the DTW between two sequences of k-persistence diagrams is given by the computation of the optimal warping path γ∗: DT W (Df , Dg ) = dBγ ∗ (Df , Dg ) = min dBγ (Df , Dg ) γ is an (n, m) − warping path . In particular, the DTW inherits the symmetry by the bottleneck distance and the tameness of f and g assures the bottleneck stability of every diagram Dfi and Dgj , with i ∈ {1, . . . , n} and j ∈ {1, . . . , m}. Consider the preﬁx sequences Df,α = { Dfi ,k }αi=0 and symmetrically Dg,β = { Dgi ,k }βi=0 . For simplicity, the index k specifying the dimension of the persistence diagram shall be omitted if the context is clear. Let A (α, β) := DT W (Df,α , Dg,β ) , be the entry of the accumulated cost matrix, then A (n, m) = DT W (Df , Dg ). Theorem 12.2.1. Let A be the accumulated cost matrix. The identities 1. A (α, 1) = 2. A (1, β) = Pα l=1 dB Pβ l=1 dB (Dfl , Dg1 ), for 1 6 α 6 n; (Df1 , Dgl ), for 1 6 β 6 m; 3. A (α, β) = min { A (α − 1, β − 1) , A (α − 1, β) , A (α, β − 1) }+dB (Dfα , Dgβ ), for 1 < α 6 n and 1 < β 6 m. hold. The number of operations required for the computation of DT W (Df,n , Dg,m ) is O(nm). The theorem holds for general cost functions and it is proved in (Senin, 2008). The optimal warping path with respect to the accumulated cost matrix is computed through the Algorithm 12.1. 191 12.3. APPLICATIONS Algorithm 12.1 Optimal warping path. Input: A Output: γ ∗ = { γ1∗ , . . . , γl∗ } ⊲ Accumulated cost matrix. ⊲ Optimal warping path. γl∗ = (n, m) and γ1∗ = (1, 1); 2: while l > 1 do   (1, β − 1) , if α = 1 ∗ 3: γl−1 = (α − 1, 1) , if β = 1   min { A (α − 1, β − 1) , A (α − 1, β) , A (α, β − 1) } , otherwise 4: end while 1: 12.3 . Applications In the following applications we use DTW to compute the dissimilarity between 0 and 1-persistence time series associated to three datasets composed by classical, pop and jazz compositions, respectively. Often, musical phrases are organised according to the metric of the piece: modulations occur each four or eight bars in a jazz context, as well as the melodic line of the voice is arranged in a question and answer paradigm consisting of cycles of 2 or 4 bars in pop music. Thus, reﬂecting the approach followed in signal analysis, it is reasonable to space observations in time by taking into account the subdivision of each piece in bars. Therefore, it is also reasonable to study the properties of these features when the windowing is varied. 12.3.1 Musical interpretation First of all, it is necessary to provide an interpretation of the music features represented by the persistence time series. In Figure 12.3 on the next page, a sequence of six 0-persistence diagrams computed considering a 8-bar windowing of Klavierstück I is depicted. We recall that the 0th persistent module describes the connectedness of the torus F , when it is rebuilt via the ﬁltration induced by the height function on T. First of all, note that the axes of the persistence diagrams in Figure 12.3 have diﬀerent limits. Consider the top-left persistence diagram, the two corner points represent the lifespan of two connected components. The ﬁrst one is a cornerpoint at inﬁnity. It reveals the connected nature of F and represents the subcomplex of minimal height retrieved by the height function. The proper cornerpoint points out the presence of a minimum of the height function, which is disconnected from the ﬁrst one (recall the musical interpretation provided in Section 8.2). The remainder of the observations describes the changes in terms of death and birth-levels of such minima. These cornerpoints correspond to disconnected subcomplexes of the fundamental domain of the Tonnetz. The chromatic and atonal nature of the piece is suggested by the persistence of these minima. Moreover, the increasing growth of the birth-levels of the points of the whole multiset grabs the homogeneous gain of height of the entire simplicial complex T. This means that the whole chromatic 192 CHAPTER 12. MUSICAL PERSISTENCE SNAPSHOTS Figure 12.3: The six ﬁrst observation of the 0-persistence time series. Klavierstück I - Schönberg. Persistence snapshots are taken each 8 bars. scale is uniformly used in the composition, both in terms of pitches and duration of the notes. The relative distances among corner points represent their disparity when represented as subcomplexes of the deformed Tonnetz T. The variability of their conﬁguration highlights the diﬀerent preferred directions followed by the piece: disconnected regions of the Tonnetz at diﬀerent heights represent pitch-class sets that we have listened to in inverse proportion to their birth-level in the ﬁltration. In Figure 12.4 the persistence time series associated to the same composition, but composed by 1-persistence diagrams, is shown. We recall, that the two cornerpoints at inﬁnity correspond to the two generators of F consisting of major and minor third intervals, respectively. In the ﬁrst observations the two cornerpoints at inﬁnity are well separated and a third maximum disconnected from the others give rise to a proper cornerpoint. This homology class vanishes at the second observation suggesting a progressive compression of the whole set of pitch classes. The same idea is highlighted by the progressive reduction of the distance between the two cornerpoints at inﬁnity, which are fused in a single point of multiplicity 2 in the third observation. We are now ready to align two k-persistence time series and compute their dissimilarity and their optimal warping path, using the DTW algorithm. 12.3.2 Optimal persistence warping path Given a set of k-persistence time series, the calculation of the pairwise bottleneck distance, which normally represents a computationally hard task, can be performed in a reasonable amount of time, due to the low dimensionality and simple structure 12.3. APPLICATIONS 193 Figure 12.4: Consecutive observations of a 1-persistence time series. Klavierstück I Schönberg. Persistence snapshots taken at constant relative time intervals of 8 bars. of F . The DTW allows to compute the pairwise optimal warping path between two compositions. Two examples are depicted in Figure 12.5 on the following page. The images in the ﬁrst row of the ﬁgure describe the optimal warping path between the third movement of the Sonata n. 8 by Mozart and Jeux d’Eau by Ravel, for an 8 and a 4-bars windowing, respectively. The second row represents the alignment of two pop songs, namely Genie in a Bottle by Christina Aguilera and Fortress around Your Heart by Sting. Both pieces have been aligned using a 8 and a 4-bars windowing. According to Theorem 12.2.1, the ﬁrst point of the warping path is assumed to be the (1, 1) entry of the accumulated cost matrix. This assumption corresponds to force the alignment of the ﬁrst w bars of the two pieces, where w ∈ N is the size of the windows we consider. Horizontal and vertical segments of the piecewise linear path drawn along the matrix correspond to the insertion of gaps while aligning two symbolic sequences. Thus, in musical terms, the optimal warping path represents comparable regions of the two compositions represented by similar (near with respect to the bottleneck distance) persistence diagrams. In Figure 12.6 two persistence time series associated to the compositions A and B are represented by piecewise line segments and their observations are labelled according to a 4-bars windowing. The dashed lines represent the alignment of the compositions described by the optimal warping path. The ﬁrst twelve measures of A are associated to the ﬁrst four bars of B. Assuming to consider an accumulated cost matrix whose columns represent the observations of A and rows the observations of B, the warping path would connect A(1, 1) to A(1, 3) and it is a horizontal line segment. Symmetrically, the last eight bars of A are associated to the four last measures of B. In this region the warping 194 CHAPTER 12. MUSICAL PERSISTENCE SNAPSHOTS (a) Sonata n. 8 - 3, Mozart vs. Jeux d’Eau - (b) Sonata n. 8 - 3, Mozart vs. Jeux d’Eau Ravel (8 bars windowing). Ravel (4 bars windowing). (c) Genie in a Bottle, Aguilera vs Fortress of (d) Genie in a Bottle, Aguilera vs Fortress of Your Heart, Sting (8 bars windowing). Your Heart, Sting (4 bars windowing). Figure 12.5: Accumulated cost matrices and optimal warping paths between 0persistence time series. path would be represented as a vertical line segment. Following the musical interpretation of the 0-persistence diagrams we gave in Section 12.3.1 and the results discussed in Section 8.2, the optimal warping path suggests local regions of the two pieces in which the minima of the height function evaluated on the two deformed Tonnetze are organised in a similar way. Symmetrically the optimal warping path between 1-persistence time series highlights regions in which the 1-dimensional holes deﬁned by the sublevel sets of the height function, have similar conﬁgurations in terms of height (birth-level) and connectedness (number and relevance of the cornerpoints), with respect to the structure of the Tonnetz. 12.3.3 Dissimilarity of persistence time series Let Df = { Dfi ,k }ni=0 and Dg = { Dgi ,k }m i=0 be two persistence time series of k-persistence diagram, for k ∈ Z. The computation of DT W (D1 , D2 ) = A(n, m) 195 12.3. APPLICATIONS 25-28 28-32 21-24 17-20 13-16 9-12 1-4 5-8 A 28-32 25-28 17-20 21-24 13-16 B 1-4 5-8 9-12 Figure 12.6: Dynamic time warping between persistence time-series associated to two compositions A and B. Observations are labelled according to a 4-bars windowing. Composition Sonata n. 27 Arabesque Sonata n. 8 Jeux d’Eau Klavierstük Movements 1,2,3 1,2 I,II Author Beethoven Debussy Mozart Ravel Schönberg Table 12.1: Summary of the compositions of the classical music dataset. allows to retrieve the dissimilarity between the two time series. This value provides a measure of the eﬀort needed to produce the elastic transformation of minimal cost described by the optimal warping path, in order to warp a composition into the other. In Figure 12.8 on page 199 the dissimilarity score computed by aligning the compositions belonging to three datasets are depicted. Classical alignments Observe the ﬁrst row of the ﬁgure. The pieces of the dataset are listed in Table 12.1. Proceed by reading the matrix from top to bottom. Both Schönberg’s compositions (we will denote them by DK11-1 and DK11-2) gave high dissimilarity score when aligned with the tonal pieces. The ﬁrst row of the matrix represents the dissimilarity scores computed by comparing DK11-2 with the other compositions of the dataset. The two minimal scores we retrieved are obtained by comparing DK11-2 with the compositions by Debussy and Ravel. In general, when compared to the rest of the dataset DK11-1 obtains smaller dissimilarity scores, however, they are suﬃcient to segregate it from the tonal pieces. The corresponding results depicted in the distance matrix on the left do not diﬀer greatly from the one we just discussed. However the tonal traces left in DK11-1 are highlighted by the ﬁner windowing we considered. 196 CHAPTER 12. MUSICAL PERSISTENCE SNAPSHOTS Label Caravan-js Caravan-md Fly-bz Fly-dc Fly-gw How-gr How-jh How-mw Ensemble 4 Gtrs, Org, Kora, Bgtr Tpt, Pf, Bgtr Bgtr, Vib, Kora, Pf Flt, Tnr & Bar Sax, F Hn, Org, Gtr Big Band 2 Obss, 2 Gtrs, Bgtr Pf, Bgtr Tpt, Gtr, Bgtr, Str, Pf Style B.B. arr., no solo Rich solos and Tensions B.B. arr., chromatic solos B.B. arr., Manouche guitar B. B. arr. Manouche Chromatic solo Embellishments, chromatic Table 12.2: Summary of the compositions of the jazz dataset. The same consideration holds focusing on the scores realised by Jeux d’Eau. The surprisingly low score generated by its alignment with the second movement of the Beethoven’s sonata changes by considering a 4-bars windowing. In this case the composition by Ravel results segregated form the others, while the 2nd movements of the sonata n. 27 obtain a surprisingly low dissimilarity score, when it is aligned with DK11-1. The tonal and pentatonic compositions are highlighted as similar in both representations. Pop alignments The dissimilarity scores computed on the dataset composed by 2 songs by Christina Aguilera and 3 pieces by Paul McCartney and Sting respectively conﬁrm the results we obtained from the analysis of the same dataset in Section 8.2. In both diagrams the two Aguilera’s pieces result well separated from the others. Sting’s Fields of Gold and If You Love Somebody Set Them Free turn out to be similar to the pieces by McCartney. It is not the case for Fortress Around Your Heart that recollects high dissimilarity scores when aligned to the other songs of the dataset. It is interesting to note how the two distance matrices are almost invariant respect to the change of windowing. Jazz alignments The classiﬁcation of jazz standards is a diﬃcult task due to the improvisational nature of this genre. We considered a dataset composed by two versions of Caravan and three versions of Fly Me to the Moon and How High the Moon, respectively. Each interpretation is characterised by diﬀerent choices in terms of ensemble and arrangements. We summarised these features in Table 12.2 by denoting a big band arrangement (breaks, horns ﬁlls, et cetera) as B.B. arr., pointing out the presence of solo parts, their main features, and particular stylistic choices. The dissimilarity scores resulting by the global pairwise alignment of the persistence time series associated to these compositions are depicted in the third row of Figure 12.8. In this example, the information retrieved by the alignment is twofold: on one hand it stresses the mere melodic and harmonic similarity. On the other hand, it retrieves common stylistic choices. In both distance matrices the scores associated to the same compositions are reasonably low, highlighting their similarity in the case 12.4. DISCUSSION AND PERSPECTIVES 197 Figure 12.7: Optimal warping path between to versions of Caravan. The positions of the gaps correspond to the solo parts of the longer version (frames 25-50 and 51-65 respectively). of Fly Me to the Moon and How High the Moon. An exception is represented by the two versions of Caravan. The presence of rich solos in Caravan-md distinguishes it neatly by the other interpretation of the standard. Note how the optimal warping path between these two pieces depicted in Figure 12.7 tries to align them on the themes, skipping the solo parts. Hence, the evolution in time of the persistence diagrams grasps the diﬀerence between an organised thematic ﬂow, and a freer improvisational context. Moreover, we notice how the three versions of Fly Me to the Moon result well separated from the three versions of How High the Moon only utilising a 4-bars windowing. This feature is opposite to the one characterising the analysis of the Pop dataset. 12.4 Discussion and perspectives We presented a method to adapt persistent homology to the time-dependent nature of music. The deﬁnition of k-persistence time series has been derived by the model proposed in Part III in its symbolic application. The observations of these time series provide a topological characterisation of music, obtained by considering its natural subdivision in bars1 . We gave a musical interpretation of the evolution in time of the persistence diagrams associated to a composition and we used DTW to provide an alignment of persistence time series. Finally, we analyse both the optimal warping 1 In a context in which bars are not relevant, it would be possible to provide a segmentation according to different criteria, for instance, by selecting relevant musical events according either to signal descriptors, or symbolic notations. 198 CHAPTER 12. MUSICAL PERSISTENCE SNAPSHOTS path and the alignment score of collections of classical (tonal, modal and atonal) compositions, pop songs endowed with diﬀerent harmonic complexity and a collection of jazz standards played by diﬀerent ensembles, with diﬀerent arrangements and solo parts. The computation of the pairwise alignment scores for each dataset, revealed how in a classical music context the persistence time series classify tonal, modal and atonal compositions. The stability respect to a change of windowing of the pop collection and hence the possibility to study it with a coarser distribution of observations has been highlighted, as well as the retrieval of the peculiar harmonic choices made in Fortress Around Your Heart. The analysis of jazz standards is more complex due to their variability in terms of improvisational styles, the generous utilisation of harmonic substitutions and the highly variable composition of the ensemble. Nevertheless, the tool we proposed is able to retrieve the similarity of diﬀerent versions of the same standard, as to distinguish between the ordered structure of the theme, in opposition to the more entropic solo parts. The natural development of this work is to extend it to the analysis of audio. The stability of the persistence diagrams assures that small variations of the function will be represented as small variations of the persistence diagrams forming the persistence time series. A chromagram can also be used to produce dynamic deformations of the Tonnetz, as the consonance function could be used to describe the variations in terms of tension/resolution roles of the degrees of the chromatic scale in relationship to a variable harmonic choice. The variation of the Tonnetz in time can be used to generate music. The study of free or constrained trajectories of a mass on the time-dependent deformed surface induced by a composition can be used to generate a melody, according to the preferred directions deﬁned by the deformation. Diﬀerent melodies computed considering the nearest pitch class or pitch-class set to the mass can be classiﬁed in terms of periodicity and symmetry. The same ideas can be extended to a system of n masses moving on the surface. Some interesting starting points can be borrowed by the theory of conﬁguration and reconﬁguration spaces (Abrams and Ghrist, 2002; Ghrist and Peterson, 2007). Persistence time series could be substituted by continuous vineyards, which are suitable for representing the variations induced by piecewise constant and piecewise linear functions. It would be interesting to study the alignment between vineyards by considering the minimal homotopy leading from one to the other. Given the simple structure of the persistence diagrams derived from the Tonnetz and the possibility to provide their musical interpretation, the musical framework we introduced is particularly suitable for this task. The persistence time series we introduced in this last chapter have full memory with respect to the notes that are played in time. It would be possible to introduce in the model a gravity function, in order to represent the plasticity of the listener’s perception. In a pitch-duration-based model this assumption would enhance the representation of repeated musical ideas, while in a consonance-oriented model this function would reﬂect the decreasing of the tensional content of a long lasting or repeated harmonic/melodic choice. 199 12.4. DISCUSSION AND PERSPECTIVES Schoenberg-DK11-2 Schoenberg-DK11-2 Schoenberg-DK11-1 Schoenberg-DK11-1 Ravel-jeuxdeau Ravel-jeuxdeau Mozart-311-3 Mozart-311-3 Mozart-311-2 Mozart-311-2 Debussy-arabesque Debussy-arabesque Beethoven-27m3 Beethoven-27m3 Beethoven-27m2 Beethoven-27m2 Schoenberg-DK11-1 Ravel-jeuxdeau Mozart-311-3 Mozart-311-2 Debussy-arabesque Beethoven-27m3 (b) Classic Music (4 bars windowing). Sting-IfYouLove Sting-IfYouLove Sting-Fortress Sting-Fortress Sting-Fields Sting-Fields McCartney-Hi McCartney-Hi McCartney-Band McCartney-Band McCartney-Another McCartney-Another Aguilera-ITurn Aguilera-ITurn Sting-Fortress Sting-Fields McCartney-Hi McCartney-Band McCartney-Another Aguilera-ITurn Aguilera-Genie Sting-Fortress Sting-Fields McCartney-Hi McCartney-Band McCartney-Another Aguilera-ITurn Aguilera-Genie (c) Pop Music (4 bars windowing). (d) Pop Music (2 bars windowing). How-mw How-mw How-jh How-jh How-gr How-gr Fly-gw Fly-gw Fly-dc Fly-dc Fly-bz Fly-bz Caravn-md Caravn-md How-jh How-gr Fly-gw Fly-dc Fly-bz Caravn-md Caravan-js How-jh How-gr Fly-gw Fly-dc Fly-bz Caravn-md Caravan-js (e) Jazz Music (8 bars windowing). Beethoven-27m2 Beethoven-27m1 Schoenberg-DK11-1 Ravel-jeuxdeau Mozart-311-3 Mozart-311-2 Debussy-arabesque Beethoven-27m3 Beethoven-27m2 Beethoven-27m1 (a) Classic Music (8 bars windowing). (f) Jazz Music (4 bars windowing). Figure 12.8: Alignment score of 0-persistence time series for diﬀerent datasets and variable windowing. Both the colour and the size of the circles associated to each pair of pieces depends on their alignment score. 201 Part V Conclusion and future works 203 Thirteen Conclusion The question that inspired the investigations portrayed in this work is simple: why some melodies create interesting musical soundscapes even when they are simply whistled, while others need an orchestra to be easy to understand? To answer this question, we represented music by using topological and geometrical models, following the twofold interpretation suggested by (Kurth and Rothfarb, 1991). We tried to keep the dimensionality of our representation as low as possible, in order to guarantee a visual intuition over the entities that we described mathematically. Furthermore, the metric that is naturally deﬁned on the spaces we considered allows to deﬁne a distance between musical objects in a natural way. We used collections of partial permutation matrices and three dimensional paths to describe voice leadings and counterpoint. Our model surely needs improvements in order to reﬂect the vast information contained in a whole contrapuntal composition, and also to describe the numerous techniques that are used by composers to generate interest in the listener. As we stated in the introduction, despite some attempts have been made (Birkhoﬀ, 1933) and it is an open line of research (Juslin and Västfjäll, 2008; Tulipano and Bergomi, 2015; Brattico and Pearce, 2013), it is not yet possible to evaluate the aesthetics of music objectively. Thus, it is not possible to speak of a truly mathematical complexity, unless considering a sort of understandability/originality dilemma: the action of the composer on a piece, in order to shape its resolution and tension structure in time, in order to accompany or frustrate the expectation of the listener. From this consideration follows the necessity of producing time-dependent models and study their patterns. We hope that, beyond the limitations of the model we suggest, its formal core and novel outline shall give a new perspective on the study and representation of counterpoint. Second, we encoded part of the information contained in a musical phrase by displacing the vertices of the well-known Euler Tonnetz, in its simplicial complex interpretation. The assumption, in this case, does not concern the concatenation in time of motifs endowed with nuanced complexities, but the hypothesis that a core musical idea should be repeated, although slightly transformed, along a composition. In this model, a piece of music is represented as a three dimensional shape. Persistent homology, computed on the deformed surfaces derived from the Tonnetz, is able to grasp the main ideas used in a composition and to encode them in a simple representation. The main advantages of this approach are twofold. On one side, the topological tools we used can deal with any ﬁnite number of dimensions and have been proved to be eﬀective in the retrieval of diﬀerent musical properties. On the other side, persistence diagrams are points of a metric space. Hence, it is possible to compare them, even though the computational complexity of this distance requires 205 206 CHAPTER 13. CONCLUSION a large amount of time (interesting researches are currently investigating novel algorithms for the computation of this distance (Di Fabio and Ferri, 2015)). When questioning the possible representations of music, it also natural to wonder if it is describable through absolute models. In this work, we do not provide a ultimate answer to this question. However, we assumed that the perception of music depends on the culture and the background of the listener. According to this observation, we introduced models whose main ingredients are the Tonnetz and the consonance function (Plomp and Levelt, 1965). This coupling of a musicological model and a function deduced from the ﬁtting of experimental data, allows to consider both the symbolic structure of music and the information carried by the signal. Our consonance-based shapes are limited compared to the abstraction level provided by the standard Tonnetz. Nevertheless, they reﬂect the perceptive nature of music, providing interesting and coherent results. Finally, the two last models we suggested are a consequence of the exploration of both the horizontal and vertical approaches. They are not to be considered as improvements of the previously investigated strategies. On the contrary, they are endowed with their own independence, and oﬀer a novel viewpoint on the complementarity of low and high-level features of music. Moreover they take into account the dynamical nature of real-life musical applications. Music proved to be a rich source of inspiration for the development of mathematical tools, providing a suitable framework for the novel time-series approach to the topological characterisation of time-varying systems. If the primary goal of this research was to provide a complete formalisation of the compositional process, we are surely far from giving a broad representation of all its features. However, we hope that this work shall represent a new starting point for future researches in music analysis, music information retrieval and computational algebraic topology. Fourteen Future works The results portrayed in this work can be divided in three main groups, according to the three directions we explored. The modelling and the visualisation of voice leadings, the topological description of music features and, ﬁnally, the time-oriented analysis of musical entities. Following this structure, we give a brief summary of the results discussed at the end of each chapter. Then each subsection provides a more detailed overview on their future developments. 14.1 Voice-leading modelling 14.1.1 Voice leadings as partial permutations and geodesics The formalisation of simultaneous motions of voices provides a handy representation of a voice leading as a partial permutation matrix. The representation of voice leadings as geodesic paths in several spaces allows to simply understand their representation as a concatenation of geodesics and the diﬀerent information retrieved by the standard spaces of music analysis. The information carried by each partial permutation matrix has been rewritten as a ﬁve-dimensional vector, in order to represent rested voices. Thus, given a contrapuntal composition it is possible to compute its paradigmatic voice leadings and to represent them as a multiset of 5-dimensional points. Furthermore, the interpretation of the sequence of vectors as observations of a time series allows to describe the evolution of voices’ motions in time. Thus, to deﬁne a dissimilarity measure describing two compositions, it is possible to apply the powerful techniques used for the computation of the distance between time series. Finally, a particular class of partial singular braids is used to visualise the voice leadings between chords represented as pitches and pitch classes. Test and evaluation The algorithm for the comparison of time series of complexity vectors has been tested on a small set of compositions. The evaluation of its accuracy in relation to tasks such as the artist retrieval and both the stylistic and temporal classiﬁcation should be computed on large datasets. The same holds for the visualisation of the paradigmatic voice leading choices as a multiset of points. This representation should be evaluated on a large collection of compositions by a musicologist, which could provide a meaningful interpretation of the 3-dimensional projections of the point cloud. 207 208 CHAPTER 14. FUTURE WORKS Hidden 2 B5 A#5 A5 G#5 G5 F#5 F5 E5 D#5 D5 C#5 C5 B4 A#4 A4 G#4 G4 F#4 F4 E4 D#4 D4 C#4 C4 B3 A#3 A3 Past hidden layer Hidden 1 8 16 24 32 40 48 56 64 72 80 88 96 Visible 104 112 120 128 136 144 Past frames Current frame Figure 14.1: The partial permutation matrices give a low-dimensional representation of the features of each voice leading. Here, they are used to feed a harmonic conditional restricted Boltzmann machine. The lateral connections in the visible layer are used to retrieve the harmonic structure of chords. Past events are taken into account thanks to the autoregressive connections between the current and past units. Higher order phenomena The complexity vector we introduced takes into account only the features of a single voice leading. This analysis is extremely localised, and it is blind with respect to phenomena occurring in the concatenation of several voice leadings. For instance, the overlap of two voices (described in Section 1.2) is visible only by considering more than one voice leading at a time. In addition, the behaviour of each voice can be tracked by analysing its evolution in the concatenation of partial permutation matrices. This approach allows to measure the length of the crossings between voices and hence, a more accurate computation of the overall complexity of a polyphonic composition. A deep learning model for orchestration The analysis of the voice leading complexity can be used to implement the structure of generative models for automatic orchestration (Crestel, 2015). The deep neural network depicted in Figure 14.1, based on the models described in (Taylor and Hinton, 2009; Osindero and Hinton, 2008) and implemented by Léopold Crestel aims at the generation of real-time orchestrations of symbolic data (onset, pitch, duration, velocity). The network is trained by comparing piano and orchestral arrangements of a set of compositions. The number of instruments composing an orchestra (15 at the actual state of development of the model) give rise to high-dimensional, sparse representations, that can be simpliﬁed considering the representation of voice leadings through partial permutations. Furthermore, we suggested a simplistic extension of the partial permutation model in order to classify contrapuntal composition of other species than the ﬁrst. In this context, the discretisation used to compute the complexity of the voice leadings shall be improved by the (noteon/noteoff) 14.1. VOICE-LEADING MODELLING 467 FPS (54-820) 209 Open Controls Figure 14.2: Trefoil knot. Identifying the domain and co-domain of a braid b ∈ Bn produces a closed braid. In particular, any knot can be represented as a closed braid (Alexander, 1923). information provided by the M IDI ﬁles. 14.1.2 Voice leadings and braids A concatenation of voice leadings between n-notes chords can be represented as a piecewise geodesic path in Rn (ordered pitches), Tn (ordered pitch-classes), or An (pitch-class multisets). When n > 3 this path cannot be easily visualised. Braids allow to depict a voice leading between n-notes chords as a collection of n paths in R3 . Furthermore, unisons and rests can be represented by intersections (singularity) and deletion (partiality) of the strands, respectively. In order to univocally associate a voice leading to a collection of three dimensional paths, we considered the class of piecewise, positive, partial, singular braids. Voice leading’s topological qualities This visualisation strategy can be transformed in a representation space by taking into account the topological complexity of the braids we deﬁned. We showed how both the motion of each voice and the intervallic leap it covers during the voice leading can be represented as the slope of the strand associated to the voice. The musical relevance of the topological invariants (Birman, 1974) of this voiceleading-oriented braids should be investigated. For instance, this problem can be tackled by considering the link generated by the closure of a braid (Alexander, 1928) (see Figure 14.2 for an example). When it is deﬁned, it is possible to consider the closure of the multiplication (concatenation and removal of discontinuous strands) of the partial braids representing a suite of voice leadings. In our model, we assumed that every crossings between strands are positive. However it is possible to deﬁne a rule determining the sign of the crossings, for instance by taking into account the slope of the strands. In this way, the information concerning the intervallic leaps of the voices would be inherited by the structure of the link associated to a braid. 210 CHAPTER 14. FUTURE WORKS A distance between pitch and pitch-class sets We visualised voice leadings between ordered tuples of pitches as a collection of geodesics paths in R3 . The idea is to consider the length of the strands in order to deﬁne a distance between two particular ordered representative of a set of pitches. In particular, minimal geodesics represent a non-crossing voice leading. The introduction of crossings implies the use of longer line segments to connect the pitches. Note that this kind of distance would be diﬀerent than a mere count of the crossings, since the length of the segment (or equivalently its slope) describes the relative intervallic motion of each voice. A symmetric argument can be applied to pitch-class sets, considering the helices connecting two copies of R/12Z. 14.2 Persistent music features We interpret the Tonnetz as a 2-dimensional simplicial complex. Its embedding in a metric space and its structure, in which pitch classes, consonant intervals and triads are represented as 0, 1 and 2-simplices respectively, have been used in order to take advantage of a third dimension to add relevant information to its structure. 14.2.1 Pitch classes and durations As a ﬁrst application, we use the pitches and durations of a sequence of notes and chords, in order to deﬁne the height of the pitch-classes labelling the vertices of the Tonnetz. After introducing the theory of persistence homology, the height function deﬁned on the vertices has been used to induce a ﬁltration of the fundamental domain of the Tonnetz generated by the major and minor third intervals. Styles classification In spite of its simplicity, the ﬁrst approach we introduced that considers a slice of the deformed Tonnetz in order to determine a preferred extended shape associated to a composition, revealed an interesting behaviour when applied to the description of diﬀerent styles of classical music. For instance, the shapes retrieved from the analyses of impressionistic compositions are isomorphic. This result and the ease of the model are strong points that can lead to an intuitive representation of diﬀerent compositional styles. This representation will be tested on large datasets and can be reﬁned by introducing a variable threshold, in order to take into account the pitch classes and pitch-class set circulation (Tymoczko, 2011; Bigo, 2013). 14.2.2 A topological music fingerprint Given a musical phrase, the computation of its persistent homology by considering the homological critical values of the height function generates two relevant persistence diagrams. The 0-persistence diagram describes the lifespan of the connected components of the shape along the ﬁltration. The 1-persistence diagram collects the same information concerning the 1-dimensional holes of the shape. Both these representations have a neat musical interpretation. Furthermore, persistence diagrams are points of a metric space, equipped with the bottleneck distance. This notion 211 14.2. PERSISTENT MUSIC FEATURES (a) Time, Hans Zimmer. (b) Sonata n.8, mov. Mozart. 1,(c) Klavierstück, I, Schönberg. Figure 14.3: Visualisation of diﬀerent compositional styles as sub-level sets of the height function (light grey area). The displacement of vertices is given by the duration and pitch classes of notes and chords. allowed to use the 0 and 1-persistent homology representation of music, to provide a quantitative analysis of the distance between compositions. Thus, it has been possible to organise three diﬀerent datasets (of classical, jazz and pop compositions respectively), according to their topological representation. Didactic: composition The deformed Tonnetze and the contours depicted by the sub-level sets of the height function highlight the diﬀerences between compositional styles. This property can be used to ease the complicated task of teaching (and learning) how to compose a piece of music. The 3-dimensional deformation of the Tonnetz can be used to visualise common tonal and atonal pattern, as depicted in Figure 14.3. Moreover, it can be a valuable tool for composers and students in composition, allowing them to produce an immediate visual feedback of their musical ideas and compare them to common patterns or educational examples. Extension to audio analysis It is possible to extract information concerning both pitch classes and duration of a sequence of notes directly from audio ﬁles (Harte and Sandler, 2005). In a chromagram, a magnitude is associated to each pitch class, representing its importance over time. The interest of this extension is twofold: on one side the stability of the persistence diagrams (computed considering tame functions) makes this representation robust to the small errors introduced by the analysis of the signal. On the other hand, the information carried by the signal is richer than the information contained in a MIDI ﬁle. The harmonic spectra of melodic and harmonic instruments inﬂuence the chromagram, but also the harmonic contribution of percussive instruments. Moreover, auditory masking phenomena (Wegel and Lane, 1924) are non-negligible. Thus, this extension would better approximate our perception of the musical information. 212 CHAPTER 14. FUTURE WORKS Validation The results obtained by considering both the 0 and 1-dimensional persistent diagram associated to shapes generated deforming the Tonnetz are promising. In a classical music context tonal, modal and atonal classical pieces have been represented as separated clusters. Three diﬀerent versions of a jazz standard have been grouped coherently with respect to their arrangements and pop songs belonging to diﬀerent artists have been grouped according both to their genre and compositional style. The 1-persistence diagrams seemed to grab the similarity between melodies, but also to distinguish between the ordered structure of a theme and the more entropic solo parts. However it is necessary to test all this feature on large datasets. The extension of the model to the analysis of the audio shall be fundamental to test our model on a heterogenous and rich collection of compositions Application to other music-oriented simplicial complexes In this work, we considered the Tonnetz for its acoustical and musicological relevance. However, our constructions can be applied to every simplicial complex, graph or point cloud having a musical meaning. Clearly, one diﬃculty is the choice of the ﬁltration function, as it should reﬂect the musical persistence properties one wants to consider. Higher dimensions In our model, we considered only three dimensions, to visualise the deformations of the Tonnetz. However, persistent homology is suitable for the analysis of highdimensional data, that makes possible to study features such as the velocity, the position of a note in a bar respect to a certain metric and dynamics of a sequence of notes. Multidimensional persistence A further development consists in the extension of the deformed model to multidimensional persistence (Carlsson and Zomorodian, 2009; Cagliari et al., 2010). This theory allows to explore several ﬁltrations of a topological space. In Figure 14.4, a bi-ﬁltration of a triangulation of an ant has been deﬁned considering the position of the simplices with respect to the y-axis and their discrete Gaussian curvature κ. Fixing one of the parameters, we obtain a ﬁltration of the shape and hence, its representation in terms of persistence diagrams. Although this multidimensional model allows to build an accurate ﬁngerprint of the shape, it is computationally hard. We guess that it would be possible to decrease the complexity of this problem, by consider the information obtained by the analysis of random ﬁltrations. Similar diagrams obtained varying a single parameter identify the topological and geometrical property of the shape that are robust to this parameter. It would be possible to restate the evaluation of the parameters governing the multidimensional ﬁltration as a problem of exploration/exploitation. On one side the (random) exploration of the space of ﬁltrations. On the other side, the exploitation of the knowledge acquired by exploring the space randomly. In the future, it would be interesting to consider 213 14.2. PERSISTENT MUSIC FEATURES y y = y0 κ = κ0 κ Figure 14.4: Multidimensional persistence. A 2-dimensional ﬁltration, whose parameters are the discrete Gaussian curvature κ and the height function y. Persistent homology can be applied on each ﬁltration obtained by ﬁxing one of the two parameters. this approach, inspired by the theory of reinforcement learning (Watkins and Dayan, 1992). 14.2.3 Audio feature deformed Tonnetz In a second application, we modiﬁed either the labelling function associated to the vertices of the Tonnetz and the function used to deform its geometry. The former is restricted to the pitches corresponding to the chromatic scale built on a single octave, the latter computes the consonance (Plomp and Levelt, 1965) between a reference pitch and the pitches of the chromatic scale used to label the vertices. The invariance of the consonance function in terms of uniform transposition of the chromatic scale and the reference notes is discussed, as well as its octave dependence. The space generated by deforming the Tonnetz with the consonancebased function is used to classify the 21 modal scales derived from the diatonic, melodic and harmonic minor scales. After ﬁxing a reference pitch and a labelling of the vertices, we represented these scales as subsets of the 0-skeleton of the deformed Tonnetz. Thereafter, we consider the 0-persistent homology of each point cloud generated by its ﬁltered Vietoris-Rips complex. The resulting consonance-based persistent ﬁngerprint of modes recognises diﬀerent sonorities, organising them in coherent clusters according to the disposition of their tension and resolution pitches on the Tonnetz. Moreover, the changes in the clustering given by the uniform transposition of the chromatic scale of an octave with respect to the reference pitch reﬂect the properties of the chords (ﬁrst, third, ﬁfth and seventh degrees) associated to the modal scales. 214 CHAPTER 14. FUTURE WORKS Figure 14.5: Conﬁguration of the tensions (circles) and resolutions (squares) on the consonance-deformed Tonnetz obtained by considering block voicings of a major chord and the chromatic scale built an octave higher than the root note of the triad. Melodic phrasing analysis The space that we proposed is limited, since its geometry varies for uniform octave transposition of the chromatic scale used to label the vertices of the Tonnetz. Nevertheless, its topological properties are sensitive to the diﬀerent sonorities we examined. Thus, a natural development of this work is the analysis of point clouds generated by other scales and, then, by more complex musical phrases. As we suggested for the model discussed above, it could be possible to describe additional features (for instance the occurrences of pitches) embedding the Tonnetz in a higher-dimensional space. Learning the melodic phrasing Every music teacher experienced how the concept of mode, sonority, or the use of the right tension at the right moment can be awkward for a beginner or even an experienced musician. We are surely far from explaining in a clear and unifying way these concepts. The geometric representation of consonance we suggested inherits the cultural and temporal dependence of the curve of Plomp and Levelt. Nevertheless, the visualisation of the consonance-deformed Tonnetz and the conﬁgurations of its vertices in terms of tension and resolution patterns could be used in a didactic context, to explain the aforementioned ideas. In Figure 14.5, the pitches (of the chromatic scale built an octave higher than the root of the chord) are highlighted to represent typical tensions and resolutions on a major triad, to obtain a bluesy sonority. Looking at this representation, one can build new patterns, that are not evident when the scales are visualised directly on an instrument. In addition, pitches connected by an edge on the Tonnetz or representing the vertices of a triangle are acoustically related. For instance, in the ﬁgure the tritone substitution (G♭ major for C major) is represented as a triangle of tensions, juxtaposed to a triangle of resolutions (A minor triad). This representation gives an insight on harmonic 14.2. PERSISTENT MUSIC FEATURES 215 substitutions (often considered as complex rules), rather than valuable compositional or improvisational techniques. 14.2.4 Harmonic variable geometry An extension of the consonance function to chords has been suggested and used to compute the deformation of the vertices of the Tonnetz labelled with the pitches of a chromatic scale. The height of each vertex is computed by considering the overall consonance of the superposition of its label and a ﬁxed triad. In this setting, we compared the surfaces generated by six classes of triads. In particular, the values of the discrete Gaussian curvature have an interesting musical interpretation in terms of harmonic classiﬁcation. We classiﬁed the shapes obtained by deforming the Tonnetz with the six classes of triads, by computing their persistent homology. We compared the three diﬀerent ﬁltrations induced by the sub-level sets of the consonance function, the Vietoris-Rips complex and the discrete Gaussian curvature. The clustering of the shapes produced by the distance between their 0-persistence diagrams gave interesting preliminary results, that reﬂect diﬀerent harmonic properties of the triads that we considered. Chords and voicing classification The model shall be tested on seventh chords. This analysis would provide an interesting vision of the block voicing technique, commonly used in jazz arrangements. Furthermore, we proved that in equal temperament the inversions of a chord are classiﬁed by their consonance value modulo changes of the harmonic spectrum used to compute the consonance of the voicings. The study of several families of chords and their inversions, in terms of geometrical conﬁgurations (shapes) of the deformed Tonnetz can lead to a classiﬁcation of chords parametrisable in terms of harmonic spectra and voicing classes. Configuration and generation The variable geometry characterising both the modal and harmonic-oriented deformed Tonnetz can be used for the automatic generation of melodic lines. Figure 14.6 depicts the trajectories of several particles moving under gravity, on the surface of a deformed Tonnetz. Each trajectory depends on the mass and the initial condition (height, position with respect to the plane z = 0 and initial speed) of each particle. The retrieval of the nearest pitch along the trajectory of each particle in time allows to deﬁne a melody (in terms of pitches and durations). Furthermore, it could be interesting to study the conﬁgurations of more than one particle (Abrams and Ghrist, 2002; Ghrist and Peterson, 2007) by modelling the superpositions (and collisions) of several melodies on a given harmonic structure. It is natural to extend this model to a dynamical framework. We can consider these trajectories while the geometry of the space updates according to a progression of bass or harmonic changes. On a given accompaniment, it would also be interesting to study the periodicity of the trajectories and introduce symmetries in order to inﬂuence the paths of the particles. 216 Heightfield in Physijs CHAPTER 14. FUTURE WORKS Heightmapped Terrain 10 shapes created Stop adding shapes Figure 14.6: Gravity on the deformed Tonnetz. Masses move following the deformation of the surface. The pitches or pitch classes lying in a neighbourhood of the trajectories can be used to generate melodic lines. 14.3 Harmonic and persistence time series The horizontal representation did not take into account the intervallic relationships between the notes constituting the domain and codomain of the voice leading. The vertical analysis neglected the ordering of notes and chords in time. In the two last applications, we recovered this ordering. On one hand, we considered time series whose observations are computed from a harmonic analysis of a composition. On the other hand, persistence time series represent the geometrical properties of the deformation of the Tonnetz at evenly-spaced instants. 14.3.1 Multiple (harmonic) sequences alignment We suggested a novel method based on multiple sequences alignment, for the analysis of pop music. Albeit global pairwise alignment has already been applied to music (Pardo and Sanghi, 2005; İzmirli and Dannenberg, 2010; Martin et al., 2012), our approach consists in the exploitation of the both the signal and symbolic information and in the deﬁnition and test of a set of ad hoc tools for the alignment of harmonic-oriented symbolic sequences. Furthermore, multiple sequences alignment allowed us to give an encompassing analysis of the propagation of the musical inspiration among diﬀerent artists, genres and time. Abstract descriptors Historically, the music analysis community has been devided between the symbolic (musical notations, discrete nature) and signal-oriented (continuous ﬂux of information) research paradigms. In this signal/symbol application, the analysis of the audio allowed to extract musical descriptors, that the symbolic approach endowed with a higher-level of abstraction. In our analyses, these high-level music features (as modulations or cadential patterns) reveal surprising properties of music, that are normally neglected by the standard notion of musical genre. We suggested in the introduction of this work, that a composition is built on a collection of core 14.3. HARMONIC AND PERSISTENCE TIME SERIES 217 Figure 14.7: Dendrogram chasing. For each branch of the dendrogram it is possible to build a consensus sequence, that describe the similarity between the sequences of the cluster once they have been aligned. concepts: strong musical ideas that are clearly grasped by the listeners and that allow musicians to build their own vocabulary. The high-level descriptors allow to represent the propagation of these musical concepts, retrieving them in compositions that do not belong to the same genre. Music molecular clock The Molecular Clock Hypothesis (MCH) consists in the reconstruction of the evolutional history of species by measuring the variations of particular structure (for instance the haemoglobin) that occur at an almost constant time rate. It would be pretentious to translate literally this assumption to music. However, the construction we proposed allows to follow the evolution of a particular motif, through the chase in the dendrogram depicted in Figure 14.7. The cluster in this ﬁgure has been obtained by computing the pairwise alignment of sequences of degrees (cadential patterns) among the 138 Quaero’s songs we analysed, utilising the weighting matrix deduced by the spiral array and the NW algorithm. Red points corresponds to the consensi, or in MCH terms to the common ancestors of the songs represented in the cluster. Considering these points from the right to the left of the ﬁgure, the ﬁrst consensus sequence corresponds to the synthesis of cadential solutions between the very similar Baby’s Black and Polythene Pam. Then, by climbing toward the centre of the dendrogram, to broader syntheses of the cadential patterns used in the whole set of multiple aligned songs. The information carried by these common ancestors can be relevant either in musicological studies of particular artists, genres or periods and for music recommendation. Computer-assisted composition The results in terms of pairwise and multiple alignment, motif mining and the computation of the consensus sequences can be used for computer-assisted composition. Indeed, pairwise alignment provides a local description of the similarity regions of two compositions and induces a grouping of a set of songs. Multiple alignment allows to chase similarity patterns in the diagram generated by the simultaneous alignment of a set of songs and highlighted by the analysis of motifs. Finally, the consensus sequences provide a candidate capable to summarise in its structure the 218 CHAPTER 14. FUTURE WORKS Figure 14.8: Static classiﬁcation between three Chet Baker’s themes and improvisations and a version of Blue Bossa. Two solos by the same author are grouped together, while the bass solo of Blue Bossa is linked to the theme of Summertime at a high distance. common features of a set of previously aligned sequences. In musical terms and in our application, consensus sequences depict a possible synthesis of the coherent harmonic solutions provided by diﬀerent artists, in a diﬀerent time. 14.3.2 Persistence time series The last model we proposed consists in the representation of the deformation of the Tonnetz in time, as a time series of persistence diagrams. We provided a musical interpretation of the sequences of diagrams and utilised the dynamic time warping algorithm with the bottleneck distance as cost function, to compute the optimal warping path between two compositions and their alignment score. These computations provide an alignment between two compositions and a dissimilarity score. We analysed three datasets collecting classical, pop and jazz compositions, by considering time series of evenly-spaced observations at 8, 4 and 2-bars respectively. The dissimilarity scores and the analysis of their variations with respect to the windowing, provide a good stylistic descriptor of music able to distinguish among tonal, atonal and modal classical compositions and sensitive to the use of diﬀerent tensional paradigms in pop music. Improviser retrieval The jazz dataset represents a point of interest due to the improvisational nature of the compositions. In this case the information provided by the optimal warping path allows to distinguish between themes and solo parts, in two versions of the same standard. This method can be used to measure the distance between improvisational 14.3. HARMONIC AND PERSISTENCE TIME SERIES 219 styles. This particular development is also suggested by the preliminary result we obtained considering the static deformed Tonnetz depicted in Figure 14.8. Musical concepts: granularity and propagation In order to interpret the evolution of a system in time as a collection of observations, we deﬁned a windowing, consisting in an even partition of the composition according to its subdivision in bars. We saw how the three datasets we considered respond diﬀerently to changes of this windowing. In musical terms, this corresponds to the pace at which musical concepts evolve in the composition. It would be interesting to compute the optimal time granularity necessary to describe the evolution of compositions belonging to diﬀerent genres or artists. Mean persistence diagrams The computation of the pairwise bottleneck distances between the observations of two persistence time series has a high computational cost. A possible development of our model consists in the evaluation of the strategy introduced in (Munch, 2013) in which a unique average persistence diagram is associated to the vineyard associated to a time-varying system. In terms of classiﬁcation, it would be interesting to compare the results obtained through our primary static analysis and the corresponding mean persistence diagrams. In addition, their diﬀerences could be relevant in order to compute the optimal granularity mentioned above or at least to bound its value. Persistence time series analysis The strategy we proposed in order to compare the evolution of the persistence properties of two time-varying spaces is a new approach. Its eﬃciency can be tested in many applications, such as animals tracking (Pérez-Escudero et al., 2014) and group behaviour (Topaz et al., 2015). Memory The dynamic Tonnetz has full memory of the composition: the heights of its vertices increases monotonically. This feature does not reﬂect our perception of music, since we cannot remember every note of a whole composition. The deﬁnition of a gravity function in opposition to the one generating the deformation of the vertices can be used to endow this space with a type of short-term memory. The same argument can be applied to the study of dynamical consonance-based deformed Tonnetze. In addition, we can deﬁne a variable gravitational ﬁeld (or equip the vertices with variable masses) in order to diminish the eﬀect of the gravity in correspondence of vertices representing relevant elements of a musical phrase (for example its higher, lower, ﬁrst, ending and syncopated notes), we refer interested readers to (Perricone, 2000). Part VI Appendices 221 A Modes in Modern Music and a Topological viewpoint The aim of this chapter is twofold. On one hand, it provides both the basic references concerning modal theory and the deﬁnition of mode in a modern music context (Appendix A.1). On the other hand, it gives an intuitive, topological-oriented point of view on the use of modes in composition. A mode is interpreted as a superposition of a four-note chord and a triad. This construction follows naturally when modes are deduced from the harmonisation of a scale: the four-note chord (named the base-chord in the remainder of this chapter) represents the set of resolutions of the modal scale it deﬁnes, while a triad of tensions is formed by the second, fourth and sixth degree of the modal scale. Finally, we associate to each class of base-chord an oriented planar connected graph. This association allows both to deﬁne a new family of modes and to provide a topological, qualitative description of seven classes of four-note chords. Remark 17. What follows is a shortened, modiﬁed version1 of (Bergomi and Portaluri, 2013). A.1 Standard modes as superposition of chords From a music theory viewpoint, modes are deﬁned as a seven-note scales created by starting on any of the seven notes of a major or a melodic minor scale (Levine, 2011). In the following paragraphs, we apply this deﬁnition to deduce a family 21 modal scales and discuss their hidden harmonic nature. A.1.1 Deducing the standard modes Following the deﬁnition given in (Levine, 2011), we assume that a mode is a heptatonic scale, whose structure is inherited from a tonal scale. In table A.1 we list the 21 modes built by considering the degrees of the diatonic, melodic minor and harmonic minor scale. We include the harmonic minor scale, since its modes are widely used in several musical contexts. See (Bergomi and Geravini, 2012) for details and examples. A comparison between Tables A.1 and A.2 reveals that the idea of mode is deeply related to a harmonic choice: a modal scale is recognisable if it is played together with either a reference pitch (its root), or a chord. 1 The full text is available at http://arxiv.org/abs/1309.0687. 223 224 APPENDIX A. MODES AND TOPOLOGY Scale Major Melodic Minor Harmonic Minor Degree I II III IV V VI V II I II III IV V VI V II I II III IV V VI V II Modes Ionian Dorian Phrygian Lydian Mixolydian Eolian Locrian Hypoionian Dorian ♭2 Lydian Augmented Lydian Dominant Mixolydian ♭13 Locrian ♯2 Super locrian Hypoionian ♭6 Locrian ♯6 Ionian augmented Dorian ♯4 Phrygian dominant Lydian ♯2 Ultra locrian Example C-D-E-F-G-A-B D-E-F-G-A-B-C E-F-G-A-B-C-D F-G-A-B-C-D-E G-A-B-C-D-E-F A - B - C -D - E - F - G B-C-D-E-F-G-A C - D - E♭ - F - G - A - B D - E♭ - F - G - A - B - C E♭ - F - G - A - B - C - D F - G - A - B - C - D - E♭ G - A - B - C - D - E♭ - F A - B - C -D - E♭ - F - G B - C - D - E♭ - F - G - A C - D - E♭ - F - G - A♭ - B D - E♭ - F - G - A♭ - B - C E♭ - F - G - A♭ - B - C - D F - G - A♭ - B - C - D - E♭ G - A♭ - B - C - D - E♭ - F A♭ - B - C - D - E♭ - F - G B - C - D - E♭ - F - G - A♭ Table A.1: The 21 modes derived from the major, melodic minor and harmonic minor scale. Examples have been built on the C major, melodic minor and harmonic minor scale, respectively. A.1.2 Modes as superposition of chords In this paragraph we want to stress the importance of the harmonic choice lying behind the modal scale. Consider both the seventh chord and the modal scale built on the same degree of a tonal scale. It is easy to see that the pitch classes composing the chord are the ﬁrst, third, ﬁfth and seventh degrees of the modal scale. In a modern music context, it is possible to refer to these pitch classes as resolutions. Example A.1.1. Consider F lydian. The chord associated to this mode is F maj7, then we have F lydian scale F −G−A−B−C −D−E F maj7 arpeggio F −A−C −E Given a mode, we refer to the seventh chord built on its root as the base-chord of the mode. Thus, we call tension-triad, the triad composed by the second, the fourth and the sixth degree of the modal scale, obtained by deleting the pitch classes of the base chord. F lydian scale F −G−A−B−C −D−E Base-chord arpeggio F −A−C −E Tension-traid arpeggio G−B−D A.1. STANDARD MODES AS SUPERPOSITION OF CHORDS Scale Major Melodic Minor Harmonic Minor Degree I II III IV V VI V II I II III IV V VI V II I II III IV V VI V II 7th Chord maj7 −7 −7 maj7 7 −7 −7♭5 −maj7 −7 maj7♯5 7 7 −7♭5 −7♭5 −maj7 −7♭5 maj7♯5 −7 7 maj7 ◦7 225 Arpeggio (example) C-E-G-B D-F-A-C E-G-B-D F-A-C-E G-B-D-F A-C-E-G B-D-F-A C - E♭ - G - B D-F-A-C E♭ - G - B - D F - A - C - E♭ G-B-D-F A - C - E♭ - G B-D-F-A C - E♭ - G - B D - F - A♭ - C E♭ - G - B - D F - A♭ - C - E♭ G-B-D-F A♭ - C - E♭ - G B - D - F - A♭ Table A.2: Seventh chord harmonisation on the major, melodic minor and harmonic minor scale. Refer to Appendix E for details on modern chord notation. Hence, the F lydian mode is given by the superposition of a F maj7 chord and a G major triad. Every modal scale can be decomposed uniquely in a seventh chord built on its root and a triad built on its second degree. Often, a modal scale is played on its base-chord, so one can consider the base-chord as the set of stable notes and the tension-triad as the collection of tension notes of the mode2 . See Table A.3 for a complete description of modes in terms of base-chords and tension-triads. This decomposition arises naturally from the musical-oriented deﬁnition of modes of (Levine, 2011). It is possible to associate a family of modes to each class of base-chord [B] by varying the tension triad [T ]. In Table A.4, modes are organised according to their base-chord’s class. For instance, let [B] be a major seven chord, then, it is possible to associate to [B] three diﬀerent classes of tension-triads [Ti ]. If we choose C as a root of a maj7 chord, the three modes associated Cmaj7 are 1. Ionian: i := (Cmaj7, D−); 2. Lydian: l := (Cmaj7, D); 2 The term mode here is intended as a non necessarily ordered modal scale, played on its base chord, or at least with its root as accompaniment. 226 APPENDIX A. MODES AND TOPOLOGY Mode C Ionian D Dorian E Phrygian F Lydian G Mixolydian A Eolian B Locrian C Hypoionian D Dorian ♭2 E♭ Lydian Augmented F Lydian Dominant G Mixolydian ♭13 A Locrian ♯2 B Super locrian C Hypoionian ♭6 D Locrian ♯6 E♭ Ionian augmented F Dorian ♯4 G Phrygian dominant A♭ Lydian ♯2 B Ultra locrian Scale C-D-E-F-G-A-B D-E-F-G-A-B-C E-F-G-A-B-C-D F-G-A-B-C-D-E G-A-B-C-D-E-F A-B-C-D-E-F-G B-C-D-E-F-G-A C - D - E♭ - F - G - A - B D - E♭ - F - G - A - B - C E♭ - F - G - A - B - C - D F - G - A - B - C - D - E♭ G - A - B - C - D - E♭ - F A - B - C -D - E♭ - F - G B - C - D - E♭ - F - G - A C - D - E♭ - F - G - A♭ - B D - E♭ - F - G - A♭ - B - C E♭ - F - G - A♭ - B - C - D F - G - A♭ - B - C - D - E♭ G - A♭ - B - C - D - E♭ - F A♭ - B - C -D - E♭ - F - G B - C - D - E♭ - F - G - A♭ Base-chord Cmaj7 D−7 E−7 F maj7 G7 A−7 B − 7♭5 C − maj7 D−7 E♭maj7♯5 F7 G7 A − 7♭5 B − 7b5 C − maj7 D − 7♭5 E♭maj7♯5 C −7 G7 Amaj7 B ◦7 Tension-triad D− E− F G A− B − ♭5 C D− E♭♯5 F G A − ♭5 B − ♭5 C− D − ♭5 E♭♯5 F G A − ♭5 B − ♭5 C− Table A.3: Modes as a superposition of two chords. 3. Lydian ♯2: l♯2 := (Cmaj7, D♯−♭5 ) Every couple ([B], [T ]) can be associated uniquely to a modal scale. The scale is given by the set of notes {b1 , b2 , b3 , b4 , t1 , t2 , t3 } where B = {b1 , . . . , b4 } and T = {t1 , t2 , t3 }. Following (Piston et al., 1978, Chapter 10) for the analysis of non-harmonic tones in classical harmony we are entitled to deﬁne Definition A.1.1. Let B = {b1 , . . . , b4 } and T = {t1 , t2 , t3 }. Non-chord tones are pitch classes of the modal scale that do not belong to the base-chord; i.e. ti ∈ T such that ti 6∈ B. Let m = ([B], [T ]), then 1. m identiﬁes a unique mode; 2. chord tones and non chord tones are splitted into two components, respectively [B] and [T ] 3. considering the notes belonging to [B] and [T ] we deduce the modal scale associated to m, that can be re-ordered in a 7-uple in which the degrees of the scale are displayed from the root, to the seventh note. A.1. STANDARD MODES AS SUPERPOSITION OF CHORDS Base-chord maj7 T-M III-P V-M VII Base-chord maj7 ♯5 T-M III-aug V-M VII Base-chord 7 T-III M-V P-VII m Base chord −7 T-m III-P V-m VII Base chord −7 ♭5 T-m III-dim V-m VII Base-chord −maj7 T-m III-P V-M VII Base-chord ◦7 T-m III-dim V-dim VII Modes Ionian Lydian Lydian ♯2 Modes Lydian augmented Ionian augmented Modes Mixolydian Mixolydian ♭13 Phrygian dominant Lydian dominant Modes Dorian Phrygian Eolian Dorian ♭2 Dorian ♯4.. Modes Locrian Locrian ♯2 Superlocrian Locrian ♯6 Modes Hypoionian Hypoionian ♭6 Modes Ultralocrian 227 Example (root C) C-D-E-F-G-A-B C - D - E - F♯ - G - A - B C - D♯ - E - F♯ - G - A - B Example (root C) C - D - E - F♯ - G♯ - A - B C - D - E - F - G♯ - A - B Example (root C) C - D - E - F - G - A - B♭ C - D - E - F - G - A♭ - B♭ C - D♭ - E - F - G - A♭ - B♭ C - D - E - F♯ - G - A - B♭ Example (root C) C - D - E♭ - F - G - A - B♭ C - D♭ - E - F - G - A♭ - B♭ C - D - E♭ - F - G - A♭ - B♭ C - D♭ - E♭ - F - G - A - B♭ C - D - E♭ - F♯ - G - A - B♭ Example (root C) C - D♭ - E♭ - F - G♭ - A♭ - B♭ C - D - E♭ - F - G♭ - A♭ - B♭ C - D♭ - E♭ - Ff lat - G♭ - A♭ - B♭ C - D♭ - E♭ - F - G♭ - A - B♭ Example (root C) C - D - E♭ - F - G - A - B C - D - E♭ - F - G - A♭ - B Example (root C) C - D♭ - E♭ - F - G♭ - A♭ - B♭♭ Table A.4: Modal scales associated to a ﬁxed base-chord Thus, it is possible to associate to a ﬁxed base-chord an ordered modal scale for every available choice of tension-triad. Example A.1.2. Fix a seventh chord, for instance a Cmaj7. The idea is to split tensions and resolutions according to the ordering induced by the modal scale as follows C → → E → → G → → B White squares are placeholders for the note of a suitable triad. As we showed in the previous paragraph, choosing a D minor triad one can ﬁnd the C ionian scale, considering a D major triad we have a C lydian and with a D♯ diminished triad we obtain the C lydian ♯2 scale. C → D → E → F → G → A → B 228 APPENDIX A. MODES AND TOPOLOGY C → D → E C → D♯ → E → F♯ → G → F♯ → G → A → → B A → B Remark 18. The following section lies beyond the scope of this appendix. However, it represents the ﬁrst eﬀort towards a topological interpretation of music I made with Alessandro Portaluri, thus I decided to add it to this thesis. A.2 A geometrical representation of modes through graphs In this section we suggest an elementary topological-oriented analysis of modes and modal scales. Fixed a seventh chord, a particular graph is used both to provide an intuitive visualisation of the possible modal choices associated to the seventh chord and to give a qualitative description of the seventh chord classes we listed in the ﬁrst column of Table A.4. A.2.1 Some mathematical preliminaries Definition A.2.1. Two abstract (unoriented) graphs (V, E) and (V ′ , E ′ ) are isomorphic if there exists a bijective map f : V → V ′ such that {v, w} ∈ E ⇐⇒ {f (v), f (w)} ∈ E ′ . Remark 19. Analogous deﬁnitions for oriented graphs are obtained by replacing unordered pairs {·, ·} by ordered pairs (·, ·). Definition A.2.2. For n > 1, a path on a graph G from v 1 to v n+1 is a sequence of vertices and edges v 1 e1 v 2 e2 . . . v n en v n+1 where e1 = (v 1 v 2 ), e2 = (v 2 v 3 ), . . . , en = (v n v n+1 ). If G is oriented, we only require that ei = (v i v i+1 ) or ei = (v i+1 v i ) for i = 1, . . . , n; that is, the edges along the path are oriented in the opposite way. Definition A.2.3. The path is simple if e1 , . . . , en are all distinct, and v 1 , . . . , v n+1 are all distincts except that possibly v 1 = v n+1 . If the simple path has v 1 = v n+1 and n > 0 is called a loop. A graph G is said to be connected if, given any two vertices v and w of G there is a path on G from v to w. A graph which is connected and without loops is called a tree. Definition A.2.4. Given a graph G, a graph H is called a subgraph of G is the vertices of H are vertices of G and the edges of H are edges of G. Also H is called a proper subgraph of G if H 6= G. The following deﬁnition is central in the remainder. 229 A.2. MODES THROUGH GRAPHS (a) An example of a planar grap (b) A maximal tree Definition A.2.5. Let G be a graph, H be any maximal tree in G, S be a subset of the vertices set and k is an integer. An admissible path γ in G with respect to S of length k is any proper subgraph of H satisfying the two conditions: 1. each vertex v ∈ S lies in γ; 2. the total number of vertices in γ is k. We also observe that any graph G as a subgraph which is a tree (e.g. the empty subgraph is a tree) so that the set T of subgraphs of G which are trees will have maximal elements. That is, there exists at least one T ∈ T such that T is not a proper subgraph of any T ′ ∈ T . Lemma A.2.1. Let G be a connected graph. A subgraph T of G is a maximal tree for G if and only if T is a tree containing all the vertices of G. Proof. Cfr. (Giblin, 2010, Proposition 1.11, pag.18). For a connected graph G there is a standard way to compute the homotopy group. In fact the following result holds: Proposition A.2.2. For a connected graph G with maximal tree T , π1 (G) is a free group with basis the classes [fα ] corresponding to the edges eα of X \ T . Proof. Cfr. (Hatcher, 2002, Proposition 1A.2 pag.84). A.2.2 Graphs and base-chords Definition A.2.6. Given a base-chord [B] the associated graph G ([B]) is the realisation of the abstract graph whose vertex set is given by the set of all notes forming [B] and of every compatible tension-triad. The oriented edge set is represented by all possible oriented connections between each vertex, according to the order of the degrees of the scale; i.e. from the root to the seventh. (See Figure A.1). Associating a graph to a modal scale, it is possible to arrange its degrees in the plane in inﬁnitely many ways. However considering the orientation induced by the degrees of the scale all the oriented graphs are homeomorphic. Since homeomorphisms induce isomorphisms in homotopy, all of the homotopic classiﬁcation is not aﬀected by the convention given in deﬁnition A.2.6. We also observe that on the second, fourth and sixth degrees we have at most two choices. This is a straightforward 230 APPENDIX A. MODES AND TOPOLOGY t1 t2 t3 b 1 → t1 b 2 → t2 b3 → t3 b1 b2 b3 b1 → t̄1 b2 → t̄2 b3 → t̄3 t1 t2 b4 t3 Figure A.1: A graph built assuming that the modal choices on a base-chord B = {b1 , b2 , b3 , b4 } are given by two tension-triads T = {t1 , t2 , t3 } and T = {t1 , t2 , t3 } I mII mIII dIV dV mVI dVII Figure A.2: The graph associated to the diminished seventh chords, Γ◦7 . aIV I MII MIII aV MVI MVII PIV Figure A.3: The graph associated to diminished seventh chords, Γmaj7♯5 . consequence of the constructions of the modes from the major, melodic minor and harmonic minor scales. Following deﬁnition A.2.6, it is possible to associate a graph to each base-chord class: ◦7 , maj7♯5 , −maj7, maj7, 7, −7, −7♭5 . 1. Diminished seven: Γ◦7 . This type of chord appears only in the harmonisation of the seventh degree of the harmonic minor scale, therefore the only available mode is the ultralocrian. Its graph (and a fortiori maximal tree) is represented in Figure A.2 2. Major seven ♯5: Γmaj7♯5 . Fixing a maj7 ♯5 chord as base of the mode, we have two diﬀerent possibilities: either the ionian sharp ﬁve or the lydian sharp ﬁve modal scale Figure A.3. 3. Minor major seven: Γ−maj7 . In this case we can choose between two diﬀerent modes: hypoionian and hypoionian ♭6. The graph is represented in Figure A.4. 4. Major seven: Γmaj7 . This is certainly a more common chord than the previous ones. We expect to have more possibilities, in fact a well known 231 A.2. MODES THROUGH GRAPHS MVI I MII mIII PIV PV MVII mVI Figure A.4: The graph associated to minor major seventh chords, Γ−maj7 . aII aIV I MIII MII PV MVI MVII PIV Figure A.5: The graph associated to major seven chords, Γmaj7 . mII I aIV MIII MII mVI PV PIV mVII MVI Figure A.6: The graph associated to dominant chords, Γ7 . and simple base-chord surely will bear more tension-triads than a naturally dissonant one, as depicted in Figure A.5. 5. Dominant: Γ7 . Dominant chords are largely used in blues and traditional jazz thanks to their capability of bearing tensions. The graph associated to this chord class is depicted in Figure A.6. 6. Minor seven: Γ−7 . For a minor seventh chord the only forbidden notes are the augmented second and the diminished fourth. So, we obtain a graph isomorphic to Γ7 in Figure A.7. 7. Minor seven ♭5: Γ−7♭5 . In this case, the root note and the diminished ﬁfth form a tritone interval which gives a stable sense of dissonance to the half-diminished seventh chord, that is emphasised by the minor second which 232 APPENDIX A. MODES AND TOPOLOGY mII I aIV mIII MII mVI PV PIV mVII MVI Figure A.7: The graph associated to minor seven chords, Γ−7 . mII I dIV mIII MII mVI dV PIV mVII MVI Figure A.8: The graph associated to minor seven ﬂat ﬁve chords, Γ−7♭5 . is natural in three of the four modal solutions we ﬁnd by considering this chord class, i.e. locrian, superlocrian and locrian ♯6 scales. See Figure A.8. Remark 20. It is clear by previous discussion that even if the graphs associated to Γ7 , Γ−7 , Γ−7 ♭5 are isomorphic, they are built on diﬀerent notes and hence they are quite diﬀerent in the essence so the homotopy is not suitable to distinguish among them. All of these graphs show in a clear and net way how to construct new modes from the existing ones. Given a graph G, let us consider any proper tree H (not maximal, in general!) contained in G and having 7 vertices. By taking into account deﬁnition A.2.5, we give the following: Definition A.2.7. Let [B] a base-chord and G ([B]) be the associated base-chord graph. An admissible mode is any admissible connected subgraph (or path in the graph) γ([B]) in G with respect to [B] of length 7. If γ([B]) is not a mode constructed above, we refer to as admissible special mode. Proposition A.2.3. Given the base-chord class [B] the following modes are the only admissible special modes: 1. if [ΓB ] = [Γmaj7 ] then γ([B]) is the path γIon♯2 := {I, aII, M III, P IV, P V, M V I, M V II} A.2. MODES THROUGH GRAPHS 233 2. if [ΓB ] = [Γ7 ] then γ([B]) are the paths γM ix♭2 := {I, mII, M III, P IV, P V, M V I, mV II} γM ix♭2♯4 := {I, mII, M III, aIV, P V, M V I, mV II} γM ix♯4♭6 := {I, M II, M III, aIV, P V, mV I, mV II} γM ix♭2♯4♭6 := {I, mII, M III, aIV, P V, mV I, mV II} 3. if [ΓB ] = [Γ−7 ] then γ([B]) are the paths γEol♭2 := {I, mII, mIII, P IV, P V, mV I, mV II} γEol♯4 := {I, M II, mIII, aIV, P V, mV I, mV II} γP hr♯4 := {I, mII, mIII, aIV, P V, mV I, mV II} 4. if [ΓB ] = [Γ−7♭5 ] then γ([B]) are the paths γLoc♯2♯6 := {I, mII, mIII, P IV, dV, mV I, mV II} γSup♯2 := {I, M II, mIII, dIV, dV, mV I, mV II} γSup♯6 := {I, mII, mIII, dIV, dV, M V I, mV II} γSup♯2♯6 := {I, M II, mIII, dIV, dV, M V I, mV II} Proof. The result readily follows by the previous graph classiﬁcation. Let us consider every class of chord to prove the existence of special modes. • ◦7 . It is not possible to have path associated to special modes on the graph Γ◦7 (Figure A.2), since there is only one path available which represent the ultra locrian mode. • maj7♯5 and −maj7. In both Γmaj7♯5 (Figure A.3) and Γ−maj7 (Figure A.4) The only available choice is on the fourth and the sixth degree of the modal scale, respectively. This fact implies that only two admissible modes can be built on such graph and they diﬀers exactly for one note. So we can choose among two paths on the graph which are exactly the two admissible modes we used to build the graph. • maj7. In Γmaj7 (Figure A.5) there are 22 available choices. The modes which generate this graph are three, so there is a special mode which represent the admissible path on Γmaj7 which is diﬀerent from the paths representing the Ionian, Lydian and Lydian ♯2 scales. The only possible, admissible path is {I, aII, M III, P IV, P V, M V I, M V II}. • 7. Four admissible non special modes generate Γ7 (see Table A.4 and Figure A.6). The total number of admissible modes in this graph is 23 = 8. We expect to ﬁnd 4 special modes: {I, mII, M III, P IV, P V, M V I, mV II} {I, mII, M III, aIV, P V, M V I, mV II} {I, M II, M III, aIV, P V, mV I, mV II} {I, mII, M III, aIV, P V, mV I, mV II} 234 APPENDIX A. MODES AND TOPOLOGY • −7 and −7♭5 . This cases are similar to the previous one. Γ−7 (Figure A.7) is generated by 5 admissible, non special modes (table A.4), so we have 3 special modes which are {I, mII, mIII, P IV, P V, mV I, mV II} {I, M II, mIII, aIV, P V, mV I, mV II} {I, mII, mIII, aIV, P V, mV I, mV II}. Γ−7♭5 (Figure A.8) is generated by four admissible non special modes (table A.4), we have the following four special modes: {I, mII, mIII, P IV, dV, mV I, mV II} {I, M II, mIII, dIV, dV, mV I, mV II} {I, mII, mIII, dIV, dV, M V I, mV II} {I, M II, mIII, dIV, dV, M V I, mV II} A.2.3 A qualitative description of the base-chord classes The aim of this section is to associate a value to each base chord, with respect to the connections of its graph, as we deﬁned it in the previous section. Definition A.2.8. Given a base-chord [B], let G ([B]) be the associated base-chord graph. We call topological quality of [B], i.e. τ ([B]), the number of generators of the fundamental group of G ([B]). Lemma A.2.4. Let [B] be a base-chord. The integer τ ([B]) is well-defined. Proof. It is enough to observe that given any base-chord [B], the associated integer τ ([B]) is uniquely deﬁned. In fact by the classiﬁcation given in section A.2, at each base-chord [B] we can uniquely associate a planar connected graph G ([B]). As direct consequence of proposition A.2.2, the fundamental group of π1 G ([B]) is a free group having τ ([B]) generators. Proposition A.2.5. Let [B] be a base-chord, G ([B]) be a planar graph. Then the fundamental group π1 G ([B]) and the topological measure of complexity are given in the table below. 235 A.2. MODES THROUGH GRAPHS Base-chord [B] π1 [B] τ ([B]) ◦7 {1} 0 maj7♯5 Z 1 Z 1 maj7 Z∗2 2 7 Z∗3 3 −7 Z∗3 3 Z∗3 3 −maj7 −7♭5 Proof. The proof follows from the classiﬁcation given in section A.2 and proposition A.2.2. In conclusion, the topology of the graphs we constructed reﬂects the degrees of freedom oﬀered by a standard seventh chord, from a modal viewpoint. Moreover, dominant, minor seventh and half-diminished chords are commonly substituted in a jazz context. Think either about the equivalence between dominant and minor seven chords in a blues improvisation, or the strong relation between the pitch-class sets of a dominant chord, the half-diminished built on its major third and the minor seven on its perfect ﬁfth (for instance {G, B, D, F }, {B, D, F, A} and {D, F, A, C}, respectively). B Geometric characterisation of the chord space (proof). Theorem B.0.1. The space of chords An is a metric space, obtained by gluing the (n − 1)-dimensional tetrahedral bases of a right n-dimensional prism via the equivalence relation induced by a cyclic permutation of the vertices. Proof. Let G = hT n , Sn i be the group of isometries such that An = Rn /G and x = (x1 , . . . , xn ) a point in Rn . Let τi ∈ T n be the translation (x1 , . . . , xi , . . . , xn ) = (x1 , . . . , xi + 1, . . . , xn ) and σij ∈ Sn the permutation swapping the ith and jth coordinates of x. The group G is isomorphic to T n ⋊ Sn , since T n ∩ Sn = {e} and T n is normal in G. The proof is structured as follows: 1. Study the subgroup of isometries F ⊂ G that ﬁxes the hyperplanes in Rn , then observe that the elements of this group are reﬂections. 2. Deﬁne the fundamental domain for the action of G and show that it is the prism endowed with the structure described in the thesis of the theorem. 1. Let Ct be the hyperplane ( (x1 , . . . , xn ) ∈ R n n X ) xi = t i=1 for t ∈ R and F ⊂ G the subgroup of isometries ﬁxing Ct . An element of F can P be written as στ , where σ ∈ Sn and τ = τ1e1 · · · τnen , with ni=1 ei = 0. We prove by inducing on m = | { ei 6= 0, for i = 1, . . . , n } | that τ is an element of the group generated by conjugate elements of Sn . Observe that m has to be at least 2, indeed P τ = τ1e1 · · · τnen . The sum of the coordinates i xi = t is invariant under τ only if P the sum of its exponents i ei = 0, hence m > 1. • m = 2: τ = τik τj−k = τi τj−1 k , τi τj−1 = σij τi−1 σij τi . • We assume the statement true for every integer up to m. For (m + 1) we have e e +em+1 m+1 τ = τie11 · · · τim+1 = τie11 · · · τimm e +e −e e m+1 τim m+1 τim+1 . −e e m+1 By induction hypothesis both τie11 · · · τimm m+1 and τim m+1 τim+1 can be written as products of conjugated of elements in Sn and hence, so is τ . 237 APPENDIX B. GEOMETRIC CHARACTERISATION OF THE CHORD SPACE 238 (PROOF). The subgroup F is generated by elements of the form τ σij τ −1 . Thus, it is necessary to study the transformations of the form τ σij τ −1 . We distinguish two cases: (i) τ = τkm and k 6∈ {i, j}, then τkm σij τk−m = σij . (ii) τ = τkm and either k = i or k = j. Assume k = j, then τjm σij τj−m (x1 , . . . , xi , . . . , xj , . . . , xn ) τjm σij (x1 , . . . , xi , . . . , xj − m, . . . , xn ) τjm (x1 , . . . , xj − m, . . . , xi , . . . , xn ) (x1 , . . . , xj − m, . . . , xi + m, . . . , xn ) . Hence, τjm σij τj−m corresponds to the reﬂection with respect to the hyperplane xj − xi = m. 2. The fundamental domain for the action of G is given by P = C ∩ D, where C= ( (x1 , . . . , xn ) ∈ R n X n i=1 xi ∈ [0, 1] ) and D = { (x1 , . . . , xn ) ∈ Rn | xi > xi+1 ∀i ∈ { 1, . . . , n − 1 } , x1 6 xn + 1 } . Indeed, every point x ∈ Rn can be translated to some Ct , for t ∈ [0, 1], hence, every x ∈ Rn is in relationship with a point of P , through the action of elements of G. The elements of F act as mirrors with respect to the hyperplane xi = xj + m, with c ∈ Z. These constitute n2 families of parallel hyperplanes, that decompose Ct in a union of (n − 1)-dimensional simplices. Through these reﬂections, each point of Ct can be associated to one and only one point of the simplex D ∩ Ct . The points lying in the interior of P are not in relationship. Observe that the P elements of G modiﬁes the sum i xi of an integer value. If two points x ∈ Cp and y ∈ Cq , with p, q ∈ [0, 1] were in relationship, it should be p − q ∈ Z, then, there are only two possible cases: (i) p = 0 and q = 1 (or vice versa). In this case the points x and y belong to the basis of the prism and not to its interior. (ii) p = q. Both points lie in the simplex D ∩ Cp , hence they cannot be in relationship. Assume the contrary, then it exists g ∈ F such that g(x) = y, i. e. y can be obtained by reﬂecting x with respect to a hyperplane of the form xi = xj + c, which is impossible for an aforementioned argument. Finally, we study the possible relationships between the two bases D ∩ C0 and D ∩ C1 . Let g be an element of G, V = {v0 , . . . , vk } a set of points in Rn and P P x = ki=0 λi vi , where i λi = 1, then g(x) = g k X i=0 λi vi ! = k X i=0 λi g(vi ), 239 these equalities show that we can study the possible relationships between the vertices of the two bases, and then extend them to their convex hulls. The vertices of D ∩ C0 are the origin v0 and the points    k k k−n k − n   ,..., , ,...,  where k ∈ { 1, . . . , n − 1 } .   n n n n {z } | {z } | vk =  (n−k) times k times The vertices of the basis D ∩C1 have form uk = vk + n1 , . . . , n1 for k ∈ {0, . . . , n−1}, of the i. e. they are obtained by a translation of the vertices vk along the height-axis k k prism P . Note that every vk is in relationship with the point n , . . . , n (it suﬃces to apply τj for j > n − k). For the same argument, every uk is in relation with k+1 k+1 . Hence, for k ∈ {0, . . . , n − 2}, uk is in relation with vk+1 and un−1 n ,..., n is in relation with v0 . C Code C.1 Boundary matrix reduction in persistence algorithm import numpy as np def low( column ): ones = np. nonzero ( column ) # p r i n t ones [ 0 ] if not ones [0]. any (): low = None else: low = int(max(ones [0])) return low def persistence ( boundary ): # Initialize boundary the reduced matrix as a copy of R = boundary # Check i f entries t h e b o u n d a r y m a t r i x h a s non− a d m i s s i b l e nonz = R[np. nonzero (R > 1)] if nonz.shape [1] > 0: raise ValueError ("check your boundary matrix !") else: columns = R[:R.shape [1]].T for i in range (len( columns )): for j in range(i): ci = np.array( columns [i]) [0]. tolist () cj = np.array( columns [j]) [0]. tolist () low_i = low(ci) low_j = low(cj) if low_i != None and low_j != None and low_i == low_j: new_col = np.mod(np.add(ci ,cj) ,2) columns [i ,:] = new_col 241 242 APPENDIX C. CODE persistence (R) return columns . transpose () C.2 Three Dimensional Visualization of the Deformed Tonnetz The following code add some comments to the JavaScript generating the web application hosted at http://nami-lab.com/tonnetz/examples/deformed_tonnetz_ int_sound_pers.html. This implementation depends on the Three.js and MIDI.js libraries, which are free downloadable. We refer to the online version of the code for libraries’ dependencies and to see how the script is embedded in a html page. C.2.1 Tutorial The web application allows to deform a 15 × 15 Tonnetz in two diﬀerent ways, giving to the user the possibility to play chords and melodies on the keyboard using the keys of the ﬁrst and second line as a piano keyboard, according to table C.1 or playing a piece of Music among the one listed in the menu song of the graphic user interface (gui). It is possible to orbit in the 3D scene using the mouse to zoom and rotate the Tonnetz and to move the whole mesh (right click and drag). The skeletons of the simplicial complex can be hidden using the show commands of the gui, to better understand the positions of the vertices as a point cloud, or the geometry of the edges which could be hidden by the triangles in some conﬁgurations. Once the Tonnetz has been deformed, the function disp_pref allows to compute a preferred set of pitch classes ﬁltering the vertices of the Tonnetz on their relative height, considering the one higher than a certain threshold, depending on the height of the maximal peak among the vertices. The preferred pitch class conﬁguration is then show as a point cloud on the planar Tonnetz in z = 0. Table C.1: Pitches - Key association Key Pitch a C w C♯ s D e D♯ d E f F t F♯ g G y G♯ h A u A♯ <!−− Soundfont settings −−> <script > MIDI. loadPlugin ({ soundfontUrl : "./ soundfont /", instrument : " acoustic_grand_piano " }); </script > <script > // T h r e e js s t a n d a r d o b j e c t s var group; var container , stats; var particlesData = []; j B k C o C♯ l C p D♯ C.2. 3D DEFORMED TONNETZ 243 var camera , scene , renderer ; // T o n n e t z v a r i a b l e s var positions , colors ; var pointCloud ; var particlePositions , vertices ; var triangles , edges_helper , edges , mesh; var num_of_points_per_line = 10; var num_of_lines = 10; var num_of_triangles = 2 ∗ ( num_of_lines − 1) num_of_points_per_line − 1 ); var num_of_edges = 3 ∗ num_of_triangles / 2 + num_of_lines − 1) + num_of_points_per_line var faces = []; var num_of_particles = num_of_points_per_line num_of_lines ; var particleCount = num_of_particles ; var shadow , mesh_s ; ∗ ( ( − 1; ∗ // k e y b o a r d l i s t e n e r var keyboard = new THREEx . KeyboardState (); var indices_array = [], pitches_array = []; // GUI var effectController = { showVertices : true , showEdges : true , showTriangles : true , showHelper :true , play_pause : true , stop: false , song : "1 all_the_things_you_are .mid", reset_ton : false } // M I D I p l a y i n g s e t t i n g s var delay = 0; // p l a y o ne n o t e e v e r y q u a r t e r s e c o n d var velocity = 127; // how h a r d the n o t e h i t s var note , cur_playing =[]; var player = MIDI. Player ; var pitch = [], message ; // S o n g s var songsToFiles ={ "All the Things You Are": "3 all_the_things_you_are_piano_solo_2_tema .mid", ... }; init (); animate (); 244 APPENDIX C. CODE function initGUI () { var gui = new dat.GUI (); gui.add( effectController , " showVertices " ). onChange ( function ( value ) { pointCloud . visible = value; } ); gui.add( effectController , " showEdges " ). onChange ( function ( value ) { edges. visible = value; } ); gui.add( effectController , " showTriangles " ). onChange ( function ( value ) { mesh. visible = value ; } ); gui.add( effectController , " showHelper " ). onChange ( function ( value ) { edges_helper . visible = value; } ); gui.add( effectController , " disp_pref " ). onChange ( function ( value ) { if (value == true) { disp_pref (); mesh. visible = false ;} else{mesh. visible = true }} ); gui.add( effectController , " play_pause " ). onChange ( function ( value ) { if (value == true) { player . start (); }else{ player .pause ()}} ); gui.add( effectController , "stop" ). onChange ( function ( value ) {{ player .stop (); }} ); gui.add( effectController , ’song ’ ,songsToFiles ). onChange ( function ( value ) { player .stop (); player . loadFile ("midi/" + value , player .start); } ); gui.add( effectController , " reset_ton " ). onChange ( function ( value ) { undeform ()} ); } function init () { initGUI (); container = document . getElementById ( ’container ’ ); // c a m e r a s e t t i n g s camera = new THREE. PerspectiveCamera ( 45, window . innerWidth / window . innerHeight , 1, 4000 ); camera . position .z = 10; camera . position .x = 0; camera . position .y = −20; // c o n t r o l s controls = new THREE. OrbitControls ( camera , C.2. 3D DEFORMED TONNETZ 245 container ); // s c e n e an d l i g h t s scene = new THREE.Scene (); scene.fog = new THREE.Fog( 0x050505 , 2000 , 3500 ); var light1 = new THREE. DirectionalLight ( 0xffffff , 0.5 ); light1 . position .set( 100, 100, 100 ); scene.add( light1 ); var light2 = new THREE. DirectionalLight ( 0xffffff , 1.5 ); light2 . position .set( 0, −1, 0 ); scene.add( light2 ); // T o n n e t z v e r t i c e s group = new THREE.Group (); scene.add( group ); positions = new Float32Array ( num_of_particles ∗ 3 ) ; colors = new Float32Array ( num_of_particles ∗ 3 ); // v e r t i c e s m a t e r i a l var pMaterial = new THREE. PointCloudMaterial ( { color: 0xffffff , size: 3, blending : THREE. AdditiveBlending , transparent : true , sizeAttenuation : false } ); particles = new THREE. BufferGeometry (); particlePositions = new Float32Array ( num_of_particles ∗ 3 ); // d e f i n i n g the t r i a n g l e of t he t o n n e t z as equilateral var k = 0; for ( var i = 0; i < num_of_lines ; i++ ) { for ( var j = 0; j < num_of_points_per_line ; j++ ) { if ( i % 2 == 0 ){ var x = j; var y = Math.sqrt (3) /2 ∗ i; var z = 0; pitches_array [k] = ((j ∗ 7) +i/2) %12; } else{ 246 APPENDIX C. CODE var x = j+1/2; var y = Math.sqrt (3) /2 ∗ i; var z = 0; pitches_array [k] = ((j ∗ 7+4) + (i −1)/2) %12; } particlePositions [ particlePositions [ particlePositions [ indices_array [k] = k++; } k ∗ 3 ] = x; k ∗ 3 + 1 ] = y; k ∗ 3 + 2 ] = z; k; } // ad d p o s i t i o n an d i n d e x a t t r i b u t e to v e r t i c e s particles . addAttribute ( ’position ’, new THREE. DynamicBufferAttribute ( particlePositions , 3 ) ); particles . addAttribute ( ’index ’, new THREE. BufferAttribute ( new Uint16Array ( indices_array ), 1 ) ); // ad d v e r t i c e s to t he s c e n e pointCloud = new THREE. PointCloud ( particles , pMaterial ); group.add( pointCloud ); // c r e a t e t he 2 - s k e l e t o n ( e d g e s w i l l be a d d e d l a t e r ) triangles = new THREE. Geometry (); for (var i = 0; i < num_of_particles ; i++){ var v = new THREE. Vector3 ( particlePositions [ i ∗ 3 ], particlePositions [ i ∗ 3 + 1 ], particlePositions [ i ∗ 3 + 2 ]); triangles . vertices .push(v); } var ind = 0; for (var j = 0; j < num_of_lines −1; j++){ var k = j ∗ num_of_points_per_line for (var i = 0; i < num_of_points_per_line −1; i++) { if (j%2 ==0){ triangles .faces.push(new THREE.Face3(k+i+0,k +i+1,k+i+ num_of_points_per_line )); var v1 = [k+i+0,k+i+1,k+i+ num_of_points_per_line ]; triangles .faces.push(new THREE.Face3(k+i+ num_of_points_per_line ,k+i+ num_of_points_per_line +1,k+i+1)); var v2 = [k+i+ num_of_points_per_line ,k+i+ num_of_points_per_line +1,k+i+1]; }else{ C.2. 3D DEFORMED TONNETZ 247 triangles .faces.push(new THREE.Face3(k+i+0,k +i+1,k+i+ num_of_points_per_line +1)); var v1 = [k+i+0,k+i+1,k+i+ num_of_points_per_line +1]; triangles .faces.push(new THREE.Face3(k+i+ num_of_points_per_line ,k+i+ num_of_points_per_line +1,k+i)); var v2 = [k+i+ num_of_points_per_line ,k+i+ num_of_points_per_line +1,k+i]; } faces[ind] = v1; faces[ind + 1] = v2; ind += 2; } } for ( var i = 0; i < triangles .faces. length ; i ++ ) { triangles .faces[ i ]. color. setHex ( 1 ∗ 0 xffffff ); } // Se t th e f a c e s ’ m a t e r i a l var material = new THREE. MeshPhongMaterial ( { color: 0xaaaaaa , specular : 0x00ffff , shininess : 250, side: THREE.DoubleSide , vertexColors : THREE. FaceColors } ); // C o m p u t e t h e i r n o r m a l s triangles . computeFaceNormals (); mesh = new THREE.Mesh( triangles , material ); // S t a t e th e g e o e m t r y w i l l be d y n a m i c a l l y u p d a t e d mesh. geometry . dynamic = true; // Ad d t r i a n g l e s group.add(mesh) // g e n e r a t e the p l a n a r T o n n e t z b e h i n d the d e f o r m e d on e edges_helper = new THREE. EdgesHelper ( mesh , 0 x00ff00 ); group.add( edges_helper ); // e d g e m a t e r i a l : th e a t t r i b u t e w i r e f r a m e a l l o w s to g e n e r a t e t he e d g e s d i r e c t l y f r o m t he t r i a n g l e s var ematerial = new THREE. MeshBasicMaterial ( { color: 0x00ff00 , specular : 0x00ffff , shininess : 250, side: THREE.DoubleSide , wireframe :true }); // Ad d e d g e s edges = new THREE.Mesh( triangles , ematerial group.add(edges); ); 248 APPENDIX C. CODE // A s o c i a t i o n a m o n g k e y b o a r d c h a r a c t e r a nd s o u n d s var playing_char = ["a","w","s","e","d","f","t","g", "y","h","u","j","k","o","l","p","Ãš"]; for (var i = 0; i < playing_char . length ; i++){ cur_playing [ playing_char [i]] = 0; } // M I D I p l a y e r s e t t i n g s window . onload = function (){ MIDI. loadPlugin ( function () { player . timeWarp = 1.0; player . addListener ( function (data) { message = data. message ; // If a p i t c h p l a y s r e t u r n it if ( message === 144){ pitch.push(data.note); }else{pitch = [];} }); }); } // r e n d e r e r renderer = new THREE. WebGLRenderer ( { antialias : true , alpha:true} ); renderer . setPixelRatio ( window . devicePixelRatio ); renderer . setSize ( window .innerWidth , window . innerHeight ); renderer . gammaInput = true; renderer . gammaOutput = true; container . appendChild ( renderer . domElement ); // fp s s t a t s stats = new Stats (); stats. domElement .style. position = ’absolute ’; stats. domElement .style.top = ’0px’; container . appendChild ( stats. domElement ); window . addEventListener ( ’resize ’, onWindowResize , false ); } function onWindowResize () { camera . aspect = window . innerWidth / window . innerHeight ; camera . updateProjectionMatrix (); renderer . setSize ( window .innerWidth , window . innerHeight ); C.2. 3D DEFORMED TONNETZ } // M I D I n o t e O n a nd n o t e O f f function play(note) { MIDI. noteOn (0, note , velocity , delay); } function stop(note) { MIDI. noteOff (0, note , delay); } function animate () { requestAnimationFrame ( animate ); stats. update (); render (); } // T o n n e t z d e f o r m a t i o n t h r o u g h k e y b o a r d : p l a y a p i t c h a nd u p d a t e t he g e o m e t r y of th e s i m p l i c i a l complex function key(character , note){ var c = 0.002; if( keyboard . pressed ( character ) ){ for (var i = 0; i < pitches_array . length ; i++){ if ( pitches_array [i]== note %12){ particlePositions [i ∗ 3+2] += c; mesh. geometry . vertices [i].z += c; } } if( cur_playing [ character ] == 0){ cur_playing [ character ] = 1; play(note); } } else if ( cur_playing [ character ] == 1){ cur_playing [ character ] = 0; stop(note); } } // B r i n g th e t o n n e t z to i ts p l a n a r s h a p e function undeform (){ var c = 0.002; for (var i = 0; i < pitches_array . length ; i++){ particlePositions [i ∗ 3+2] = 0; mesh. geometry . vertices [i].z = 0; } } // D e f o r m a t i o n i n d u c e d by t he M I D I p l a y e r function playerdef (pitch){ c = 0.002 if( message === 144){ 249 250 APPENDIX C. CODE for (var i = 0; i < pitches_array . length ; i ++){ if ( pitches_array [i]== pitch %12){ particlePositions [i ∗ 3+2] += c; mesh. geometry . vertices [i].z += c; } } } } function disp_pref (){ // g et t he v a l u e s of th e v e r t i c e s on the m e s h var sk_0 = mesh. geometry . vertices ; // g et t he f a c e s of t he m e s h var sk_2 = faces; var height = []; var h_val = []; var sort_indices = []; var pref_pitches = []; // r e a d h e i g h t s of p i t c h e s a nd c r e a t e an i n d e x e d array for (var i = 0; i < 12; i++){ height [i] = [ i , sk_0[i].z ]; h_val[i] = sk_0[i].z; } var fifths = [[0,’C’] ,[7,’G’] ,[2,’D’] ,[9,’A’] ,[4,’E’] ,[11,’B’] ,[6,’F#’] ,[1,’C#’] ,[8,’Ab’] ,[3,’Eb’] ,[10,’Bb’] ,[5,’F’] ]; var aver = eval(h_val.join(’+’))/12; // s o r t th e v e r t i c e s u s i n g t h e i r h e i g h t a nd s a v e th e // c o r r e s p o n d i n g p e r m u t a t i o n of th e i n d i c e s in th e array // s o r t _ i n d i c e s height .sort( function (x,y){ return x[1] − y[1] }) for (var i = 0; i < height . length ; i++){ sort_indices [i] = height [i][0]; pref_pitches [i] = fifths [ height [i ][0]]; } // var var var var soil_pref = []; s = 0; threshold = 2; max_h = height [ height .length − 1][1]; C.2. 3D DEFORMED TONNETZ 251 soil_pref [0] = height [ height .length −1]; // s e l e c t p r e f e r r e d p i t c h e s ( d e p e n d s on t h r e s h o l d ) for (var i = height .length −1; i > 0; i−− ){ if ( height [i][1] > max_h/ threshold ){ soil_pref [s] = height [i]; s++; } } // s o r t p r e f e r r e d p i t c h e s r e m e m b e r i n g t h e i r i n d i c e s soil_pref .sort( function (x,y){ return x[1] − y[1] }) console .log( soil_pref ) var sort_soil_indices = []; var pref_soil_pitches = []; for (var i = 0; i < soil_pref . length ; i++){ sort_soil_indices [i] = fifths [ soil_pref [i ][0]][0]; pref_soil_pitches [i] = fifths [ soil_pref [i ][0]][1]; } // p r e f e r r e d p i t c h e s a nd t h e i r v a l u e in Z / 12 Z ar e d i s p l a y e d in c o n s o l e console .log( sort_soil_indices , pref_soil_pitches ) var material = new THREE. PointCloudMaterial ( { color: 0x000000 , size: 5, transparent : false , sizeAttenuation : false } ); // g e n e r a t e th e g e o m e t r y a s s o c i a t e d to p r e f e r r e d pitches shadow = new THREE. Geometry (); // d r a w it for (var i = 0; i < sort_soil_indices . length ; i++) { var pref_pitch = sort_soil_indices [i]; for (var j = 0; j < pitches_array . length ; j ++){ if ( pitches_array [j] == pref_pitch ){ var v = new THREE. Vector3 ( particlePositions [ j ∗ 3 ], particlePositions [ j ∗ 3 + 1 ],0); shadow . vertices .push(v); } 252 APPENDIX C. CODE } } // g e n e r a t e th e p o i n t c l o u d of p r e f e r r e d p i t c h e s an d a dd it to the g r o u p mesh_s = new THREE. PointCloud ( shadow , material ); mesh_s . geometry . dynamic = true; group.add( mesh_s ) } function render () { key("a", 60); ... // R e c e i v e p i t c h e s f r o m th e p l a y e r an d d e f o r m th e tonnetz if ( message === 144){ for (var i = 0; i<pitch. length ; i++){ playerdef (pitch[i]) } } // N e e d U p d a t e d e c l a r a t i o n pointCloud . geometry . attributes . position . needsUpdate = true; mesh. geometry . verticesNeedUpdate = true; renderer . render ( scene , camera ); } </script > C.3 Persistent homology computation The following code described the computation of the persistence diagrams of the torus Tonnetz through the ﬁltration induced by the height function deﬁned on its planar covering. import os import numpy as np from music21 import ∗ from dionysus import ∗ from os import listdir from os.path import isfile , join import matplotlib . pyplot as plt import matplotlib as mpl import csv from mpl_toolkits . mplot3d import axes3d import mpl_toolkits . mplot3d . axes3d as p3 class Tonnetz : C.3. PERSISTENT HOMOLOGY COMPUTATION 253 def __init__ (self , left , lowRight , upRight ): self.left = left self. lowRight = lowRight self. upRight = upRight self. weights = {} self. triangles = [] notes = [’C’, ’C#’, ’D’, ’Eb’, ’E’, ’F’, ’F#’, ’G’ , ’G#’, ’A’, ’Bb’, ’B’] def computeDataFrom (self , piece , considerDurations = True): def extractFromChord (c): values = [] value = None try: value = c. pitchClass except AttributeError : pass # do n o t t r y o t h e r s if value is not None: values . append (value) if values == []: for p in c. pitches : # t r y t o g e t g e t v a l u e s from p i t c h f i r s t , then chord value = None try: value = p. pitchClass except AttributeError : break # do n o t t r y o t h e r s if value is not None: values . append (value) return values self. weights = {} for i in range (12): self. weights [i] = 0 flat = piece.flat. getElementsByClass ([ note. Note , chord.Chord ]) for obj in flat: if ’Chord ’ in obj. classes : 254 APPENDIX C. CODE values = extractFromChord (obj) else: # s i m u l a t e a l i s t values = [obj.pitch. pitchClass ] for i, value in enumerate ( values ): if considerDurations : self. weights [value] += obj. duration . quarterLength else: self. weights [value] += 1 def loadDataFromTxt (self , filename ): with open(filename , "r") as text_file : lines = text_file . readlines () self. weights = {} for i in range (12): self. weights [i] = float (lines[i]. replace (" \n", "")) def draw2D (self , title = ’’, showValues = True): nCols = 4 nRows = 4 x = [] y = [] z = [] for i in range(nRows): for j in range(nCols): x. append (1 + 2 ∗ j + i) y. append (nRows + 1 − 2 ∗ i) self. triangles = [[0 ,4 ,1] ,[1 ,5 ,2] ,[2 ,6 ,3] , [1 ,4 ,5] ,[2 ,5 ,6] ,[3 ,6 ,7] , [4 ,8 ,5] ,[5 ,9 ,6] ,[6 ,10 ,7] , [5 ,8 ,9] ,[6 ,9 ,10] ,[7 ,10 ,11]] triangles = np. asarray (self. triangles ) print triangles plt. figure () plt.gca (). set_aspect (’equal ’) plt. triplot (x, y, triangles , "o−−", color=’ blue ’) plt.title(title) plt.axis ([0 ,10 ,0 ,6]) # m a r g i n s plt.axis(’off ’) matrix = [[0 for i in range (10)] for j in range (10)] C.3. PERSISTENT HOMOLOGY COMPUTATION 255 noteIndex = self.left % 12 # f i r s t " p r e v i o u s " point for i in range (3): # t h i s s e q u e n c e d e p e n d s on x and y c r e a t i o n order for j in range (4): noteIndex = ( noteIndex − self.left) % 12 xx = (1 + 2 ∗ j) + i yy = 5 − 2 ∗ i try: matrix [yy][xx] = self. weights [ noteIndex ] except KeyError : print ’Error: Please call computeDataFrom before calling draw2D ’ return label = self.notes[ noteIndex ] if showValues : label += ’ (’ + str(self. weights [ noteIndex ]) + ’) ’ + "[" + str (4 ∗ i+j) + "]" plt. annotate (label , xy=(xx , yy), xytext =(8, 3), textcoords =’offset points ’, color=’blue ’) noteIndex = ( noteIndex + self. lowRight + 4 ∗ self.left) % 12 cmap = mpl. colors . LinearSegmentedColormap . from_list (’my_cmap ’, [’white ’,’red ’], 256) cmap._init () alphas = np. linspace (0, 0.8, cmap.N+3) cmap._lut[:, −1] = alphas plt. imshow (matrix , alpha =1, cmap=cmap , interpolation =’blackman ’) plt. colorbar () plt.show () def draw3D (self , title = ’’, showValues = True , saveFileName = ""): nCols = 4 nRows = 4 x = [] y = [] z = [] 256 APPENDIX C. CODE for i in range(nRows): for j in range(nCols): x. append (1 + 2 ∗ j + i) y. append (nRows + 1 − 2 ∗ i) self. triangles = [[0 ,4 ,1] ,[1 ,5 ,2] ,[2 ,6 ,3] , [1 ,4 ,5] ,[2 ,5 ,6] ,[3 ,6 ,7] , [4 ,8 ,5] ,[5 ,9 ,6] ,[6 ,10 ,7] , [5 ,8 ,9] ,[6 ,9 ,10] ,[7 ,10 ,11]] triangles = np. asarray (self. triangles ) fig = plt. figure () ax = fig.gca( projection =’3d’) ax. view_init (elev =40. , azim =268) noteIndices = [] noteIndex = self.left % 12 # f i r s t " p r e v i o u s " point for i in range (3): # t h i s s e q u e n c e d e p e n d s on x and y c r e a t i o n order for j in range (4): noteIndex = ( noteIndex − self.left) % 12 noteIndices . append ( noteIndex ) z. append (self. weights [ noteIndex ]) noteIndex = ( noteIndex + self. lowRight + 4 ∗ self.left) % 12 if ( showValues ): [ax.text(x[i], y[i], z[i], self.notes[ noteIndices [i]] + ":" + str("{0:.2f}". format (z[i]))) for i in range (12)] ax. plot_trisurf (x, y, z, triangles =triangles , cmap=’Blues ’, linewidth =0.1 , shade=True) if saveFileName != "": fig. savefig ( saveFileName + ’.pdf ’) else: plt.show () def getTonnetz3D (self): nCols = 4 nRows = 4 x = [] y = [] z = [] for i in range(nRows): for j in range(nCols): C.3. PERSISTENT HOMOLOGY COMPUTATION 257 x. append (1 + 2 ∗ j + i) y. append (nRows − i) self. triangles = [[0 ,4 ,1] ,[1 ,5 ,2] ,[2 ,6 ,3] ,[3 ,7 ,0] , [1 ,4 ,5] ,[2 ,5 ,6] ,[3 ,6 ,7] ,[0 ,7 ,4] , [4 ,8 ,5] ,[5 ,9 ,6] ,[6 ,10 ,7] ,[7 ,11 ,4] , [5 ,8 ,9] ,[6 ,9 ,10] ,[7 ,10 ,11] ,[4 ,11 ,8] , [8 ,0 ,9] ,[9 ,1 ,10] ,[10 ,2 ,11] ,[11 ,3 ,8] , [9 ,0 ,1] ,[10 ,1 ,2] ,[11 ,2 ,3] ,[8 ,3 ,0]] triangles = np. asarray (self. triangles ) # print triangles noteIndex = self.left % 12 # f i r s t " p r e v i o u s " point for i in range (3): # t h i s s e q u e n c e d e p e n d s on x and y c r e a t i o n order for j in range (4): noteIndex = ( noteIndex − self.left) % 12 z. append (self. weights [ noteIndex ]) noteIndex = ( noteIndex + self. lowRight + 4 ∗ self.left) % 12 return [x,y,z] def max_vertex (s, vertices ): values = [ vertices [v] for v in s. vertices ] if len( values ) > 0: return max( values ) else: return 0 def max_vertex_cmp (s1 , s2 , vertices ): m1 = max_vertex (s1 , vertices ) m2 = max_vertex (s2 , vertices ) return cmp(m1 , m2) or cmp(s1. dimension (), s2. dimension ()) #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−# #−−−−−−−−−−−−−−−−−−− MAIN −−−−−−−−−−−−−−−−−−−# #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−# 258 APPENDIX C. CODE midi_dir = " sibeliusnodrums " distance = 1 holes_under = 1 # 0 t o d i s a b l e , 1 t o r e m o v e z b e l o w 1/10 , 2 t o remove z b e l o w 2/10... # Name o f CSV f i l e s weights_csv_name = " weights_ " + midi_dir + ".csv" distances_csv_name = " distances_ " + midi_dir + "_z.csv " # Outputting w e i g t h s t o CSV header =[’Name ’,’Coords ’] with open(os.path.join( weights_csv_name ), ’wb’) as csvfile : writer = csv. DictWriter (csvfile , fieldnames = header , delimiter = ’;’) writer . writeheader () myfile = open( weights_csv_name , ’ab’) wr = csv. writer (myfile , delimiter =";") # −−− MIDI −−− dir_path = "./" + midi_dir + "/" paths_list = [ dir_path + f for f in listdir ( dir_path ) if isfile (join(dir_path , f)) ] # −−− e n d MIDI −−− # −−− M u s i c 2 1 C o r p u s −−− #c o r e C o r p u s = c o r p u s . C o r e C o r p u s ( ) #p a t h s _ l i s t = [ path f o r path in coreCorpus . g e t P a t h s ( ) ] # −−− e n d M u s i c 2 1 C o r p u s −−− dgms = [] names = [] n = 1 print " Writing weights ..." for path in paths_list :# [ 0 : 5 ] : # −−− t x t −−− name = path. replace (dir_path , "") if not name. endswith (".txt"): continue print " iteration %d on element %s" % (n, name) path = path. replace (".txt", "") name = name. replace (".txt", "") n += 1 C.3. PERSISTENT HOMOLOGY COMPUTATION # 259 −−− e n d t x t −−− print names. append (name) n += 1 tonn = Tonnetz (3 ,4 ,5) # tonn . computeDataFrom ( p i e c e , Tr ue ) considerDurations = tonn. loadDataFromTxt (path + ".txt") tonn. draw3D ( saveFileName = "") #p a t h ) row = name , tonn. getTonnetz3D () wr. writerow (row) result = row [1] maxZ = max( result [2]) points = [] vertices = [] simplices = Filtration () for i in range (12): vertices . append (10 ∗ float ( result [2][i]) / maxZ) for i in range (24): if ( vertices [tonn. triangles [i][0]] > holes_under and vertices [tonn. triangles [i ][1]] > holes_under and vertices [tonn. triangles [i][2]] > holes_under ): simplices . append ( Simplex ([ tonn. triangles [i ][0]]) ) simplices . append ( Simplex ([ tonn. triangles [i ][1]]) ) simplices . append ( Simplex ([ tonn. triangles [i ][2]]) ) simplices . append ( Simplex ([ tonn. triangles [i ][0] , tonn. triangles [i ][1]]) ) simplices . append ( Simplex ([ tonn. triangles [i ][1] , tonn. triangles [i ][2]]) ) simplices . append ( Simplex ([ tonn. triangles [i ][2] , tonn. triangles [i ][0]]) ) simplices . append ( Simplex (tonn. triangles [i ])) simplices .sort( lambda x,y: max_vertex_cmp (x,y, vertices )) print " Complex in the filtration order: ", ’, ’. join (( str(s) for s in simplices )) print 260 APPENDIX C. CODE p = StaticPersistence ( simplices ) p. pair_simplices () # Output the p e r s i s t e n c e diagram smap = p. make_simplex_map ( simplices ) dgm = init_diagrams (p, simplices , lambda s: max( vertices [v] for v in s. vertices )) print " Diagram :" print dgm # d r a w P e r s i s t e n c e D i a g r a m ( dgm ) print dgms. append (dgm) print " Writing distances ..." header =[’Name1 ’, ’Name2 ’, ’Distance ’] with open(os.path.join( distances_csv_name ), ’wb’) as csvfile : writer = csv. DictWriter (csvfile , fieldnames = header , delimiter = ’;’) writer . writeheader () with open( distances_csv_name , ’ab’) as csvfile : wr = csv. writer (csvfile , delimiter =";") for i in range (len(dgms)): for j in range(i+1, len(dgms)): if i % 100 == 0 and j == i+1: print i, "/", len(dgms) try: bott_dist = bottleneck_distance (dgms[i ][ distance ], dgms[j][ distance ]) except : bott_dist = ’undefined ’ result = names[i], names[j], bott_dist print result wr. writerow ( result ) print " Finished ." C.4 Persistent time series - pairwise bottleneck distance import os C.4. PERSISTENT TIME SERIES 261 import sys import numpy as np from music21 import ∗ from dionysus import ∗ from os import listdir from os.path import isfile , join import matplotlib . pyplot as plt import matplotlib as mpl import csv from mpl_toolkits . mplot3d import axes3d import mpl_toolkits . mplot3d . axes3d as p3 class Tonnetz : def __init__ (self , left , lowRight , upRight ): self.left = left self. lowRight = lowRight self. upRight = upRight self. weights = {} self. triangles = [] self. numberOfFragments = 0 notes = [’C’, ’C#’, ’D’, ’Eb’, ’E’, ’F’, ’F#’, ’G’ , ’G#’, ’A’, ’Bb’, ’B’] def computeDataFrom (self , piece , considerDurations = True , tsNumberOfMeasures = 1): def extractFromChord (c): values = [] value = None try: value = c. pitchClass except AttributeError : pass # do n o t t r y o t h e r s if value is not None: values . append (value) if values == []: for p in c. pitches : # t r y t o g e t g e t v a l u e s from p i t c h f i r s t , then chord value = None try: value = p. pitchClass except AttributeError : 262 APPENDIX C. CODE break # do n o t t r y o t h e r s if value is not None: values . append (value) return values self. weights = {} flat = piece.flat. getElementsByClass ([ note. Note , chord.Chord ]) fragmentNum = 0; self. weights [0] = {} for i in range (12): self. weights [0][i] = 0 for obj in flat: fragmentNum = int(obj. offset / (4 ∗ tsNumberOfMeasures )) if (len(self. weights ) <= fragmentNum ): for i in range(len(self. weights ), fragmentNum + 1): self. weights [i] = {} for j in range (12): self. weights [i][j] = self. weights [i − 1][j] if ’Chord ’ in obj. classes : values = extractFromChord (obj) else: # s i m u l a t e a l i s t values = [obj.pitch. pitchClass ] for i, value in enumerate ( values ): if considerDurations : self. weights [ fragmentNum ][ value] += obj. duration . quarterLength else: self. weights [ fragmentNum ][ value] += 1 self. numberOfFragments = fragmentNum + 1 def getTonnetz3D (self , fragmentNum = 0): nCols = 4 nRows = 3 x = [] y = [] z = [] for i in range(nRows): 263 C.4. PERSISTENT TIME SERIES for j in range(nCols): x. append (1 + 2 ∗ j + i) y. append (nRows − i) self. triangles = [[0 ,4 ,1] ,[1 ,5 ,2] ,[2 ,6 ,3] ,[3 ,7 ,0] ,[1 ,4 ,5] , [2 ,5 ,6] ,[3 ,6 ,7] ,[0 ,7 ,4] ,[4 ,8 ,5] ,[5 ,9 ,6] , [6 ,10 ,7] ,[7 ,11 ,4] ,[5 ,8 ,9] ,[6 ,9 ,10] , [7 ,10 ,11] ,[4 ,11 ,8] ,[8 ,0 ,9] ,[9 ,1 ,10] , [10 ,2 ,11] ,[11 ,3 ,8] ,[9 ,0 ,1] ,[10 ,1 ,2] , [11 ,2 ,3] ,[8 ,3 ,0]] triangles = np. asarray (self. triangles ) # print triangles noteIndex = self.left % 12 # f i r s t " p r e v i o u s " point for i in range (3): # t h i s s e q u e n c e d e p e n d s on x and y c r e a t i o n order for j in range (4): noteIndex = ( noteIndex − self.left) % 12 z. append (self. weights [ fragmentNum ][ noteIndex ]) noteIndex = ( noteIndex + self. lowRight + 4 ∗ self.left) % 12 return [x,y,z] def drawPersistenceDiagram (dgm , name): for dim in range (2): plt. figure () plt.gca (). set_aspect (’equal ’) plt.title(" Persistence Diagram ") maximum = 0 try: points = [i for i in dgm[dim ]] except : continue x = [] y = [] for point in points : if (np.isinf(point [1])): x. append (point [0]) y. append (point [0]) plt.plot ([ point [0], point [0]] , [point [0], 100] , ’r’) 264 APPENDIX C. CODE maximum = max(maximum , point [0]) else: x. append (point [0]) y. append (point [1]) maximum = max(maximum , point [0], point [1]) plt.axis ([0, maximum +1, 0, maximum +1]) # margins plt.plot(x, y, ’ro’) # p l t . show ( ) plt. savefig (name + ’_diagram ’ + str(dim) + ’. pdf ’) plt.close () def max_vertex (s, vertices ): values = [ vertices [v] for v in s. vertices ] if len( values ) > 0: return max( values ) else: return 0 def max_vertex_cmp (s1 , s2 , vertices ): m1 = max_vertex (s1 , vertices ) m2 = max_vertex (s2 , vertices ) return cmp(m1 , m2) or cmp(s1. dimension (), s2. dimension ()) #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−# #−−−−−−−−−−−−−−−−−−− MAIN −−−−−−−−−−−−−−−−−−−# #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−# midi_dirs = [" midi_new_classic_mini ", " midi_new_jazz_mini ", " midi_new_pop_mini ", " midi_new_pop_minimini "] tsMeasures = 2 # Number o f m e a s u r e s ( i n t i m e s e r i e s case ) # −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− holes_under = 0 # 0 t o d i s a b l e , 1 t o r e m o v e z b e l o w 1/10 , 2 t o remove z b e l o w 2/10... for midi_dir in midi_dirs : print " Processing dir: ", midi_dir C.4. PERSISTENT TIME SERIES 265 # −−− MIDI −−− dir_path = "./" + midi_dir + "/" paths_list = [ dir_path + f for f in listdir ( dir_path ) if isfile (join(dir_path , f)) ] # −−− e n d MIDI −−− # −−− M u s i c 2 1 C o r p u s −−− #c o r e C o r p u s = c o r p u s . C o r e C o r p u s ( ) #p a t h s _ l i s t = [ path f o r path in coreCorpus . getPaths () ] # −−− e n d M u s i c 2 1 C o r p u s −−− dgms = {} names = [] n = 1 for path in paths_list :# [ 0 : 5 ] : # #−−− MIDI −−− name = path. replace (dir_path , "") try: piece = converter .parse(path) except : continue # #−−− e n d MIDI −−− names. append (name) dgms[len(dgms)] = [] n += 1 tonn = Tonnetz (3 ,4 ,5) tonn. computeDataFrom (piece , considerDurations = True , tsNumberOfMeasures = tsMeasures ) maxZ = max(tonn. getTonnetz3D ( fragmentNum = tonn. numberOfFragments − 1) [2]) print name , " Number of fragments : ", tonn. numberOfFragments for j in range(tonn. numberOfFragments ): row = name , tonn. getTonnetz3D ( fragmentNum = j) result = row [1] points = [] vertices = [] simplices = Filtration () for i in range (12): 266 APPENDIX C. CODE vertices . append (10 ∗ float ( result [2][i ]) / maxZ) for i in range (24): if ( vertices [tonn. triangles [i][0]] > holes_under and vertices [tonn. triangles [i][1]] > holes_under and vertices [tonn. triangles [i][2]] > holes_under ): simplices . append ( Simplex ([ tonn. triangles [i ][0]]) ) simplices . append ( Simplex ([ tonn. triangles [i ][1]]) ) simplices . append ( Simplex ([ tonn. triangles [i ][2]]) ) simplices . append ( Simplex ([ tonn. triangles [i][0] , tonn. triangles [ i ][1]]) ) simplices . append ( Simplex ([ tonn. triangles [i][1] , tonn. triangles [ i ][2]]) ) simplices . append ( Simplex ([ tonn. triangles [i][2] , tonn. triangles [ i ][0]]) ) simplices . append ( Simplex (tonn. triangles [i])) simplices .sort( lambda x,y: max_vertex_cmp ( x,y, vertices )) p = StaticPersistence ( simplices ) p. pair_simplices () smap = p. make_simplex_map ( simplices ) dgm = init_diagrams (p, simplices , lambda s : max( vertices [v] for v in s. vertices )) drawPersistenceDiagram (dgm , "./" + midi_dir + "/" + name + str( tsMeasures ) + " _no_holes_frag " + str(j)) dgms[len(dgms) − 1]. append (dgm) distance = 0 # Name o f CSV f i l e s distances_csv_name = "./ distances_no_holes_dim0_torus_ts_ " + str( tsMeasures ) + "meas/ distances_ " + midi_dir + "_z .csv" 267 C.4. PERSISTENT TIME SERIES print " Writing distances ..." header =[’Name1 ’, ’Name2 ’, ’Distance ’] with open(os.path.join( distances_csv_name ), ’wb’) as csvfile : writer = csv. DictWriter (csvfile , fieldnames = header , delimiter = ’;’) writer . writeheader () with open( distances_csv_name , ’ab’) as csvfile : wr = csv. writer (csvfile , delimiter =";") for i in range(len(dgms)): print names[i], len(dgms[i]) for j in range(i+1, len(dgms)): # Pairwise distances distances = [] for ii in range(len(dgms[i])): for jj in range(len(dgms[j])): try: bott_dist = bottleneck_distance (dgms [i][ii][ distance ], dgms[ j][jj][ distance ]) except : bott_dist = ’undefined ’ distances . append ( bott_dist ) result = names[i], names[j], distances wr. writerow ( result ) distance = 1 # Name o f CSV f i l e s distances_csv_name = "./ distances_no_holes_dim1_torus_ts_ " + str( tsMeasures ) + "meas/ distances_ " + midi_dir + "_z .csv" print " Writing distances ..." header =[’Name1 ’, ’Name2 ’, ’Distance ’] with open(os.path.join( distances_csv_name ), ’wb’) as csvfile : writer = csv. DictWriter (csvfile , fieldnames = header , delimiter = ’;’) writer . writeheader () with open( distances_csv_name , ’ab’) as csvfile : wr = csv. writer (csvfile , delimiter =";") 268 APPENDIX C. CODE for i in range(len(dgms)): for j in range(i+1, len(dgms)): # Pairwise distances distances = [] for ii in range(len(dgms[i])): for jj in range(len(dgms[j])): try: bott_dist = bottleneck_distance (dgms [i][ii][ distance ], dgms[ j][jj][ distance ]) except : bott_dist = ’undefined ’ distances . append ( bott_dist ) result = names[i], names[j], distances wr. writerow ( result ) print " Finished ." D Scores Here we report the minimal collection of scores that we believe essential for the understandability of the musical applications contained in this work. The author transcribed part of the pieces contained in this chapter, while both the two versions of Caravan and the fragments ([1]interlude and [3]intro) of All the Things You Are correspond to the MIDI ﬁles freely downloadable at http://www.midiworld.com/ and http://midkar.com/. 269 APPENDIX D. All The Things You Are 270 All the Things You Are - Jerome Kern F‹7 B¨‹7 b 4 & b bb 4 w 9 & bbbb 17 & 21 & œ œ C‹7 œ œ œ n˙ œ œ œ œ D¨‹7 ˙ B¨‹7 œ œ nw w ˙ œ E¨7 ˙ nœ œ £ nw œ nœ £ nw œ œ C&7 bw w œ œ C‹7 œ ˙ œ œ œ nœ œ nœ nœ EŒ„Š7 A¨Œ„Š7 œ ˙ GŒ„Š7 nœ nœ œ E¨7 ˙™ w £ GŒ„Š7 œ œ nœ B¨‹7 D¨Œ„Š7 CŒ„Š7 œ B7 F‹7 œ œ œ nœ œ œ F©‹7 b b b b n˙ b & b bb 33 nœ 3 œ ˙ œ E¨Œ„Š7 D7 b & b bb w & œ œ ˙ œ œ D7 A‹7 bbbb œ B¨7 ˙™ bw œ n˙ œ F‹7 b b b b n˙ 25 29 œ Jerome Kern A¨Œ„Š7 œ G7 A¨Œ„Š7 b & b bb 13 œ D¨Œ„Š7 b & b bb œ 5 E¨7 ˙™ SCORES ˙ œ Bº7 œ 3 œ œ œ w A¨Œ„Š7 ˙™ w G7 œ C7 271 [1]interlude b 4 œ œ œ œ & b bb 4 ‰ œj œ œ œ b b4 &b b 4 Œ b 4 & b bb 4 b 4‰ & b bb 4 œ œ œ œ œ œ œ ∑ ‰ nœj œ œ œ nœ œ œ œ Œ Ó ‰ œ œ œ bœ œ œ nœ Œ ‰ ≈ 3 œ nœ ≈ ™ œ™ ˙ ˙ w ˙™ n œ œJb œ b œ n œ n œ œ n œ b œ b œ n œ œ b n œ œ n œ j j œ #- œ œ 3 3 nœ œ œ nœ œ nœ bb ‰ &b b ‰ œ œ œ œ b˙ n˙ n œ b œ nn œœ n œœ œ b œœ œ nnn œœœ n œ n œœ œ œ œ œ n œ j j bbb ‰ nœ ‰ nœ œ b & œ œ ˙ n˙ bœ œJ œ™ b ‰ j nœ œœ œ & b bb ‰ ≈n‰œj nnŒœœœ œ #œœŒœ œ œœœ‰ œ≈ # œ Œ ‰ ≈ ‰ Œ n œ œ œœ ≈œ n œ œ œ œ bœ bœ n œ œnœ R R ≈ 3 ≈ b ‰ Œ Œ ‰ ≈3 Œ ‰ ≈3 r ‰ Œ & b bb r 3 bœ œ nb œw bœnœ w j b œ n œ nn œœ n œ œ b œœ n œb œ n nwœ nœ 5 nœ œ œ b n œ ™ bœ ™ n œ nœ b ‰ b b & bœ œ œ œ œ œ œ n œœ n œ œ n œ œ œ b œ n œ n j b b ‰ nœ nœ bœ bœ &b b nœ œ œ ˙ j 3 j j ‰ b b œ œ j ‰ ‰ b n œ œ ≈ œ b œ b n œ & r #œœ œœ # œœ œ n œ œ n œ œ nœœ nœ bœœ œ nœ . œnœ ≈ ‰ Œ ≈ 3 b ‰ Œ Ó ‰ b ‰ j ‰ &b b r Ó j ‰ nœ nb œw œ n œJ b œJ b œ n œ w 272 APPENDIX D. SCORES [1]interlude 2 n œ b œœ n œ b œ œ œ n œj œ nœ bb ‰ &b b b˙ 7 ‰ œ œ œ œ j nn œœ n œ n œœ n œ œ œ œ bœ nœ n œ b œ œ œ bœ nœ œ ‰ bœ J œJ œ™ ≈ j œ r b œj bbnœœœ œ nnnœœœ œ bœœœ œ ≈ nœ‰ nnœœœŒ œ #œŒœ œ œœœ‰ œ≈ b œ n œ n œw b ‰ & b bb ˙ b ‰ & b bb œ R b & b bb ≈‰ Œ bn wœ wR nœ œ b b R &b b ‰ ≈ 9 nn œœ nœ b & b bb Œ b & b bb n œ nb œœ bb &b b Œ ‰ ≈ nœ ™ n œœ ™™ œ™ j nœœ ‰ Ó nn œœ b & b bb Œ 11 b œ n œ nn œœ b œ œ œ b œ bœ J Œ Œ œ 3 bw b œ n œ b œœœ œœ ™™ 3 œ. n œ n œ ™ œ œ n œ ˙˙ ™ ™ œ œ ‰ ≈ R ‰ ≈ R œœ. œ nœ nn#œœœ ™™™ œ œœœ œ ‰ ≈ œR ‰ ≈ nœœœ R j ‰ Œ # nœœj ‰ Œ œ œœ nœ œ bœ œ œ œ œ nœ ™ nœ œ œ™ n œ b œ ™ n œ #n œœ nœ j œŒ ‰ Œ nœ bœ nœ ™ b œ nœ ™ nœ nœ ™ œœ ™™™ n œœœ ™™™ bnœœœ œœœ bbb<n><n>œœ nnœœ ™™™ ™ nœ ™™ bœœ b & œ œ nœ œ œ n œœœ ™™™ nœ bœœœ ™™ bœn œœ ™ b œ b j ‰ Œ j ‰ Ó & b bb Œ nnœœj ‰ Œ œœ œ b ‰ j ‰ j ‰ Œ & b bb Œœj œœ ‰ b œ Œœœ œ œ nœ nœ J œ œ nœ nœ nœ nœ j œŒ ∑ ∑ ∑ ∑ ‰ 273 [3]intro bœ b 4 b &b b 4 bœ { ? bb b 44 bœ b nnœœ b b &b b 2 { ? bb b bœœ b - œœ œnœ œ œœ b œœ Ó 3 bb œœ œ œ œ œœ œ∑œ œœœ œœ œœ .œ nœ nœ -œ . œ œœ. œ œœœ œ ≈ œ nœ œn œœ œœ œœ œ œ œ ∑ - - œ œ œœ bœœœ - œ bœœ œ™ œœ 3 nœœœ œœ œœbb œœ ≈ bœœœ b b œ &b b R R ≈ R ≈ J œœ œœ bbœœœœ œ ? bb b bœœ b . - 3 { nnœœ b b &b b 4 { ? bb b bœ b œœ œ nœ œœ œœ nœœœ - œ n œœ œœ œœ œœ œœ . ‰ œ œœœ œœ œœ œ ≈ œ œ ≈ ‰ R œœ œœ œœ œœ ™ œ ™ ∑œ œ ‰ 3 nœ nœ œœ. R ≈ R ≈ Œ nœ œœ œ œ œœœ ∑ œœœ œœ œœœ œœœ œœœ 3 Interplay 274 APPENDIX D. SCORES Bill Evans Interplay - Bill Evans 3 3 j œ b 4 b œ j œ œ œ œ œ b œ œ ‰ ‰ œ œ œ œ œ œ œ œj ‰ & b4œ JJœ œ œœœ œ œ œ œ b 4 & b bb 4 ˙ ˙ ˙ ˙ œ œ œ œ ? bb b 4 ˙ b4 b˙ ˙ ˙ œ œ œ bœ œ bœ œ œ œ œ œ œ 3 ™ œ œ œ œ œ œ b œ œ b œ œ œ œ œ œœ œœœ œ ‰J œ œ œœ œ œ œ œj ‰ œ œ &b b J 5 3 Pno. bb &b b ˙ ? bb b ˙ b ˙ ˙ b˙ œ bœ œ œ ˙ ˙ ˙ œ œ œ œ b & b bb œ œ œ œ œ œ œ œ œ œ œ Œ 9 3 Pno. 3 3 œ œ œ œ œ œ bœ œ œ œ œ 3 3œ œ œ œ œ œ œ œœœœ œ œ œœ Œ 3 bb &b b ˙ ˙ n˙ ˙ ? b b b˙ bb ˙ ˙ ˙ ˙ ˙ œ œ œ 3 3 ˙ œ bœ n˙ œ œ œ 275 Time - Hans Zimmer w w w ? #4 w 4 { w w w w ? #4 4œœœœœœœœ œœœœœœœœ œœœœœœœœ œœœœœœœœ 5 w ?# w { ?# w w w œœœœœœœœ 9 ?# w w w { w w w w w œœœœœœœœ œœœœœœœœ œœœœœœœœ w w w w w ?# w œ œœœœ œ œ œ œ œ w œ œœœœ œ œ œ œ œ œ œœœœ œ œ œ œ œ w 12 w ?# w w w w w w w { œœœœœ œ œ œ œ œ ?# w œœœœœ œ œ œ œ œ w œ œœœœ œ œ œ œ œ w 15 w w ˙ w ?# œ œ ˙ œ œ ˙ w ?# œœœœœ œ œ œ œ œ œ œœœœ œ œ œ œ œ œ œ œ œ œ œ œ œ œœœœœœœœ œœœœœœœœ w w { 19 ?# ˙ ˙ ˙ œ œ œ œ ˙ ?# œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œœœœœœœœ œœœœœœœœ œœœœœœœœ œœœœœœœœ { œ œ ˙ ˙ œ œ? ˙ 276 APPENDIX D. SCORES Time - Hans Zimmer 2 ? # ˙˙ 23 œ œ ˙ ˙ œ œ w w ˙ œ œ w w w œ œ ˙ œ œ ˙w ?# œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œœœœœœœœ œœœœœœœœ œœœœœœœœ œœœœœœœœ { 27 & { # w ˙w œ œ & w œ œ œ œ w w w w w œ œ œ œ œ œ œ œ œ œ ?# œ œ œ œ œ œ œ œ œœœœœœœœ œœœœœœœœ œ œœ œœ œœ œ œ œœ œœ œœ œ # & w œ 31 { œ œ œ w w w w œ œ œ œ œ œ ?# œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ 34 & { #w w w wœ œ œ œ w œ œ œ œ w w w œ œ œ œ ?# œ œ œ œ œ œ œ œ œ œ œ œ œœ œ œœ œ œœ œ œŒ œ œ œ œ œ œ œ œ œ # wœœ œœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ w & w wœ œ œ œ œ œ œ œ œ œ œ œ œ œ w œœ œœ œœ ? # œœ œ œ œ œ œ œ œ œ œœœ œ œœœ œ œœœ œ œœœ œ 37 { wœœ #w wœ œ œ œ œ œ œ œ œ œ œ œ œ œ & wœœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ w œœ œœ œœ ? # œœ œ œ œ œ œ œ œ œ œ œ œ œ œœ œ œœ œ œœ œ œœ œ 39 { 277 Time - Hans Zimmer w w #wœœ w œ œ œ œ œ œ œ œ œ œ & w œ œ œ œ wœœ œ œ œ œ œ œ œœ œœ œœ ? # œœ œ œ œ œ œ œ œ œ œ œ œœ œ œœ œ 41 { #w & wœœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œœ œœ œœ ? # œœ œ œ œ œ œ œ œ œ 43 { œœ œ w w w w w w w œ w œ Œ œœ œ œ wœ œ œ œœ Œ œ œœ œœ œ œ œ œ œ ? # œœ œœ œœ œœ œœ œœ œœ œœ œ œ œ œ œœ œœ œœ œœ œ œ œ œ œ œ œ œ œœ œœ œœ œœ œ œ œ œ œ œ œ œ # & œœ œ { œ œ œœ œ œ œ œ w w w Œ œ œ œœ œ œ œœ œ w w w Œ œœ œ œ œ œœ œ œœ œœ œœ œ œ œ œ œ œ œ #w w & w w Œ œ œ œœ œ œ 51 œœœœœœœœ w w wœœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ 46 { 3 œ œ œ œ w w w Ó w w w w Œ œœ œ œ œ œœ œ œ œ œœ œ œ œ œ œ œœ œœ œœ œœ œ œœ œœ œœ œ œ œ œ œ ? # œœ œœ œœ œœ œœ œœ œœ œœ œœœ œœœ œœœ œœœ œœœ œœœ œœœ œœœ œœœ œœœ œœœ œœœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ w œ œ w œ w w œ w œ œ w Ó œ Ó œ œ œ ? # œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ #w w & w w Ó 54 { œ œ œ w w w œ #w w œ œ w œ w œ & Ó œ Ó œ ? # œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œ œ œ œ œ œ œ œ œ œ œ œ œ œœœ œœœ œœœ œœ œ œœœ œœœ œœœ œœ œ œœœ œœœ œœœ œœ 57 { œ œ ˙ œ ˙˙ œ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œœ œ œœœ œœœ œœœ œœ 278 APPENDIX D. SCORES Time - Hans Zimmer 4 w w #w & Ó œ œ 60 { Œ ˙˙ ™™ w w œ œ œœ ?# œ œ œ œ œ œœ œ œœ œ œ œ œ œ œ œ œ ˙ w w w & w w ?# ∑ 66 { # w w w w w ∑ w w w w w ∑ ∑ w w Œ w ˙™ ˙ w ™ Œ ˙™ w w ˙™ w w ∑ ∑ w w w w w ∑ w w w w Œ ˙˙ ™™ w ˙™ w w w ∑ ∑ ww w w ∑ 279 Caravan_md - bars 1 - 50 caravan-duke-milesdavis_oz Vibraphone 4 &4 ∑ ∑ ∑ ∑ 4 {& 4 ∑ ∑ ∑ ∑ ? 44 ∑ ∑ 4 &4 ∑ ∑ ∑ ∑ ∑ ∑ Tenor Saxophone Acoustic Bass Synth Brass Solo Solo 4 &4 4 &4 ˙ ∑ bœ™ b œœ ™™ ≈ œ™ J Solo Solo ? ˙ . ≈ nœr Œ ‰ n œœ & œœ ‰ Œ ## Jœœ ≈ #œ ™ & Œ J ˙˙ ˙≈ œ ‰ ‰ ≈ n œœ œ R œ. nœ ™ œ. œ ™ R 7 A. Bass Solo Solo ? ˙ ˙˙ ˙ ˙ ˙ ˙ j & œœ ‰ ‰ ≈ r œœœ œœœ ™™™ b œœ & ≈ bœJ ™ ˙ ™ œœœ œœ bœ ˙˙Œ ˙ ˙ j œœœ ‰ Œ bœ œ‰ Œ J - œ ≈ œ™ J ∑ w w w . . ≈ bœ ™ œ. œ™ œ œ™ œ œ ™ J ˙ ˙ ˙ ∑ œœ œœ œœ ™™ œœ œœ œœ œ ™ œ œ ˙™ 5 A. Bass ˙ ˙ ˙ œ ‰ ≈ œœ .R ≈˙ ˙˙ r ‰ Œ bœ œ J ‰ Œ 280 APPENDIX D. SCORES Caravan_md - bars 1 - 50 2 9 A. Bass Solo Solo ? ˙ ˙ j & ≈ # œœœ ™™™ œœœ # œ™ œ & Œ j œœ ‰ ‰ ≈ n œœ . . ≈ œ™ œ #œ ™ œ nœ ™ J r œœ œ ˙ ˙ r ≈ ≈ œr Œ œœ # œœ œ- . œ nœ ™ ≈ #œ ™ J r≈ ‰ Œ œœ œœ. œ ™ œ ™ . œ 11 A. Bass Solo Solo ? ˙ ˙ & œœr ≈ ‰ ‰ ≈ œr Œ n œœœ #œ & ≈bœJ ™ ˙ ™ ˙ Solo Solo ? ˙ A. Bass Solo Solo ? ˙ j r & bœœœ ‰ ‰ ≈ œœœÓ œ œ j & ≈ œ™ ˙™ ˙ ˙ ˙ ˙ j œ ‰ ‰ ≈n œr ˙ œ œœ ˙˙ ## œœ œ. nœ ™ œ. bœ ™ œ ‰ Œ Œ ≈ œ™ J J ˙ j j & ≈ # œœœ ™™™ œœœ n œœœ ‰ # œ™ œ œ j & ≈ œ™ œ. #œ ™ œ. #œ ™ 17 ˙ r ≈ ‰ œj ‰ ‰ ≈ œr Ó œœœ œ b œœœ b œœœ 14 A. Bass ˙ ˙ ‰ ≈ r Œ ≈ r‰ ‰ ≈ r Ó œœ ≈ œœ ‰ Œ œ bœ œœ œ R j œ. œ ™ ≈nœ ™ ˙™ Œ œÓ ˙ J ‰ ™ Œ ‰ ‰ ≈ rŒ j ≈ b œœœ œ ‰ Œ œœœœÓ R j ˙ œ‰ œ ˙ bœ œ ™™ œ ™ œœ ™™ # ≈œœ ™ œ ™ Œ ≈ œÓ ˙ ™ Œ J ‰ œ r ≈‰ Œ ‰ ≈ œœr Œ bœ œœœ j ˙ œ‰ Œ .. œœ ‰ Œ œœ œœ œ R œ™ ˙™ J ˙ ˙˙ ˙≈b œ ‰ Œ R 281 Caravan_md - bars 1 - 50 3 20 A. Bass Solo Solo ? ˙ j j & #œœœ ‰ Œ n œœœ ‰ ‰ ≈ r #œ bœ œ b œœ œ ‰ Œ & ˙ J œ ≈ œ-r 22 A. Bass Solo Solo ? ‰ & ≈œœœ ™™™ bœ ‰ R & ≈ bœJ ™ 24 A. Bass Solo Solo ? Solo Solo ˙ & <b> œ ™ <b> œœ ™™ & <b>˙ 26 A. Bass ˙ ˙ ? ˙ œ œ ˙ ‰ ≈ nœ r≈ Œ R œœ œ # œ œ. nœ ™ œ. œ ™ ≈ ™ J œœ ™™ œ™ Œ ˙ ˙ . j œœ œ ˙ œÓ ‰ Œ ≈œj ‰‰ Œ‰ ≈ ‰ œ≈ œœ ˙˙ œ b œœ œœ œ b œœ œ ˙ . b œ R R .œ œ™ R œ. œ™ œ ™ ≈ bœ ™ ˙ ™ œ J ˙ ˙ . ≈ œr ˙ ‰ ‰ j œœ ˙˙≈ ‰ ‰ ≈ bœ œœ bb œœœ≈ ‰ Œ œ œ R . . R œ #œ ™ œ nœ ™ R Œ ≈ œ™ J ∑ œœ b œœ ˙˙ œ œ ˙ œ ‰ Œ J -œ ≈ œ™ J ≈ œr Ó Œ ‰ & <b> ww w . . & ≈#œJ ™ œnœ ™ œ œ ™ œ. œ ™ ˙ ˙ . œœ ™™ œœ œ œ≈ ™ b œ ‰ ‰ ≈œ œœ œ R R ≈ bœ ™ ˙ ™ J -œ Ó œ ≈ œ™ œ ‰ ˙™ Œ J J r j œœ ‰ ‰ ≈ œœ Œ ‰ ≈bœR ˙˙ b œœ ˙˙ œœ ˙ œ‰ Œ J 282 APPENDIX D. SCORES Caravan_md - bars 1 - 50 4 29 A. Bass Solo Solo ? ˙ ≈ j œ ™ œ & <b>≈˙˙ ‰ Œ Ó #˙ œ bn œœœ ™™™ œœœ R . Œ ≈ œ™ œ nœ ™ œ. bœ ™ & J ˙ 31 A. Bass Solo Solo ? ˙ ˙ r j & bœœœ ‰ ‰ ≈ œœœŒ ‰ ≈ r nœ œ b œœ œ j & ≈nœ ™ ˙™ 34 A. Bass ? ˙ ˙ Solo j r r & bœœœœ‰ ‰ ≈ œœœœŒ ‰ ≈ œœœœ Solo & ˙ j œ‰Œ ? #Œœ #œœ œ R≈‰ Ó 37 A. Bass ˙ ˙ Solo & Ó Solo bœ ™ ˙ ™ & ≈ Jœ ™ ˙™ œ r≈‰ Œ ##œœœœ Œ œ w w w ˙ ≈ j Ó # œœœ ™™™™ #œ ≈ œ™j ˙ j ‰ ‰ ≈œœ œ #œ R œ. #œ ™ œ #œ ™ . œœ œœ Œ Œ œ. œ ™ ˙ ˙ j œ ‰ Œ ‰ ≈ œr Œ ‰ ≈ œr Œ œœ b œœ ≈ œ™j ˙™ œœ ≈‰ œÓ œ R Ó Œ ≈ r‰ Ó Œ bœœ nœ ™ ™ ≈ œœ™ ˙˙™ J œ ‰œ J ˙ œ œ r ≈‰ ‰ ≈ r Œ r ≈‰ n# œœ # œœœ n œœ œ œ œ œ œœ ≈‰ ≈bœœ ™™ œœ. œœ™™ œœ. œ ™ œ™ J R nœœ œ # œ bœ œ œ bœ R≈‰ Ó j r ∑ bbœœœœ ‰ ‰ ≈ œœœœŒ ‰ ≈ r b œœ œ j ™ ™ œ ˙ ≈ j ≈#œœ ™™ œœ ≈ œœ ™™ œœ ≈bœ ™ ˙ ™ J J nŒœ 283 Caravan_md - bars 1 - 50 5 40 A. Bass Solo Solo ? j bœ ‰ œ & œr ≈ ‰ Œ <b> œ œ œ ≈ ‰ ≈ œœ ™™ & R J 42 A. Bass Solo Solo ? bŒœ & ≈œj bb œœ œ Rœ ™ b œ™ ≈ & J Solo Solo ? bœ œ œ ≈ œr Œ b œœ .œ œ ™ œœ. œ ™ œ œ™ bœ ™ ‰ œ œœ ≈ ‰ Ó R ≈ r Œ ‰ ‰ ‰ Œ œœœœ Ó ˙˙ ™™ 44 A. Bass œ bœ œ & Œ b œœr ≈‰ ‰ ≈ œr ‰ ≈ œr b œœ œ b œœ œ œ ™ ∑bœ ™ œ & <b>œR ≈‰ ≈ œJ ™ œb œ ™ œ œœ ™™ . . Œ œ œ œœ ≈ ‰ #œÓ R ‰ ≈ bœr Œ ≈ œr ‰ Œ œ œ bœ bœ ˙˙™™ ≈ bJœœ ™™ œ œœœ œ bœ Ó j ‰ Œ œ b œ bb œœ ≈ bœ™j ˙™ ≈b œJ ™ ˙ ™ bœ ‰ œ J bœ œ bœ 47 A. Bass Solo Solo œœ j r ≈‰ Œ ‰ ≈ b œr Œ b Óœœœ ‰ Œ b œœ b œ œœ œ j ™ ≈ œ™ œ ≈bœœ ™™ ˙˙ ™™ bœJ œ œ b œ #œŒ bœœ œ bœ bœœ R ≈‰ Ó R ≈‰ Ó r r j j bœœ ‰ ‰ ≈ œœ Œ ‰ ≈ œR & bbœœœ ‰ ‰ ≈ œœœ Ó b œœ œœ œ œ bœ ™ ˙ ≈ œ™j ≈ œr ‰ ≈ œj™ œ #œ ™ œbœ ‰ & ≈b œJ ™ ˙ . . ? #œŒ Œ œ b˙ j bœœ ‰ Œ œœ œ œ bœ ‰Œ ≈œœj ‰ Œ œ bœ R ≈ nœœ ™™ œœ J nœ œ œ j‰ Œ ˙ n#≈˙˙b œ ‰ Œ œœÓœœ Rj ≈ nœœ ™™ ˙˙ ™™ 284 APPENDIX D. SCORES Caravan_js - bars 1 - 50 ° 4 &4 ∑ ∑ 4 &4 ∑ ∑ 4 &4 ∑ ∑ 4 &4 ∑ ∑ 4 &4 ∑ ∑ 4 &4 ∑ ∑ Acoustic Bass ?4 ¢ 4 ∑ ∑ œœ œœ œœ œœ œ≈œ œ≈ œ≈ j‰ Rœ œ œ œ œ™ œ œ œ™ œ œ œ Rock Organ 4 {& 4 ∑ ∑ ∑ ∑ ∑ Rock Organ 4 {& 4 ∑ ∑ ∑ ∑ ∑ Rock Organ 4 {& 4 ∑ ∑ ∑ ∑ ∑ Rock Organ 4 {& 4 ∑ ∑ ∑ ∑ ∑ Acoustic Guitar Electric Guitar Pedal Steel Guitar Pedal Steel Guitar Pedal Steel Guitar Kora ˙™ ˙™ ∑ ˙™ ˙™ ∑ ∑ œ œ w w œ w b˙ w ˙ œ œ ≈ #˙ ˙ œœ œœ œ œ -œ ≈ R œ œ™ œ œ œ œ bœ ≈ œœ œ ≈ œœ œ ≈ œœ œ ≈ œœ œ œ ≈ r ≈ œœ ≈ r ≈ ‰ ##œœœœ≈œœœœ œœœœ≈ œœœœ≈ ≈ œ R œ Rœ œ œ œ œ œ œ™ œ œ #œ œ œ œ™ œ œ œ œ™ #œ œ = 6 E. Gtr. ° & w w P. S. Gtr. & P. S. Gtr. & P. S. Gtr. & Kora A. Bass ¢ w w œ œ & œ œœ ≈ œœ œ - R ? œ™ œ œ ˙™ ˙™ œ œ- œ œ™ œ œ- œ ≈ Jœ ™ ˙™ ˙™ #˙ ˙ ˙™ œ œ- œ ˙™ b˙ ˙™ ˙ ˙™ œœ ≈ œœ ≈ œ œ œ œ œ œ œ ≈ œ œœ ≈ œœ œ - R œ œ™ œ œ œ œ™ œ œ ≈J œ™ J ≈ œ œ œ œ™ ≈J œ. œ bœ œ. ≈ ≈ œ. œ œ # œ ≈ œ. œ œ œ. ≈ ≈ œ. œ œ # œ ≈ #œ œ œ œ œœ ≈ œœ ≈ œ œ œ œ œ œ œ œ œ ≈ œ œœ ≈ œœ œ œœ ≈ œ œœ ≈ œ œ ≈ # œœ ≈ œœR œœ ≈ œœ ≈ j ‰ - R œ œ œ œ œ œ™ œ œ œ œ™ œ œ b œœj œj™ # œ œ ≈ #œ œ‰ œ Œœ œ ≈ œ™ œ œ ≈ Œœ ™ ≈‰ ≈R œ 285 Caravan_js - bars 1 - 50 2 10 A. Gtr. & ˙™ ˙™ & Œ P. S. Gtr. & Œ P. S. Gtr. & E. Gtr. P. S. Gtr. A. Bass Œ ˙™ ˙™ œœœ & œ ≈ ? nœ ™ œœœ R œ œœ ° ##œœ & œœJ œ ‰ œœœ J Kora ¢ ##œœœœ ‰ œœœœ ‰ œœœœ œœœœ ‰ œœœœ J J J #w w ∑ ° & ##œœœœ œ ‰ œœœ J r bœr #œ ≈ bœj Œ ‰ ≈ œR Œœ #œ ≈ ‰ ŒR J r ≈ Œ #œ bœ ≈ r ŒR œJ ‰ Œ œ œ ≈# œ ≈ ‰ ŒR J Œ-≈ œ œ # œ œ œ bœ œœ œ #œ œ œ œ 3 #œ œ œ œ™ œ bw w Œ œœœ œœ ≈ œ œ ≈ œj ‰ nœ œ ™ œ œ œ - 3 bœ #œœ œœ œ œœ ≈ œ œœ œ™ œ œ #œ œ œ œ œ œ œ™ œ 3 #œ œ œ œ™ œ œ ≈ œ™ œ œ œœœ œ œœœ J r #œœœ≈ œœ œ œœ œ rœ b œœ œ ≈ Rr ‰J ≈bn œœ œ ‰ œj ‰ œj R ≈ Rr ‰J ‰ ≈≈ œœ œ # œœ œ œj R ≈bn Rœœr ‰Jœ ‰ œœ œ≈ ≈ œj ≈ R ‰J R ‰ ≈≈ œ j œ œ œ ‰ œ œ œJ ‰ œ™ œ œ ≈ œ = 13 A. Gtr. E. Gtr. & P. S. Gtr. & P. S. Gtr. & P. S. Gtr. & #w w œ œ œ ‰ œœœ œœœ ‰ œœœ J ##œœœœ œ œœœ ‰ ‰ bw ‰ w œ≈ ## œœ œœ Kora & #œ œ œ ? -œ ™ #œ A. Bass ¢ . r -j œœœ œœ œj œœ œ ™ œœ œ ≈ œ œ œ œ™ œ œ œ œ œ™ œ œ œ ≈ ‰ #œ œ # œœ #œ œ œ™ œ œ œ ‰ œœœ œœœ ‰ œœœ J J ≈ #œR bœ- œ ≈ œ- œ œ ≈ œR œ œ œ œ œ œ #œ ≈R ≈R ≈ bœ#œ œ œ œ œ œ ≈R - ≈≈R œ œ œ œ œ œ #œ ≈ R-r - ≈ - - ≈ R œœ œœ œ œœ œ ≈ œ ≈ œR ‰œ œJ œ œ œ J #œ œ œ œ™ œ œ ‰ ‰ ‰ ‰ ‰ œ≈ ##œœœœ J #w w œ ‰ œœœ J œ œ œ ‰ œœœ œœœ ‰ œœœ J bw w ## œœ œœ ≈œ #œ œ œ - #-œ œ™ . r -j œœœ œœ œj œœ œ ™ œœ œ ≈ œ œ œ œ™ œ œ œ œ œ™ œ œ œ = ° ##œœ A. Gtr. & œœ 16 E. Gtr. & ‰ P. S. Gtr. ‰ & Ó P. S. Gtr. & ‰ P. S. Gtr. Kora A. Bass ¢ œ œœœ ‰ œœœœ œœœœ ‰ J J #œ œ. œ. œ bœ Jj R ≈ R ≈ œ bœ œ #œ nœ b œ œ. œ. Jj R ≈ R œ bœ œ ‰ & Ó - -r œ ## œœœ ≈ œœ œ & #œ ≈ œ ‰ œ R- œ ? œ™ #œ œ œœ œJ œ œ œœœ œœ œJ ‰ œ. ≈ œ w R # œ n Jœ w J ‰ Œ œ œ œ. #œ nw R ≈J ≈ #œ nœ #œ nœ w J Œ- - ‰ œœ œ œ œœ œ œ œ œ œ œ œ œ œJ ‰ r 3 œ™ œ œ ≈ œ ‰ œ™ ≈ œ J ≈ œœ œJ ‰ œœœ œœœ ‰ œœœ J œœ œ ˙™ ˙™ ˙™ - œœ œ œœ œ œ œ œ ™ œ œ œ Œ ‰ œœ œ ‰ œœœ œœœ J J ‰ œœ œ Œ Œ Œ ˙™ -r . . œŒ œ œ œœ œœ œ œ œ œœ œœ œ œœ œœ œ œ œ œ œ3 œ ≈ œ œ œ œœ œœ œ œ œ œ œ ™ j œ œ™ œj œj™ œœ œ œ ≈œ œŒ œ œ™ J R ≈R ≈ ‰ ‰ ≈ 286 APPENDIX D. SCORES Caravan_js - bars 1 - 50 3 ° ##œœ & œœJ 19 A. Gtr. ‰ #w & w E. Gtr. P. S. Gtr. & P. S. Gtr. & w œ œœœ J œ œ ‰ œœœ œœœ œ ‰ œœœ J ##œœœœ œ œœœ ‰ bœ œJ ‰ bw ‰ & #œ œ# œ Kora & œ #œ ? ™ 3 A. Bass œ ¢ P. S. Gtr. œœ œœ œœ œ œ œ œ œ œ #œ œ œ œ™ J #œ J œ œ œ ‰ œœœ œœœ J J #œ ≈ œ œœ. R J . œ # œ œœ R ≈J b œ œ œœ. R ≈J œ # œ œœ. -r R ≈ J . œœ œ œ œ œ œ ‰- J #œ œ# œ ≈ œ #œ œ œ #œ œ œ™ . œœ œœ œ œ œ œ™ œ œœ œ œ ‰ œœœ œ. ≈ œR œ œ bœ œ ##œœœœ ‰ œœœœ ‰ œœœœ œœœœ ‰ œœœœ J J J œ #w w œ ≈R œœ œ bw ≈b Rœ œ œ w œ ≈R œœ œ œ - 3 œ œ œ œ3 œ bœ #œœ œœ œ œœ ≈ œ œœ œ™ œ œ œ œ œ œ œ #œ œ œ œ œ œ œ™ œ œ œ™ 3 œœ #œ œ œ œ™ œ œ œ œ™ œ œ œ. = ° ##œœ & œœ œ œ œ ‰ œœœ œœœ ‰ œœœ J J ≈ #œR bœ œ ≈ œ œ œ ≈ œ™ & ‰ œ œ œ œ œ œ # Jœ ™ ≈ R ≈J ≈ & ‰ bœ#œ œ œ œ œ œ™ R ≈ ≈J ‰ ≈ & œ œ œ œ œ œ #œ ™ ≈J ≈ R ≈ & -‰ œœ bœ œœ œ œœ œj <b>œ #œœ ‰ & #œ œ œ œ œ œ œ œJ ‰ ? œ ™ 3 #œ œ œ œ™ œ œ œ 22 A. Gtr. E. Gtr. P. S. Gtr. P. S. Gtr. P. S. Gtr. Kora A. Bass ¢ œ œœœ ##œœœœ J #w w œ œ œ ‰ œœœ ‰ œœœ œœœ J œ ‰ œœœ J ‰ w œ ‰ ‰ bw -r bœ #œœœ ≈ œœœ œ #œ œ œ ≈R œ -‰ ≈#œr œ œJ ™ ≈ Œ ##œœœœ œœœœ - œœœ œ œœœ œr ≈ ≈ œr œ œ œ œ™ j œ œ œ™ #œ œ b œ ‰ ≈R ‰ œ #œœ #œ œ™ œœ™ œ œ ‰ œœœ œœœ ‰ J J . œ. œ bœ # œ œ ≈ R- R ≈ R ≈ œ bœ œ #œ nœ ≈R b œ œ. œ. œ œ ≈ R- R ≈ R ≈ œ bœ œ #œ nœ ≈R -r ≈ œœ bœ œœ œ œœ ≈ œR œ ‰œ œ œ J j j #œ œ œ œ™ J ‰ ‰ ≈ œ œœœ œ. ≈ œ R # œ- n Jœ œ. #œ R≈J # -œ n œ œj ‰ Œœ ≈œ œŒ œ≈ R = ° œ œ œœ œ A. Gtr. & œ œJ ‰ œœJ ‰ œœ œœ ‰ œœJ 25 & w w E. Gtr. P. S. Gtr. & P. S. Gtr. & P. S. Gtr. Kora A. Bass ¢ nw w & - -r nœ œ œ œ≈œ œ & nœ œ ≈ œR œ ? nœ ™ œ œ ˙ ˙ ˙ r œœ œœ -r œ œ ≈≈ œ œœœ œ ™ œ œ œ ™ nœ œ œ œ #œœœ œ ‰ œœœ J J ˙ Œ œ‰ ˙ J œ Œ J ‰ #˙ œ Œ J ‰ ˙ œ Œ - J ‰ r #œ œ œ œ œ≈œ œ œ œ œj ‰ œ œ Œœ œ œ ≈ œR œ ™ œ œœ œ ≈ œ™ œ œ œœ œœ ‰ œœ œœ ‰ œœ œ œ œJ œJ œ ˙ - -r œ œ œœ≈œ œ œ œ ≈ œR œJ - œœ œ™ œœ œ‰ œ œœ œ #œ œ œ œ œ ‰ œœœœœœ ‰ œœœ œœœ œœœ ‰ œœœ œœœ ‰ œœœ J J J #œ œ ˙™ œ. œ ≈ #œ œ #˙ ™ œ. œ ≈ œ œ . #˙ ™ œ œ ≈ #œ œ #˙ ™ œ. œ - ≈ . r œœ œœ # œœ œœ œœ œœ œ œ œ œ œ œ œ™ œ œ œ œ œ œ ≈ œ œ œ œ œ œ ≈≈ œ œ œ œ œ œ œ ≈ ™ R œ œ™ œ œ œ œ™ œ œ œ ≈ œ™ œ œ œ 287 Caravan_js - bars 1 - 50 4 ° n#œœœœ & J 29 A. Gtr. E. Gtr. P. S. Gtr. P. S. Gtr. ˙ & ˙ œœ œœ œœ ‰ œœJ ‰ œœ œœ & #˙ & ˙ & œ œ ™ #œ ™ œ#œ ™ œ™ œ œ™ œ#œ ™ r r ≈œ # œœ ≈ œœ œ œœ œ œœ Kora & œ œ ≈ œR œ œ œ œ ? œ œ œ œ™ A. Bass œ™ ¢ P. S. Gtr. œœ œœ œœ œœ œœ œœ ‰ œœJ #œœ œœ ‰ œœJ œœJ ‰ œœ nœ ™ œ™ . œ œ™ œ ≈œ œ ™ œœ œ œ œ œ œ œ œ œ œ œ œ n#œœœ ‰ œœœ ‰ œœœ œœœ ‰ œœœ #œœœ œœœ ‰ œœœ œœœ ‰ œœœ J J J J J œ ˙ #œ œ j r ≈ ˙ œ ≈ œ- œ œ ≈ ˙™ œ. œ ˙ œ œ ˙ œ œ- œ œ ˙™ œ. J ≈R ≈ ≈ œ #˙ nœ œ #˙ œ -œ œ œ #˙ ™ œ. ≈ ≈ J ≈R œ ˙ œ œ ˙ œ œ . œ œ ˙™ œ J ≈R ≈ ≈ - - r r r r œ œ œ œ j #œ œ œ œ r n œœ œœ œœ œœ r œ œ ≈ œ œ œ ≈ œ œ ≈ œ ≈ ‰ #œ œ ≈ œ œ œ œ œ œ ≈≈ œ œ œ ≈ œ #œ œ œ œ œ ‰ œ œ œ œ œ œ œ ≈ R œ œ œ™ œ œ ≈ R œJ ‰œ œ œ Œœ œ œ≈ œR œ œ≈ œ œ≈ œ ≈ ‰ - œ œ œ œ™ œ œ œ œ™ œ œ R œ œ œ œ ™ œ œ™ œœ œ œ œ™ œ≈ œ™ œ œ œ = 33 ° œ & œœ ‰ œœœ J J œœœ. œ œ E. Gtr. & R ≈ R ≈ R ≈ œ œ b œœœ ŒR R R P. S. Gtr. & ≈ ≈ -≈ . œ œ œœœ R ≈R ≈R ≈ P. S. Gtr. & œ œ b œœœ ŒR r R -r RP. S. Gtr. & ≈ œ≈ ≈œ œœ œœ ≈œ≈œ Kora & ≈≈R œ œ # œ œ™ R œ™ ? A. Bass ¢ A. Gtr. ‰ œœœ œœœ ‰ œœœ J ‰ ‰j ‰ 3œ œ nœ œ œ ‰ Œ ‰ ‰ ‰ 3 œ œ ‰ ‰j ‰ nœ 3 œ œ ‰ Œ ‰œœ ‰ ‰œœ 3 j œ œ œ œ≈ ≈ œ œ ™ œ œœJ œ ≈ bœR -œ œ R ≈ -œ œ ≈ R -œ œ R ≈ ‰ ‰ œ ∑ œ ‰ ˙ J œ #˙ J ‰ ≈‰ ≈‰ œ ˙ J ‰ œ #˙ J ‰ ≈‰ ≈‰ ##œœœœ J #w w ‰ œ œ œ œ œœœ ‰ œœœ œœœ ‰ œœœ J J bw w -r œ ≈ œœœ œ œ # œ œ # œ œ œ b œ # œœ ≈ œœ œœ ≈ œœ ≈ j ‰ #œ œ R œ œ œ œ œ ≈R œ #œ œ œ œ™ œ œ œ ≈ -œ ™ #œ œ œ™ œœœ œ œ œ œ œ™ - œœœ œr ≈ ≈ œr œ œ™ œ œœ œ = ° ##œœ œœ A. Gtr. & œœ œœ ‰ œœœœ œœœœ ‰ œœœœ J J . #œ œ œ œœœ. œ b œ œ ≈ R. ≈ E. Gtr. & ‰ œœ œ # œ œ œœRœ. b œ R≈ ≈ R P. S. Gtr. & ‰ # œœ. b œ œ œœœ. œ œ R≈ R ≈ . P. S. Gtr. & ‰ . bœ œœ œ # œ œ œœœ R≈ ≈ R P. S. Gtr. & ‰ -r œ œ œ œ <b>œ #œ ≈ œ bœ œ œ œ œj Kora & #œ ≈ œ œ œ Œ ‰ œ œ œ R œJ œ œ™ ? œ™ #œ œ œœ A. Bass ¢ œ 36 œ J œ J œ J œ J ‰ œ≈ ##œœœœ J #w w ‰ œ œ œ œ œœœ ‰ œœœ œœœ ‰ œœœ J J bw w œ ## œœ œ œ™ 3 . œœ œœ œ œœ -œ œœ œœ œ œ #œ œ œ œ œ œ œ™ œ œ œ™ #œ œ œœ œ ##œœœœ œ œœœ ‰ œœœœ œœœœ ‰ œœœœ J J # œ b œ ‰ ≈R œ œœ œ‰ œ œ œ œœ œ ‰ ≈R ‰ bœ#œ œ œ œ œ ‰ ≈R ‰ œ œ œ œœ œ ‰- ≈ R-r ‰ . œ # œ œ œ œ œ # œ œ œ œ œ œ ≈ œ œ œ #œ œ œ œ œ œ œ œ œ œ™ #œ œ œœ œ™ œ J #œ J œ J #œ J œ œ œ3 œ œ œ œ 288 APPENDIX D. SCORES Caravan_js - bars 1 - 50 ° ##œœ & œœJ œ œœ œ œ œ œ #œ œ œœ ‰ œœ ‰ ‰ œœœ ‰ œœœœœœ ‰ œœœ #œœœ œœœ ‰ œœœ œœœ ‰ œœœ œJ œJ J J J J ˙ #œ œ. œ. <b>œ #œ ™ ‰ ≈ R R ≈ R ≈nœbœ ≈ œR ≈ œ w E. Gtr. & J J -œ b œ œ # œn œ ˙ <#> œ œ™ #œ nœ w J ≈ R≈J ‰ ≈R P. S. Gtr. & ™ ˙ b œ œ. œ. œ œ œ œ bœ #œ nw J ≈ R≈J ‰ ≈ R R ≈R ≈ P. S. Gtr. & œ b œ œ # œn œ # œ n œ w ˙ <#> œ œ™ J ≈ R ≈ J - -r ‰ ≈R P. S. Gtr. & . - -r - - r -j œ œœ œœ œ œœ œ≈ ## œœ œœ œœœ ≈ œœ œj œœ œ ™ œœ œ œ ## œœœ ≈ œœ œ œœ œ œœ œ œ≈œ œ Kora & #œ œ œ œ #œ œ œ œ œ œ œ œ™ œ œ ≈ R œ‰ J œ œJ ‰ œ- ≈ R- œ‰ J - œ œ œ™ œ œ ? # œ œ # œ œ œ œ™ œ œ œ œ™ œ œ ™ ™ A. Bass œ œ œ ¢ œ 39 A. Gtr. 5 œœ œœ ‰ œœ œ œ œJ œœ œœ ‰ œœ œœ ‰ œœ œ œ œJ œJ œ œ œ œ≈‰ Ó R œ R ≈‰ Ó œ œ R ≈‰ Ó œ œ œœ œj œ ‰ œ œJ ‰ œ™ œ œ œ≈ œœ ≈œ œ œ œ™ œ R - ≈ ‰-j Ó œœ œœ œœ œ œ œœ œ‰ œ œ≈Œ œ œ œJ ‰ œ œ œ™ œ œ œ œ = ° ##œœœ ‰ œœœ ‰ œœœ œœœ ‰ œœœ ##œœœ œœœ œ œ œ œ œ œ & œJ J rJ r œ ## œœ ≈ œœ œ œœ ≈ œ œœ ≈ œ™ œ œ ## œœ ≈ œœ Kora & œJ #œ ≈#œR œ œ ≈ œ œ ≈ œ™ œ œ #œ ≈ œR ? œ™ #œ œ œ œ™ œ œ œ ≈ œ ™ #œ A. Bass ¢ 43 A. Gtr. ‰ œœœœ œœœœ ‰ J J œ œœ ≈ œ œœ œ œ≈œ œ œ œ œ™ œ œ ##œœœœ ‰ œœœœ ‰ œœœœ œœœœ ≈ œœœ J Jr j r œ≈ œ≈ ## œœ œœ œœœ ≈ œœ œj œœ ‰ œ #œ œ œ œ œ œ œ≈‰ Rœ œ ≈ œ ™ #œ œ œ œ™ œ #w Œ ‰ r bn œœ œœ œ ≈ R R ≈≈ r bœ r j r œ bœ #œ ≈bœ #œœœ≈ ≈ R ≈ ≈R Œ Organ {& bw ‰ Œ r ≈#œ bŒœ R ≈ œJ r œ œ ≈ #œœ ‰œ œ ≈≈ R J R Organ {& w ‰ Œ ≈ œr Œœ R ≈#œJ r ≈ bnœœ œ œœ œ≈ ≈ R ‰J R Organ {& Organ {& w ‰ Œ r ≈ œ ŒœR #œ ≈J œ ‰J œj ‰ w ‰œJ œj ‰ #w j œ‰ bw j œ‰ w œ ‰ œœ J. œ ™ œœ œ œ™ œ œ œ≈ 289 Caravan_js - bars 1 - 50 6 ° ##œœ œœ & œœ -œœ œ ## œœœ ≈ & #œ ≈ œ 46 A. Gtr. Kora A. Bass ¢ ? œ™ œœ œœ -r ‰ œœJ- œœœœ œ œ œ œ œ ‰ œJ œ R- œ œ œ™ #œ œ œ œœ œ - œ œœ œ œ œJ œœ ##œœœœ ‰ œœœœ J -r J œ œ œ bœ #œ ≈ œœ œ #œ ≈ œ ‰ œ R œ #œ œ œ ≈ œ™ œ -œ œ œ #œ ≈R ‰ ≈ w œ œ ‰ œœœ œœœ œœœ œ œœœ œ œ œ œ œ™ œ œ œ œ œ #œ œ ‰ œœœ #œœœ œœœ ‰ œœœ œœœ ‰ œœœ J J J r -r r -œ œ œ œ œ ≈ ≈ œ œ #œ ≈ œ bœ œ œ œ œj ‰ #œ ≈ œ œ œ Œ ‰ œ™ œ œ œ œ R œJ œ œ™ œ #œ œ œœ œ ≈ œ™ œ≈ œj b œ Organ {& ‰ œ œ≈R Organ {& ‰ ≈#œR bœ œ ≈ -œ œ œ ≈ œ ‰ R #w ‰ #œJ œ. ≈ œ. ≈ œ bœ œ. ≈ œ R R R J Organ {& ‰ ≈bœR #œ œ ≈ -œ œ œ ≈ œr ‰ bw ‰ bœJ œ. ≈ œ. ≈ œ œ œr ≈ j R R . #œ Organ {& ‰ ≈ œ œR œ ≈ -œ œ œ ≈#œr ‰ w ‰ Ó ‰ Ó j œ bœ œ #œ nœ #œ nœ ‰ J Œ œ #œŒ nœ #‰œ nœ J = ° œ œ œœ œ & œœ- ‰ r œœ ‰ œœ œœ ‰ œœ J œœ œœ J œœ œœ J nœ œ œ œ™ œ ≈ ≈ ≈ Kora & nœJ œ ≈ œR œ œ ≈ œ œ ≈ œ ™ œ œ œ™ œœ ? œ™ œ œ œ≈ A. Bass ¢ 49 A. Gtr. œœ œœ ‰ œœ œœ ‰ œœ ##œœœœ ‰ œœœœ ‰ œœœœ œœœœ ‰œœœœ ##œœœœ œœœœ ‰ œœœœœœœœ œ- œ r œJ œJ œ - -r J- JJ Jr -j J. œ œ œ œ r œœ œ œ œ œ œ œ ## œœ œœ œœœ œœ œj œœ œ ™ œœ œ œ ## œœœ œœ œ œœ œ œ ≈ ≈ ≈ ≈ ≈‰ ≈ œ œ ≈ œR œ œ ≈ œ œ œ ≈ ‰ œ #œ œ œ œ œ œ œ™ œ œ #œ ≈ œR œ‰ œJ œ Rœ™ œ œ œ œ™ œœ œ ≈ œ ™ #œ œ œ œ™ œ œ œ ≈ œ ™ #œ œ œ œ™ Organ {& w ˙™ Œ Organ w {& ˙™ Œ Organ { & nw ˙™ Œ Organ {& w ˙™ Œ ‰ œœœœ œœ œ œ œ‰ J œ œœ ‰ œ œ œ. œ b œ œ J R ≈#œJ œ ≈ R œ ‰ bœ #œ ≈ œœœ. œ. ≈ œR œ œ J R J bw ‰ #œ bœ ≈ j œ. œ. ≈ œ œ J R œœ R œ w ‰ œ œ ≈ j œ œ ≈ bœR œ œ J R #œ œ. w #w E Modern Chord Notation As it has been discussed in chapter 1 the lead sheet notation is widely used in modern music. We recall it consists of a melody written in classical notation supported by chords denoted as symbols. In tables E.1 and E.2 we list the most common symbols using C as root to build examples for triads and seventh chords, both including their chord tones extensions. A chord is said to be altered if a pitch not belonging diatonically to the chord is added, such as C7♭9 = (C, E, G, B♭, D♭). Where used, these chords will be described explicitly. Table E.1: Modern triad notation. Name Major Minor Augmented Diminished Suspended 2 Suspended 4 Major Added 9 Minor Added 9 Major Added 6 Notation C Cm Caug Cdim Csus2 Csus4 Cadd9 Cmadd9 Cadd6 Arpeggio (C, E, G) (C, E♭, G) (C, E, G♯) (C, E♭, G♭) (C, D, G) (C, F, G) (C, E, G, D) (C, E, G, D) (C, E, G, A) Table E.2: Modern seventh chords notation. Name Major 7 Minor 7 Dominant Minor Major 7 Major 7♯5 Augmented Dominant Half-Diminished Diminished Six Nine Six-Nine Minor Nine Notation C∆ Cm7 C7 Cm∆ C+∆ C+7 C∅ C◦ C6 C9 C6/9 Cm9 291 Arpeggio (C, E, G, B) (C, E♭, G, B♭) (C, E, G, B♭) (C, E♭, G, B) (C, E, G♯, B) (C, E, G♯, B♭) (C, E♭, G♭, B♭) (C, E♭, G♭, B♭♭) (C, E, G, B♭, A) (C, E, G, B♭, D) (C, E, G, B♭, A, D) (C, E♭, G, B♭, D) List of Figures 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 Intuitive representation of monody and polyphony. (a) Monody is intended as a melodic line supported by a harmonic progression. (b) The polyphonic approach allows to create superpositions of independent melodic strands, that aﬀect the listener both as a whole and separated entities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 An example of lead sheet. . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chord symbols are substituted by mode names. The Law of Diminishing Returns - Alan Pasqua. Solos part B. . . . . . . . . . . . . . . . . . . . 13 Example of polyphony from Musica Enchiriadis. Transcription from (Taruskin, 2009, Chapter 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Melody voicing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Two diﬀerent harmonizations of Jerusalem. Guido d’Arezzo, Micrologus. 14 Independent voice leading and contrary motion. A fragment of Alleluia: Angelus Domini - Chartres 109, fol. 75. . . . . . . . . . . . . . . . . . . 15 Polyphonic Jazz standard. . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Voices’ independency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 A reduced orchestration of Boplicity bars 1-4. Birth of the Cool, by Miles Davis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Alto sax, baritone sax, trumpet and horn voices in Move, bars 1-11, by Miles Davis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Simultaneous motions of voices. . . . . . . . . . . . . . . . . . . . . . . . 18 Gluing diagrams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simplices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Star and link. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The linear space of pitches and the space of pitch classes. . . . . . . . . The space of three notes chords . . . . . . . . . . . . . . . . . . . . . . . The billiard table orbifold . . . . . . . . . . . . . . . . . . . . . . . . . . The Euler Tonnetz. Two pitch classes are connected by an edge, if they form a consonant interval. The horizontal arrow (PV) links two pitch classes a perfect ﬁfth apart, while the two pitch classes connected by the vertical arrow (MIII) forms a major third interval. . . . . . . . . . . . . The spiral array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A planar inﬁnite Tonnetz. . . . . . . . . . . . . . . . . . . . . . . . . . . Gluing diagram of the Tonnetz torus. . . . . . . . . . . . . . . . . . . . Simple shapes and four notes chords. . . . . . . . . . . . . . . . . . . . . 292 22 22 23 24 26 27 27 29 30 31 31 List of Figures 293 2.12 Extended shapes on the Tonnetz. Two diﬀerent modes are represented by the same extended shape. . . . . . . . . . . . . . . . . . . . . . . . . 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 32 Voice leading and corresponding piecewise geodesic path. . . . . . . . . Voice leadings representation in R2 . . . . . . . . . . . . . . . . . . . . . Voice leadings visualization in T2 and A2 . . . . . . . . . . . . . . . . . . Alleluia, Angelus Domini, Chartres fragment n. 109, fol. 75. . . . . . . . Voice leadings’ complexity as a point cloud 1. . . . . . . . . . . . . . . . Dicant nunc Judei, Chartres fragment. . . . . . . . . . . . . . . . . . . . Voice leadings’ complexity as a point cloud 2. . . . . . . . . . . . . . . . Reduction of rhythmically independent voices to a counterpoint of the ﬁrst species. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 The Retrograde Canon . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.10 Voice leadings’ complexity as a point cloud 3. . . . . . . . . . . . . . . . 3.11 Dynamic Time Warping among two series of observation. . . . . . . . . 3.12 Optimal warping path on Alleluia, Angelus Domini and Dicant nunc Judei. 44 45 46 50 51 51 52 4.1 4.2 60 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 5.1 Concatenation of braids. . . . . . . . . . . . . . . . . . . . . . . . . . . . Graphical representation of the Braids properties in Equations (4.1.1) and (4.1.2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A partial braid β ∈ IB5 . . . . . . . . . . . . . . . . . . . . . . . . . . . Concatenation of partial braids in IB5 . . . . . . . . . . . . . . . . . . . . Singular generator of SBn . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial singular braid representation of voice leadings. . . . . . . . . . . Partial singular braid representation of voices leaps. . . . . . . . . . . . Partial braids inducing the same partial permutation. . . . . . . . . . . The partial singular braid representation of a voice leading deﬁned in R/12Z. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Voice leadings as braids on the cylinder. . . . . . . . . . . . . . . . . . . Concatenation of pitch and pitch-class partial singular braids.The observation of a single strand, or of the whole voice leading (regions 1, . . . , 7) provide an intuitive representation of both the motions of pairs of voices (similar, parallel, oblique, contrary) and of the behaviour of each voice (downward, upward and ﬁxed). The length of a crossing is simply measurable, as more complicated phenomena such as the overlap (see Section 1.2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Melodic and harmonic intervals. Pairs of consecutive bars represent diﬀerent musical entities from a melodic and a harmonic viewpoint, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The musical concept evolution in Time by Hans Zimmer. The ﬁrst bar represents the musical idea that opens the composition. The following bars depicts consecutive evolutions of the ﬁrst concept. . . . . . . . . . . 6.2 Displacement of the pitch-class space’s vertices. . . . . . . . . . . . . . . 6.3 Deformed geometries generated from the Tonnetz. A portion of the planar Tonnetz is represented on the plane z = 0. . . . . . . . . . . . . . . . . . 6.4 Visualization of the Tonnetz simplicial structure. . . . . . . . . . . . . . 53 54 55 56 57 60 61 62 63 64 65 66 67 69 71 79 6.1 82 82 83 84 294 6.5 6.6 6.7 6.8 List of Figures A vertex map from the fundamental domain of the Tonnetz to the Tonnetz torus. The red and blue lines corresponds to the two generators of the torus, given by the translation (transposition) of 3 and 4 half-steps, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preferred pitch-class set. . . . . . . . . . . . . . . . . . . . . . . . . . . . Preferred subcomplexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . Weighted preferred subcomplexes of T . . . . . . . . . . . . . . . . . . . . 7.1 7.2 The boundary of a 3 and a 2-simplex. . . . . . . . . . . . . . . . . . . . Representation of a chain complex associated to a 3-dimensional simplicial complex. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 2-dimensional simplicial complex. . . . . . . . . . . . . . . . . . . . . . . 7.4 Reduced n-th boundary matrix. . . . . . . . . . . . . . . . . . . . . . . . 7.5 Filtration and persistence diagram of a manuscript note. . . . . . . . . . 7.6 Sub-level sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7 Persistence of a homological class. . . . . . . . . . . . . . . . . . . . . . 7.8 Persistence barcodes and persistence diagrams. . . . . . . . . . . . . . . 7.9 Corner points matching. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.10 2-dimensional simplicial complex. . . . . . . . . . . . . . . . . . . . . . . 7.11 Reduction of the persistent boundary matrix to normal form. . . . . . . 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 9.1 9.2 9.3 9.4 9.5 85 87 89 90 94 95 96 97 99 100 101 103 104 106 107 Lower star ﬁltration of a simplicial complex. . . . . . . . . . . . . . . . . 110 Critical points on a simplicial complex. . . . . . . . . . . . . . . . . . . 111 Sub-levels on the Tonnetz. . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Musical interpretation of the Tonnetz topological persistence. . . . . . . 114 Smooth representation of critical points on the deformed Tonnetz. . . . 116 Dendrogram representation of data dissimilarity. The structure of the 2-dimensional point cloud consists of two distinct groups and two outliers. The dendrogram reﬂects such a structure representing the two groups as separate clusters and joining the outliers to the clusters respecting their relative position respect to the conﬁguration of the point cloud. . . . . . 117 Persistence-based clustering of nine classical and contemporary pieces. . 118 Comparing three diﬀerent version of All the Things You Are. . . . . . . 119 Pop clustering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 H1 persistence-based clustering of nine classical and contemporary pieces.122 Comparing three diﬀerent version of All the Things You Are using 1dimensional persistence. . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 A simpliﬁed version of the clustering of 58 pop songs generated from their 1-persistence diagrams. . . . . . . . . . . . . . . . . . . . . . . . . 124 A Tonnetz deformed through a signal-based height function. . . . . . . . Consonance function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Consonance function on an octave. . . . . . . . . . . . . . . . . . . . . . Deformation of a portion of the Tonnetz. The reference note used to displace its vertices is C3 . The labels associated to the Tonnetz’s vertices correspond to the chromatic scale built on the fourth and the ﬁfth octave of the piano. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Variations of the Tonnetz’s geometry on three octaves. . . . . . . . . . . 125 127 128 129 130 List of Figures 295 9.6 9.7 9.8 9.9 9.10 133 134 135 138 9.11 9.12 9.13 9.14 9.15 9.16 Modes clustering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hierarchical clustering of the 21 modes of Table A.1. . . . . . . . . . . . Octave dependency of the harmonic-oriented modes clustering. . . . . . Consonance-based distance matrices for triads. . . . . . . . . . . . . . . Hierarchical structure of triads’ consonance. In the ﬁrst row it is possible to observe how the consonance classify triads according to their classes, by using two diﬀerent harmonic spectra. In the second row the inversions of the major triads are classiﬁed according to their consonance value, computed with h1 and h2 , respectively. . . . . . . . . . . . . . . . . . . . Consonance height function. . . . . . . . . . . . . . . . . . . . . . . . . . Visualisation of the curvature for planar curves and surfaces. . . . . . . The elliptic paraboloid (a) and the hyperbolic paraboloid (b). . . . . . . Discrete Gaussian curvature on deformed Tonnetze. . . . . . . . . . . . Gaussian curvature trends. . . . . . . . . . . . . . . . . . . . . . . . . . Hierarchical clustering of consonance-deformed Tonnetze generated by triads and two harmonic spectra: (1, 1, 1, 1, 1, 1) on the left column and (1, 1/2, 1/3, 1/4, 1/5, 1/6) on the right. . . . . . . . . . . . . . . . . . . . 139 141 143 144 145 146 149 10.1 Chromagrams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 11.1 Example of global alignment between two (apparently) lowly-related sequence. Exact matches are identiﬁed by (|) and related matchs are identiﬁed by (:). Even though the symbols in both sequences are quite diﬀerent, most of these are actually closely related in their functions, which implies that the sequences share a high amount of similarity. . . . 11.2 The eﬀect of using diﬀerent grammars (symbolic information) and different weighting matrix can lead to dramatically diﬀerent results in the ﬁnal alignments and similarities between the sets of sequences. . . . . . 11.3 Multiple sequence alignment of 3 sequences through dynamic programming. (a) Given a set of 3 sequences to align, (b) we can construct a 3-dimensional matrix in which (c) each cell deﬁnes 7 diﬀerent paths. (d) Following the same procedure as pairwise alignment, we can ﬁnd the optimal (e) multiple sequence alignment. (f) An interesting property is that we can project the multidimensional path on bi-dimensional planes to obtain pairwise alignments between any sequence of the set. . . . . . 11.4 Summary of the centre star algorithm. . . . . . . . . . . . . . . . . . . . 11.5 Summary of the progressive alignment algorithm. (a) The similarity matrix is computed based on pairwise alignments. (b) The guide tree is obtained from this matrix. (c) By going up the tree, each node generates a speciﬁc alignment, between subsets of sequences. (d) When the root of the tree is reached, we obtain the set of multiple alignments. . . . . . . 11.6 Possible representations of the consensus sequences . . . . . . . . . . . . 11.7 From chords to symbols. (a) In a lead sheet, the standard chord notation is substituted by symbols. (b) The triad harmonisation of the diatonic scale of C and its seven degrees. . . . . . . . . . . . . . . . . . . . . . . 161 165 166 167 169 171 172 296 List of Figures 11.8 In the circle of ﬁfths major (and relative minor) tonalities are organized in relationship to the altered notes they contain. Two tonality a step apart diﬀer of a single note. The only exception is represented by the tonalities of C♯ and C♭, which are separated by a thick line. The bold letters surrounding the circle correspond to the alphabet used to build the tonality class of sequences. . . . . . . . . . . . . . . . . . . . . . . . 173 11.9 Two weighting matrices expressing the similarity between degrees of a tonality (left) and semiotic labelling (right). The former is computed considering the distances of chords in the spiral array, the latter is deduced from the similarity of the block retrieved by the semiotic segmentation of music. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 11.10Dendrogram obtained by evaluating the dissimilarity among 19 songs of Quaero and 3 Beatles’ covers contained in the original set. . . . . . . . . 177 11.11Evaluation of several harmonic-oriented clusterings in relation to a genre recognition task. Diﬀerent clusterings are represented as colored spheres of variable radius in the space. The colour represent the alignment algorithm used to obtain the clustering. The size of the spheres corresponds to the 1-NN accuracy of the clustering, while the height of the spheres depends on the weighting matrix used to generate the clustering. On the cluster precision/cluster recall plane (z = 0 ), the projection of each sphere is depicted as a cross. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 11.12Two possible clusterings. Each cluster has been labeled coherently with the genre represented by its objects. Clusters whose objects do not share a similar genre are labelled as Mixed. Big clusters have been labelled according to their subgroups. Finally, the cluster named as Beatles for Sale in (b) owes its name to the presence of a neat groups of songs belonging to this album. . . . . . . . . . . . . . . . . . . . . . . . . . . 179 11.13Interaction between the semiotic segmentation and the harmonic-based sequences. (a) The polar dendrogram representing the hierarchical organisation of the semiotic sequences aligned with the NW algorithm and the semiotic weighting matrix. Clusters are genre-wise labeled. Mixed clusters corresponds to incoherent groupings in terms of genre. (b) Reorganisation of the Pop Rock and Hip Hop clusters of (a) through the alignment given by the combination (Degrees, alternate, N W ). The new dissimilarity measure has been computed cluster-wise, enhancing the genre retrieval obtained by the semiotic approach. . . . . . . . . . . . . 180 11.14Reference-free methods are represented in three diﬀerent barplots according to their order of magnitude. . . . . . . . . . . . . . . . . . . . . . . 181 List of Figures 297 11.15The polar dendrogram constituting the centre of the ﬁgure is the clustering obtained considering sequences of the class Tonality, aligned with the NW algorithm and the binary weighting matrix. The radial segments represent the result of the multiple sequence alignment. Recurrent modulation patterns have been highlighted as coloured segments. Finally, the consensus of the most relevant motifs have been depicted for each cluster. For the sake of simplicity, the consensus sequences are composed only by capital and lowercase letters, representing natural and ﬂat tonalities, respectively (the symbol C denotes the tonality of C major, while c the major tonality of C♭ ). . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 12.1 Homotopy between the functions f, g : X → Y. The values t ∈ [0, 1] can be interpreted as time, thus H(x, t) describes the continuous deformation allowing to transform f in g. . . . . . . . . . . . . . . . . . . . . . . . . 188 12.2 An example of vineyard. . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 12.3 The six ﬁrst observation of the 0-persistence time series. Klavierstück I Schönberg. Persistence snapshots are taken each 8 bars. . . . . . . . . . 192 12.4 Consecutive observations of a 1-persistence time series. Klavierstück I - Schönberg. Persistence snapshots taken at constant relative time intervals of 8 bars. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 12.5 Accumulated cost matrices and optimal warping paths between 0-persistence time series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 12.6 Dynamic time warping between persistence time-series associated to two compositions A and B. Observations are labelled according to a 4-bars windowing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 12.7 Optimal warping path between to versions of Caravan. The positions of the gaps correspond to the solo parts of the longer version (frames 25-50 and 51-65 respectively). . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 12.8 Alignment score of 0-persistence time series for diﬀerent datasets and variable windowing. Both the colour and the size of the circles associated to each pair of pieces depends on their alignment score. . . . . . . . . . 199 14.1 The partial permutation matrices give a low-dimensional representation of the features of each voice leading. Here, they are used to feed a harmonic conditional restricted Boltzmann machine. The lateral connections in the visible layer are used to retrieve the harmonic structure of chords. Past events are taken into account thanks to the autoregressive connections between the current and past units. . . . . . . . . . . . . . . . . . . . . 208 14.2 Trefoil knot. Identifying the domain and co-domain of a braid b ∈ Bn produces a closed braid. In particular, any knot can be represented as a closed braid (Alexander, 1923). . . . . . . . . . . . . . . . . . . . . . . . 209 14.3 Visualisation of diﬀerent compositional styles as sub-level sets of the height function (light grey area). The displacement of vertices is given by the duration and pitch classes of notes and chords. . . . . . . . . . . 211 298 List of Figures 14.4 Multidimensional persistence. A 2-dimensional ﬁltration, whose parameters are the discrete Gaussian curvature κ and the height function y. Persistent homology can be applied on each ﬁltration obtained by ﬁxing one of the two parameters. . . . . . . . . . . . . . . . . . . . . . . . . . 14.5 Conﬁguration of the tensions (circles) and resolutions (squares) on the consonance-deformed Tonnetz obtained by considering block voicings of a major chord and the chromatic scale built an octave higher than the root note of the triad. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.6 Gravity on the deformed Tonnetz. Masses move following the deformation of the surface. The pitches or pitch classes lying in a neighbourhood of the trajectories can be used to generate melodic lines. . . . . . . . . . . 14.7 Dendrogram chasing. For each branch of the dendrogram it is possible to build a consensus sequence, that describe the similarity between the sequences of the cluster once they have been aligned. . . . . . . . . . . . 14.8 Static classiﬁcation between three Chet Baker’s themes and improvisations and a version of Blue Bossa. Two solos by the same author are grouped together, while the bass solo of Blue Bossa is linked to the theme of Summertime at a high distance. . . . . . . . . . . . . . . . . . . . . . A.1 A graph built assuming that the modal choices on a base-chord B {b1 , b2 , b3 , b4 } are given by two tension-triads T = {t1 , t2 , t3 } and T {t1 , t2 , t3 } . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 The graph associated to the diminished seventh chords, Γ◦7 . . . . . . A.3 The graph associated to diminished seventh chords, Γmaj7♯5 . . . . . . A.4 The graph associated to minor major seventh chords, Γ−maj7 . . . . . A.5 The graph associated to major seven chords, Γmaj7 . . . . . . . . . . A.6 The graph associated to dominant chords, Γ7 . . . . . . . . . . . . . . A.7 The graph associated to minor seven chords, Γ−7 . . . . . . . . . . . . A.8 The graph associated to minor seven ﬂat ﬁve chords, Γ−7♭5 . . . . . . = = . . . . . . . . . . . . . . . . 213 214 216 217 218 230 230 230 231 231 231 232 232 List of Tables 3.1 3.2 3.3 Complexity vectors of the analysed fragments and their occurrences. . . Complexity vectors of the Retrograde Canon and their occurrences. . . . DTW distance matrix for the three time series of complexity vectors. . . 9.1 Names of the studied triads and their corresponding representative pitchclass set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 The sign of the discrete Gaussian curvature characterise the each vertex of the by considering its interaction with its star. Here it is possible to compare the curvature values associated to each pitch, in the six classes of triads that we analysed. . . . . . . . . . . . . . . . . . . . . . . . . . . 144 9.2 52 54 57 12.1 Summary of the compositions of the classical music dataset. . . . . . . . 195 12.2 Summary of the compositions of the jazz dataset. . . . . . . . . . . . . . 196 A.1 The 21 modes derived from the major, melodic minor minor scale. Examples have been built on the C major, and harmonic minor scale, respectively. . . . . . . . . . A.2 Seventh chord harmonisations . . . . . . . . . . . . . . . A.3 Modes as a superposition of two chords. . . . . . . . . . A.4 Modal scales associated to a ﬁxed base-chord . . . . . . and harmonic melodic minor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 225 226 227 C.1 Pitches - Key association . . . . . . . . . . . . . . . . . . . . . . . . . . 242 E.1 Modern triad notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 E.2 Modern seventh chords notation. . . . . . . . . . . . . . . . . . . . . . . 291 299 List of algorithms 3.1 Computing the partial permutation matrix. . . . . . . . . . 7.1 Boundary Matrix Reduction. . . . . . . . . . . . . . . . . . 7.2 Persistence Algorithm. . . . . . . . . . . . . . . . . . . . . . 12.1 Optimal warping path. . . . . . . . . . . . . . . . . . . . . . App2/code/python/persistence_code.py . . . . . . . . . . . . . . App2/code/javascript/deformed_tonnetz_int_sound_pers.html App2/code/python/tonnetz_z_torus.py . . . . . . . . . . . . . . App2/code/python/Persistent_TS.py . . . . . . . . . . . . . . . 301 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 97 105 191 241 242 252 260 Bibliography Abrams, A. and Ghrist, R. (2002). Finding topology in a factory: conﬁguration spaces. American Mathematical Monthly, pages 140–150. Adcock, A., Rubin, D., and Carlsson, G. (2014). Classiﬁcation of hepatic lesions using the matching metric. Computer vision and image understanding, 121:36–42. Ahola, V., Aittokallio, T., Vihinen, M., and Uusipaikka, E. (2006). A statistical score for assessing the quality of multiple sequence alignments. BMC bioinformatics, 7(1):484. Aldwell, E., Schachter, C., and Cadwallader, A. (2010). Harmony and voice leading. Cengage Learning. Alexander, J. W. (1923). A lemma on systems of knotted curves. Proceedings of the National Academy of Sciences of the United States of America, 9(3):93. Alexander, J. W. (1928). Topological invariants of knots and links. Transactions of the American Mathematical Society, 30(2):275–306. Andreatta, M. (2003). Méthodes algébriques en musique et musicologie du XXe siecle: aspects théoriques, analytiques et compositionnels. PhD thesis, École des Hautes Etudes en Sciences Sociales. Apel, W. (1958). Gregorian chant, volume 601. Indiana University Press. Armougom, F., Moretti, S., Keduas, V., and Notredame, C. (2006). The iRMSD: a local measure of sequence alignment accuracy using structural information. Bioinformatics, 22(14):e35–e39. Aucouturier, J.-J., Pachet, F., and Sandler, M. (2005). "The way it Sounds": timbre models for analysis and retrieval of music signals. Multimedia, IEEE Transactions on, 7(6):1028–1035. Bailey, T. L., Williams, N., Misleh, C., and Li, W. W. (2006). MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic acids research, 34(suppl 2):W369–W373. Barona, M. E. A. (2014). The fender rhodes. Basil, S. (1963). Exegetic homilies, volume 46. Catholic Univ of Amer Pr. 303 304 BIBLIOGRAPHY Bergomi, M. G. (2015). (Talk). Dynamics in Modern Music Analysis. XXIst Oporto Meeting on Geometry, Topology and Physics. Applications of Topology. Bergomi, M. G. and Andreatta, M. (2015). Math’n pop versus math’n folk? a computational (ethno) musicological approach. Folk Music Analysis. Bergomi, M. G., Andreatta, M., and Fabbri, F. (2015). Hey Maths! Modèles formels et computationnels au service des Beatles. Volume! (preprint). Bergomi, M. G. and Geravini, S. (2012). I Modi delle Scale. Casa Musicale Eco. Bergomi, M. G., Jadanza, R. D., and Portaluri, A. (2014a). Modelli geometrici e dinamici per spazi musicali. In Ferrara, F., Giacardi, L. M., and Mosca, M., editors, Conferenze e Seminari dell’Associazione Subalpina Mathesis 2013–2014, chapter Le Conferenze, pages 179–196. Kim Williams Books, Torino, Italy. Bergomi, M. G., Jadanza, R. D., and Portaluri, A. (2014b). Una geometrizzazione dello spazio degli accordi. Ithaca, (3):33–46. Bergomi, M. G. and Portaluri, A. (2013). Modes in modern music from a topological viewpoint. arXiv preprint arXiv:1309.0687. Berndt, D. and Cliﬀord, J. (1994). Using dynamic time warping to ﬁnd patterns in time series. In AAAI-94 workshop on knowledge discovery in databases, pages 229–248. Bigo, L. (2013). Spatial Computing for Symbolic Musical Representations. PhD thesis. Bigo, L., Andreatta, M., Giavitto, J.-L., Michel, O., and Spicher, A. (2013). Computation and visualization of musical structures in chord-based simplicial complexes. In Mathematics and Computation in Music, pages 38–51. Springer. Bimbot, F., Deruty, E., Sargent, G., and Vincent, E. (2012). Semiotic structure labeling of music pieces: concepts, methods and annotation conventions. In 13th International Society for Music Information Retrieval Conference (ISMIR). Birkhoﬀ, G. D. (1933). Aesthetic measure. Cambridge, Mass. Birman, J. S. (1974). Braids, links, and mapping class groups. Number 82. Princeton University Press. Birman, J. S. (1993). New points of view in knot theory. Bulletin of the American Mathematical Society, 28(2):253–287. Boland, M. and Link, J. (2012). Elliott Carter Studies. Cambridge University Press. Brattico, E. and Pearce, M. (2013). The neuroaesthetics of music. Psychology of Aesthetics, Creativity, and the Arts, 7(1):48. Brinkmann, R. (1969). Arnold Schönberg, drei Klavierstücke Op. 11: Studien zur frühen Atonalität bei Schönberg. Franz Steiner Verlag. BIBLIOGRAPHY 305 Bugge, E. P., Juncher, K. L., Mathiesen, B. S., and Simonsen, J. G. (2011). Using sequence alignment and voting to improve optical music recognition from multiple recognizers. In 12th International Society for Music Information Retrieval Conference, pages 405–410. Buteau, C. and Mazzola, G. (2000). From contour similarity to motivic topologies. Musicae Scientiae, 4(2):125–149. Cagliari, F., Di Fabio, B., and Ferri, M. (2010). One-dimensional reduction of multidimensional persistent homology. Proceedings of the American Mathematical Society, 138(8):3003–3017. Callender, C., Quinn, I., and Tymoczko, D. (2008). Generalized Voice-Leading Spaces. Science, 320:346–348. Carlsson, G. and Zomorodian, A. (2009). The theory of multidimensional persistence. Discrete & Computational Geometry, 42(1):71–93. Carlsson, G., Zomorodian, A., Collins, A., and Guibas, L. J. (2005). Persistence barcodes for shapes. International Journal of Shape Modeling, 11(02):149–187. Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., and Slaney, M. (2008). Content-based music information retrieval: current directions and future challenges. Proceedings of the IEEE, 96(4):668–696. Cerri, A., Fabio, B. D., Ferri, M., Frosini, P., and Landi, C. (2013). Betti numbers in multidimensional persistent homology are stable functions. Mathematical Methods in the Applied Sciences, 36(12):1543–1557. Chanan, M. (1994). Musica practica: The social practice of Western music from Gregorian chant to postmodernism. Verso. Chazal, F., Cohen-Steiner, D., Guibas, L. J., Mémoli, F., and Oudot, S. Y. (2009). Gromov-Hausdorﬀ Stable Signatures for Shapes using Persistence. In Computer Graphics Forum, volume 28, pages 1393–1403. Wiley Online Library. Chen, L. and Ng, R. (2004). On the marriage of Lp-norms and edit distance. In Proceedings of the Thirtieth international conference on Very large data basesVolume 30, pages 792–803. VLDB Endowment. Chew, E. (2002). The spiral array: An algorithm for determining key boundaries. In Music and artificial intelligence, pages 18–31. Springer. Chung, M. K., Bubenik, P., and Kim, P. T. (2009). Persistence diagrams of cortical surface data. In Information Processing in Medical Imaging, pages 386–397. Springer. Cohen-Steiner, D., Edelsbrunner, H., and Morozov, D. (2006). Vines and vineyards by updating persistence in linear time. In Proceedings of the twenty-second annual symposium on Computational geometry, pages 119–126. ACM. 306 BIBLIOGRAPHY Cohen-Steiner, D. and Morvan, J.-M. (2003). Restricted delaunay triangulations and normal cycle. In Proceedings of the nineteenth annual symposium on Computational geometry, pages 312–321. ACM. Cohn, R. (2011). Audacious Euphony: Chromatic Harmony and the Triad’s Second Nature. Oxford University Press. Crestel, L. (2015). Deep symbolic learning of multiple temporal granularities for musical orchestration. d’Amico, M., Ferri, M., and Stanganelli, I. (2004). Qualitative Asymmetry Measure for Melanoma Detection. In ISBI, pages 1155–1158. d’Amico, M., Frosini, P., and Landi, C. (2006). Using matching distance in size theory: A survey. International Journal of Imaging Systems and Technology, 16(5):154–161. d’Arezzo, G., Colette, M.-N., and Jolivet, J.-C. (1993). Micrologus. Éd. IPMC. Das, G., Gunopulos, D., and Mannila, H. (1997). Finding Similar Time Series. In Principles of data mining and knowledge discovery: First European Symposium, PKDD’97, June 24-27, volume 1263, pages 88–100, Trondheim, Norway. Springer Verlag. De Silva, V. and Ghrist, R. (2007). Coverage in sensor networks via persistent homology. Algebraic & Geometric Topology, 7(1):339–358. Di Fabio, B. and Ferri, M. (2015). Comparing persistence diagrams through complex vectors. arXiv preprint arXiv:1505.01335. Di Fabio, B. and Frosini, P. (2013). Filtrations induced by continuous functions. Topology and its Applications, 160(12):1413–1422. Di Fabio, B. and Landi, C. (2011). A Mayer–Vietoris formula for persistent homology with an application to shape recognition in the presence of occlusions. Foundations of Computational Mathematics, 11(5):499–527. Do, C. B., Mahabhashyam, M. S., Brudno, M., and Batzoglou, S. (2005). ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome research, 15(2):330–340. Douthett, J. and Steinbach, P. (1998). Parsimonious graphs: A study in parsimony, contextual transformations, and modes of limited transposition. Journal of Music Theory, pages 241–263. Dowling, W. J. (1972). Recognition of melodic transformations: Inversion, retrograde, and retrograde inversion. Perception & Psychophysics, 12(5):417–421. Dudeque, N. (2005). Music theory and analysis in the writings of Arnold Schoenberg (1874-1951). Ashgate Publishing, Ltd. Easdown, D. and Lavers, T. (2004). The inverse braid monoid. Advances in Mathematics, 186(2):438–455. BIBLIOGRAPHY 307 East, J. (2007). Braids and partial permutations. Advances in Mathematics, 213(1):440–461. East, J. (2010). Singular braids and partial permutations. preprint. Edelsbrunner, H. and Harer, J. (2008). Persistent homology-a survey. Contemporary mathematics, 453:257–282. Edelsbrunner, H., Letscher, D., and Zomorodian, A. (2002). Topological persistence and simpliﬁcation. Discrete and Computational Geometry, 28(4):511–533. Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic acids research, 32(5):1792–1797. Ellis, D. P. and Weller, A. V. (2010). The 2010 LABROSA chord recognition system. MIREX 2010. Erickson, R. and Palisca, C. V. (1995). Musica Enchiriadis And Scolica Enchiriadis. Yale University Press. Esling, P. and Agon, C. (2012). Time series data mining. ACM Computing Surveys, 45(1). Esling, P. and Bergomi, M. G. (2015). Multiple sequence alignment and the musical molecular clock hypothesis. ACM Trans Intell Syst Technol (submitted). Euler, L. (1739a). Tentamen novae theoriae musicae ex certissimis harmoniae principiis dilucide expositae. ex typographia Academiae scientiarum. Euler, L. (1739b). Tentamen novae theoriae musicae ex certissismis harmoniae principiis dilucide expositae. Saint Petersburg Academy. p. 147. Euler, L. (1774). De harmoniae veris principiis per speculum musicum repraesentatis. Opera Omnia, 3(1):568–586. Euler, M. (1766). Conjecture sur la raison de quelques dissonances generalement recues dans la musique. Everett, W. (2000). Expression in pop-rock music: a collection of critical and analytical essays, volume 2. Taylor & Francis. Fenn, R. and Keyman, E. (2000). Extended braids and links. Knots in Hellas, 98:229–251. Ferri, M., Frosini, P., and Landi, C. (2011). Stable Shape Comparison by Persistent Homology. Fletcher, H. (1940). Auditory patterns. Reviews of modern physics, 12(1):47. Folgieri, R., Bergomi, M. G., and Castellani, S. (2014). EEG-Based Brain-Computer Interface for Emotional Involvement in Games Through Music. In Digital Da Vinci, pages 205–236. Springer. 308 BIBLIOGRAPHY Foote, J. and Uchihashi, S. (2001). The beat spectrum: A new approach to rhythm analysis. In null, page 224. IEEE. Forman, R. (1998). Witten–Morse theory for cell complexes. Topology, 37(5):945–979. Forman, R. (2002). A user’s guide to discrete Morse theory. Sém. Lothar. Combin, 48:35pp. Frosini, P. (1992). Measuring shapes by size functions. In Intelligent Robots and Computer Vision X: Algorithms and Techniques, pages 122–133. International Society for Optics and Photonics. Frosini, P. and Landi, C. (2001). Size functions and formal series. Applicable Algebra in Engineering, Communication and Computing, 12(4):327–349. Galilei, G. (1638). Discorsi e dimostrazioni matematiche, intorno à due nuove scienze. Galilei, V. (1569). Il fronimo. Forni. Ghrist, R. (2008). Barcodes: the persistent topology of data. Bulletin of the American Mathematical Society, 45(1):61–75. Ghrist, R. and Peterson, V. (2007). The geometry and topology of reconﬁguration. Advances in applied mathematics, 38(3):302–323. Giblin, P. (2010). Graphs, surfaces and homology. Cambridge University Press. Govc, D. (2013). On the deﬁnition of homological critical value. arXiv:1301.6817. Hansen, V. L. (1989). Braids and coverings: selected topics, volume 18. Cambridge University Press. Harte, C. and Sandler, M. (2005). Automatic chord identifcation using a quantised chromagram. In Audio Engineering Society Convention 118. Audio Engineering Society. Hatcher, A. (2002). Algebraic topology. Cambridge University Press. Helmholtz, H. v. (1877). Die Lehre von den Tonempfindungen als physiologische Grundlage für die Theorie der Musik. Vieweg, Braunschweig. Hertz, G. Z. and Stormo, G. D. (1999). Identifying DNA and protein patterns with statistically signiﬁcant alignments of multiple sequences. Bioinformatics, 15(7):563–577. Horn, R. A. and Johnson, C. R. (1991). Topics in matrix analysis. Cambridge University Press, Cambridge. Hughes, J. R. (2015). Using Fundamental Groups and Groupoids of Chord Spaces to Model Voice Leading. In Mathematics and Computation in Music, pages 267–278. Springer. BIBLIOGRAPHY 309 İzmirli, Ö. and Dannenberg, R. B. (2010). Understanding Features and Distance Functions for Music Sequence Alignment. In ISMIR, pages 411–416. Jensen, K. (2007). Multiple scale music segmentation using rhythm, timbre, and harmony. EURASIP Journal on Applied Signal Processing, 2007(1):159–159. Johnson, M. (2009). Pop Music Theory. Lulu. com. Juslin, P. N. and Västfjäll, D. (2008). Emotional responses to music: The need to consider underlying mechanisms. Behavioral and brain sciences, 31(05):559–575. Katoh, K., Misawa, K., Kuma, K.-i., and Miyata, T. (2002). MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic acids research, 30(14):3059–3066. King, H., Knudson, K., and Mramor, N. (2005). Generating discrete Morse functions from point data. Experimental Mathematics, 14(4):435–444. Knees, P., Schedl, M., and Widmer, G. (2005). Multiple Lyrics Alignment: Automatic Retrieval of Song Lyrics. In ISMIR, pages 564–569. Kozlov, D. (2007). Combinatorial algebraic topology, volume 21. Springer Science & Business Media. Kurth, E. and Rothfarb, L. A. (1991). Ernst Kurth: selected writings. Number 2. Cambridge University Press. Langfelder, P., Zhang, B., and Horvath, S. (2008). Deﬁning clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics, 24(5):719–720. Larson, S. (2012). Musical Forces: Motion, Metaphor, and Meaning in Music. Indiana University Press. Lassmann, T. and Sonnhammer, E. L. (2005). Automatic assessment of alignment quality. Nucleic acids research, 33(22):7120–7128. Lee, K. M., Skoe, E., Kraus, N., and Ashley, R. (2009). Selective subcortical enhancement of musical intervals in musicians. The Journal of Neuroscience, 29(18):5832–5840. Lester, J. (1994). Compositional theory in the eighteenth century. Harvard University Press. Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, volume 10, pages 707–710. Levine, M. (2011). The jazz theory book. " O’Reilly Media, Inc.". Lewin, D. (2007). Generalized musical intervals and transformations. Oxford University Press. 310 BIBLIOGRAPHY Li, T. L., Chan, A. B., and Chun, A. (2010). Automatic musical pattern feature extraction using convolutional neural network. In Proc. Int. Conf. Data Mining and Applications. Martin, A. P. and Palumbi, S. R. (1993). Body size, metabolic rate, generation time, and the molecular clock. Proceedings of the National Academy of Sciences, 90(9):4087–4091. Martin, B., Brown, D. G., Hanna, P., and Ferraro, P. (2012). Blast for Audio Sequences Alignment: a Fast Scalable Cover Identiﬁcation. In 13th International Society for Music Information Retrieval Conference, pages pages–529. Martinez, W. L., Martinez, A., and Solka, J. (2010). Exploratory data analysis with MATLAB. CRC Press. Matityaho, B. and Furst, M. (1995). Neural network based model for classiﬁcation of music type. In Electrical and Electronics Engineers in Israel, 1995., Eighteenth Convention of, pages 4–3. IEEE. Mauch, M. (2010). Automatic chord transcription from audio using computational models of musical context. PhD thesis, School of Electronic Engineering and Computer Science Queen Mary, University of London. Mauch, M., Noland, K., and Dixon, S. (2009). Using Musical Structure to Enhance Automatic Chord Transcription. In ISMIR, pages 231–236. Mazzola, G. and Andreatta, M. (2006). From a categorical point of view: K-nets as limit denotators. Perspectives of New Music, pages 88–113. Mazzola, G. et al. (2002). The topos of music. Birkhäuser, Basel. Mileyko, Y., Mukherjee, S., and Harer, J. (2011). Probability measures on the space of persistence diagrams. Inverse Problems, 27(12):124007. Milnor, J. W. (1963). Morse theory. Number 51. Princeton university press. Munch, E. (2013). Applications of persistent homology to time varying systems. PhD thesis, Duke University. Munch, E., Shapiro, M., and Harer, J. (2012). Failure ﬁltrations for fenced sensor networks. The International Journal of Robotics Research, 31(9):1044–1056. Munkres, J. R. (1984). Elements of algebraic topology, volume 2. Addison-Wesley Reading. Needleman, S. B. and Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of molecular biology, 48(3):443–453. Notley, M. A. (2007). Lateness and Brahms: music and culture in the twilight of Viennese liberalism. Oxford University Press, USA. BIBLIOGRAPHY 311 Notredame, C., Higgins, D. G., and Heringa, J. (2000). T-Coﬀee: A novel method for fast and accurate multiple sequence alignment. Journal of molecular biology, 302(1):205–217. Ogdon, W. (1981). HOW TONALITY FUNCTIONS IN SCHOENBERG OPUS-11, NUMBER-1. Journal of the Arnold Schoenberg Institute, 5(2):169–181. Osindero, S. and Hinton, G. E. (2008). Modeling image patches with a directed hierarchy of Markov random ﬁelds. In Advances in neural information processing systems, pages 1121–1128. OSullivan, O., Zehnder, M., Higgins, D., Bucher, P., Grosdidier, A., and Notredame, C. (2003). APDB: a novel measure for benchmarking sequence alignment methods without reference alignments. Bioinformatics, 19(suppl 1):i215–i221. Ott, N. (2009). Visualization of Hierarchical Clustering: Graph Types and Software Tools. GRIN Verlag. Pardo, B. and Sanghi, M. (2005). Polyphonic Musical Sequence Alignment for Database Search. Citeseer. Pass, J. (1987). Joe Pass guitar chords. Alfred Musicr. Pearsall, E. (2012). Twentieth-century music theory and practice. Routledge. Pérez-Escudero, A., Vicente-Page, J., Hinz, R. C., Arganda, S., and de Polavieja, G. G. (2014). idTracker: tracking individuals in a group by automatic identiﬁcation of unmarked animals. Nature methods, 11(7):743–748. Perricone, J. (2000). Melody in songwriting: tools and techniques for writing hit songs. Hal Leonard Corporation. Piston, W. (1947). Counterpoint. WW Norton & Company. Piston, W., De Voto, M., and Jannery, A. (1978). Harmony. Gollancz, London. Plomp, R. and Levelt, W. J. (1965). Tonal consonance and critical bandwidth. Journal of the Acoustical Society of America, 38(4):548–560. Plomp, R. and Steeneken, H. (1968). Interference between two simple tones. Journal of the Acoustical Society of America, 43(4):883–884. Popoﬀ, A., Andreatta, M., and Ehresmann, A. (2015). A Categorical Generalization of Klumpenhouwer Networks. In Mathematics and Computation in Music, pages 303–314. Springer. Prout, E. (2012). The orchestra: orchestral techniques and combinations. Courier Dover Publications. Rankin, S. (1993). Winchester Polyphony. The Early Theory and Practice of Organum. Music in the Medieval English Liturgy, pages 59–100. Russo, W. (1997). Jazz composition and orchestration. University of Chicago Press. 312 BIBLIOGRAPHY Sankoﬀ, D. (1972). Matching sequences under deletion/insertion constraints. Proceedings of the National Academy of Sciences, 69(1):4–6. Senin, P. (2008). Dynamic time warping algorithm review. University of Hawaii. Sethares, W. (2004). Tuning, Timbre, Spectrum Scale. Springer, New York. Six, J. and Cornelis, O. (2012). A Robust Audio Fingerprinter Based on Pitch Class Histograms Applications for Ethnic Music Archives. In Proceedings of the Folk Music Analysis conference (FMA 2012). Slavich, L. (2009–2010). Strutture algebriche e topologiche nella musica del ventesimo secolo. Master’s thesis, University of Pisa. Smaragdis, P. and Brown, J. C. (2003). Non-negative matrix factorization for polyphonic music transcription. In Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on., pages 177–180. IEEE. Sturm, B. L. (2013a). Classiﬁcation accuracy is not enough. Journal of Intelligent Information Systems, 41(3):371–406. Sturm, B. L. (2013b). The GTZAN dataset: Its contents, its faults, their eﬀects on evaluation, and its future use. arXiv preprint arXiv:1306.1461. Sturm, B. L. (2014). A survey of evaluation in music genre recognition. In Adaptive Multimedia Retrieval: Semantics, Context, and Adaptation, pages 29–66. Springer. Sussman, R. and Abene, M. (2012). Jazz composition and arranging in the digital age. Oxford University Press. Taruskin, R. (2009). Music in the Nineteenth Century: The Oxford History of Western Music. Oxford University Press. Taylor, G. W. and Hinton, G. E. (2009). Factored conditional restricted Boltzmann machines for modeling motion style. In Proceedings of the 26th annual international conference on machine learning, pages 1025–1032. ACM. Tenney, J. (1988). A History of ’Consonance’ and ’Dissonance’. Excelsior Music Publishing Company, New York. Thompson, J. D., Gibson, T., Higgins, D. G., et al. (2002). Multiple sequence alignment using ClustalW and ClustalX. Current protocols in bioinformatics, pages 2–3. Thompson, J. D., Linard, B., Lecompte, O., and Poch, O. (2011). A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives. PloS one, 6(3):e18093. Thompson, J. D., Plewniak, F., Ripp, R., Thierry, J.-C., and Poch, O. (2001). Towards a reliable objective function for multiple sequence alignments. Journal of molecular biology, 314(4):937–951. BIBLIOGRAPHY 313 Thurston, W. P. (2002). The Geometry and Topology of Three-Manifolds. Electronic version 1.1, website: http://library.msri.org/nonmsri/gt3m. Topaz, C. M., Ziegelmeier, L., and Halverson, T. (2015). Topological data analysis of biological aggregation models. Trezise, S. (2003). The Cambridge Companion to Debussy. Cambridge University Press. Tulipano, L. and Bergomi, M. G. (2015). Meaning, music and emotions: a neural activity analysis. In NEA Science, pages 105–108. Tymoczko, D. (2006). The geometry of musical chords. Science, (313):72–74. Tymoczko, D. (2008). Scale theory, serial theory and voice leading. Music Analysis, 27(1):1–49. Tymoczko, D. (2011). A geometry of music: harmony and counterpoint in the extended common practice. Oxford University Press. Tymoczko, D. (2012). The Generalized Tonnetz. Journal of Music Theory, 56(1):1– 52. von Appen, R., Doehring, A., Helms, D., and Moore, A. F. (2015). Song Interpretation in 21st-Century Pop Music. Ashgate Publishing, Ltd. Wang, A. et al. (2003). An Industrial Strength Audio Search Algorithm. In ISMIR, pages 7–13. Watkins, C. J. and Dayan, P. (1992). Q-learning. Machine learning, 8(3-4):279–292. Wegel, R. and Lane, C. (1924). The auditory masking of one pure tone by another and its probable relation to the dynamics of the inner ear. Physical review, 23(2):266. Wei, M. (2008). Jazz Piano Handbook: Essential Jazz Piano Skills for All Musicians. Alfred Music Publishing. William, B. (1984). Harmony in Radical European Music. Society of Music Theory. Yoshitaka, A. and Ichikawa, T. (1999). A survey on content-based retrieval for multimedia databases. IEEE Transactions on Knowledge and Data Engineering, 11(1):81–93. Zabka, M. (2009). Generalized Tonnetz and Well-Formed GTS: A Scale Theory Inspired by the Neo-Riemannians? Mathematics and Computation in Music, page 286. Zwicker, E. (1961). Subdivision of the audible frequency range into critical bands (Frequenzgruppen). The Journal of the Acoustical Society of America, 33(2):248– 248. 314 BIBLIOGRAPHY Zwicker, E. and Terhardt, E. (1980). Analytical expressions for critical-band rate and critical bandwidth as a function of frequency. The Journal of the Acoustical Society of America, 68(5):1523–1525.

Log In

Dynamical and topological tools for (modern) music analysis

Related papers

Related papers

Related topics