About the Author(s)

Winfried A. Lüdemann Email symbol
Department of Music, Faculty of Arts and Social Sciences, Stellenbosch University, Stellenbosch, South Africa


Lüdemann, W.A., 2022, ‘Perspectives on music and evolution’, HTS Teologiese Studies/Theological Studies 78(2), a7747. https://doi.org/10.4102/hts.v78i2.7747

Note: Special Collection: Challenging Building Blocks of Our Present, Past and Future, sub-edited by Chris Jones (Stellenbosch University) and Juri van den Heever (Stellenbosch University).

Original Research

Perspectives on music and evolution

Winfried A. Lüdemann

Received: 15 May 2022; Accepted: 18 July 2022; Published: 12 Oct. 2022

Copyright: © 2022. The Author(s). Licensee: AOSIS.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Many scholars of philosophy, aesthetics, religion, history or social science have ventured to offer a comprehensive explanation of music, one of the most intangible and elusive phenomena in the world. A palaeoanthropological approach, which places music into an evolutionary paradigm, can add important perspectives to our understanding of this phenomenon. To begin with, the question whether music is an adaptation that has survival value in the classical Darwinian sense is contemplated. Views on the origin of music in conjunction with the emergence of language and as a domain for the expression of emotion, linked to music’s benefits for social coherence, are discussed. More recent views on the emergence of consciousness, on semiosis and on music as a manifestation of biocultural co-evolution, especially in conjunction with ritual, are then presented. Finally, the merit of exploiting the concept of play to help account for the systematicity of music’s semiosis is examined.

Contribution: In line with the intent of this special collection of articles, the above considerations are placed into the context of materialist versus nonmaterialist perspectives on the emergence of the human mind. The overarching argument is that music contributes crucially to what it means to be human.

Keywords: origin of music; biocultural co-evolution; emergence; language; social cohesion; consciousness; ritual; play.


Music has to rank as one of the most astounding phenomena in the world. Over the ages, many great thinkers have tried to capture the essence and origin of this ubiquitous and all too familiar, yet strangely intangible and elusive, form of human behaviour.1 Such accounts have ranged from viewing music as a divine gift, as a manifestation of the harmony of the spheres, as an imitation of the sounds in nature, as the language of the affections, as ‘tönend bewegte Form’,2 right through to music as a reflection of the social forces present in society, to name but a few examples. A list of these thinkers will have to include names like Pythagoras, Plato, Boethius, Luther, Hegel, Schopenhauer, Hanslick and, in the last century, Theodor W. Adorno ([1948] 1978), Peter Kivy (1980, 1993, 2002), Stephen Davies (1994) and Roger Scruton (1999). Since then, the speculative or philosophical reflections of these authors have made way for a more anthropological approach. Leading the way were John Blacking (1976) and Wolfgang Suppan (1984). Subsequently, the scope of this approach was expanded into the fields of neuroscience, music psychology, biomusicology and, of special interest for the present article, palaeoanthropology.3 Following on Steven Mithen’s acclaimed book on what he calls ‘the singing Neanderthals’ (2005), and the work of scholars like Nils Wallin (1991), Wallin, Björn Merker and Steven Brown (2001), Brown (2001) and especially the numerous publications by Ian Cross (e.g. 2003a, 2003b), the most ground-breaking contribution has come from Gary Tomlinson (2015). South African scholars who have contributed to the topic include Barry Ross (2008) and Sarah Wurz (2009).

A palaeoanthropological perspective on music opens up vistas onto a deep history that goes back in time for several hundred thousand years and touches not only on the biological and cultural evolution of Homo sapiens but that of other hominins as well. It must trace the various physiological, cognitive and cultural traits that had to evolve over vast stretches of time before music, or rather the capacity for music and musical behaviour, came to be one of the defining characteristics of human modernity. Tomlinson (2015:23) uses the collective term ‘musicking’ for the joint presence of these characteristics, a term that I will adopt as well.

This deep history reveals the evolution of a phenomenon of the greatest complexity, one that has thus far defied full and comprehensive explanation. Reducing it simply to a trait that has been selected for its survival value, for its fitness value for mate selection or simply as a spandrel or the fortuitous by-product of other adaptations – ‘auditory cheesecake, an exquisite confection’, in Steven Pinker’s (1997:534) famous phrase4 – has not convinced scholars who are aware of the many and diverse factors that have to be accounted for in musicking in the comprehensive sense of the term. Pinker’s phrase has spurred scholars who are set on explaining musicking as an adaptation after all, who are intent on finding a ‘music gene’,5 or who endeavour to examine music in all its complexity.

Given the constraints of space, this article can offer only a few perspectives on the above issues, bearing in mind also the nonspecialist reader.

A Darwinian perspective on musicking

To begin with, it has to be asked how a Darwinian approach can account for musicking by approaching it as a biological adaptation that has survival value. Is there a music gene,6 and does the brain have one or more areas dedicated to a capacity for music, the origin of which could be explained in adaptive terms? In his investigation of this question, in which he discusses brain lesions and the results of brain imaging, Steven Mithen (2005:64) comes to the conclusion that there is no single ‘spatially discrete area of the brain’ that represents a ‘music module’, roughly comparable to Broca’s and Wernicke’s areas for the capacity for language. He writes (Mithen 2005):

This need not be the case, as the neural networks might be very widely distributed throughout the brain even though they undertake a specific task. Indeed, even if there is a spatially discrete neural module that maps onto a music module, it need not be located in the same place in each person’s brain. Evolution may have ‘programmed’ the brain to develop neural networks for music without specifying where they are to be located. (p. 64)

This view is supported by Isabelle Peretz (2003:200): ‘[T]he demonstration of a similar brain organization for music in all humans remains elusive’. Nevertheless, she seems to support the view that ‘music may have biological roots’ and adds:

If music is biologically determined, then music is expected to have functional and neuro-anatomical specialization. That is, music is expected to be subserved by neural networks that are dedicated to its processing, in being unresponsive or inadequate for dealing with nonmusical input. Presently, support for the existence of such specialized neural networks is compelling. (p. 192)

From a palaeoanthropological perspective, however, it is more important to determine whether neural networking for music-specific processing is unique to H. sapiens. Mithen (2005) seems to believe so. He points out that Neandertals7 probably had what he calls ‘domain-specific intelligence’ and adds:

[T]hey had neural circuits equivalent to ours in number and complexity, enabling tool-making, socializing and interaction with the natural world. What they lacked were the additional neural circuits that made connections between these domains. […] The extra neural circuits that modern humans possess provide them with what I term ‘cognitive fluidity’. This is, in essence, the capacity for metaphor, which underlies art, science and religion – those types of behaviour that go unrepresented in the Neanderthal archaeological record. (p. 233)8

As to how this capacity emerged, Mithen (2005:264) suggests that ‘cognitive fluidity was a consequence of language: spoken and imaginary utterances acted as the conduits for ideas and information to flow from one separate intelligence to another’.

The implication of Mithen’s view is that the emergence of language either preceded the capacity for musicking, that the two co-evolved or that they both emerged from an earlier not yet differentiated system of protocommunication. He favours the latter view, describing this system, as it emerged among early humans, as holistic, manipulative, multimodal, musical and mimetic, abbreviated by the author to Hmmmmm (Mithen 2005:172). Mithen suggests that approximately 200 000 years ago, this mode of communication split into what then developed into music and language, respectively, each of them retaining characteristics of the earlier system. Music is defined as ‘a communication system specializing in the expression of emotion’, while language is ‘a communication system specialising in the transmission of information’ (Mithen 2005:267).

In a similar vein, Steven Brown (2001) proposes a ‘musilanguage model of music’ as a precursor to the emergence of language and music as separate capacities. He bases his model on the idea that:

[I]t greatly simplifies thinking about the origins of music and language. As it uses common features of both as its starting point, the model avoids the endless semantic qualifications as to what constitutes an ancestral musical property versus what constitutes an ancestral linguistic property […]. The model forgoes this by saying that the common features of these two systems are neither musical nor linguistic but musilinguistic, and that these properties evolved first. (p. 277)

For his model, Brown (2001) relies strongly on a ‘grammar metaphor’ in music, even if:

[T]he notion that musical phrase structures (can) have a hierarchical organization similar to that of linguistic sentences, an idea presented elegantly by Lerdahl and Jackendoff […], must be viewed as pure parallelism. (p. 272)

He even hints at a ‘generative analogy’ in music.9 If this analogy is valid and thus provides a ‘biological justification for this potential for hierarchical organization in music’, then it means that music and language must ‘converge at some deep level’. He asks what this point of convergence is. ‘The answer, briefly, is combinatorial syntax and intonational phrasing’ (Brown 2001:273).

Whatever the merits of Lerdahl and Jackendoff’s ideas are – and they have already reached classical status – when it comes to an evolutionary theory of music, they lose some of their explanatory power because of their basis in ‘combinatorial syntax’. The latter presupposes, as in the case of Mithen, the co-evolution of music and language. As will be indicated below, this co-evolution is by no means a certainty, and the emergence of a ‘grammar’ or ‘syntax’ of music can be explained equally well by other means (see the discussion of the play element in music in the ‘Musicking and play’ section).

The scepticism against musicking as an adaptation on the part of scholars like Pinker, on the one hand – musicking would then have to be categorised as an exaptation – and, on the other hand, the view by Peretz or Brown that music does have ‘biological justification’, challenges scholars in support of the notion of musicking as an adaptation to spell out what the selective advantages of this elusive phenomenon actually are. David Huron (2003:61), for one, places examples of such advantages in the domain of social interaction between hominins: mate selection, social cohesion, group effort, perceptual development, motor skill development, conflict reduction, safe time passing and transgenerational communication.

An additional level to the standard Darwinian explanation of the arts, and by implication music – a level that had not been appreciated fully before – is proposed by Denis Dutton in his book The art instinct: Beauty, pleasure, and human evolution (Dutton 2009). He criticises the idea that the arts are ‘evolutionarily useless spin-offs of the oversized human brain’, as suggested by Stephen Jay Gould (quoted in Dutton 2009:5) or, by Pinker, as ‘non-adaptive by-products’ (Dutton 2009:95). Instead, he develops ideas from Darwin’s The descent of man (1871) to show how, in a process of selective ‘self-domestication’, the arts became markers or ‘signals’ of ‘fitness’ for sexual selection – as opposed to mere courtship calls associated with the concept of natural selection, developed in Darwin’s earlier work On the Origin of Species (1859) (Dutton 2009:146, 164). (The problem with explaining the origin of musicking in terms of courtship calls is that it would have to be understood in terms of sexual dimorphism.) Despite its novelty, Dutton’s approach merely succeeds in shifting the weight of an explanation for musicking from one singular factor – various kinds of social cohesion, as proposed by Huron above – to another: fitness for sexual selection.

Musicking and consciousness

A serious deep history of musicking should be prepared to account for the emergence of a phenomenon of the mind within a universe that poses as the subject of physical science. By implication, such an account has to be linked to explanations for the emergence of consciousness in the broader sense, a topic that has initiated keen discussion in recent years. Three perspectives on this topic will be highlighted here.

David Bentley Hart (2013:153) articulates the nonmaterialist position aptly when he suggests that it may well turn out to be an ‘enormous […] category error’ to expect that neuroscience will one day discover an explanation of consciousness ‘solely within the brain’s electrochemical processes’. If this is correct, then an orthodox evolutionary account for the emergence of consciousness does not seem plausible. He adds (Hart 2013):

At the apex of the mind […] there is the experience of consciousness as an absolutely singular and indivisible reality, which no inventory of material constituents and physical events will ever be able to eliminate. Here again, and as nowhere else, we are dealing with an irreducibly primordial datum. (p. 171)

In a subsequent observation about mathematics, which may well be equally applicable to music, Hart (2013) points out the dilemma that:

[E]ven if one could wholly ground mathematics in empirical experience (which one cannot), it would still give evidence of abstractive powers in the human mind so far in excess of what the physical forces of mechanized matter or the requirements of evolution could produce that it would be hard not to view the whole phenomenon as a kind of miracle. (p. 186)

With respect to the highest forms of music, then, one would be equally hard-pressed to account for a phenomenon which is so far in excess of the basic requirements of biological adaptation. This leads to a consideration of the qualitative difference between consciousness, including musical consciousness, and its physiological or neurological substrate (Hart 2013):

This is the great non sequitur that pervades practically all attempts, evolutionary or mechanical, to reduce consciousness wholly to its basic physiological constituents. If there is something structurally problematic about consciousness for a physicalist view of things, a strictly genetic narrative of how consciousness might have evolved over a very long time, by a very great number of discrete steps, under the pressure of natural selection, cannot provide us with an answer to the central question. What, precisely, did nature select for survival, and at what point was the qualitative difference between brute physical causality and unified intentional subjectivity vanquished? (p. 206)

The logical consequence of these reflections has to be clear (Hart 2013):

[A]ny perfectly scrupulous consideration of consciousness as the unique phenomenon it is leads quite naturally toward the supposition of a ‘spiritual’ dimension of the mind, the simple and necessarily immaterial perspective of a noetic or transcendentally apperceptive power that abides, that knows, that holds reality together from some vantage of unassailable subjectivity. The conviction among many that this must be false is dictated not by logical considerations but only by an earnest devotion to a certain picture of the world; and, with regard to the mystery of consciousness no less than with regard to the mystery of being, the materialist position is the least coherent metaphysical position on offer, and the one that suffers from the greatest explanatory poverty. (p. 212)

By contrast, Thomas Nagel (2012:13) writes about consciousness from an atheist perspective. Nevertheless, he describes his position as ‘antireductionist’. More than Hart, he gives his scepticism ‘about the truth of reductionism in biology’ a rather stinging slant: ‘Physico-chemical reductionism in biology is the orthodox view, and any resistance to it is regarded as not only scientifically but politically incorrect’ (p. 5). His critique implies the call for an approach to ‘nonreducible conscious life’ (p. 52) that goes beyond standard evolutionary theory:

This would mean abandoning the standard assumption that evolution is driven by exclusively physical causes. Indeed, it suggests that the explanation may have to be something more than physical all the way down. The rejection of psychophysical reductionism leaves us with a mystery of the most basic kind about the natural order – a mystery whose avoidance is one of the primary motives of reductionism. […] The existence of consciousness is both one of the most familiar and one of the most astounding things about the world. No conception of the natural order that does not reveal it as something to be expected can aspire even to the outline of completeness. And if physical science, whatever it may have to say about the origin of life, leaves us necessarily in the dark about consciousness, that shows that it cannot provide the basic form of intelligibility for this world. There must be a different way in which things as they are make sense …. (p. 53)

Terrence Deacon’s (2012) ground-breaking work is an attempt to overcome the problems associated with this kind of reductionism and – without invoking the concept of a homunculus10 – to examine how, after all, ‘mind emerged from matter’ (the subtitle of his book). He turns to the concept of emergence11 and develops a theory called ‘emergent dynamics’. It explains how ‘processes at a higher [hierarchical] level emerge from, and are grounded in, simpler physical processes’. At this higher level, ‘teleodynamic12 (e.g. living and mental) processes’ account for the emergence of phenomena that are specific to the human mind, such as ‘intentional, purposeful, normative’ concepts (Deacon 2012:549).

The reason why Deacon’s (2012) ideas have specific relevance for the emergence of musicking is that they are based on the concept of ‘absence’ (p. 1). Music would count as one of those ‘phenomena whose existence is determined with respect to an essential absence’ (2012:3). This counterintuitive concept refers to phenomena that are characterised by an ‘absential feature’, a feature that is nonintrinsic to the physical substrate of the phenomenon yet determines its essence (Deacon 2012):

This paradoxical intrinsic quality of existing with respect to something missing, separate, and possibly nonexistent is irrelevant when it comes to inanimate things, but it is a defining property of life and mind. A complete theory of the world that includes us, and our experience of the world, must make sense of the way that we are shaped by and emerge from such specific absences. What is absent matters, and yet our current understanding of the physical universe suggests that it should not. (p. 3)

Turning to the manner in which these phenomena emerged, Deacon (2012) writes13:

Organisms with nervous systems, and particularly those with brains, have evolved to augment and elaborate a basic teleodynamic principle that is at the core of all life. Brains specifically evolved in animate multicelled creatures – animals – because being able to move about and modify the surroundings require predictive as well as reactive capacities. The evolution of the ‘anticipatory sentience’ – nested within, constituted by, and acting on behalf of the ‘reactive (or vegetative) sentience’ of the organism – has given rise to emergent features that have no precedent. Animal sentience is one of these. As brains have evolved to become more complex, the teleodynamic processes they support have become more convoluted as well, and with this the additional distinctively higher-order mode of human symbolically mediated sentience has emerged. These symbolic abilities provided what might be described as sentience of the abstract. (pp. 504–505)

The last two sentences take the discussion to the level of human consciousness. In this context, Deacon (2012) presents an extraordinarily novel view, warranting quotation at some length:

While we have only just begun to sketch the outlines of an emergent dynamics account of this one most enigmatic phenomenon – human consciousness – the results point us in very different directions than previously considered. With the autogenic creation of self as our model, we have broken the spell of dualism by focusing attention on the contributions of both what is present and what is absent. Surprisingly, this even points the way to a non-mystical account of the apparent non-materiality of consciousness. The apparent riddle of its non-materiality turns out not to be a riddle after all, but an accurate reflection of the fact that the locus of subjective experience is not, in fact, a material substrate. […] With the realization that specific absent tendencies – dynamical constraints – are critically relevant to the causal fabric of the world, and are the crucial mediators of non-spontaneous change, we are able to stop searching for consciousness ‘in’ the brain or ‘made of’ neural signals.

I believe that human subjectivity has turned out not to be the ultimate ‘hard problem’ of science. […] It was hard because we have stubbornly insisted on looking for it where it could not be, in the stuff of the world. […] The complex and convoluted dynamical processes that are the defining features of self, at any given level, are not embodied in molecules, or neurons, or even neural signals, but in the teleodynamics of processes generated in the vast networks of brains. The molecular interactions, propagating neuronal signals, and incessant energy metabolism that provide the substrate for this higher-order dynamical process are necessary substrates; but it is because of what these do not actualize, because of how their interactions are constrained, that there is agency, sentience, and valuation implicit in their patterns of interaction. We are what we are not: continually, intrinsically, necessarily incomplete in our very nature. Our sense of self, our experience of being the originative locus of agency, our interior subjective isolation, and the sense of emerging out of nothing and being our own prime mover – all these core characteristics of conscious experience – are accurate reflections of the fact that self is literally sui generis, emerging each moment from what is not there. (pp. 534–535)

Deacon’s ideas open the view to a much more sophisticated history of the evolutionary emergence of musicking than the exclusively socially orientated ideas of scholars like Huron. His reflections make it clear that the process by which musicking could have emerged, as part of sapient culture in general – that is, emergent dynamics – was not a simple and linear one. In one way or another, the linear paradigm still underlies the examples from the work of Brown, Huron and Dutton quoted above.

Neither should one imagine the emergence of musicking to be a phenomenon spread uniformly over the world. We must not think that there was unbroken continuity over the vast spatial and temporal distances (several millennia) under discussion. Rather, we should imagine similarities emerging independently from one another wherever similar environmental or demographic circumstances called for them.14

The work of Richerson and Boyd (2005:191), specifically their ground-breaking study Not by genes alone,15 already suggested that the emergence of culture (and musicking) was not like a one-way street from biology to culture. Ian Cross (2003b:42) also argues in this direction when he says that ‘music may have played a central role in the evolution of the modern human mind’. Tomlinson’s (2018:60, 134–138) work advances these ideas by a huge leap when he presents his model of ‘biocultural coevolution’. It is necessary to quote his description of the process at length (Tomlinson 2018):

Once we see that culture can be a force in the natural selective history of a species, we must conclude that the resulting new genotypes could likewise exert selective forces back on the capacities underlying its developing culture. […] For human ancestors, the ‘coevolutionary dance’16 of culture, environment, and genes was an unending round.

The approaches surveyed here track the deepening incorporation of culture into conceptions of natural selection and biological evolution across the last few decades of thought about evolution. By now, culture has found an important position in the extended evolutionary synthesis, and it is widely understood that its patterns of interaction with genes and environment bring about reciprocal dynamics that are usefully gathered under the rubric of feedback.

The classic coevolutionary situation involves mutual interaction of two species, while the sexual selection feedback cycle takes place within a single species. Niche construction, instead, draws populations of organisms and the environments they partly construct into the cycle, with feedback running from the population through ecological changes and back to the population. For cultural animals, gene-culture coevolution highlights the effects of genes on culture or culture on genes, which must always involve a middle term, the phenotypic expression of genes. Cultural niche construction, finally, interposes the environment in the midst of these genetic, phenotypic, and cultural elements. In all cases, the feedback exerts its force across generations, shaping and directing the operation of selection. (pp. 57–58)

Tomlinson’s (2018) model addresses many of the concerns highlighted by the authors quoted above. This co-evolution reaches back into the history of early hominins like Homo heidelbergensis (600 000–200 000 years ago; Roberts 2011:136) and Homo neanderthalensis (350 000–28 000 years ago; Roberts 2011:152), but then gains momentum with the emergence of H. sapiens:

It begins with the emergence by about 200 000 years ago of a species that deployed in its niche construction a novel flexibility of behavior, and it ends with a single lineage of that species some 70 000–60 000 years ago: the founding lineage of modern humanity, minimally but effectively differentiated from those around it, endowed in all essentials with our behavioral panoply, and setting off on its global expansion. (p. 135)

The processes he suggests were involved in this co-evolution include niche construction, feedback loops, epicycles17 and a gradual development of semiosis from iconic signs, through indexicality right up to full-fledged digital symbolism (see further discussion below). At the end stage of this evolution a genetically stable human modernity had been achieved. That is, ‘cultural capacity expanded to a point beyond which it buffered our species against selection for further genetic enhancement of cultural capacity’ (Tomlinson 2018:142).

One of the novel suggestions which Tomlinson (2018) introduces at this point, based on careful observation of the archaeological record, is to link the emergence of musicking not with that of language or the expression of emotion (as Brown and Mithen do), but with the indexical semiosis present in ritual:

The roots of the abiding coalition of ritual, music, and dance reach down to a period when their modern forms had not taken shape. The deep-seated connection of music and ritual should also not be mistaken for the idea that music came first, language flowing from it – a mode of language origins [which has been] repeated and elaborated from Darwin’s time to our own. (p. 159)

Tomlinson’s introduction of the phenomenon of ritual gives the discussion a decisive turn, a new paradigm in which semiosis, Deacon’s notion of absence and musicking can be linked.

Musicking, semiosis and ritual

A fully developed capacity for symbolic thinking is generally regarded as one of the defining characteristics of H. sapiens. It is maintained that this capacity distinguishes humans from animals. To what extent it distinguishes them from other hominins, however, is a rather more problematic matter.18 (The ‘Musicking and play’ section touches upon this question in more detail.)

The general term ‘symbolic thinking’ needs refining when it comes to the emergence of musicking as a human capacity. Musical semiosis has a complex genesis and consists of diverse strands that converged over a long period of time in hominin history. This history shows that the differences between linguistic, visual and musical kinds of semiosis are as important as their similarities. Lumping them together under a general term like ‘the art instinct’ (Dutton 2009) obscures important characteristics of what makes musicking, specifically, unique.

A closer look at musical semiosis will help to clarify the problematic questions regarding whether music is a language and whether modern music and spoken language have a common origin. What the question entails, in essence, is not so much whether music is a language (i.e. whether music has language-like characteristics) but what the nature of the systematicity is that underlies musical semiosis. The crucial threshold between semiosis in modern humans and other hominins (and animals) is, therefore, not whether signs were utilised at all but whether the signs were integrated into a coherent system that governed their application (see Tomlinson 2018:77–78).19 In language, this system entails combinatorial principles like grammar and syntax, which grant individual signs their full referentiality and capacity for propositional meaning. Botha (2008) describes this as ‘fully syntactical language’. This begs the question of whether an equally rich systematicity of musical signs emerged and what the nature of such a system would have been.

Archaeological evidence from the Blombos cave in South Africa of the employment of ochre and beads by late Stone Age H. sapiens (approximately 75 000 years ago) is regarded as the earliest material proof of human symbolism.20 If it is correct that these artefacts were used as body ornamentation, to indicate status, to be worn as talismans or as offerings at burial sites, then that would represent a remarkable achievement in itself (see Henshilwood 2006; Tomlinson 2015:227–233).21 One can imagine a similar employment of sounds (‘singing’ in the form of advanced gesture-calls, or extracting sounds from litho- or other idiophones) for such purposes. However, this does not yet amount to a systematicity in which the respective sounds would have been fixed into coherent signification, a stage that was achieved only later with the arrival of stable human modernity approximately 60 000 years ago (Tomlinson 2018:142).

Before this question can be pursued any further, it is necessary to understand how semiosis and the human mind in general relate to each other. One of the earlier scholars to write about semiosis (or more specifically semiotics) in a systematic way was Charles William Morris. For him, the human mind comes into its own only through symbolic thinking. In his book Foundations of the Theory of Signs (1938), he expresses the opinion that:

[H]uman civilization is dependent upon signs and systems of signs, and the human mind is inseparable from the functioning of signs – if indeed mentality is not to be identified with such functioning. (p. 1)22

Underlying Morris’s statement is the assumption that at the very basis of semiosis – in fact, as its indispensable principle – lies the readiness of the mind to provide a phenomenon with a significance that is not intrinsic to its physical substrate. This is what Deacon calls the ‘absential feature’ of a phenomenon. Musicking is a prime example of such phenomena. The capacity and readiness to distinguish between sounds intended as music and sounds that have other origins and functions, and to respond to them appropriately or, in turn, to produce them oneself – both are included in the capacity for musicking – is common to all H. sapiens and not only to those who are trained in music.23 It is the capacity to respond to aural information in an intentional way. As opposed to other kinds of information processing, as they are found in the physical or biological realms or in computation,24 the readiness to register sounds as music implies subjectivity, agency or a sense of self in the perceiver as well as the capacity of the mind to be directed at something that in philosophical parlance is called its ‘aboutness’. It is the readiness to accept such sounds as representing something which they are not in and of themselves, bringing them into existence and, in doing so, creating a relationship between them and the perceiver. In short, it is the intentionality to recognise sounds as signs. In the words of Tomlinson (2018):

Signs form the foundation of the world of what philosophers call intentionality, the power of perceiving entities (philosophers usually say ‘minds’) to be directed at something, ‘to be about, to represent, or to stand for, things, properties and states of affairs’. (p. 66)25

Similarly, Hart (2013) defines intentionality as:

[T]he mind’s capacity for ‘aboutness’, by which it thinks, desires, believes, means, represents, wills, imagines, or otherwise orients itself toward a specific object, purpose, or end. Intentionality is present in all perception, conception, language, cogitation, imagination, expectation, hope and fear, as well as in every other determinate act of the conscious mind. (pp. 191–192)

In this context he refers to the work of Franz Brentano, who introduced the term into modern philosophy: ‘For Brentano, intentionality is the very “mark of the mental”, and is of its nature something entirely absent from the merely material physical order’. Elaborating on this important point, Hart (2013) states that:

[T]he mind knows nothing in a merely passive way, but always has an end or meaning toward which it is purposively directed, as toward a final cause. […] Physical reality, however, according to mechanistic metaphysics at least, is intrinsically devoid of purpose, determinacy, or meaning. […] Among many other things, this means that, in themselves, physical events cannot produce our representations of them, because it is we who supply whatever meaning or intelligibility those representations possess. (p. 193)

From an evolutionary perspective, this presupposes the emergence of an advanced level of consciousness, which, as was mentioned above, is possessed by H. sapiens alone.

Brentano’s idea that intentionality relates to something that is ‘entirely absent’ from the ‘material physical order’ is encapsulated in the following sentences:

Every mental phenomenon is characterised by what the Scholastics of the Middle Ages called the intentional (or mental) inexistence of an object, and what we might call, though not wholly unambiguously, reference to a content, direction toward an object (which is not to be understood here as meaning a thing), or immanent objectivity. […]

This intentional inexistence is characteristic exclusively of mental phenomena. No physical phenomenon exhibits anything like it. We can, therefore, define mental phenomena by saying that they are those phenomena which contain an object intentionally within themselves.26

Returning now to the matter of musical semiosis, the point has to be made that what makes musical signs different from other signs is that they are nonreferential, as opposed to linguistic signs, which are arbitrary in their referentiality. The nature of musical signs, as opposed to those of language, was already examined by Eduard Hanslick in the middle of the 19th century, even if he used different terminology (see reference to Hanslick above).27 Important studies from the latter part of the 20th century on the nature of musical semiotics and semantics were advanced by Reinhard Schneider (1980) and Karbusicky (1986). Additional insight can now be gained from a palaeoanthropological perspective. Specific attention is given to grouping, pitch and timbre.

One area where there is indeed a close correlation between language and musicking lies in our perception of comprehensible units of meaning, that is, grouping of sounds. The ‘intonational tunes of language [have] a close counterpart in musicking: our ability to perceive general intonational contours of melodies as a kind of acoustic Gestalt’ (Tomlinson 2015:121).28 Tomlinson (2015:121) adds that the ‘musical capacity, like the linguistic one, emerges early in ontogeny […], and the two seem to be generally related in the brain areas they recruit’.

On the other hand, discrete pitch discrimination emerged as unique to hominin musical semiosis (Tomlinson 2015):

We perceive melodies […] not only as general contours but also as arrays of successive pitches (or intervals between them), wholes built from joining discrete smaller units. This discretized pitch perception is a primary manifestation of combinatorial structuring and cognition in music. (p. 121)

Even if timbre is an important aspect of vowel and consonant formation and recognition in language, it assumes special semiotic value in music. It must have emerged as a correlate to our capacity for pitch discrimination. Besides allowing perception of different sound colours (e.g. the difference between the sounds of wind and thunder or between a violin and a clarinet), it also allows the perceiver to distinguish between discretely pitched sounds (sounds that boast a harmonic overtone spectrum) and sounds of indefinite pitch. In addition, the special resonance, ‘lustre’ or ‘sheen’ (to use a visual metaphor) of sounds deemed fit for musical use would have added an evocative quality to musicking, whereby belief that the sources of such sounds were animated or enchanted29 should not be excluded.30 Another capacity is the distinction between what could be described as extreme as opposed to moderate timbre, for example, the extreme compared with the middle register of a musical instrument or between shrieking and normal voice production. Finally, the capacity to pick out at will one timbre from a complex array of other timbres is of constitutive importance for a great deal of music, for example, the melody of an oboe from among the complex timbre of a full symphony orchestra.

These three capacities hark back to what Tomlinson (2015:123) terms ‘gesture-calls’, which already formed part of a presapient ‘protodiscourse’, and which he classifies as ‘analog’ with respect to their signification (Tomlinson 2015:109, 110). Their emergence has to be linked to one another. For one, octave equivalence – the similarity in tone quality (with simultaneous difference in lightness) of pitches related in a simple 2:1 frequency ratio – characterises not only the difference between male and female voices singing the ‘same’ note, but also divides the continuous pitch spectrum into discrete cyclically repetitive segments, which in turn allow for a hierarchical order of the pitches within these segments, as well as representing the first interval in the harmonic sound spectrum. Interestingly enough, our visual sense perception does not produce this kind of frequency ratio equivalence, which, in the case of music, probably has its adaptive origin in ‘aligned periodicities of neural firing’ (Tomlinson 2015:199) and in the ‘physiologically given’ structure of the inner ear and the voice box (Erpf 1967:188–189). Human voice production (and the neural networks to which it is linked) also has the unique ability to alternate at will between the overtone spectrum of a speaking voice and that of a singing voice. The physiological and mental ability – which includes a close link to an auditive capacity that exercises self-control over the voice – to produce the latter sound, that is, to ‘sing’, must certainly rank as the very earliest manifestation of musicking behaviour, together with the exploitation of primitive idio- or lithophones. The adaptive pressure for this double potential of the voice is not certain, but it could perhaps be linked to the ‘winnowing’ of discrete pitches from gesture-calls to be mentioned below. (The intimate knowledge of the quality of stone for the construction of a wide array of tools by Palaeolithic hominins must also have led to the discovery and use of lithophones, like bell stones. The archaeological record seems to be rather scant in this regard.)

As mentioned before, the signs in musicking are nonreferential, which means that music cannot be translated into language. For the most part, language makes use of what in Charles Sanders Peirce’s famous typology of signs is classified as symbolic signs. The other signs in this tripartite division are iconic and indexical. Musicking tends to be ‘derived from indexical roots’ (Tomlinson 2015:202–203), even if many examples of iconicity or symbolism come to mind. For Tomlinson, discrete pitch perception, specifically, was instrumental in the emergence of musicking from simple indexicality.31 He (Tomlinson 2015) suggests a process by which discrete pitch was ‘winnowed’ from ‘analog gesture-calls’. In relation to the broader contours of gesture-calls:

[D]iscrete pitches represented an atomizing of the shapes into newly perceived component units. Later, modern melodies retaining some of the general contour-indexicality of their ancient predecessors would be built from these components. (p. 202)

He adds:

To this day [pitches] carry little or no indexical association; they are signs only in extraordinary contexts, usually involving modern symbolism. This abstracting of pitch from meaning represents a momentous swerve in communicative means as, for the first time in the long development of hominin communication, a new ingredient appeared in vocalized gesture that attenuated meaning and referentiality rather than bolstering and specifying them. (p. 203)

This swerve resulted in a ‘new kind of sociality, a transcendental sociality that could sponsor both ritual and religion – and tightly bind musicking to them’ (Tomlinson 2015:203). More specifically, the model proposed here:

[E]nvisages a coevolutionary sequence whereby pitches arose from […] gesture-calls, only gradually to be systematized by the nature of their own acoustical interrelations (and developing cognition of them) and then put to use in the melodies of musicking.

In general, this description does not give music any chronological priority over language, a primary feature of most hypotheses of musical protolanguage. It offers an alternative scenario in which weakly rule-governed sets of signals of more than one type emerged from the advancing protodiscourse of late Middle Paleolithic taskscapes. It suggests that discrete pitch perception formed alongside protolinguistic elements, and that both were abetted by nascent hierarchic and combinatorial cognition, before either modern language or musicking appeared. […] Musicking in the world today is the extended spectacularly formalized, and complexly perceived systematization of ancient, indexical gesture-calls. (pp. 204–205)

In his discussion of early Aurignacian instruments – bone or ivory pipes and flutes from archaeological finds at Geissenklösterle, Vogelherd and Hohle Fels32 – Tomlinson (2015) arrives at a far-reaching conclusion. According to him, these instruments represent a further stage of abstraction with respect to musicking’s ‘own acoustical interrelations’. More specifically:

The winnowing of discrete pitches from the graded intonational contours of the calls of protodiscourse brought with it an abstraction, a distancing of the pitches themselves from meaning […]. While broad pitch contours continued to convey emotive and even semantic content, the pitches underwent an absolution from signifying. Music was from the first, in this sense, absolute. […] This abstraction was momentous, marking a point at which, from within the selective pressures for effective communication, there arose an element […] resistant to signification. (p. 258)

This abstraction reflects the growing ability to think at a distance among late hominins. The ‘release’ from ‘the constraints of copresent sociality’ which this capacity enabled and which led to a ‘vast expansion of the human imaginarium’ (Tomlinson 2015:259) comes close to Deacon’s concept of absence. The ability to think in terms of that which is absent represents an important step towards the emergence of human modernity during the Upper Pleistocene (Tomlinson 2015):

The connection of musicking to ritual, religion, metaphysics, and the institutions they fostered came about as an unfolding congruence between this musical cognition and the similar cognition that made them possible. […] Across tens of millennia, minds capable of formalized abstraction and immersed in cultures of epicyclic proliferation linked transcendentalized sociality to musicking in a dance coforming the two. (p. 278)

With these ideas, Tomlinson succeeds in overcoming the linguocentrism that marks the work of earlier authors like Mithen and Brown, positing a ‘musicocentrism’ instead. At the same time they refine sentiments that were pre-empted in the 1980s already by Wolfgang Suppan (1984), who suggests that:

[W]hile spoken language originated from the need for interhuman communication, it appears that rhythmic and melodic sound structures originated primarily to create emotional states which had as their aim communication with the invisible world, the world of gods, spirits and demons. (p. 27)33

Tomlinson’s ideas represent a novel contribution to current thinking on musical semiosis. What they do not answer sufficiently is the question of musicking’s semiotic systematicity, mentioned at the outset of this section. On its own, the hierarchical order of discrete pitch perception related to the various factors surrounding octave equivalence, powerful as it is, does not suffice to give a full account of the systematicity that emerged during hominin evolution. Further consideration is called for, taking the deliberations back to the very earliest stages of hominin evolution.

Musicking and play

In musicking, the intentionality to invest sounds with significance intersects with humans’ impulse to play. The latter is a ubiquitous characteristic of musicking and – as one of its constituent strands – must have entered its emergence at a very early time, long before the first hominins made their appearance and became aware of the special qualities of sound. The impulse to play can be observed already in most mammals and therefore has to be regarded as precultural and presymbolic. Because playing is not only a social activity but can also be indulged in by individuals, its adaptational value for social cohesion should be seen to overlie earlier selective advantages. Sounds, as they then presented themselves in the voices of early hominins or in primitive idiophones, would have provided excellent material for indulging in the play impulse.34

It is noteworthy that the play-element has not featured strongly in the literature on evolutionary accounts of musicking thus far. Neither Mithen nor Tomlinson mentions it. For understandable reasons, the author of the classic study on the topic, Homo ludens (1938), Johan Huizinga ([1950] 1955:1), does not align his views with Darwinian theory either, even if he describes play as being ‘older than culture’. Whatever else one may have to say about Huizinga’s ideas – after all, they were conceived more than 80 years ago – some of his notions can be employed – or revived – productively for the topic at hand.

To begin with, the word ‘play’ is even to this day used widely and in many languages for making music, as in ‘playing a musical instrument’, or as a title for a piece of music, as in ‘pre-lude’. (The figuration preludes in the Well-tempered clavier by J.S. Bach would count as some of the most outstanding examples of this kind of music in our time.) But more importantly, play has characteristics that must have entered the emergence of musicking earlier than most other strands. These could have helped to define the parameters within which such musicking emerged. It would be important to examine the very earliest archaeological records for relevant evidence.

Even when tracing the roots of the play-element back to a pre-cultural animal level, Huizinga [1950] 1955 regards play as:

[A] significant function – that is to say, there is some sense to it. […] However we may regard it, the very fact that play has meaning implies a non-materialistic quality in the nature of the thing itself. (p. 1)

He adds:

But in acknowledging play you acknowledge mind, for whatever else play is, it is not matter. […] From the point of view of a world wholly determined by the operation of blind forces,35 play would be altogether superfluous. Play only becomes possible, thinkable and understandable when an influx of mind breaks down the absolute determinism of the cosmos. The very existence of play continually confirms the supra-logical nature of the human situation. […] We play and know that we play, so we must be more than merely rational beings, for play is irrational. (pp. 3–4)

Another characteristic of play is that it ‘resist[s] any attempt to reduce it to other terms’. Its ‘rationale’ must ‘lie in a very deep layer of our mental being’ (Huizinga [1950] 1955:6). On a cultural level, play accounts for the capacity to employ metaphors: ‘… every metaphor is a play upon words. Thus, in giving expression to life man creates a second, poetic world alongside the world of nature’ (Huizinga [1950] 1955:4). This should be linked to a similar employment of myths and rituals. The former are described as (Huizinga [1950] 1955):

[A] transformation or an ‘imagination’ of the outer world, only here the process is more elaborate and ornate than is the case with individual words. In myth, primitive man seeks to account for the world of phenomena by grounding it in the Divine. (pp. 4–5)


[P]rimitive society performs its sacred rites, its sacrifices, consecrations and mysteries, all of which serve to guarantee the well-being of the world, in a spirit of pure play truly understood. (p. 5)

Extending the notion that in play humans ‘create a second, poetic world’, Huizinga ([1950] 1955:9–13) continues to define some of the formal and structural characteristics that apply to play. Play is ‘distinct from “ordinary” life both as to locality and duration’. It is ‘played out’ within ‘certain limits of time and place. It contains its own course and meaning’. While it is in progress, ‘all is movement, change, alternation, succession, association, separation’. In nearly all the ‘higher forms of play the elements of repetition and alternation’ are essential characteristics. ‘More striking than the limitation as to time is the limitation as to space’. Play takes place in a ‘consecrated spot’ within which ‘special rules obtain’. Here an ‘absolute and peculiar order reigns’. Play ‘creates order, is order. […] Play demands order absolute and supreme’. Words we use to ‘denote the elements of play’ include ‘tension, poise, balance, contrast, variation, solution, resolution, etc. Play casts a spell over us; it is “enchanting”, “captivating”’. Tension, specifically, means ‘uncertainty, chanciness’. Furthermore, ‘[a]ll play has its rules. They determine what “holds” in the temporary world circumscribed by play. […] Indeed, as soon as the rules are transgressed the whole play-world collapses’. ‘The exceptional and special position of play is most tellingly illustrated by the fact that it loves to surround itself with an air of secrecy’. Moreover, the ‘“differentness” and secrecy of play are most vividly expressed in “dressing up”. […] The disguised or masked individual “plays” another part, another being. He is another being’.

Many of these characteristics can already be observed when animals play, and it has to be assumed that they would have been present in early hominin behaviour as well, thus confirming the earlier statement that play reaches back to precultural and presymbolic levels of hominin evolution. The idea that play is ‘distinct from ordinary life both as to locality and duration’, that it creates a ‘second world’ and is played out within ‘certain limits of time and place’, and that if these limits are transgressed the ‘whole play-world collapses’, has to be regarded as a fundamental prerequisite for musicking. Describing musicking in this manner creates something very close to what Karl Popper (1968:430) would call a ‘structurally different world’, that is, a world with its own set of constraints and possibilities.36

Seeing musicking as being grounded in such a world allows for a plausible explanation of the emergence of some of music’s most important characteristics. First to be mentioned here is our capacity for entrainment, our (Tomlinson 2015):

[A]bility to perceive the regularity of an even or isochronous series of pulses, to predict the continuation of the series, and to coordinate our activities in advance to this continued regularity. (p. 77)

Such synchronisation to an externally given beat or movement is not only an individual activity, but it includes the capacity to coordinate our sense of time to that of others, like in walking together, marching or dancing. Karbusicky (1986) makes the important point that this ‘archaic’ phenomenon is one of the strands of musical behaviour – he calls it an archetype – that reaches back into animal behaviour – for example, when one animal starts running the whole herd follows, or when one starts playing the others join in – and is therefore of great importance for our understanding of ‘anthropogenesis’.37 Tomlinson (2015:79) locates the physiological aspects of this phenomenon in multiple neural oscillations ‘synchronizing themselves to oscillating sequences of external events’. It may well be that the neural networking for this capacity is already partly established in the womb in synchrony with the mother’s heartbeat in immediate vicinity of the foetus.

Linking entrainment to the notion of play allows us to recognise how the synchronisation of our activities to an external regularity or set of constraints, as it is constitutive for musicking, is a prime example of entry into the ‘temporary world circumscribed by play’. Herein lies a plausible explanation for the (albeit much earlier) emergence of rhythm as the ultimately indispensable counterpart to pitch in musicking. It also deserves to be pointed out that ritual, which Tomlinson links to discrete pitch discrimination, and which frequently suggests the existence of a parallel world, is regarded by Huizinga (1955:5, 158) as a form of play.38

Taking this train of thought a step further, it can be argued that many of the formal structures of musicking can be seen to have their roots in play. These structures complement those that are associated with pitch, and together they serve to create the systematicity required to make musicking the coherent form of semiosis that it is, giving it the ‘order’ of which Huizinga speaks. It lends support to Tomlinson’s dismissal of linguocentrism in the discussion of musicking’s emergence. Most of the formal structures that manifest themselves in time – in contrast to the more ‘spatially’ perceived parameters like pitch and timbre – can be traced back to play. These include what Erpf (1967:196), in his novel anthropological approach to form in music, lists as the most basic principles or patterns of musical form: repetition, variation and contrast.39 To these could be added secondary principles such as imitation, elaboration,40 recursion and symmetry.41 It is needless to say that these formal structures or patterns presuppose the grouping of sounds into smaller or larger units of comprehension or Gestalten.

Patterns of repetition, variation (which includes ornamentation), contrast, imitation, symmetry, etc., can be grounded in the psychology of play as much as in the psychological phenomena of tension and release, invoked by Tomlinson (2015:143, 167) in his discussion of formal structures in musicking. Here he relies on Leonard Meyer’s theory of ‘binary bifurcations’, based on our emotional response to music.42 With regard to an evolutionary view on musicking, these ideas also have more explanatory power than the linguocentric approaches to music based on Lerdahl and Jackendoff (1983), as was pointed out in the section ‘A Darwinian perspective on musicking’.43

An argument could be made that the factors incorporated into musicking from the realm of play, thereby adding to its systematicity, may also have tended to strengthen the indexical nature of music that was forgone through the ‘winnowing’ of pitch from gesture-calls.

Describing music as a language, then, makes sense only if it is meant to point to the systematicity with which musical semiosis manifests itself. Karbusicky (1986:74) says that ‘the contents of our thoughts are by no means only symbols. Aural impressions that are imprinted in our memory are thought images, but which defy translation into symbols’.44 It is in this sense that music can be regarded as a way of knowing.


What has been said in this article can only be provisional. As new archaeological knowledge is unearthed and existing knowledge is reinterpreted, there should be much scope for revision and expansion. In the meantime, these considerations have important implications for musicology (and theology) today.

For one, engaging with the deep history of musicking is bound to have a sobering effect on many other aspects of contemporary musical scholarship, in that it foregrounds some of the most fundamental questions about music, questions that may have become submerged in the wake of what has become known as ‘new musicology’. For example, it places the much-maligned concept ‘the music itself’ back onto the agenda and creates new interest in its semiotic and structural constitution. The vast vistas which such a deep history opens also tend to dwarf some of the seemingly more immediate issues with which scholars wrestle today. What Yuval Harari (2015) writes in his equally expansive treatise on the ‘history of humankind’ can be made to have a bearing on musicology as well:

There are schools of thought and political movements that seek to purge human culture of imperialism, leaving behind what they claim is a pure, authentic civilisation, untainted by sin. These ideologies are at best naive; at worst they serve as disingenuous window-dressing for crude nationalism and bigotry. Perhaps you could make a case that some of the myriad cultures that emerged at the dawn of recorded history were pure, untouched by sin and unadulterated by other societies. But no culture since that dawn can reasonably make that claim, certainly no culture that exists now on earth. All human cultures are at least in part the legacy of empires and imperial civilisations, and no academic or political surgery can cut out the imperial legacies without killing the patient. (pp. 227–228)

Furthermore, our understanding of cosmology and evolution has developed exponentially in the last decade or two. Each new discovery seems to open up new areas of enquiry. As the work of Deacon has shown, a final answer to the phenomenon of consciousness, for example, still evades us. And if a philosopher of science like Stephen Meyer is anything to go by, there are questions about the origin of life that resist a neo-Darwinian explanation as much as they resist the ‘God-of-the-gaps’ model (Meyer 2021:412–430, especially 424 and 426). Evidence of the fine tuning of the universe raises questions about the ‘anthropic principle’ and how conscious life fits into such a universe (Meyer 2021:151–157).45 Even more to the point: how does musicking, as an ‘anthropic’ phenomenon, fit into this universe? Questions about truth and the ‘reliability of our cognitive faculties’ are at the forefront of these considerations; they have a direct bearing on our understanding of musicking as well (Meyer 2021:443).

In a debate with Rabbi Jonathan Sacks, the eminent atheist scholar Richard Dawkins, on considering the ‘aspiration’ of ‘science answering the “how questions” and religion answering the “why questions”’, remarks that ‘there is no reason to suppose that “why questions” have any legitimacy, any right to be asked or answered’.46 As this journal serves a theological constituency, the ‘why question’ should be permitted and it certainly has legitimacy. Viewpoints like those of Dawkins represent, in the words of Klaus Nürnberger (2010:82), ‘stark physical and biological reductionism’. Tomlinson (2015:34) calls it ‘adaptationist fundamentalism’. Dawkins’s reductionism (quoted from Nürnberger 2010) finds perfect expression in the following words:

The universe we observe has precisely the properties we should expect if there is, at the bottom, no design, no purpose, no evil and no good, nothing but blind, pitiless indifference … DNA neither knows nor cares. DNA just is. And we dance to its music. (p. 81)

Nürnberger (2010:82) responds: ‘[r]eductionism denies meaning and purpose not only for the impersonal infrastructure of human consciousness, but for reality as a whole’. A reductionist approach to human consciousness would also result in an impoverished view of the value of musicking in the biocultural coevolution of humans. The theologian Hans Küng (2007) challenges this kind of reductionism when he writes:

Thus if science is to remain faithful to its method, it may not extend its judgment beyond the horizon of experience. Neither the stubbornness of a skeptical ignorance nor the arrogance of those who always know better befits it. In some circumstances, can’t musicians, poets, artists, and religious people have an inkling of, glimpse, hear, see, and express in their works realities that burst physical space, the space of energy and time? (p. 52)

At the end of his book on science and religion, Küng (2007:157) writes: ‘God and world, God and human beings are thus not two competing finite causalities side by side, where one wins what the other loses; they are in each other’. These domains should therefore not be seen as hermetically sealed ‘parallel universes’ but as folded into each other: God not only as initial creator, demiurge, deistic mechanic or God of the gaps, but also as ever-present sustainer of the universe. And for the believer, music is able to straddle both these domains and transcend the ‘blind’, ‘indifferent’ and ‘pitiless’ dance of our DNA.

A more comprehensive discussion of music and evolution than what is possible here should certainly dare to deal with these far-reaching issues. Why is there something like consciousness,47 why do humans practise symbolic thinking and engage in rituals and religion, all of which go far beyond their need for physical survival, and why do they engage in musicking and articulate their humanity by doing that? All are questions worth raising. Deacon (2012:544), representing a materialist position with respect to these questions, laments as the ‘most tragic feature of our age’ the fact that, where we have begun to ‘appreciate the vastness of the cosmos, the causal complexity of material processes, and the chemical machinery of life’, we have alienated ourselves from the ‘realm of value’. He adds (Deacon 2012):

By rethinking the frame of the natural sciences in a way that has the metaphysical sophistication to integrate the realm of absential phenomena as we experience them, I believe that we can chart an alternative route out of the current crisis of the age. […] The universe is larger than just that which we can see, and touch, or manipulate with our hands or our cyclotrons. (p. 544)

In doing so, we might just happen to ‘reinvent the sacred’, to use Stuart Kauffman’s (2008) words. And yes, pursuing these larger questions may also help us to stand in amazement once again and marvel at one of the most astounding, intangible and elusive phenomena in the universe: music.


The author wishes to thank Barry Ross for stimulating discussions on this topic over many years and Felicity Grové for assistance with the final editing of the manuscript.

Competing interests

The author declares that he has no financial or personal relationships that may have inappropriately influenced him in writing this article.

Author’s contributions

W.L. is the sole author of this article.

Ethical considerations

This article followed all ethical standards for research without direct contact with human or animal subjects.

Funding information

This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

Data availability

Data sharing is not applicable to this article as no new data were created or analysed in this study.


The views and opinions expressed in this article are those of the author and do not necessarily reflect the official policy or position of any affiliated agency of the author.


Adorno, T.W., [1948] 1978, Philosophie der neuen Musik, Suhrkamp, Frankfurt am Main.

Blacking, J., 1976, How musical is man?, Faber and Faber, London.

Botha, R., 2008, ‘Prehistoric shell beads as a window on language evolution’, Language & Communication 28(3), 197–212. https://doi.org/10.1016/j.langcom.2007.05.002

Brentano, F., 1955, Psychologie vom empirischen Standpunkt, Erster Band, Felix Meiner, Hamburg.

Breyl, M., 2020, ‘Triangulating Neanderthal cognition: A tale of not seeing the forest for the trees’, WIREs Cognitive Science 12(2), e1545. https://doi.org/10.1002/wcs.1545

Brown, S., 2001, ‘The “musilanguage” model of music evolution’, in N. Wallin, B. Merker & S. Brown (eds.), The origins of music, pp. 271–300, MIT Press, Cambridge, MA.

Cross, I., 2003a, ‘Music and evolution: Consequences and causes’, Contemporary Music Review 22(3), 79–89. https://doi.org/10.1080/0749446032000150906

Cross, I., 2003b, ‘Music, cognition, culture, and evolution’, in I. Peretz & R. Zatorre (eds.), The cognitive neuroscience of music, pp. 42–56, Oxford University Press, Oxford.

Davies, S., 1994, Musical meaning and expression, Cornell University Press, Ithaca, NY.

Dawkins, R., 1987, The blind watchmaker: Why the evidence of evolution reveals a universe without design, Norton, New York, NY.

Deacon, T.W., 2012, Incomplete nature: how mind emerged from matter, W.W. Norton, New York, NY.

Dutton, D., 2009, The art instinct: Beauty, pleasure, and human evolution, Bloomsbury Press, New York, NY.

Erpf, H., 1967, Form und Struktur in der Musik, Schott, Mainz.

Hamer, D., 2004, The God gene: How faith is hardwired into our genes, Anchor, New York, NY.

Hanslick, E., 1980, Vom musikalisch Schönen: Ein Beitrag zur Revision der Ästhetik der Tonkunst (Reprint of the 20th edn.), Breitkopf & Härtel, Wiesbaden.

Harari, Y.N., 2015, Sapiens: A brief history of humankind, Penguin Random House, London.

Hart, D.B., 2013, The experience of God: Being, consciousness, bliss, Yale University Press, New Haven, CT.

Henshilwood, C., 2006, ‘Reading the artefacts: Gleaning language skills from the Middle Stone Age in Southern Africa’, keynote address, Cradle of language conference, Stellenbosch, 6th–10th November.

Huizinga, J., [1950] 1955, Homo ludens. A study of the play-element in culture, Beacon Press, Boston, MA.

Huron, D., 2003, ‘Is music an evolutionary adaptation?’, in I. Peretz & R. Zatorre (eds.), The cognitive neuroscience of music, pp. 57–75, Oxford University Press, Oxford.

Jacob, P., 2014, ‘Intentionality’, in E.N. Zalta (ed.), The Standford Encyclopedia of Philosophy, viewed n.d., from http://plato.stanford.edu/archives/win2014/entries/intentionality/.

Karbusicky, V., 1986, Grundriß der musikalischen Semantik, Wissenschaftliche Buchgemeinschaft, Darmstadt.

Kauffman, S., 2008, Reinventing the sacred: A new view of science, reason and religion, Basic Books, New York, NY.

Kivy, P., 1980, The corded shell: Reflections on musical expression, Princeton University Press, Princeton, NJ.

Kivy, P., 1993, The fine art of repetition: Essays in the philosophy of music, Cambridge University Press, Cambridge.

Kivy, P., 2002, Introduction to a philosophy of music, Clarendon Press, Oxford.

Küng, H., 2007, The beginning of all things: Science and religion, Eerdmans Publishing Co., Grand Rapids, MI.

Lerdahl, F. & Jackendoff, R., 1983, A generative theory of tonal music, MIT Press, Cambridge, MA.

May, A., 2022, ‘Since when have humans had a soul?’, HTS Teologiese Studies/Theological Studies 78(2), a7311. https://doi.org/10.4102/hts.v78i2.7311

Meyer, L.B., 1956, Emotion and meaning in music, University of Chicago Press, Chicago, IL.

Meyer, S.C., 2021, Return of the God hypothesis: Three scientific discoveries that reveal the mind behind the universe, HarperCollins, New York, NY.

Mithen, S., 2005, The singing Neanderthals: The origin of music, language, mind and the body, Weidenfeld & Nicolson, London.

Morris, C.W., 1938, Foundations of the theory of signs. International Encyclopedia of Unified Science, vols. 1, 2, University of Chicago Press, Chicago, IL.

Nagel, T., 2012, Mind and cosmos: Why the materialist neo-Darwinian conception of nature is almost certainly false, Oxford University Press, Oxford.

Nürnberger, K., 2010, Richard Dawkins’ God delusion: A repentant refutation, Xlibris Corporation, Bloomington, IN.

Peretz, I., 2003, ‘Brain specialization for music: New evidence from congenital amusia’, in I. Peretz & R. Zatorre (eds.), The cognitive neuroscience of music, pp. 192–203, Oxford University Press, Oxford.

Pinker, S., 1997, How the mind works, W.W. Norton, New York, NY.

Popper, K., 1968, The logic of scientific discovery, Harper, New York, NY.

Richerson, P. & Boyd, R., 2005, Not by genes alone: How culture transformed human evolution, University of Chicago Press, Chicago, IL.

Roberts, A., 2011, Evolution: The human story, Dorling Kindersley, London.

Ross, B., 2008, ‘An assessment of language and music: Co-evolution theories in evolutionary musicology’, Unpublished Masters’ dissertation, Stellenbosch University.

Schneider, R., 1980, Semiotik der Musik: Darstellung und Kritik, Wilhelm Fink Verlag, Munich.

Scruton, R., 1999, The aesthetics of music, Oxford University Press, Oxford.

Suppan, W., 1984, Der musizierende Mensch: Eine Anthropologie der Musik, Schott, Mainz.

Tomlinson, G., 2015, A million years of music: The emergence of human modernity, Zone Books, New York, NY.

Tomlinson, G., 2018, Culture and the course of human evolution, The University of Chicago Press, Chicago, IL.

Turk, M., Turk, I., Dimkaroski, L., Blackwell, B., Horusitzky, F., Otte, M. et al., 2018, ‘The Mousterian Musical Instrument from the Divje babe I cave (Slovenia): Arguments on the Material Evidence for Neanderthal Musical Behaviour’, L’anthropologie 122 (2018), 679–706. https://doi.org/10.1016/j.anthro.2018.10.001

Wallin, N., 1991, Biomusicology: Neurophysiological, neuropsychological, and evolutionary perspectives on the origins and purposes of music, Pendragon Press, New York, NY.

Wallin, N., Merker, B. & Brown, S., 2001, The origins of music, MIT Press, Cambridge, MA.

White, R., 2007, ‘Systems of personal ornamentation in the early upper Palaeolithic: Methodological challenges and new observations’, in P. Mellars, K. Boyle, O. Bar-Yosef & C. Stringer (eds.), Rethinking the human revolution. New behavioural and biological perspectives on the origin and dispersal of modern humans, pp. 287–302, MacDonald Institute for Archaeological Research, Cambridge.

Wurz, S., 2009, ‘The evolutionary origins of music’, unpublished Master’s dissertation, Stellenbosch University.


1. Steven Pinker (1997:528) even calls it an ‘enigma’.

2. Eduard Hanslick’s famous statement is impossible to translate. A close English wording would be ‘sonically moving forms’. See Hanslick (1980:59); the first edition of this treatise appeared in 1854.

3. Gary Tomlinson (2018:ix) calls these endeavours a ‘biological turn’.

4. Pinker (1997:529) also maintains that music is a ‘technology, not an adaptation’. See Gary Tomlinson’s (2018:46) critical response to Pinker’s ideas.

5. Similar to a language or ‘God gene’ (see Hamer 2004). This term also appeared on the cover of Time magazine, 25 October 2004.

6. Tomlinson (2018:23–25, 49–50) discusses ‘gene-centrism’ and the derived concept of ‘memes’, famously proposed by Richard Dawkins, in very critical terms, the latter offering a ‘reductivist view of human culture’ (p. 25).

7. I follow recent conventions in the spelling of this term.

8. The term metaphor seems to refer to what is called elsewhere in the literature the capacity for symbolic thinking or what I prefer to call the capacity for semiosis.

9. These ideas are very close to the notion of a grammar of music based on generative linguistics, as espoused by Fred Lerdahl and Ray Jackendoff (1983).

10. See Deacon (2012:46) and elsewhere.

11. More generally, Klaus Nürnberger (2010) writes: ‘[R]eductionism is today being replaced across the board with the theory of emergence …’ (p. 3).

12. As opposed to the concept of teleology.

13. In the limited space available, a full assessment of Deacon’s ideas is not possible, least of all of his constant recourse to the second law of thermodynamics and the related concept of entropy.

14. See Tomlinson’s (2018:108–111) discussion of this important matter.

15. See especially chapter 6: ‘Culture and genes coevolve’ (pp. 191–236).

16. This is a quotation from the work of Richerson and Boyd.

17. Space limitations do not allow any elaboration of these terms and processes.

18. Breyl (2020) suggests that the distinction made between anatomically modern humans (H. sapiens) and other hominins, especially Neandertals – based on the supposed cognitive inferiority of the latter – is far less tenable than earlier scholars have taken for granted. The capacity for language has traditionally been regarded as the crucial dividing point between H. sapiens and other hominins. However, Breyl (2020:11) states that ‘it should be noted that there is some physiological and genetic evidence to suggest that Neanderthals perceived and used speech sounds in a similar manner as modern humans. The auditory capacities seem to have been fine-tuned to a basically modern constitution by H. heidelbergensis […] and the Neanderthal hyoid bone is documented to be physiologically and biomechanically essentially modern-like, implicating the possibility of a roughly human speech capacity. […] Adding to this picture, the modern variant of the FOXP2 gene, which is, among a multitude of other functions […], linked to a normative development of language, has been shown to be present in both Neanderthals and Denisovans and therefore seems to date back to at least their last common ancestor with the lineage toward anatomically modern humans’. See also May (2022).

19. Whether signs with an essentially ‘musical’ quality were utilised by other hominin species would require an investigation of its own. In this context, it is important to reference a recent article by Turk et al. (2018), in which it is claimed that a bone flute found in the Divje babe I cave in Slovenia is evidence of Neandertal musicking: ‘Discussed in the archaeological context of the site and taking into the account new findings about the cultural achievements and cognitive abilities of Neanderthals, the find of the Mousterian musical instrument is no longer surprising. At present, the musical instrument from Divje babe I, which predates 50 ka, firmly supported with a Mousterian context and chronology, remains the strongest material evidence for Neanderthal musical behaviour. As such, it cannot be excluded from any scientific discussion regarding the origins of music’ (Turk et al. 2018:703).

20. See Botha (2008) for additional, partially critical thoughts about this claim.

21. Note that Breyl (2020:9) attributes this kind of behaviour to Neandertals already: ‘Neanderthals perforated and colored seashells, apparently to string them, and behaved in similar ways with beads, geodes as well as prepared feathers and talons, sometimes carving geometric patterns on the materials and featuring some of this behavior by at least 130,000 years ago’. Furthermore, he quotes some scholars who claim ‘some of the currently oldest known cave paintings to be of Neanderthal origin’, while others even go so far as to ‘suggest a Neanderthal number sense by presenting evidence of Neanderthal cord production, which implies knowledge of numbers and sets in order to pair and intertwine individual fibers into the cord’ (Breyl 2020:9).

22. Karbusicky (1986:3) quotes Charles Sanders Peirce as saying: ‘We have no power of thinking without signs’. Schneider (1980:13–14) expands on Morris’s claim that his ideas are of universal validity.

23. This rather obvious point was brought home forcefully by John Blacking (1976).

24. See Tomlinson’s (2018:60–62) discussion of Claude Shannon’s systematic exposition of the concept of information, as well as Deacon’s (2012:91, 469) reference to computation and vegetative ‘informational relationships’ within organisms, respectively.

25. The latter part of the sentence is cited from the work of Pierre Jacob (2014).

26. This translation is quoted from Deacon (2012:375). The original German wording reads: ‘Jedes psychische Phänomen ist durch das charakterisiert, was die Scholastiker des Mittelalters die intentionale (auch wohl mentale) Inexistenz eines Gegenstandes genannt haben, und was wir, obwohl mit nicht ganz unzweideutigen Ausdrücken, die Beziehung auf einen Inhalt, die Richtung auf ein Objekt (worunter hier nicht eine Realität zu verstehen ist), oder die immanente Gegenständlichkeit nennen würden. […] Diese intentionale Inexistenz ist den psychischen Phänomenen ausschließlich eigentümlich. Kein physisches Phänomen zeigt etwas Ähnliches. Und somit können wir die psychischen Phänomene definieren, indem wir sagen, sie seien solche Phänomene, welche intentional einen Gegenstand in sich enthalten’ (Brentano 1955:124–125).

27. Lack of space does not allow for any engagement with Hanslick’s ideas. It would, however, make for an interesting study to evaluate his views on music from the vantage of present-day scholarship on semiotics and the evolution of musicking.

28. Karbusicky (1986:63) points out that Gestalt theory has long since recognised that ‘musical shapes are memorised holistically’: … die in der Gestalttheorie längst bekannte Tatsache, dass Musikgebilde als Ganzheit im Gedächtnis gespeichert werden’. (All translations are by the author, unless stated otherwise.)

29. Karbusicky (1986:7) is at pains to point out: ‘Das magische Moment der Tonerzeugung und –gestaltung […] bewirkt, dass tatsächlich schon jeder einzelne Ton als ein geheimnisvolles “Zeichen” erscheinen kann. Musik legt die magischen Wurzeln der Zeichen- und Bedeutungsbildung frei’. [The aspect of enchantment in sound generation and shaping causes every single note, in fact, to appear as a secret or hidden ‘sign’. Music exposes the genesis of signs and meaning as being rooted in enchantment.]

30. An interesting parallel to this idea can be found in Randall White’s discussion of Aurignacian ornaments: ‘[R]aw materials to be transformed into ornaments were not chosen for their ready availability […]. Materials used to communicate social value and identity, such as ivory, amber, lignite, mother of pearl, dental enamel and soapstone, share a common character: visual and tactile lustre. It is hard to escape the notion that this evocative qualitative characteristic must have been construed in some meaningful way within Aurignacian society’ (White 2007:299).

31. Other parameters of musicking, especially those associated with the dimension of time, will have retained a closer connection to their indexical roots.

32. He refers here to the work of Nicholas Conard and his colleagues (Tomlinson 2015:253). However, see also Turk et al. (2018) for their investigation into a supposedly Neandertal bone flute, as mentioned in footnote 19.

33. ‘Doch während die gesprochene Sprache aus einem zwischenmenschlichen Kommunikationsbedürfnis entstanden ist, scheinen die rhythmischen und melodischen Klangformen zunächst mit dem Ziel entwickelt worden zu sein, emotionale Zustände zu schaffen, die eine Kommunikation mit der unsichtbaren Welt, mit der Geisterwelt ermöglichen’.

34. David Huron (2003:63) agrees that utilising the voice for musicking predated the construction of musical instruments and goes back to approximately 150 000 years ago.

35. Compare these ideas with Richard Dawkins’s (1987) notion of the ‘blind watchmaker’.

36. It is telling that Popper illustrates these constraints and possibilities – he calls them natural laws – with examples taken from the realm of musical form.

37. This summarises Karbusicky’s (1986:7) ideas. He writes: ‘In der Auffasssung der Musik als “Zeichensystem” wirkt sich auch die Magie der Töne aus. Musik ist kreativ wie rezeptiv eine der archaischsten Erscheinungen; die vergleichende Rekonstruktion ihrer Entwicklungsstadien hat darum eine große Bedeutung für die Erkenntnis der Anthropogenese […]. In ihr sind Archetypen, die ins tierische Verhalten hineinreichen, lebendig (vgl. das Mitgerissen-Sein mit der Bewegung anderer usw.)’.

38. The opposite also holds true: many kinds of play, like international or provincial rugby or football matches, exhibit ritual characteristics.

39. Karbusicky (1986:25) is in agreement with the notion that repetition and variation represent basic procedures of play.

40. Prolongation, spinning out or Fortspinnung would be alternative terms for this principle.

41. These patterns are also integral to dance, ‘the purest and most perfect form of play that exists’ (Huizinga [1950] 1955:164).

42. He refers here to Meyer’s classic study Emotion and Meaning in Music (1956).

43. As an aside, the analysis of Mozart’s G Minor Symphony KV 550 presented by Lerdahl and Jackendoff (1983:259), to cite one example, can be arrived at equally well by employing formal principles like repetition, variation, contrast and especially symmetry, as they are found in play.

44. ‘Unsere Denkinhalte sind keineswegs nur Symbole. Auch gespeicherte Höreindrücke sind Situationsbilder, die gedacht, aber nicht in Symbole umgesetzt sind’.

45. See Meyer’s arguments for and against a ‘weak’ and a ‘strong’ anthropic principle here. Küng (2007:147) is much more circumspect in his assessment of these questions.

46. Jonathan Sacks and Richard Dawkins at BBC RE:Think festival 12 September 2012, accessible on YouTube, between 6:14 and 7:01 (viewed 06 May 2022, from https://www.youtube.com/watch?v=roFdPHdhgKQ).

47. The natural scientist Stuart Kauffman (2008:177) writes: ‘Consciousness is the aspect of our humanity that is most obviously – and famously – incompatible with reductionism. […] How the mind is able to generate the array of meanings and doings it does is beyond current theory’.

Crossref Citations

No related citations found.