Integrating Music, Language and the Voice in Music Therapy

By Joanne Loewy

[Editorial note: Audio excerpts (a), (b), (d), and (e) has been previously published in: Loewy, Joanne (1995). Musical stages of speech, Audio excerpts #1, #4, #6, and #7 Music Therapy, vol 13(1). Retrieved from Music Therapy Research, CD-ROM I, 1964-1998. Silver Spring: American Music Therapy Association.
The audio excerpts are republished here with the kind permission from the American Music Therapy Association, Inc., 8455 Colesville Road, Suite 1000, Silver Spring, Maryland 20910, USA. Web:]


The use of spoken language is one of the most uniquely human parameters that differentiate one human being from another. The words we choose and the musical qualities that we use to express our words are based on a broad spectrum of functioning. This includes our brain and neurological functioning, our emotions and ego state, our intellectual and historic use of cognitive processes and the cultural realm of our existence.

Although there has been a fair amount of research and clinical practice related to neurological music therapy, speech production within a musical context and music psychotherapeutic voice work, we rarely find these practices intertwined. The following article provides history, rationale, definition of practice and theory that provides strong backing for the integration of the models and approaches currently available to music therapists.


The most personal and uniquely musical presentation in every day human expression is speech. With words, we combine in a selective way morphemes; deliberate pieces of content that we string together with adherence to the history of our use. This history includes our capacity to listen (receptive), our ability to express (expressive) and the environment in which we have grown accustomed to the use of our voice through words in discourse (culture). Anderson (1982) would add cognition, association and autonomous skills to this acquisition process. When we speak words, we are consciously and often unconsciously selecting dynamic, rhythmic and timbric elements to express our wishes, thoughts and responses to others in a way that is uniquely indicative of who we are.

In the text The Musical Mind, psychologist John Sloboda (1989) identifies seven striking links between language and music. It is useful to note these similarities as we ponder the notion that speech is part content and part expression in its basis. The thoughts we have require an inner search for a contextual description. Our constant quest to locate the precise word to describe a particular idea occurs at virtually every moment in which we are engaging in discourse. Subsequently, the tones and timbres we select influence the ideas we seek to communicate and gravely influence our listeners, providing revelations about our emotions and our truest intentions, thru the way we express our words. The underlying premise of this paper is that speech is both a cognitive and musical process and that humans are engaged in this process at every juncture of engagement in moment-to-moment communication.

Sloboda (1989) points out that language and music are:

  1. "universal to all humans and specific to humans" meaning that we have a unique propensity to use both music and language.
  2. The ability to create "an unlimited number of novel sequences" using words/musical contours is indicative of characteristics that are part of both language and music.
  3. "Spontaneous speech and spontaneous singing develop within infants at approximately the same time." Sloboda makes a case for an analogous progression of acquiring "rules" of language and music.
  4. "The natural medium for both language and music is auditory-vocal."
  5. Music and language can both be written down and notated, meaning that "the use of visual symbols" is integral to both systems.
  6. "Receptive skills precede productive skills in the development of both language and music."
  7. In both language and music, there are distinct variances in cultural forms and the context in which such forms are presented which affects understanding of acquisition. (p.18)

Today, the similarities involved in the mechanisms of the brain in both language and music acquisition, are of interest to the music therapist. Music therapists are deft at working with issues associated with language and voice dynamics and content development. As it is nearly impossible to separate the speaker from the speech, or the singer from the song in vocal discourse; the reader may attain the most holistic model of vocal expression by viewing distinct aspects of thought, speech and expression as a gestalt which contributes to emotion and meaning in human communication and function.

This article will consider the essence of language and communication through identifying and suggesting a means for integrating, four existing models:

  1. Music Therapy in the Pre-linguistic Stages
  2. Music Therapy in a Developmental Context
  3. Music Therapy in Recovery
  4. Music Therapy and the Voice in Psychotherapeutic Function

In this paper, I will describe each model, provide supporting research and practice applications for the music therapists. It is my intention to highlight specific rationale as well as a developing model of research and practice to support each model's context. Undoubtedly, the music therapist working within the domain of any one model would benefit in thinking about the model within a continuum of the other three. It is essential that we take aspects within each model to create a comprehensive way of treating patients. It is perhaps a flaw in our attempts to refine our particularly unique practices, that we veer away from integrating strategies based on other, perhaps less familiar avenues of approaches. The emphasis of this article will be based upon the foundation that the mind (brain), the body-voice (physiologically and culturally) and the psyche (emotions) may become more inherently unified and ultimately achieve greater capacity for precise and intentional expression through the music of our speech and the discourse-repairs of language that is supported by mechanisms of music and principles of music therapy.

Music therapy in the Pre-linguistic Stages

In 2001, Valgerður Jónsdóttir posited that infants develop musically and considered what the role of music might be for the developing child, emphasizing that with the onset of hearing in the second trimester of fetal life that musical development, the sensitivity for music, and the predispositions for processing musical experiences also commence.  Indeed, the whoosh and flow of the womb as well as the pulsing variations of mother's heartbeat are the first musical elements that the fetus adjusts or "entrains" with. (Rider, 1997).  According to Hanus Papousek, the most essential aspects of prosody occur during the preverbal period of infancy (1996).  Indeed, music is instrumental to the development of language.

As Jónsdóttir notes (2001):

If musical intelligence is a gift that emerges earlier than others as Gardner believes (1983:99) one should draw upon it to enhance the possibilities and educational opportunities of the infant.  Supported by research the sensory stimulation and the education provided by music reflect a basic human need.  Thus the role of music in children's life is perhaps that of forming a base for all other education.  Through affiliative interaction and our innate capacity to seek optimal inter-personal relationships we engage in the world musically.  We also develop our interpersonal intelligence musically.

Jónsdóttir makes a compelling case for music therapy in early intervention. At twenty-four weeks, a baby will respond to music "by blinking its eyes and moving as though dancing to a beat" (p. 148). After birth, infants will distinguish the typical melody contours of their native language, while yet still unaware of the meaning of words. (Mehler & Dupoix, 1992, cited in Mora, 2000, p. 148) In light of this evidence, Mora responds:

Discourse intonation, the ordering of pitched sounds made by a human voice, is the first thing we learn when we are acquiring a language. Later on, it is through interaction that a child picks up not only the musicality of each language, but also the necessary communication skills." (p. 149)

The first sound that is received is the cry the infant makes at birth. Interestingly, it is the pitch, tone and dynamic strength of the cry that is assessed within moments of emerging from the womb and serves as a basis for the APGAR score. This cry is the first developmental parameter that is evaluated and scored. It is a major indicant of health and physical strength.

Subsequently, the new infant learns that the cry will soon have his needs met; the hunger cry informs the parent that the infant needs to be fed. In contrast, the comfort sound reflects a feeling of contentment. The crying -comfort sounds are cues of expression. The pitch and dynamic qualities of the cry indicate the emotional state of the baby as well as the individual's unique needs.

As the baby develops, the cry becomes more refined and is part of an interactive dynamic that comes to represent a mood (as evidenced through a musical state). The sounds are more specific and can expand to indicate discomfort, fear, pain, restlessness. There is a growing body of mood states.

The growing child can imitate the rhythm and musical contours of language long before he can say the words. The musical aspects of language, tone, pauses, stress, and timbre, can be thought of as sonorous units (Stansell, 2002, Loewy, 1995) into which phonemes are later placed.

Loewy (1995) proposes that language should be considered not in a cognitive context, but within a musically developed domain. Illustrating through written and audio case examples, infant and children's development of speech through four distinct musical stages was revealed. This work was based on Van Riper's postulation that the following stages of speech present as: Stage I, crying/comfort sounds; Stage II, babbling, lulling; Stage III: inflected vocal play; and Stage IV: single and double word utterances.

Physicians can predict problems an infant may incur through musical aspects of the cry; as in 'the meningial cry' or 'cri-du chat' (Cat Cry syndrome). The musical encouragement of cooing sound production, which serves as a precursor to and predictor of speech, can enhance the baby's comfort in language. The infant's encouragement of movements may indicate a broad range of emotions. Rhythmic movement and music are part of a collaborative system that affects vocal/motoric apparatus which enhances communicative intent. As Darwin stated in the 19th century as part of his linguistic evolutionary theory:

The wants of an infant are at first made intelligible by instinctive cries, which after a time are modified in part unconsciously, and in part, as I believe, voluntarily as a means of communication, by the unconscious expression of the features, by gestures and in a marked manner by different intonations, lastly by words of a general nature invented by himself, then of a more precise nature imitated from those which he hears. (1877, p. 50)

The intonation contours within crying and babbling stages are of unique interest to the music therapist. How is it that these essential inherently emerging communicative qualities are ignored in the music therapy literature? In 1968, Peter Ostwald (et al) asserted that the infant cry is a "remarkably efficient sound signal, which indicates that the baby is hungry, cold, lonely, or in distress" (p. 51). In inflected vocal play, no words involved. Communicative vocalizations are reinforced through the inherently musical elements of the cry. Music therapists often sing in a child's tonality, mirroring the baby's sounds that in turn provide refuge in resolving cries of distress. Use of drums may help to encourage internal rhythm. Communication through musically inflected vocal play helps to connect the baby with her surrounding environment.

Through a technique entitled 'tonal vocal holding' the music therapist can provide the vocal means for a patient to explore sound, breath and voice, thus developing the capacity to release a vocal sound, which is the most primitive of vocal expressions. An example of the use of this technique can be heard in the audio excerpt (a), entitled 'Tonal Vocal Holding' (Loewy, 1995). Mi140Loewyexcerpt a by Voices-mt
Once the crying comfort stage has been achieved and explored sufficiently, most infants soon move to babbling, which enables them to more playfully explore the musical or prosodic-feeling elements of speech, such as tone, timbre, silences and rhythm. Turn taking, repartee in babbling, can provide the first, most essential basis for how a child is socialized into a community of communicative intention. An audio example of "Discrimination of Vowel Sounds Through Structure' (Loewy, 1995) can be found on audio excerpt (b). Whereas the words a child says can be represented with letters, the sound contours that still accompany the emotion behind the new linguistic production of vowels and phonemes are musical in nature. Mi140Loewyexcerpt b by Voices-mt

Diane Austin (2002) has developed a technique called 'vocal holding' which will be described later in this paper under section 4: Music Therapy and the Voice in Psychotherapeutic Function. Similarly to the author's tonal vocal holding, Austin's technique utilizes the voice of both client and therapist in grounding and mirroring. However, the therapist is holding the voice to help the client integrate a fragmented ego, whereas, the author's technique of tonal vocal holding is used to assist the baby, toddler or regressed patient to release a first sound, or to promote a crying sound.

"The comparisons between music and language are most evident in children while one system of structured communication gradually replaces another." (Stansell, 2002) The babble introduces the use of phonemes; small vowel elements with a consonant. This construction carefully placed within a musical phrase can enhance vocalizations and serve as a catalyst for the development of morphemes; the first, most elemental unit of speech, the word. Babies and toddlers are constantly synthesizing the music of speech from their surrounding environment. Audio excerpt (c) provides evidence of this. A mother and three year old daughter are working on a toy house -building project. Mother is holding her 18 month old son, who is holding a toy doll. The daughter asks: "where's the glue?" We hear the young son respond with phonemes: "bee, dee, gee). When there is not an immediate response, he moves from phonemes back to a more insistent crying sound: "ee, ee, ee" in the tonality of his sister's speech. When his mother asks him if he wants to put be put down, he responds "ma ma ma." When mom addresses the daughter's project again ("roof/house") we hear the phonemes become more rhythmic and productive: (ba, ba, ba). The developing baby is constantly influenced by the music of others' speech. It indirectly influences the progression from one stage to the next. Mi140Loewyexcerpt c by Voices-mt

Chen-Hafteck (1996) adopts a similar stance and draws together developmental research in music and language to support this position. Infants can distinguish meaningful sounds from background noises. They notice the sound qualities of direction, frequency, intensity, duration, tempo, intonation, pitch, and rhythm. The infant's perception of sound, known as the BAP or Basic Auditory Perception, is far greater than many had previously believed (Stansell, 2002, citing, Steinke, Cuddy, & Holden, 1997):

It is difficult to describe which utterances are pre-musical and which are pre-linguistic. This closeness retains its importance throughout the language and music acquisition processes, and into the melodic and telegraphic speech stage. Language is the means for communicating content, while music and the music of language, or prosody, works in tandem to represent emotion.

As evidence of this closeness, Stansell (2002) notes the process that occurs when someone begins to cry while speaking. Prosodic features of air control, pacing, tone, and tenor become more exaggerated and emotions break through in musical representations while language regresses into babbling.

In practice, I have used the 'musical stages of speech' model to assess and expand the vocal sound range of a baby or toddler and employed it with some mute children who have ceased to vocally express themselves as the result of a traumatic event. Working first with the cry, a child can be encouraged to release tension and thru supportive and reflective vocalizing, may work the voice into a 'cry' state. Babbling can become more active and diverse when framed within a musical context. Whereas speech therapy may seem likened to an educational task, music therapy can address specific emotions coupled with motoric and vowel utterance exploration within a musically playful context. The enhancement of motoric functioning is further developed through rhythmic play. Boswell and Vidret (1993) based their work with severely and profoundly disabled adults on this concept. Through music and eurhythmy, they were able to increase the vocal production of teenagers, integrating a vocal and rhythmic response.

The concept of basic beat as developed by Nordoff and Robbins has provided rationale for the idea that a person's sense of order (physically) can enhance the ability to communicate. The strengthening of a person's ability to sense and internalize rhythm seems to strengthen the capacity to produce sounds, vocally. Speech and movement tend to develop synchronously. The audio excerpt (d) entitled "Organized Beating with the Child's Name" (Loewy, 1995) reflects the use of enhancing vocal and motoric play at the same time. When phonemes are sung during moments of rhythmic synchrony, vocalization is induced and a feeling of self seems to come into the body and mind of the music maker. Music self -actualization is realized and further acclaimed when the person hears his/her name sung within the rhythm. Mi140Loewyexcerpt d by Voices-mt

Music Therapy in a Developmental Context

In 1958, Don Michel noted (1974) that researchers at Florida State University observed "close relationships between speech therapy and music therapy." In 1968, Michel developed a chapter in E. Gaston's Music in Therapy text on Speech Habilitation of Cleft-Palate Children. Ewers and Helmus (1968) wrote in the same text about enhancing speech in many varied populations with speech orders (p.159). These clinicians included winds in their work, which enhanced motor control of the lips and lung volume motivation. They also mentioned humming as useful in addressing specific speech delays.

Music therapist Kay Roskam (1979) showed the effectiveness of specifically planned music activities in their ability to expand auditory perception and to improve language skills in learning disabled children.

Music therapist Kate Gfeller (1985) has been instrumental to the field of music therapy and language development in the hearing impaired, bilingual and disadvantaged patients. Gfeller (1986) showed that music could induce recall in learning disabled children. In 1983, she demonstrated that math fact recall was enhanced with musical mnemonic treatment, in comparison to simply using verbal rehearsal strategies (Gfeller, 1983). The most successful melodies were similar to ones that the children already were familiar with, as we will see in the model of music therapy in language recovery. Song, in particular, rhythmic cuing with song, can enhance cognitive processes. Neuroscientist Karl Pribram and composer Leonard Bernstein in Pribram (1984) are advocates of the use of song in assisting the development of language. Both Kodaly and Orff (Chen-Hafteck 1996, Beaton, 1995) impressively implemented the use of folk songs in language acquisition and do this largely due to its foundation of cultural centricity. Carl Orff and Dalcroze have combined music, speech, movement, and dance into developmental models of music education.

Darrow (1996), Loewy (1985) and Leung (1985) have shown the benefits of music and total communication in facilitating language development in deaf and/or autistic children. Leung points to the similarities of music and language, the "two sisters" (as cited in Stansell, 2002).

The American Academy of Neurology's 53rd Annual Meeting in Philadelphia, Pennsylvania in 2001, using magnetic resonance imaging sequence comparing high resolution anatomical datasets of the professional musicians' and non-musicians reflected significant differences in the gray matter distribution between professional musicians trained at an early age and non-musicians. The musicians in the study had more relative gray matter volume in left and right primary sensorimotor regions, the left more than the right intraparietal sulcus region, the left basal ganglia region and the left posterior perisylvian region, with pronounced differences also seen in the cerebellum bilaterally. Results of this cross-sectional study may indicate use-dependent brain growth or structural plasticity of gray matter volume in response to such demands during a critical period of brain maturation. And yet, a second interpretation of the data could reflect that these musicians were born with these differences and were, as such, predisposed toward utilizing their musical talent. The researchers have posed the following question: Did the intense musical schedule that commenced at an early age influence the actual development of the brain or were the musicians born with these differences. The results of this limited (15/15 male split), yet age and gender matched has led to the pursuit of the question in a larger sample base.

Schlaug notes that "musicians typically commence training at an early age, making them ideal subjects for this type of investigation," ("Musical Training", 2001). These presumed cerebral adaptations may not only lead to modifications of functional sensory and motor maps, but may also lead to structural adaptations within the sensorimotor system. Additional study may help to substantiate relationships between the effect of repeated motor training for a long period of time and structural changes in motor and non-motor related brain regions.

Jon Weatherford Stansell (2002) at the University of Illinois at Urbana-Champaign presented a compelling review on The Use of Language in Learning Languages. He focused on music's effect on mental ability, and early physical development. Researchers over the last twenty years, he points out, have made critical inquiries in advancing theories of language acquisition. He cites scholars from the fields of neurology, linguistics, music therapy and music research and notes the similarities of methods that combine pedagogically, the co-joining of language and music. He also reflected upon current neurological theory and the effect of music on the mind. Through summary of scholarly inquiry on the use of music in learning languages, Stansell created a compelling rationale that supports the use of music to enhance non-musical learning foundation. His outstanding and comprehensive literature review concluded with the notion that music and language ought to be studied together, and, in fact, have been investigated historically in a co-joint forum:

As scientists dissect the functions of the brain, dividing up even musical abilities into disparate modules having different roles in the learning process, they disseminate knowledge about information processing that continues to show why this is the case. Activities of melody recognition, contour processing, timbre discrimination, rhythm, tonality, predictions, body movement, tactile involvement, and the sound, sight, and form of symbols, with their context in song, phrases, and rule structures are all common in the musical and language learning processes. (Stansell, 2002)

In his review, which covered a broad range of linguistic, neurological and musical research, he noted, that investigators have attempted to define the music-language connection, with results that are consonant, on many levels, and show a high degree of thematic cohesion. Stansell places music in a supportive role in the language acquisition used to process and learn both cognitively and literally, as in the classroom. Stansell's research led him to believe that music "has proven to lower anxiety, increase motivation, promote interest, contribute to enjoyment, and stimulate the memory response. Music helps in all areas of language learning, effecting vocabulary learning, proper accent, and grammar, as well as encouraging cultural fluency."

Innate apparatuses of music are present in young children. Through clinical investigation of normal and developmentally delayed children, along with other researchers, Loewy (1995) asserts that a child can imitate the rhythm and musical contours of discourse prior to the production of words. Nordoff and Robbins' (1977) in the famous case of Edward, write about his vocal impulsiveness and how his singing-crying supported by music turned into "interesponsivity." In this musically supported vocal play, the "singing-crying apparently brought some order into his relationship with us." (1977, p. 34). In vocal repartee using phonemes, mirroring and the child's name, the listener can hear Edward being self actualized in audio excerpt (e) (Nordoff and Robbins, audio-cited by Loewy, 1995) entitled "mirroring". After sessions with Paul Nordoff and Clive Robbins, the speech pathologist observed that Edward "could possibly accept a structured speech learning program." (p.35) In fact within several months, the authors reported that Edward's vocal-speech production went from 22 to 120 words. Mi140Loewyexcerpt e by Voices-mt

The infant has a keen perception of sound. Hearing is the first sense to develop in the fetus at approximately 16 weeks (Schwartz, 1999) and infants have receptivity to music that was played for them in the womb. Orff and Kodaly, both advocate for the use of heritage folk songs in the language acquisition context because of music's deep roots to one's culture. Silvia Nakkach, whose work we will read about in section 4 : Music Therapy and the Voice in Psychotherapeutic Function as well has based her Vox Mundi model on the significance of indigenous chant and vocalizations; and the unique contribution cultural-vocal entities play in the expansion and opening of the psyche.

Music Therapy in Recovery

Composer and scientist Robert Jourdain (2002) considers phrasing to be the closest parallel between language and music. "Phrasing divides long streams into comprehensible chunks" (p.275) Jourdain, in his text; Music, the Brain, and Ecstasy, explains how current theory in neurology complicates the commonly believed premise that music is a right brain skill while language is a left brain function. Even, as he points out that right brain auditory areas favor a tonal analysis and left-brain auditory areas seem to favor speech consonants. In outlining for review our current knowledge and attempts to understand the brain's lateralization of speech reception and production mechanisms; aphasia, alexia, agnosia, and apraxia, Joudain brings to the forefront neurology's most current quandary: Amusia. Amusia is thought to be the least specific of all such syndromes as it is the most difficult to localize. Furthermore, whereas "there tends to be a specific distinction between amelodia (loss of right brain melodic skill) and arythmia (loss of left brain rhythmic skill), "the cerebral setup for music is far more diverse and changeable than that for language." (p.292) musical capacity is less distinctively mapped. In comparing scores for 100 participants, 41 males, and 59 females, on 8 music tests to scores from 6 standardized psychological tests of cognitive ability, Steinke, Cuddy, and Holden (1997) showed that musical intelligence may be distinctly separate from other thinking capabilities.

Perhaps the most distressing characteristic of many degenerative diseases is the apparent seeming disintegration of the mind's ability to organize, retrieve, synthesize and produce language. Within the past 20 years, there have been many reports of music and rhythm improving gait training to improve functioning with the elderly (Thaut and McIntosh, 1999) as well as patients with hemi paretic stroke (Thaut, McIntosh, and Rice, 1997) and Parkinson's (Miller, Thaut, McIntosh and Rice, 1996), however, there are fewer reports on the use of music therapy in speech recovery.

According to Sparks and Holland (1976) melodic intonation therapy (MIT) may be effective for some patients with aphasia. Melodic intonation is based on three elements of spoken prosody: the melodic line, rhythm, and stress points of speech. In a toned utterance, the vowel is lengthened, the rhythm and stress are exaggerated and the constantly varying pitch of speech is reduced and stylized into a pattern involving the constant pitch of several whole notes. A typical intoned utterance only varies by one whole note, much like chanting. MIT has mixed reviews in terms of its effectiveness. ("Evidence is", 2003) It does not teach correct production of individual speech sounds. It is not designed to replace other therapy approaches but to supplement and augment them.

Melodic intonation therapy is a type of speech therapy. It consists of speaking with a simplified and exaggerated prosody, characterized by a melodic component (two notes, high and low) and a rhythmic component (two durations, long and short). Many aphasic patients do not respond to melodic intonation therapy. This includes those with global and transcortical aphasia and almost all with evidence of significant posterior language area involvement (Neurology, 1994, Belin et al, 1996). MIT is not music therapy as it does not encompass the utilization of auditory aesthetic aural organization. It is ordered according to melodic and rhythmic characteristics of speech itself, without particular attention drawn to dynamics, harmonic phrasing, sequencing, lyrical style, or relationship (collaborative play).

In his article, Music therapy and neurological rehabilitation: Recognition and the performed body in an ecological niche, Aldridge (2001) writes about temporal coherence and memory recognizing that music therapy in neurological rehabilitation offers an external sense of temporal coherence that is otherwise latent in the patient. He asserts that music therapists can break this down through the "dialogue of relationship." Aldridge discusses the work of Orange (et al, 1988) and the fact that couples in the early and middle stage of neurological disease achieve success in resolving communication breakdowns despite declining cognitive, linguistic, and conversation abilities of the individuals with the disease. His perception that these couples can still communicate leads him to support the influence of music therapy on the progression of neuro-degenerative diseases on both conversational performance and communication:

For sufferers, we are that recovery; we are the context offering them an ecological niche. That is, we are communication and our responsibility lies in developing our own communicative abilities beyond the use of words. As living beings we perform in relationship with our milieu, a milieu that includes others. This ecological milieu is established through a dynamic mutual interaction. We establish communicative forms; these are inherently of time and musical. In this sense, memory is not the recovery of facts but the performance of reality linking events. What we experience as development is a creative act of improvising forms for being in time, this we experience physically as the body but psychologically as consciousness. (Aldrigde, 2001)

Working alongside Oliver Sacks for two decades, pioneer Connie Tomaino has worked in the recovery model of language in music therapy by developing a retrieval model for those with memory impairment. She has used familiar songs to connect "seemingly lost parts of the personality by providing a necessary link to the "self" (Tomaino, p.116, 1999) Tomaino's work extends the elements of music that others working in neurological music therapy have used, such as rhythm and cuing, to the 'self' that songs are associated with and their ability to create and connect images and memories of the past. Her model intertwines theories of voice-self (to be presented next) with data from neurological music therapy.

The following example depicts a music therapy assessment vignette of a man with Alzheimer's who has frontal lobe damage, including limited speech and a variety of obsessions unrelated to aspects of his current day life. At this point in the session, he spontaneously retrieves his high school alma mater. For a few moments, he and his wife sing the Trenton high school theme. For several minutes at various points in the session (this particular theme demonstrating the most intimate of moments), Mr. C and his wife seemingly relive a momentous time from their past (audio excerpt (f)). Music's ability to lodge and then unlock memories and verses from specific moments in time may be one of its most potent qualities in recovery potential. Sacks illustrated that the irregular EEGs of post-encephalitic Parkinson's patients became normalized when merely thinking of music (in Tomaino, 1999, p.119). Mi140Loewyexcerpt f by Voices-mt

Speech pathologist Paul Fleming and music therapist Alice Rogers co-teamed in 1981, developing a music-speech protocol that aimed to enhance "volitional resumption of verbalization in brain injured patients." (1981, p.39). Music therapists Cohen and Masse (1983) applied singing and rhythmic instruction on the rate of speech in neurologically impaired persons. They found that the singing group made the most progress. Prickett (1991) showed that music aided the memory of Alzheimer's patients.

Music's potential to aid in the recovery of memory and language mechanisms, specifically through song seems significant. Although there are a host of models and methods that test the effectiveness of specific components of music, especially rhythm, it may be that the most universal means of understanding how language is recovered may include models that integrate aspects of the unique qualities of the human being within the context of a particular musical element. For instance, one person may be more receptive to melody, and another to harmony. Emphasizing songs with a focus on significant periods and events in time, may lead to the fullest retrieval capacity. There is much more for us to understand through future research endeavors, in particular, the blending of known lyrics and familiar melodies that may recover pre-existing neural pathways. Such research may be further expanded when we consider the emotional context and history of the patient's life world as revealed through a comprehensive psychotherapeutic analysis of the voice.

Music Therapy and the Voice in Psychotherapeutic Function

Thus far we have discussed aspects of language and speech presentation within a developmental and recovery context. Aspects of ego and emotion were alluded to, though not as of yet presented within a psychotherapeutic context. In Leonard Meyer's (1956) influential text "Emotion and Meaning in Music", he discusses theories of music and interpretation noting that" "Meanings become objectified only under conditions of consciousness and when reflection takes place." (p. 39)

In viewing the process of language, speech and discourse as a result and function of the developing ego, that is representative of the presentation of self in everyday life, one must consider the omnipotence of the performer and the overriding, unavoidable emotional aspect of personal inflection involved within efforts of communication. Indeed, the role of the speaker and the musical aspects of speech as a distinct aspect of ego functioning or lack thereof, is another area that has been distinctly under-investigated (or under reported) in music therapy. Three distinct models have been developed and used in therapy and in clinical training of therapists, in particular, music psychotherapists. They are described below:

The Embodied Voice Work model was developed by Lisa Sokolov about 20 years ago. The method seeks to identify the Self through the Body's musical journey. The exploration and resources are uncovered as the power within the process of finding and freeing one's voice is identified and unleashed. Through free vocal improvisation each person connects with the power of his/her own music, as well as the experience of listening and improvising with others. The tools of this work are breath, tone, touch, imagery and vocal improvisation. Through a series of developmentally sequenced exercises, these tools help to connect impulse and information in the body. The next phase is attained by moving into a process of exploration, awareness, release, strengthening and integration of the body, the voice and the self. Individual work evolves into duet and group work. The goal of the work is to embody the voice, to come more fully into one's body, one's sound, one's music and one's expressiveness. (Sokolov, in Bruscia, 1977, p.359)

According to Sokolov, the experience of singing places us in a "vital hub of what it is to be human." It brings us into central contact with many levels of ourselves. Like different octaves of the same note we exist on many levels. We are not just our physical bodies but also are made up of an energetic aspect, an emotional and thinking aspect and a knowing aspect. The process of becoming embodied, bringing our awareness and breath and sound into the body catalyses the completion of inner processes. (Sokolov, in press)

Beyond this complex layering of our individual beings, Sokolov asserts that we also live within social systems that can also be understood as complex functional "bodies". We are a part of our relational body, our family "body", our community "body" and our cultural "body". Each of us plays an important role in these larger systems. Over time all of these evolve into, function and communicate as group "body" systems. Within the context of Embodied Voice Work, we understand physical or emotional symptoms, problems or creative blockages as messages and attempts at communication within a system. These "problems" are held wound up within us and thus contain their own potential energy and path towards their unwinding. Awareness leads into release, the symptoms contain within themselves their own solutions

What is it to bring one's awareness and breath and sound from the entire body and being? What process ensues from this experience? How does what is experienced in singing inform life? Embodied Voice Work is about deep listening. It is about connecting into and sensing the body, where the voice resides. It is about giving voice to what is heard and felt:

Embodied Voice Work is a process of contacting impulse, hearing what you hear, feeling what you feel, and connecting and moving onto the river of non-verbal singing. What is it like to sing is true in the moment, no words? Not to sing about something but to go from the center of your experience and give voice to it? To follow the inner call and let it sing you? (Sokolov, in press)

After working with Sokolov's model, individuals can expect to be more grounded in their bodies and be able to improvise and sing more freely and expressively. They will be more fluent in the language of music. According to Sokolov, their listening skills will be awakened both internally and externally. They will have come into a new relationship with their own process and expressiveness. This work can open individuals to a powerful experience of emotional, energetic and expressive aliveness. (, n.d.)

Sokolov has applied this work in the training of actors, dancers and musicians in workshops in Europe, the US, Canada and in long- term training at the Experimental Theater Wing at New York University. This work is also used for training therapists and physicians, and in group, and private music therapy contexts.

Music therapist Diane Austin has developed a method of vocal improvisation that is analytically-oriented. Specifically, her Vocal Holding techniques have assisted a broad range of clients. From normal neurotics to those who have suffered from sexual abuse, eating disorders, Austin has invited her clients to explore the wounded parts of the vocalizer and to emerge and receive support through explorative vocal improvisation. Her 'Vocal Holding' describes two alternating harmonic areas underpinning a vocal interaction between client and therapist. Through a fluid and responsive use of unison, harmonizing, 'mirroring' and 'grounding', the therapist helps the client to 'give trauma a voice' and integrate a 'fragmented self'. In group work, these techniques can be demonstrated by a tutor working with a volunteer, and then can gradually be opened into dyadic work which would be later processed through discussion.

Vocal holding techniques are introduced into the music psychotherapy session in a number of ways. Austin most often begins by breathing with the client. "Deep breathing is critical in focusing, relaxing, and grounding the client in his or her body. Breathing together begins the process of vocal attunement that continues as the therapist attempts to match the client's vocal quality, dynamics, tempo and phrasing." (p.239, 2002)

Austin's current research helped to illuminate her intentions for using Vocal Holding Techniques as interventions in the clinical process:

  1. To build trust and create a positive mother transference
  2. To soothe and comfort clients
  3. to offer an experience of being seen and deeply listened to
  4. to encourage vocal play and spontaneity
  5. to work through resistance to feelings
  6. to create an opportunity for the client to undergo a therapeutic regression in order to re-experience and repair early developmental injuries
  7. to access unconscious feelings, images and associations
  8. to release feelings

to lead into and out of free associative singing -vocal holding with words" (Austin, 2003, p. 218-219)

According to Austin, our voices resonate inward to help us connect to our body-selves and they resonate outward to others. The self is revealed through the sound and characteristics of the voice (Austin, 2003):

Singing can enable clients to reconnect with their essential nature by providing them with access to, and an outlet for, intense feelings. Singing offers a way for the disembodied spirit to incarnate because the way home can be pleasurable and the painful feelings can be put into an aesthetically pleasing form. Singing can provide clients with an opportunity to express the inexpressible, to give voice to the whole range of their personality. This includes sounds not taught in traditional vocal classes, screams, sobs, moans and more primitive forms of self- expression. Claiming and giving voice to these stifled, repressed sounds is reclaiming aspects of ourselves that have been silenced or inhibited in our families and/or society in general. (p. 211)

A third model, which focuses on vocal training in a multi-cultural context is Silvia Nakkach's Vox Mundi model. Since 1988, Nakkach has pioneered work that has been integrated in an innovative curriculum of scientific vocal principles and applications through the Vox Mundi Project programs. In collaboration with an increasing number of trained professionals, she works to meet an international demand for seminars and clinical endeavors. In Vox Mundi, one explores the transformational qualities of sound by focusing on distinct forms of sacred chants, such as: Afro-Brazilian chant, Amazon healing songs, and Indian ragas and mantras. Nakkach's model affords trainees the opportunity to trace the historical, philosophical, and cultural contexts from which these traditions of healing, with the voice, rhythm, and movement, arose. Furthermore, her model encourages participants to discuss the reasons for their preservation and continuous evolution.

Vox Mundi stems from a theoretical framework for understanding how these systems of sacred chant relate to and complete one another, and how each has the capacity to express distinct emotional and specific states of consciousness through the body, voice and through elements of nature.

Learners of this model experience and come to understand the transformative potential of using sound and the voice to access higher states of creativity and as part of a yoga practice. They learn to apply vocal meditation exercises, including a repertoire of Sanskrit mantras, Yoruba Orixas xhants and Amazon healing songs. (Lirio,n.d.)


This article has provided theoretical rationale for the basis of integrating mechanisms of speech, language, discourse and music though music therapy. The case for utilizing music therapy throughout the lifespan; from the pre-linguistic period, as a developmental and recovery intervention and finally as a means to enhance ego functioning, has been highlighted through various researchers and methods of clinical practice. Perhaps the most essential recommendation of this article, beyond the strengthening cases provided throughout the literature of music, neurology, neuroscience and music therapy is the power attained through the integration of the four models.

In presenting the work under specified models, I am seeking to encourage music therapists to be aware of the rationale and evidence based practice that is a part of co-existing models, and which, in turn, has an impact upon their own particular model of comfort. It is not only essential for music therapists to become aware of the models, but to realize that we may expand practice by integrating other models into our practices.

Understanding regression, for instance, in terms of a 'crying sound' in a patient working within a rehabilitation music therapy model, might expand the musical context for how the therapist might proceed. The therapy might extend into a babbling phonemic repartee and further develop thru the institution of a chant or song from Vox Mundi, which might speak to the healing needs of the patient's family based on cultural influence.

Implementing intervallic play (harmony) after moments of unison (as in Austin's vocal holding) may help a patient with Parkinson's disease to feel more balanced and synthesized. Assisting patients with Alzheimer's disease in sentence recall as Gfeller's (1983) music with math recall did, to a song that is linked to a memory or specific period of time may induce imagery from that period and lead to vocal improvisation that frees the patient's feeling of play, into the moment, imploring exploration as in Sokolov's Embodied Voice Work model.

As we begin to learn about aspects of music that enhance the acquisition of self as evidenced through song and speech production, we strengthen the potential of what we can expect and realize as we communicate and helps others to collaborate within a music therapy speech language vocalization continuum. As we separate out specific mechanisms, to research our areas of choice and need in the field, we expand our treatment basis in compiling the data from many venues of physical, mental and ego functioning to hear the true essential voice and desires of our patients.


Aldridge, D. (2001). Music Therapy and Neurological Rehabilitation: Recognition and the Performed Body in an Ecological Niche [online]. Music Therapy Today, November 01. Retrieved February 24, 2004, from

Anderson, J. (1982). Acquisition of Cognitive Skill. Psychological Review, 89, 369-406.

Austin, D (2003). When Words Sing and Music Speaks: A Qualitative Study Of In Depth Music Psychotherapy With Adults. Unpublished Doctoral Dissertation. New York University, New York.

Austin, D (2002). The Voice of Trauma: A Wounded Healer's Perspective. In Sutton, J. (Ed.). Music, Music Therapy and Trauma: International Perspectives. London: Jessica Kingsley Publishers.

Austin, D. (2001). In Search of the Self: The Use of Vocal Holding Techniques with Adults Traumatized as Children. Music Therapy Perspectives, 19(1), 22-30.

Beaton, Patricia (1995). The Importance of Music in the Early Childhood Language Curriculum. International Schools Journal, 15(1), 28-38.

Belin P, et al.(1996). Recovery From Non-fluent Aphasia After Melodic Intonation Therapy: A PET Study. Neurology, 47, 1504-1511.

Boswell, B. & Vidret, M. (1993). Rhythmic Movement and Music for Adolescents with Severe and Profound Disabilities. Music Therapy Perspectives, 11, 37-41.

Boukydis, Z. (1985). Infant Crying: Theoretical and Research Perspectives. New York: Plenum press.

Chen-Hafteck, L (1996). Music and Language Development in Early Childhood: Integrating Past Research in the two Domains. Early Child Development and Care, 130, 85-97.

Cohen, N. & Masse, R. (1993). Application of Singing and Rhythmic Instruction on the Rate of Speech of Neurologically Impaired Persons. Journal of Music Therapy, 30(2), 81-99.

Craik, F., and Lockhart, R. (1972). Levels of Processing: A Framework for Memory Research. Journal of Verbal Learning and Verbal Behavior, 11, 671-84.

Darrow, A. A. (1996, April). The Effect of two Selected Interventions on Preservice Music Teachers' Attitudes Toward the Inclusion of Students with Disabilities into the Music Classroom. Paper presented at the Music Educators National Conference, Kansas City, MO.

Darwin, C. (1877). A Biological Sketch of an Infant. New York: Appalacian.

Evidence is a bit off-key. Claims for Melodic Intonation Therapy don't Ring True (2003). Bottomlines: Research Findings Informing Early Childhood Practices 1(8), 1. Retrieved February 25 from

Ewers, K & Helmus, K. (1968). Music Therapy for Speech Disorders. In Gaston, E. T. (Ed.). Music in Therapy (pp. 159-161). New York: MacMillan.

Gfeller, K. & Bauman, A. (1988). Assessment Procedures for Music Therapy with Hearing Impaired Children: Language Development. Journal of Music Therapy, 25(4), 192-206.

Gfeller, K. (1987). Songwriting as a Tool for Language and Reading Remediation. Music Therapy,6(2), 28-38.

Gfeller, K. (1983).Musical Mnemonics as an Aid to Retention with Normal and Learning Disabled Students. Journal of Music Therapy, 20(4), 179-1990.

Kathleen Helfrich-Miller (1994). A Clinical Perspective: Melodic Intonation Therapy for Developmental Apraxia. Clinics in Communication Disorders, 4(3).

Jónsdóttir, Valgerður (2001). Early Intervention as a Framework for Music Therapy with Caretakers and Their Special-Needs Infants. Unpublished thesis. Sogn og Fjordane University College, Sandane, Norway.

Jourdain, R. (2002). Music, the Brain, and Ecstasy. New York: Quill Press.

Leung, K. (1985). Enhancing the Speech and Language Development of Communicatively Disordered Children Through Music and Movement. Paper presented at the third Annual Convention of the Council for Exceptional Children, Anaheim, CA. Retrieved February 25, 2004, from ERIC Reproduction Service Document No. ED257282.

Lirio, Alba (Ed.)(n.d.). Vox Mundi Project. Retrieved February 24 from (n.d.) (online). Retrieved February 24, 2004 from

Loewy, J. V. (1995). The Musical Stages of Speech: A Developmental Model of Pre-verbal Sound Making. Music Therapy, 13(1), 47-73.

Loewy, J. V. (1985). Musical Sign in Autism: A Multi-Sensory Approach. Masters Thesis. New York: New York University.

Mehler, J & Dupoix, E (1997). What Infants Know (pp. 39- 47; 174-181). Oxford: Blackwell Publishers.

Melodic Intonation Therapy (1994). Report of the Therapeutic and Technology Assessment Subcommittee of the American Academy of Neurology. Neurology,44, 566-568.

Meyer, L. (1956). Emotion and Meaning in Music. Chicago: University of Chicago Press.

Michel, D. & May, N. (1974). The Development of Music Therapy Procedures with Speech and Language Disorders. Journal of Music Therapy, 11, 74-80.

Michel, D. (1968). Music Therapy in Speech Habilitation for Cleft-palate Children. In Gaston, E. T. (Ed.). Music in Therapy (pp. 162-166), New York: MacMillan.

Miller, R., Thaut, M. McIntosh, G. & Rice, R. (1996). Components of EMG Symmetry and Variability in Patients with Parkinson's Disease. Journal of Neurology, Neurosurgery, and Psychiatry, 62, 122-126.

Musical Training During Childhood May Influence Regional Brain Growth (2001). Science Daily, May 11, 2001. Retrieved February 25, 2004, from

Mora, C. (2000). Foreign Language Acquisition and Melody Singing. ELT Journal, 54(2), 146-152.

Nordoff, P & Robbins, C (1977). Creative Music Therapy. New York: John Day Company.

Orange, J., vanGenapp, K., Miller, L. & Johnson, A. (1998). Resolution of Communication Breakdown in Dementia of the Alzheimer's Type: A Longitudinal Study. Journal of Applied Communication Research, 26(1), 120-138.

Ostwald, P., Phibbs, R. & Fox, S. (1968). Diagnostic Use of Infant Cry. Biology of the Neonate, 13, 68-82.

Palmer, Caroline, & Kelly, Michael (1992). Linguistic Prosody and Musical Meter in Song. Journal of Memory and Language, 31, 525-541.

Prickett, C. & Moore, R. (1991). The Use of Music to Aid Memory of Alzheimer's Patients. Journal of Music Therapy, 27(2), 101-110.

Pribram, K. (1984). Proleomena for a Theory of Meaning in Music. In Clynes, Manford (Ed.). Brain Mechanisms in Music (pp. 21-35). Engelwood, NJ: Plenum press.

Rogers, A. & Fleming, P. (1981). Rhythm and Melody for Speech Therapy in the Neurologically Impaired. Music Therapy, 1(1), 33-39.

Roskam, K. (1979). Music Therapy as an Aid for Increasing Auditory Awareness and Improving Reading Skill. Journal of Music Therapy, 16, 31-42.

Sacks, O. (1995). An Anthropologist on Mars. New York: Alfred Knopf.

Schwartz, F. (1999). Music and Sound Affect on Perinatal Brain Development and the Premature Baby. In Loewy, J. (Ed.). Music Therapy in the Neonatal Intensive Care Unit (pp. 9-15). New York: Satchnote Press.

Sloboda, J. (1989). The Musical Mind: The Cognitive Psychology of Music. New York: Oxford University Press.

Sokolov, L. (2002).Embodied Voice Work: Voice through the Body to the Self (in press). Phoenixville, PA: Barcelona Publishers.

Sparks R.W. and Holland A. L. (1976). Melodic Intonation Therapy for Aphasia. Journal of Speech and Hearing Disorders 41, 298-300.

Stansell, J W (2001).The Use of Music in Learning Languages: A Review [online]. University of Illinois at Urbana-Champaign. retrieved February 24, 2004, from

Steinke, W.R, Cuddy, L. L., & Holden, R. R. (1997). Dissociation of Musical Tonality and Pitch Memory from Nonmusical Cognitive Abilities. Canadian Journal of Experimental Psychology, 51(4), 316-334.

Thaut, M & McIntosh, G (1999). Music Therapy and Mobility Training with the Elderly: A Review of Current Research. Care Managemnet Journal, 1, 71-74.

Thaut, M., McIntosh, G & Rice, R (1997). Rhythmic Facilitation of Gait Training in Hemiparetic Stroke Rehabilitation. Journal of Neurological Sciences. 151, 207-212.

Tomaino, C. (1999). Active Music Therapy Approaches for Neurologically Impaired Patients. In Dileo, C. (Ed). Music Therapy and Medicine (pp. 115-122). Silver Spring, MD: American Music Therapy Association.

Moderated discussion
Add your comments and responses to this essay in our Moderated Discussions. Contributions should be e-mailed to either Joke Bradt or Thomas Wosch

Comments to this essay:

Guidelines for discussions


  • There are currently no refbacks.

Voices: A World Forum for Music Therapy (ISSN 1504-1611)