1. Introduction
This text is a summary of my doctoral thesis entitled ‘Sound-Image Relation”, held in May 2004 at Federal University of the State of Rio de Janeiro (UNIRIO), available for consultation at the University Library and at the National Library (RJ), Brasil.

The main goal of this research was to highlight a new genre of musical language we called Music Video. The starting point was to investigate the increasing use of visual language incorporated into the practice of electroacoustic music composition. This study addresses the works of Brazilian composers living in Rio de Janeiro, although this musical genre is also being developed in other communities, in Brazil and abroad. Despite being the result of the interaction between two languages, we named Music Video a genre practiced by composers. Therefore, in order to study it, we borrowed some concepts from musical theory and practice in a given context. The challenge was to review such concepts in a more comprehensive approach, beyond the sonic nature, investigating, for example, how images were present or how they interfered in the various forms of sound constructions that historically preceded Music Video. Our idea was to show that music has always been an audiovisual spectacle, that images have always been part of the practice, diffusion and understanding of music in its three major historical/cultural manifestations: in oral tradition music, in written score music and in electroacoustic music. We believe that other modes of expression end up being incorporated into every manifestation of artistic creation, be it implicitly or explicitly, even when it does not result from the artist/creator's program.

2. Visualization in sonic language
(seeing/hearing  hearing/seeing)
In early musical manifestations it was not possible to hear music without seeing the musician. Listening to music included looking at scenery, costumes, as well as all mishaps that involve a real time perception: sounds, images and smells of the environment, tactile sensations etc., all interfering in the act of listening. In a context such as vocal/instrumental music we can state that music is a practice of seeing/hearing, in which the visibility of the sound sources as well as of the instrumental gesture anticipates, even if unconsciously, some aspects of the act of listening. Until the mid-twentieth century, this practice of presenting music dominated both its written and oral tradition.

In 1948, the advent of musique concrete – music recorded on a sound support – breaks with the audiovisual spectacle of the live concert. From the standpoint of perception two abrupt changes take place:

1) The elimination of the link between sound and its source, until then explicit in the two sensory aspects, visual and auditory: interpreters and instruments are replaced by a system of sound transmission; on stage, we can only see loudspeakers, capable of transmiting sound from all and any origins.

2) The possibility of creating virtual images on the listener's mind as a result of the various changes in musical practice. For example: the use of natural sounds in the composition, adding a sense of mimesis to musical language, as Emmerson comments in his article "The Relation of Language to Materials". Describing this “image” perception, he says:

“The term 'image' may be interpreted as lying somewhere between true synaesthesia with visual image and a more ambiguous complex of auditory, visual and emotional stimuli. We are concerned here not with how specific sources may evoke particular images but with how the imagery evoked interacts with more abstract aspects of musical composition”.

Thus, the practice of audiovisual concerts was broken up, reverting, so to speak, the perceptual process to hearing/seeing: it is the listener's mind that will evoke "images" related to the sounds used in the composition.

By the end of the twentieth century, the emergence of sound and image digitization techniques creates a new relationship between the aural and visual languages, opening a wide field of experimentation for the genres that comprise multimedia. Our point of view is that Music Video, as one of these genres, is represented by the sum of both stages of perception: seeing/hearing and hearing/seeing, as we will comment below.

Acoustic Music
Sound source visible
Vision/Audition (real time)

Acousmatic Music
Digital Support
Sound Source invisible
Audition (real time)/Vision (virtual)

Visual chart applied to Musical Concepts and Practices

3. The audiovisual
We did not find any specific literature regarding Music Video genre as an audiovisual practice. The limited literature available at that time consisted of isolated texts or chapters in books that lead to film or multimedia. Among these, two authors were of extreme importance for the conceptualization and analysis of Music-Video: Nicholas Cook and Michel Chion, in Analyzing Musical Multimedia (2000) and L'Audio-Vision (2002).

In his book's preface, Cook rightly points to the absence of a theoretical basis for the analysis of multimedia genres. He noted that, until then, theories and criticism for the analysis of multimedia genres used essentially isolated criteria. According to the book’s title, Analysing Musical Multimedia, the term musicalrefers to a methodological orientation that departs from a musical approach in its relation to other media. In other words, he tries to expand the boundaries of musical theory by mapping its limits with moving words, gestures or images – a premise that matches our research.

In his book, Chion conceptualises audiovisual relations in film, but in the end he outlines an analysis procedure to be applied to all genres comprised by multimedia.

Both authors, albeit through different paths, reach a common ground: the place where these new artistic events face human perception, questioning the individual perception of the senses and their meanings. As Cook says “The truth is that music is booming: but it is booming outside music theory”.

4. Methodological procedures
Chapter II is devoted to a description of the modalities of sound/image relations found in the genre of music-video, according to the conception, diffusion and perception of different interactions. To reach this goal, besides the developments of our own research, we used statements by composers whose work served to define our models. Their statements, which illustrate and reinforce the classification, were responses to a questionnaire intended to define the main issues implied in the creation of Music Video.

In the book Le Son (1998) Chion devotes chapter 10 – entitled Le couplage audio-visuel – to the demonstration of a series of concepts developed from the study of sound/image relations. Among them, we highlight the concept he calls “audio-vision et visu-audition”. "Audio-vision" means a kind of perception assigned to film and television, in which listening influences vision, in which sound at every moment incorporates a series of effects, of feelings, of meanings which, due to a projection phenomenon, are assigned to the image. Therefore, since a sound adds meaning to an image, this effect seems to emanate from the image itself; or, in other words, the sound remains implied in the image. Likewise, the term “visu-audition” applies to a type of perception that is consciously focused on the audible, as in a concert, where the act of hearing is accompanied, reinforced, transformed by the influence of a visual context.

If we transpose these concepts to our visualisation scheme in sonic language, we notice that our first postulate – seeing/hearing – as the stage which represents the practice of concert music, corresponds exactly to the criteria named "visu-audition"; and the second – hearing/seeing – acousmatic perception – is a kind of “audio-vision” in which listening stimulates vision. In acousmatic music, the image is not present either in reality or in support but in the mind of the listener, stimulated by sound. Reinforcing what we said earlier, the sum of these perceptions seems to materialise in Music Video, in which both image and sound can stimulate each other.

At this point, we can observe three basic features that distinguish Music Video:

1) The use of technology: when digitalization enables manipulation of image and sound on the same support.
2) The use of two different languages conducting the creative process: with few exceptions, there are not only two languages but also two different artists interacting, in their styles, techniques etc.
3) The autonomous use of music: most of these works can be presented as pure music, even if it gets a different meaning.

Just as in electroacoustic music, music-video may be presented primarily in three ways:

1 - Sound and image finalized on one support (such as acousmatic music)
2 - Sound and image on different supports, mixed with performances in real time (such as mixed music)
3 - Sound and image processing in real time (such as live electronic music)

So we can say that, although Music Video descends directly from electroacoustic music, it represents a third stage of the “visualization in sonic language”, generating a new form of expression – since it demands new technologies, diffusions and perceptions.

5. Models and modalities
Thus, from the works discussed, we arrived at an initial classification of the genre, including 5 modalities combining 12 models – from the traditional instrumental gesture to the newest interfaces for interactive real-time.

Just to illustrate, a chart of the first modality:
Presentation of the charts follows this order: modality, description, model description, work that originated the model, music composer, image author, equivalence in the category of electro acoustic music.


When image becomes the score, conducting the sonic construction process

Model I

Model II

The image inspires sonic materials

The image leads to the building of music structures and forms

 Di-Stances (1982)

Dueto I + 1 (1978)

Music:  Vania Dantas Leite
Image: Paulo Garcez (Example 1)

Music: Rodolfo Caesar / Vania D. Leite
Image: Milton Machado (Example 2)

Sound and Image on support
(digital version – image editing by Rara Dias, 2002)

Sound support, piano (traditional score) and the projection of design/score during the performance (2002 version)

Category: acousmatic music

Category: mixed music

Di-Stances (Model I)
Work of 1982, as guest-composer, responding to an invitation by Leo Kupper, studio director, at the Studio de Recherches Electroniques Auditives, Brussels, Belgium.

The title of the composition means "two-movements" in Greek, and describes the interaction between the movement in the drawing and in music composition.

Based on a series of 17 pages with drawings on pentagrams, research work was started on the creation of sound materials seeking to express as music the images drawn by Paulo Garcez such as contours, textures, superimposition of figures, colours etc. Example 1 shows one of the 17 pages with drawings on pentagrams.

Once the materials had been prepared and mixed, they became a tape music. Only electronic sound sources were used for the composition and the processing of the materials:

OBERHEIM - OBSX analogic polyiphonic synthetiser.

PUBLISON synthetiser / computer ( infernal machine )

Analogic filters

Direct and bip synchronised mixing in Nagra- Serion stereo recorders ( 6 channel).

Originally, this work was performed along the projection, on a large screen, of the 17 slides of the drawings – as if following the score. In 2002, the work gained a DVD version, with animated drawings.


Example 1: one of the pages with drawings on pentagrams


"Duet 1 + I, for Extremely Attentive Performers Isolated from Each Other" is based on a 1978 drawing by Milton Machado, in watercolour on paper. In 1982, from a re-drawn version with hydrographic pencil, a first musical performance took place. The score was digitalized in 2002, leading to a second musical version, performed by Rodolfo Caesar (sound support) and Vania D. Leite (piano), used here to illustrate model II.

The original drawing-score is structured in 45 bars arranged in 5 rows of 9 columns. Each bar adds one more sound event (+ 1) to the precedent event (I + 1). Thus, as a result from [progressive] accumulation, in the last bar we should hear 45 events as performed by each musician.

Another parameter score is dictated by the choice of 12 sounds, or 12 sound events that correspond to the 12 colours used by Machado.


Example 2: digital drawing

In the conception of the piano music score, for example, a series of twelve notes was built, each one corresponding to a colour,


Figure 1

that adds up in the same order of the drawing as shown in Figure 2


Figure 2

6. Final considerations
The musical genre we are pointing to is just one more amongst the artistic events of our days, newborn from the technological explosion that makes available to the artist not only new tools, but also the real possibility of overlapping different languages – the multimedia.

There are certainly many more Music-Video models than those we examined in this study. From a video clip to a mixed media installation, modalities have been unfolding, and every day new horizons appear in the mixture of languages.

By May 2004, the date of examination of this thesis, we had not found any literature or reference on the subject, with the specific intent adopted here. We believe we have taken a first step to study the genre, contributing to the foundation of its theoretical and analytical bases. It was mainly for this purpose, to rethink the relationship between music, media and technology that we immersed in this work.


