AdA Filmontology
– Levels, Types, Values
English version
Version 1.0 / July 2021
by Thomas Scherer, Jasper Stratil, Yvonne Pfeilschifter, Rebecca Zorko, Anton
Buzal, João Pedro Prado, Henning Agt-Rickauer, Christian Hentschel, Harald Sack,
Matthias Grotkopp, and Jan-Hendrik Bakels
Junior research group “audio-visual rhetorics of affect”, Freie Universität Berlin /
Hasso-Plattner Institut, Potsdam. Funded by the federal ministry of education and
research.
© 2021. This work is licensed under CC BY-SA 3.0 License.
Table of Contents
Preface ___________________________________________________________ 1
Preliminary Remarks AdA Filmontology ______________________________ 1
What is the AdA Filmontology? ____________________________________________________ 1
Where can I use it? _____________________________________________________________ 2
Where can I get other resources? (OWL; Template; Existing Datasets) ____________________ 2
What is the purpose of this document? ______________________________________________ 3
Labels & Descriptions ______________________________________________ 4
Acoustics _______________________________________________________ 5
∎Volume _____________________________________________________________________ 6
∎Dialogue Emotion _____________________________________________________________ 6
Dialogue Type _________________________________________________________________ 9
Dominant Acoustic Level ________________________________________________________ 10
∎Sound Gesture Dynamics______________________________________________________ 11
Dialogue Voice Quality _________________________________________________________ 12
Dialogue Intensity _____________________________________________________________ 13
Dialogue Tonality ______________________________________________________________ 14
Sound Segment _______________________________________________________________ 14
Sound Gesture Description ______________________________________________________ 14
Music Piece __________________________________________________________________ 15
Music Part ___________________________________________________________________ 15
Music Gesture ________________________________________________________________ 15
Music Arrangement ____________________________________________________________ 16
∎Music Mood ________________________________________________________________ 17
Music Intensity ________________________________________________________________ 18
Music Tonality ________________________________________________________________ 19
Music Figure Patterning ________________________________________________________ 19
Music Figure _________________________________________________________________ 20
Music Accent _________________________________________________________________ 21
Bodily Expressivity ______________________________________________ 22
∎Body Language Emotion ______________________________________________________ 23
∎Body Language Intensity ______________________________________________________ 26
Facial Expressions Emotion _____________________________________________________ 27
Gestures Emotion _____________________________________________________________ 29
Facial Expressions Intensity _____________________________________________________ 30
Gestures Intensity _____________________________________________________________ 31
Camera ________________________________________________________ 32
∎Recording Playback Speed ____________________________________________________ 33
Depth of Field ________________________________________________________________ 34
Defocus _____________________________________________________________________ 35
Camera Movement Unit ________________________________________________________ 35
Camera Movement Type ________________________________________________________ 36
Camera Movement Speed ______________________________________________________ 37
∎Camera Movement Direction ___________________________________________________ 38
∎Camera Angle _______________________________________________________________ 39
Camera Angle Canted __________________________________________________________ 40
Camera Angle Vertical Position __________________________________________________ 41
Lens ________________________________________________________________________ 42
Image Composition ______________________________________________ 43
∎Field Size __________________________________________________________________ 44
∎Image Brightness ____________________________________________________________ 45
Light Contrast ________________________________________________________________ 46
Colour Design ________________________________________________________________ 47
Colour Composition ____________________________________________________________ 48
Colour Saturation______________________________________________________________ 49
∎Colour Range _______________________________________________________________ 50
Colour Accent ________________________________________________________________ 53
Texture _____________________________________________________________________ 53
Animation ____________________________________________________________________ 54
Visual Pattern ________________________________________________________________ 55
Aspect Ratio _________________________________________________________________ 56
∎Image Intrinsic Movement _____________________________________________________ 57
∎Dominant Movement Direction __________________________________________________ 58
Movement Impression __________________________________________________________ 59
Splitscreen Number ____________________________________________________________ 60
Splitscreen Shape _____________________________________________________________ 61
Splitscreen Dynamics __________________________________________________________ 61
Frame in Frame _______________________________________________________________ 62
Spatial Arrangement ___________________________________________________________ 62
Language ______________________________________________________ 63
∎Dialogue Text _______________________________________________________________ 64
Text Diegetic _________________________________________________________________ 64
Text Nondiegetic ______________________________________________________________ 64
Text Aesthetics _______________________________________________________________ 64
Montage _______________________________________________________ 65
∎Shot ______________________________________________________________________ 66
∎Shot Duration _______________________________________________________________ 66
∎Montage Figure Macro ________________________________________________________ 67
Montage Figure Micro __________________________________________________________ 68
Montage Rhythm ______________________________________________________________ 69
Shot Transition _______________________________________________________________ 70
Image Split ___________________________________________________________________ 71
Dynamics Of Space ____________________________________________________________ 72
∎Found Footage ______________________________________________________________ 73
Motifs _________________________________________________________ 75
∎Setting _____________________________________________________________________ 76
∎Image Content ______________________________________________________________ 77
Important Persons _____________________________________________________________ 78
Keywords ____________________________________________________________________ 78
Costume ____________________________________________________________________ 81
Segmentation ___________________________________________________ 82
Expressive Movement __________________________________________________________ 83
Argumentation Unit ____________________________________________________________ 83
Scene_______________________________________________________________________ 83
Preface
Preliminary Remarks AdA Filmontology
What is the AdA Filmontology?
The AdA Filmontology is a systematic vocabulary and data model of film-analytical terms and
concepts for fine-grained semantic video annotations.
The vocabulary was developed in close collaboration between film scholars and computer
scientists as a tool for digital film studies. Its goal is to provide a standardized and systematic
basis for the joint annotation of audio-visual corpora to enable comparable, systematic film
analyses. The vocabulary is grounded in a methodological film-analytical consensus and is
made available as a machine-readable OWL ontology to publish annotations as Linked Open
Data for the exchange and comparison of analysis data.
Image Credit: Screenshot of our ontology visualisation tool.
The AdA Filmontology v1.8 currently consists of 8 annotation levels, 78 annotation types and
501 annotation values. Each level, type and value has a unique resource identifier (URI), an
English and German name, and an English and German description. In addition, types are
assigned colour codes for better differentiation in annotation software tools.
For example, the predefined value crescendo of the annotation type MusicAccent is defined
as follows:
1
URI
Label (de)
Label (en)
Description
(de)
Description
(en)
Belongs to
Type
https://ada.cinepoetics.org/resource/2021/05/19/AnnotationValue/MusicAccen
t_crescendo.html
anschwellend
crescendo
Deutliche Intensivierung der Musik, z.B. durch ansteigende Lautstärke oder
ansteigende Tonhöhe.
Noticeable intensification of the music, e.g. through increasing volume or
rising pitch.
MusicAccent
Where can I use it?
The AdA Filmontology is a semantic data structure with concrete application in the video
annotation software Advene. Through a specific template the ontology can be imported as a
predefined analysis vocabulary. The annotations based on this template can then be exported
as linked open data.
Where can I get other resources? (OWL; Template; Existing Datasets)
We provide a browsable online version of the AdA Filmontology. Each entry of the ontology
can be accessed by retrieving the respective URI of the term. The eMAEX annotation method
resource can be used as an entry point. More examples are listed below:
Annotation Level
Camera
Acoustics
Annotation Type
Camera Movement Type
Music Mood
Annotation Value
tracking shot
sad
The data is served using the RDF triplestore OpenLink Virtuoso and LodView, a software for
W3C standard compliant IRI dereferenciation. We also developed an interactive visualisation
of the AdA Filmontology that can be accessed in our OntoViz tool.
Download
The AdA Filmontology is available for download in our GitHub repository. The OWL file can,
for example, be viewed and edited with the Protégé ontology editor. We also offer a ready-touse Advene template package to create annotations that conform to AdA Filmontology.
Furthermore, we provide a detailed user manual on “Annotating with Advene and the AdA
Filmontology”.
2
What is the purpose of this document?
To give an overview of the different concepts, their labels and descriptions as a reference
sheet during the annotation process.
Basic Structure: Levels – Types – Values
The AdA Filmontology is structured into three different kinds of film-analytical concepts:
Annotation
Levels
Annotation
Types
Annotation
Values
An annotation level is a category that groups a set of similar annotation types
(e.g., all types related to camera or all types related to acoustics).
An annotation type refers to a concept of the annotation routine under which
a movie is analysed (e.g., camera movement speed, or dialogue intensity).
An annotation value is a concrete characteristic an annotation type can have
(e.g., for camera movement speed - slow, medium, fast, alternating)
Image: The structure of levels, types, and values visualised using LodLive.
The ontology allows free text annotations (around a quarter of the types, e.g., dialogue text)
and annotations with predefined values (around three quarters of the types, see above).
3
Labels & Descriptions
In the following all descriptions of the AdA Filmontology are listed according to the threefold
ontology structure. This overview serves as an aide for the annotation process. The ontology
is so far also available in German.
The document is structured by the different Annotation Levels. At the beginning of each of
these label chapters there is a brief description of these fundamental categories. This is each
time followed by an overview of all Annotation Types that are assigned to the specific level.
The symbol ∎ marks basic annotation types that we recommend as a starting point: These
types were compiled as a basis for comparative analyses of audiovisual dynamics. Other
Annotation Types from the AdA Filmontology can be added as needed or new dimensions
analysis can be created.
Each type section begins with a corresponding definition of the concept and a tabular listing
of all associated values, as well as their respective description.
In addition, types and their intended usage are characterised by the following traits:
Single Value Annotation
Type
Multiple Value
Annotation Type
Ordered from Value X
to Value Y
Evolving Annotation
Type
[TO]
Contrasting Annotation
Type
[VS]
Advene Label
For this Type only one single value should be assigned per
annotation.
For this type multiple values can be assigned per annotation.
Describes the system of value ordering for a specific type.
Describes the possibility of using a syntax element that indicates
a continuous development between two values.
Describes the possibility of using a syntax element that connects
two contrasting values.
Associated annotation type label in Advene.
4
Acoustics
This level encompasses all annotation types that refer to the staging of expressive acoustic
phenomena like music, sound design, or the expressive qualities of spoken language.
Types:
∎Volume
∎Dialogue Emotion
Dialogue Type
Dominant Acoustic Level
∎Sound Gesture Dynamics
Dialogue Voice Quality
Dialogue Intensity
Dialogue Tonality
Sound Segment
Sound Gesture Description
Music Piece
Music Part
Music Gesture
Music Arrangement
∎Music Mood
Music Intensity
Music Tonality
Music Figure Patterning
Music Figure
Music Accent
5
∎Volume
Volume dynamics of the entire audio track.
•
•
Multiple Value Annotation Type
Advene Label: AS | Volume
∎Dialogue Emotion
'Dialogue Emotion' aims at the emotional timbre/Klangfarbe of human and artificial acoustic
utterances (see Köng/Brandt 2006). This annotation type provides a basic classification of
these emotional qualities that are communicated through the phonetic qualities of utterances.
Two moods (e.g. in the case of simultaneous utterances) can be related as conflicting in the
sense of a 'versus' with [VS].
•
•
•
Multiple Value Annotation Type
Contrasting Annotation Type
[VS]
Advene Label: AS | Dialogue Emotion
Value
Shortcut
[VS]
1
Syntax element that connects two contrasting values.
2
Qualifying the speaking as neutral when the manner of
speaking and/or the tone of voice are comparatively free of
identifiable characteristics that hint at any of the other
emotion values or appear as too vague.
3
Qualifying the speaking as angry based on i. a. manner of
speaking and/or the tone of voice that suggest an
expression of irritation, anger, aggression or (out-)rage of
the speaker. The characteristics include a pressed,
hardened or sharpened voice, its pitch and volume might
be heightened or lowered, an accentuated style of speech,
awkward pauses, shouts or exclamations, and interrupting
other speakers. One or several of these characteristics can
occur with different levels of intensity.
4
Qualifying the speaking as caring based on i. a. manner of
speaking and/or the tone of voice that suggest an
expression of empathy of the speaker or convey a soothing
impression. The characteristics include a soft, calm style of
speaking, a quieter or deeper voice compared to the usual
pitch, warmth in the tone of voice (much chest voice, little
head voice), or even purring, murmuring, or whispering.
One or several of these characteristics can occur with
different levels of intensity.
neutral
angry
caring
Description
6
confident
disgusted
insecure
joyful
relaxed
sad
5
Qualifying the speaking as confident based on i. a. the
manner of speaking and/or the tone of voice that suggest
an expression of (self-)confidence of the speaker. The
characteristics include a medium to loud voice with a
controlled tone, the use of chest voice, as well as
emphases. One or several of these characteristics can
occur with different levels of intensity.
6
Qualifying the speaking as disgusted based on i. a. the
manner of speaking and/or the tone of voice that suggest
an expression of disgust or (strong) contempt of the
speaker. The characteristics include an appalled or
disdainful tone of voice, a rough or rude style of speech, a
pressed or squeezed pitch of voice, exclamations, or
snorting. One or several of these characteristics can occur
with different levels of intensity.
7
Qualifying the speaking as insecure based on i. a. the
manner of speaking and/or the tone of voice that suggest a
timid, cautious or insecure expression of the speaker. The
characteristics include a low volume, unclear pronunciation,
interruptions or elongation in the speech melody, the
frequent use of head voice, stuttering and stammering in
alternation with very fast speech pattern. One or several of
these characteristics can occur with different levels of
intensity.
8
Qualifying the speaking as joyful based on i. a. the manner
of speaking and/or the tone of voice that suggest an
expression of joy, happiness, or other positive emotions of
the speaker. The characteristics include laughing, cheering,
an animated speech melody, exclamations. One or several
of these characteristics can occur with different levels of
intensity.
9
Qualifying the speaking as relaxed based on i. a. the
manner of speaking and/or the tone of voice that suggest a
relaxed expression of the speaker. The characteristics
include low to medium voice volume, a slow speech
melody, and soft pronunciations. One or several of these
characteristics can occur with different levels of intensity.
Qualifying the speaking as sad based on i. a. the manner of
speaking and/or the tone of voice that suggest an
expression of sadness, grief, and sorrow of the speaker.
The characteristics include sighing or crying, a delayed and
hesitant style of speech, taciturnity, or a throaty, breaking,
or distorted voice. One or several of these characteristics
can occur with different levels of intensity.
7
scared
Qualifying the speaking as scared based on i. a. the
manner of speaking and/or the tone of voice that suggest a
timid, frightened, or anxious expression of the speaker. The
characteristics include a weak or breaking voice, its pitch
might be heightened or distorted, whimpering, gasping,
audible breathing, or breathlessness. One or several of
these characteristics can occur with different levels of
intensity.
suffering
Qualifying the speaking as suffering based on i. a. the
manner of speaking and/or the tone of voice that suggest
an expression of pain and suffering of the speaker. The
characteristics include repeated interruptions through
breathing noises, an increased volume, and repeated
screams and moans. One or several of these
characteristics can occur with different levels of intensity.
surprised
Qualifying the speaking as surprised based on i. a. manner
of speaking and/or the tone of voice that suggest an
expression of surprise of the speaker. The characteristics
include an uncontrolled style of speaking, a heightened
pitch of voice, exclamations, giggling, laughing, an
accelerated, accentuated, or unusual melody of speech,
speechlessness, stuttering, or slipping into head voice. One
or several of these characteristics can occur with different
levels of intensity.
8
Dialogue Type
Basic distinction of dialogue types regarding forms of address and interaction with other
speakers, but also their diegetic status and their acoustic qualities.
•
•
Multiple Value Annotation Type
Advene Label: AS | Dialogue Type
Shortcut
Description
monologue
1
A figure talks to him-/herself or uninterruptedly to other
figures. E.g. in a speech or verbalising his or her emotional
state or inner thoughts without diegetic addressees.
dialogue
2
Several figures speak with each other. Single utterances
are comparatively short and different characters take turns.
chorus
3
Several figures speak simultaneously in a coordinated
manner, e.g. in unison, or in coordinated individual voices.
buzz
4
Several figures speak simultaneously in a coordinated
manner, e.g. in unison, or in coordinated individual voices.
voice over
5
A voice speaks from off screen, whose origin, i.e. a
speaking body, is not part of the current action. Often it
cannot be connected to any form of diegetic body. This
may be the voice of a narrator or commentator. This voice
is often directed at the audience.
interview
6
Answers and questions (not necessarily both) from an
interview situation, e.g. questions can be inaudible or the
speaker cannot be seen.
Viewer
adressation
7
Viewers are addressed directly, either by a diegetic figure
or through a voice over.
Value
9
Dominant Acoustic Level
Indication of which acoustic level (of a basic distinction in music, language or sound) is
dominant in a given segment, i.e. in the centre of the viewers' attention.
•
•
Single Value Annotation Type
Advene Label: AS | Dominant Acoustic Level
Value
Shortcut
Description
music
1
The acoustic perception is dominated by musically
organised sounds, i.e. sounds which appear to create a
musical expression. This can be diegetic or non-diegetic
music. The sounds can be voices, as well as instruments.
sounds
2
The acoustic perception is dominated by sounds which can
neither be categorized as music, nor as language.
3
The acoustic perception is dominated by voices and voicelike sounds, which seem to be coordinated in such a way
as to create a perceptible meaning – this also refers to
foreign or fantasy languages.
4
Lasting and noticeable impression of silence that does not
have to be the complete absence of any sound in a
technical sense. Repeated and minimal sounds, as well as
background noise can reinforce the impression of silence.
language
expressive
silence
10
∎Sound Gesture Dynamics
Temporal gestalt of prominent sound structures that are perceived as sound (noise). This
annotation type provides a basic classification of the impression of these sound gestures in
the sound design.
•
•
Single Value Annotation Type
Advene Label: AS | Sound Gesture Dynamic
Shortcut
Description
1
Noticeable increase of the intensity of sounds (singular as
well as a composition of multiple sounds). Examples for this
include increasing volume, acceleration of the rhythm, or
rising pitch of sound (e.g. an engine noise closing in).
subside
2
Noticeable decrease of the intensity of sounds (singular as
well as a composition of multiple sounds). Examples for this
include decreasing volume, deceleration of the rhythm, or
falling pitch of sound (e.g. an engine noise distancing
itself).
whirring
3
A sound – regarding its pitch and volume – jumps around a
constant baseline. This whirring is audible for a couple of
moments.
explosive
4
The sound comes and goes very quickly, thereby setting a
noticeable accent.
5
A moment of pause in the sound design, i.e. a noticeable
interruption of the preceding sounds for a possibly short but
noticeable duration.
Value
swell
pause
11
Dialogue Voice Quality
Perceived acoustic quality of voices. This annotation type provides a basic distinction of
different voice qualities regarding a selection of acoustic traits. It encompasses expressive
forms such as screams, whispers, and cheering, as well as media aspects of voices (e.g. in
transmissions or non-human voices) such as metallic, distorted, muffled. Two conflicting vocal
qualities (e.g. simultaneous utterances) can be related in the sense of a 'versus' with [VS]. A
temporal sequence of or development between two vocal qualities (e.g. from screams to
whispers) can be related with [TO].
•
•
•
•
Multiple Value Annotation Type
[TO]
Evolving Annotation Type
[VS]
Contrasting Annotation Type
Advene Label: AS | Dialogue Voice Quality
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
[VS]
2
Syntax element that connects two contrasting values.
screams
3
The voice sounds very loud, very high, and very pressed.
Often coincides with the loss of clearly articulated words.
whispers
4
The voice sounds very breathy, quiet. Consonants
dominate and the vowels are almost voiceless. Often
coincides with the loss of clearly articulated words.
metallic
5
The voice sounds noticeably mechanically or digitally
distorted, often reverberant
distorted
6
The voice sounds distorted and distanced, typically by
transmission technology such as the telephone, radio, or
loudspeaker. This quality of voice often comes along with
interfering noises like crackling.
cheering
7
The voice is dominated by excited shouts, cheering, or
happy whoops.
muffled
8
The voice sounds as if muted by a wall or another kind of
barrier. The voice is thus often muted in regard to volume
and clearness. Deep tones are often fuzzy but emphasised.
12
Dialogue Intensity
The perceived degree of intensity of an affective expression in utterances. It can also involve
an inward-looking form of tension, such as repressed anger. This annotation type provides a
scale for the intensity of utterances. Two intensities (e.g. in cases of simultaneous utterances)
can be related as conflicting in the sense of a 'versus' with [VS].
•
•
•
•
Multiple Value Annotation Type
Ordered from very low intensity to very high intensity
Contrasting Annotation Type
[VS]
Advene Label: AS | Dialogue Intensity
Value
Shortcut
Description
[VS]
1
Syntax element that connects two contrasting values.
1
2
Very low intensity.
2
3
Low intensity.
3
4
Medium intensity.
4
5
High intensity.
5
6
Very high intensity.
13
Dialogue Tonality
Dialogue Tonality' refers to the perceived fundamental prosodic expressive quality of a
dialogue. This annotation type provides a basic classification of the tonality of single utterances
(or overlapping speech) into harmonic, neutral and tense.
•
•
•
•
Multiple Value Annotation Type
Ordered from harmonic to tense
[VS]
Contrasting Annotation Type
Advene Label: AS| Dialogue Tonality
Value
Shortcut
[VS]
1
Syntax element that connects two contrasting values.
harmonic
2
The piece of dialogue/monologue sounds generally
harmonic and rather free of conflict: the voices have a calm
and/or pleasant speech melody. The interplay of different
voices is perceived as harmonic, no harsh interruptions.
neutral
3
The piece of dialogue/monologue sounds neither harmonic
nor tense.
4
The section of dialogue/monologue sounds generally
inharmonic and conflictual, the voice(s) have a tense
speech melody. Conflictual interruptions of individual voices
are typical in this case.
tense
Description
Sound Segment
A segment characterised by a coherent sound design. For example, a specific location such
as a restaurant or a street. This annotation type operates with free text description of the sound
segment. Example: 'Office sounds: muffled voices, distant ringing of phones, clacking of
computer keyboards, distant traffic noise.'
•
•
Free Text Annotation Type
Advene Label: AS | Sound Segment
Sound Gesture Description
Free description of a prominent dynamic sound structure in the sound design, e.g. regarding
the sound source but also the specific sound quality. Example: 'Intensifying buzz of a drawn
light saber.'
•
•
Multiple Value Annotation Type
Advene Label: AS | Sound Gesture Description
14
Music Piece
Free description of a piece of music or its characterization by five attributions, e.g. Upbeat,
Midtempo or Metal, K-Pop, Grunge, Hip Hop, Jazz, etc.
•
•
Free Text Annotation Type
Advene Label: AS | Music Piece
Music Part
Free description of a subsegment of a music piece. A part may for instance be marked by a
change of mood.
•
•
Free Text Annotation Type
Advene Label: AS | Music Part
Music Gesture
Free description of a subsegment of a piece of music that has a specific gestural quality. This
can, e.g., be a certain increasing/ rising movement or an accent.
•
•
Free Text Annotation Type
Advene Label: AS | Music Gesture
15
Music Arrangement
'Music Arrangement' refers to the instruments used, i.e. the perceived instrumentation. This
annotation type provides a basic generic classification of the arrangement based on an overall
impression of a piece of music or a part if it is characterised by a distinct arrangement.
•
•
Single Value Annotation Type
Advene Label: AS | Music Arrangement
Value
Shortcut
Description
1
Decisive for the impression of orchestral music is a rich
sound created by the comparatively large number of
instruments. Typical instruments include strings and wind
instruments, but also choirs.
2
Chamber music is characterized by a small cast of
instruments but may include very different instruments.
Typical examples are the string quartet and the piano trio.
Vocals with acoustic accompaniment may also be included.
3
Music within which sounds are perceived as predominantly
electronically produced (e.g. by synthesizers). The
subjective impression is decisive and not every imitation of
analogue instruments must necessarily be classified as
electronic, just as analogous drum machines can fall under
the label of 'electronic' music.
pop
4
Music played by mostly modern instruments such as eguitar, e-bass, amplified voice, drums, synthesizer. The
sounds of the instruments are often altered by electronic
effects.
solo
5
Music played only by one instrument or sung by one voice,
usually a melody.
orchestral
chamber music
electronic
16
∎Music Mood
'Music Mood' refers to the perceived emotional state conveyed in a music piece. This
annotation type provides a basic classification of the general mood that is conveyed in a
coherent segment of music (either a piece or a part of it).
•
•
Multiple Value Annotation Type
Advene Label: AS | Music Arrangement
Value
Shortcut
Description
neutral
1
The mood of the music cannot be classified by any of the
other values.
tense
2
The mood of the music is perceived as tense, i.e. it can
either be perceived as continually rising tension or as a
lasting state of expectation.
happy
3
The mood of the music is perceived as happy, often
involving upbeat melodies, a major key, and soundscapes
that are perceived as pleasant.
sad
4
The mood of the music is perceived as sad, often involving
stretched and slow rhythms, a minor key, and low tones.
aggressive
5
The mood of the music is perceived as aggressive, often
involving harsh and fast rhythms as well as high intensity
and volume.
17
Music Intensity
Perceived degree of the intensity of an (affective) expression of music, e.g. regarding volume,
dynamics, instrumentation. This annotation type provides a scale for the intensity in a coherent
segment of music (either a piece or a part of it).
•
•
•
•
Single Value Annotation Type
Ordered from 1 to 5
Evolving Annotation Type
[TO]
Advene Label: AS| Music Intensity
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
1
2
Very low intensity.
2
3
Low intensity.
3
4
Medium intensity.
4
5
High intensity.
5
6
Very high intensity.
18
Music Tonality
The impression of harmony or disharmony of a coherent segment of music (either a piece or
a part of it). This annotation type provides a basic scale for the musical tonality from harmonic
to neutral to tense. The focus is on perceptual qualities, which means that, besides aspects of
polyphony and melodics, rhythmic qualities are also taken into consideration.
•
•
•
Single Value Annotation Type
Ordered from harmonic to tense
Advene Label: AS| Music Tonality
Value
Shortcut
Description
harmonic
1
The music sounds harmonic, it is dominated by pleasant
sounds such as catchy melodies, consonant chords (often
in a major key), or elating rhythms.
neutral
2
The music sounds neither harmonic nor tense.
tense
3
The music sounds tense. Characteristics can include
dissonant chords, disturbing rhythms, or extremes in pitch
and volume, among others.
Music Figure Patterning
Impression of (rhythmical) temporal patterns rather than the (objective) measurement of
characteristics of the musical structure such as beats or meter. This annotation type provides
a basic classification of the rhythmical patterning of a coherent segment of music (either a
piece or a part of it).
•
•
Single Value Annotation Type
Advene Label: AS| Music Figure Patterning
Value
Shortcut
Description
rhythmic
1
The music is dominated by a regular rhythm which is
relatively catchy, such as 3/4 or 4/4, as opposed to more
organic rhythms without a fixed musical measure.
irregular
2
The rhythm of the music varies or a strong phrasing places
the rhythm in the background. Complicated rhythms (e.g.,
7/4) can sometimes also be perceived as irregular.
spheric
3
The rhythm of the music is hardly perceptible. It is rather
dominated by sound layers, strong notes, or melodies.
19
Music Figure
Impression of dynamic processes and complex temporal patterns which predominantly
characterize the musical gestalt over a certain period of time, in particular tempo and volume
changes, as well as melodic or tone repetitions. This annotation type provides a basic selection
of musical figures.
•
•
Multiple Value Annotation Type
Advene Label: AS| Music Figure
Value
Shortcut
Description
loop
1
Musical phrase which is repeated at least three times.
crescendo
2
A continuous increase in volume for a perceptible duration.
The degree of progression is significant enough to be
perceived as an acoustic gesture.
decrescendo
3
A continuous decrease in volume for a perceptible duration.
The degree of progression is significant enough to be
perceived as an acoustic gesture.
tremolo
4
The impression of a vibrating (melodic) flicker over a period
of time, typically generated by the rapid change between
two tones.
20
Music Accent
A prominent, isolated moment within a musical gestalt with a typically (very) short duration.
This annotation type provides a basic classification of musical accents.
•
•
Single Value Annotation Type
Advene Label: AS| Music Accent
Value
Shortcut
Description
crescendo
1
Noticeable intensification of the music, e.g. through
increasing volume or rising pitch.
decrescendo
2
Noticeable decrease of the intensity of the music, e.g.
through decreasing volume or falling pitch.
tremolo
3
A short vibrating or trill-like musical accent.
explosive
4
A very loud or very strident or otherwise very prominent,
short musical accent.
pause
5
A moment of pause in the music (for a possibly short but
noticable duration, at least one crucial part of the
instrumentation is interrupted).
21
Bodily Expressivity
Expressivity of bodies that are perceived as communicating bodies (e.g. humans, animals,
anthropomorphic machines). The expressivity is not understood as a speculation about an
assumed subjectivity, but as perceived surface phenomena of gestures, facial expressions
and postures.
Types:
∎Body Language Emotion
∎Body Language Intensity
Facial Expression Emotion
Gesture Emotion
Facial Expressions Intensity
Gestures Intensity
22
∎Body Language Emotion
Perceived emotional quality of the body language (gestures, posture, as well as facial
expression) of central figures within the image. This annotation type provides a basic
classification of the perceived mood through a selection of emotion words. Two moods (e.g. of
different figures in the image) can be related as conflicting in the sense of a 'versus' with [VS].
•
•
•
Multiple Value Annotation Type
[VS]
Contrasting Annotation Type
Advene Label: BodExp | Body Language Emotion
Value
Shortcut
[VS]
1
Syntax element that connects two contrasting values.
2
Bodily gestures and facial expression are comparatively free
of classifiable emotion or unsuitable for any of the other
emotion values.
3
Qualifying the body language as an angry expression, i.e.
bodily gestures and facial expression(s) of (an) irritated,
angry or aggressive figure(s) in the image. Typical
characteristics include clenched hands and jaw, intensified
breathing, strong, fast or convulsive movements, tense
facial features, narrow eyes, or focused staring. One or
several of these characteristics can occur with different
levels of intensity and the expression of anger can occur
explosively as well as implosively.
4
Qualifying the body language as caring, i.e. bodily gestures
and facial expression express the empathy, solicitude, or
tenderness of the figure(s) in the image. Typical
characteristics include an open focused gaze, soft,
harmonic movements which adjust to another figure (or an
object), approaching, touching, smiling, and relaxed eye
contact. One or several of these characteristics can occur
with different levels of intensity. In its extreme form, the
label 'caring' can also be used for expressions of lust and
longing.
5
Qualifying the body language as confident, i.e. bodily
gestures and facial expressions communicate the
confidence and self-assuredness of the figure(s) in the
image. Typical characteristics include an upright posture,
erect chest, raised chin, regular deep breathing, extensive
movements, smooth facial features, regular blinking,
smiling, and other forms of relating to other figures. One or
several of these characteristics can occur with different
levels of intensity. At the core of this label is the abovementioned mode of claiming space within the image space
and the choreography of other figures.
neutral
angry
caring
confident
Description
23
disgusted
insecure
joyful
relaxed
sad
6
Qualifying the body language as disgusted, i.e. bodily
gestures and facial expressions communicate disgust or
repulsion of the figure(s) in the image. Typical characteristics
include a reluctant posture, the gaze turned away, repelling,
distancing movements, wrinkled nose, pulled-up brows and
lips, shaking of the head. One or several of these
characteristics can occur with different levels of intensity. As
an extreme form, the label 'disgusted' can also include
strong forms of physical repulsion such as regurgitation or
gagging.
7
Qualifying the body language as insecure, i.e. bodily
gestures and facial expression communicate the insecurity
of the figure(s) in the image. Typical characteristics include
restless posture, irregular breathing, erratic movements,
twitching hands, changing facial expressions, movements of
the mouth without speaking, frequent blinking, and
significant eye movement. Here the restless turning towards
oneself as well as the nervous relating to one's environment
can be central. One or several of these characteristics can
occur with different levels of intensity.
8
Qualifying the body language as joyful, i.e. bodily gestures
and facial expression communicate joy or happiness of the
figure(s) in the image. Typical characteristics include
smiling, laughing, deep breathing, big, exuberant
movements, wide open shiny eyes, expansive chest. One or
several of these characteristics can occur with different
levels of intensity. The value 'joyful' can thus refer to subtle
as well as excessive forms of joy.
9
Qualifying the body language as relaxed, i.e. bodily gestures
and facial expression communicate a relaxed state of the
figure(s) in the image. Typical characteristics include a calm,
tensionless posture, regular breathing, harmonic, extensive
movements, smooth and tensionless facial expressions,
regular to slow blinking, unobtrusive eye movement. One or
several of these characteristics can occur with different
levels of intensity. Relaxed body language thus refers to a
harmonic and tensionless embedding of a figure in its
environment.
Qualifying the body language as sad, i.e. bodily gestures and
facial expression communicate sadness, gloom, or sorrow of
the figure(s) in the image. Typical characteristics include
limp, weak posture, hanging shoulders, little movement,
wrinkled mouth, puckered brows, little blinking and eye
movement, undirected or lowered gaze, tears, moaning, and
sobbing. One or several of these characteristics can occur
with different levels of intensity.
24
scared
Qualifying the body language as scared, i.e. bodily gestures
and facial expression communicate different forms of fear of
the figure(s) in the image. Typical characteristics are a tense
body posture (either crouched or stiff and upright), fast and
heavy breathing, either highly dynamic eye movements in
various directions or empty gazes or closed eyes. Fear can
be expressed in rather different stances towards the figure's
environment: avoiding detection, heightening perception,
desperate assimilation, or complete loss of control in panic.
One or several of these characteristics can occur with
different levels of intensity.
suffering
Qualifying the body language as suffering i.e. bodily
gestures and facial expression communicate the endurance
of physical or psychological pain. Typical characteristics are
an extremely tense body posture (often crouched), heavy
and interrupted breathing, wide open or closed eyes,
extreme facial tensions (from wide open mouth to the
contraction of eyes, mouth, and forehead), rough forms of
self-touching (beating one's head), or clinging onto an object
or person. One or several of these characteristics can occur
with different levels of intensity.
surprised
Qualifying the body language as surprised, i.e. bodily
gestures and facial expression communicate the surprise of
the figure(s) in the image. Typical characteristics include a
sudden pausing (or even wincing), a change of direction and
intensity of the gaze, change of their way of moving, erratic
movements, wide open eyes, open mouth, laughing, or
giggling. One or several of these characteristics can occur
with different levels of intensity.
25
∎Body Language Intensity
Perceived degree of dynamicity and tension in an affective expression regarding the body
language (gestures, posture, as well as facial expression) of central figures within the image.
It can also involve an inward-oriented form of tension, such as repressed anger. This
annotation type provides a scale for the intensity of body language. Conflicting intensities (e.g.
different figures in the image or a difference between gestures and facial expressions) can be
related as conflicting in the sense of a 'versus' with [VS].
•
•
•
•
Multiple Value Annotation Type
Ordered from 1 to 5
[VS]
Contrasting Annotation Type
Advene Label: BodExp | Body Language Emotion
Value
Shortcut
Description
[VS]
1
Syntax element that connects two contrasting values.
1
2
Very low intensity.
2
3
Low intensity.
3
4
Medium intensity.
4
5
High intensity.
5
6
Very high intensity.
26
Facial Expressions Emotion
Perceived emotional quality of the facial expression of central figures within the image. This
annotation type provides a basic classification of the mood through a selection of emotion
words. Two moods (e.g. of different figures in the image) can be related as conflicting in the
sense of a 'versus' with [VS].
•
•
•
Multiple Value Annotation Type
[VS]
Contrasting Annotation Type
Advene Label: BodExp | Facial Expression Emotion
Value
Shortcut
[VS]
1
Syntax element that connects two contrasting values.
neutral
2
The facial expressions are comparatively free of identifiable
emotion or unsuitable for any of the other emotion values.
3
The facial expressions suggest an irritated, angry, or
aggressive emotional state of the figure(s) in the image.
Typical characteristics include tense facial features,
clenched jaws, narrow eyes, focused staring.
4
The facial expressions suggest empathy, solicitude, or
tenderness of the figure(s) in the image. Typical
characteristics include an open focused gaze, smiling, and
relaxed eye contact.
5
The facial expressions suggest confidence of the figure(s)
in the image. Typical characteristics include a raised chin,
smooth facial features, regular blinking, smiling.
6
The facial expressions suggest disgust or repulsion of the
figure(s) in the image. Typical characteristics include the
gaze turned away, wrinkled nose, pulled-up brows and lips.
insecure
7
The facial expressions suggest insecurity of the figure(s) in
the image. Typical characteristics include changing facial
expressions, movements of the mouth without speaking,
frequent blinking and significant eye movement.
joyful
8
The facial expressions suggest joy or happiness of the
figure(s) in the image. Typical characteristics include
smiling, laughing, wide open shiny eyes.
9
The facial expressions suggest a relaxed emotional state of
the figure(s) in the image. Typical characteristics include
regular breathing, smooth and natural-looking facial
features, regular but not too frequent blinking, eye
movement appropriate to the situation.
angry
caring
confident
disgusted
relaxed
sad
Description
The facial expressions suggest sadness of the figure(s) in
the image. Typical characteristics include wrinkled mouth,
puckered brows, little blinking and eye movement,
undirected or lowered gaze.
27
scared
The facial expression suggest different forms of fear of the
figure(s) in the image. Typical characteristics are fast and
heavy breathing, either highly dynamic eye movements in
various directions or empty gazes or convulsively closed
eyes.
suffering
The facial expression suggest the endurance of physical or
psychological pain of the figure(s) in the image. Typical
characteristics are heavy and interrupted breathing, wide
open or closed eyes, extreme facial tensions (from wide
open mouth to the contraction of eyes, mouth, and
forehead).
surprised
The facial expressions suggest the surprise of the figure(s)
in the image. Typical characteristics include a sudden
change of direction and intensity of the gaze, wide open
eyes, open mouth.
28
Gestures Emotion
Perceived emotional quality of the gestures or postures of central figures within the image.
This annotation type provides a basic classification of the mood through a selection of emotion
words. Two moods (e.g. of different figures in the image) can be related as conflicting in the
sense of a 'versus' with [VS].
•
•
•
Multiple Value Annotation Type
[VS]
Contrasting Annotation Type
Advene Label: BodExp | Gestures Emotion
Value
Shortcut
[VS]
1
Syntax element that connects two contrasting values.
neutral
2
The bodily gestures are comparatively free of identifiable
emotion or unsuitable for any of the other emotion values.
3
The bodily gestures suggest an irritated, angry, or
aggressive emotional state of the figure(s) in the image.
Typical characteristics include clenched hands, strong, fast,
or convulsive movements.
4
The bodily gestures suggest empathy, solicitude, or
tenderness of the figure(s) in the image. Typical
characteristics include soft, harmonic movements which
adjust to another figure (or an object), approaching or
touching.
5
The bodily gestures suggest confidence of the figure(s) in
the image. Typical characteristics include an upright
posture, erect chest, regular deep breathing, space-filling
movements.
disgusted
6
The bodily gestures suggest disgust or repulsion of the
figure(s) in the image. Typical characteristics include a
reluctant posture and repelling, distancing movements. As
an extreme form, the label 'disgusted' can also include
strong forms of physical repulsion.
insecure
7
The bodily gestures suggest insecurity of the figure(s) in
the image. Typical characteristics include a restless
posture, erratic movements, twitching hands, self-touching.
joyful
8
The bodily gestures suggest joy or happiness of the
figure(s) in the image. Typical characteristics include large,
exuberant movements, upright posture, low body tension.
relaxed
9
The bodily gestures suggest a relaxed emotional state of
the figure(s) in the image. Typical characteristics include a
calm posture, harmonic, space-filling movements.
angry
caring
confident
sad
Description
The bodily gestures suggest sadness or depression of the
figure(s) in the image. Typical characteristics include limp,
weak posture, hanging shoulders, little movement.
29
scared
The bodily gestures suggest different forms of fear of the
figure(s) in the image. Typical characteristics are a tense
body posture (either crouched or stiff and upright) or hectic
evasive movements. Fear can be expressed in rather
different stances towards the figure's environment: avoiding
detection, heightening perception, desperate assimilation,
or complete loss of control in panic.
suffering
The bodily gestures suggest the endurance of physical or
psychological pain of the figure(s) in the image. Typical
characteristics are an extremely tense body posture (often
crouched), rough forms of self-touching (beating one's
head), or clinging onto an object or person.
surprised
The bodily gestures suggest the surprise of the figure(s) in
the image. Typical characteristics include a sudden pausing
(or even wincing), changes in the way of moving, erratic
movements.
Facial Expressions Intensity
Perceived degree of dynamicity and tension in an affective expression regarding the facial
expressions of central figures within the image. This can also involve an inward-looking form
of tension, such as repressed anger. This annotation type provides a rating scale for the
intensity of facial expressions. Two intensities (e.g. different figures in the image or a difference
between gestures and facial expressions) can be related as conflicting in the sense of a
'versus' with [VS].
•
•
•
•
Multiple Value Annotation Type
Ordered from 1 to 5
[VS]
Contrasting Annotation Type
Advene Label: BodExp | Facial Expressions Intensity
Value
Shortcut
Description
[VS]
1
Syntax element that connects two contrasting values.
1
2
Very low intensity.
2
3
Low intensity.
3
4
Medium intensity.
4
5
High intensity.
5
6
Very high intensity.
30
Gestures Intensity
Perceived degree of dynamicity and tension in an affective expression regarding the gestures
and postures of central figures within the image. This can also involve an inward-looking form
of tension, such as repressed anger. This annotation type provides a rating scale for the
intensity of gestures. Two intensities (e.g. different figures in the image) can be related as
conflicting in the sense of a 'versus' with [VS].
•
•
•
•
Multiple Value Annotation Type
Ordered from 1 to 5
[VS]
Contrasting Annotation Type
Advene Label: BodExp | Gesture Intensity
Value
Shortcut
Description
[VS]
1
Syntax element that connects two contrasting values.
1
2
Very low intensity.
2
3
Low intensity.
3
4
Medium intensity.
4
5
High intensity.
5
6
Very high intensity.
31
Camera
Visual traits that refer directly to the camera view as a sensory extension with its own
corporeality: from interactions of the camera in its surroundings (e.g. camera movements) to
image traits that refer directly to the mechanical eye (e.g. focus shifts or fast forwards).
Crucial here is the viewing impression and not the production techniques, therefore the
(digital or analogue) simulation of camera views is included.
Types:
∎Recording Playback Speed
Depth of Field
Defocus
Camera Movement Unit
∎Camera Movement Type
Camera Movement Speed
Camera Movement Direction
∎Camera Angle
Camera Angle Canted
Camera Angle Vertical Positioning
Lens
32
∎Recording Playback Speed
Modulation of time perceivable to the viewer that results (potentially) from the relation between
recording and playback rate. Basic classification of a selection of these phenomena.
•
•
•
Multiple Value Annotation Type
Evolving Annotation Type
[TO]
Advene Label: Cam | Recording/Playback Speed
Value
Shortcut
[TO]
1
slow motion
2
Noticeable deceleration of the viewers' time perception.
Movements in slow motion effects appear as unnaturally
slow in comparison to everyday perception.
timelapse
3
Noticeable acceleration of the viewers' time perception.
Movements appear as unnaturally fast. Objects, such as
plants, that may otherwise be perceived as static can get
animated through this technique in comparison to everyday
perception.
still
4
Continued immobilisation of the viewers' time perception.
The same static image is shown over a perceivable time
span.
5
Noticeable freezing of the viewers' time perception. A
moving image is abruptly stopped and the static image is
shown over a perceivable time span. 'Freeze' may also
apply for a still that starts to move/is set into motion.
6
Noticeable reversal of the viewers' time perception. Motion
sequences of objects and figures (such as falling rain)
appear reversed and behave opposite to everyday
expectations.
7
The viewers' time perception is not noticeably altered.
Playback speed does not show any conspicuous features.
Objects and figures move at a 'normal' speed, in
accordance with everyday expectations.
freeze
backwards
normal
Description
Syntax element that indicates a continuous development
between two values.
33
Depth of Field
Impression of the depth of field, i.e. the extension of the image area in which things appear
sharp. This annotation type provides a basic classification of the degree of depth of field.
•
•
Single Value Annotation Type
Advene Label: Cam | Depth of Field
Value
Shortcut
Description
high
1
Foreground to background, i.e. all planes of the image, are
notably sharp, the outlines clearly visible.
low
2
Only a notably shallow plane of the image is sharp, closer
or further planes are out of focus, the outlines blurred.
out of focus
3
All planes of the image are out of focus, so all outlines are
blurred, however vaguely perceptible.
none
4
The image composition does not allow any conclusions
about the focus of the camera. Examples include a black
screen or an animation.
34
Defocus
Perceived dynamics of (un)sharpness. This annotation type provides a basic classification of
different forms of defocus, blur or racking focus. This can either be an effect of camera
recording or subsequently be done in post production.
•
•
Single Value Annotation Type
Advene Label: Cam | Defocus
Value
Shortcut
Description
1
Because of movement, the outlines of an element of the
image are blurred, even though they are focused by the
camera. This often occurs when the shutter of the camera
is opened long enough to capture the light of the moving
object in changing places. Motion blur can also be added
later as a visual effect in post production.
rack focus
2
The focused plane of the image moves during the shot.
This is especially noticeable when the focused object
changes, e.g. when first the foreground is in focus, then the
background.
focus
3
The image is out of focus with all outlines blurred, then the
focus is pulled on an object which suddenly becomes sharp
and clear.
defocus
4
At least part of the image is focused, then the focus is
changed in such a way that nothing in the image is sharp
anymore and all outlines are blurred.
motion blur
Camera Movement Unit
Free description of a coherently perceived camera movement, i.e. a short description of its
characteristics such as the quality of movement, orientation in space and towards figures. It
may also be a camera movement across shot borders, e.g. a continued panning.
•
•
Free Text Annotation Type
Advene Label: Cam | Camera Movement Unit
35
Camera Movement Type
Perception of the frame as mobile vision, i.e. the impression of a physical or virtual camera
movement. Different types of camera movement can occur at the same time, e.g. zooming in
while the camera pans. This annotation type provides a classification of different types of
camera movement.
•
•
Multiple Value Annotation Type
Advene Label: Cam | Camera Movement Type
Value
Shortcut
Description
1
The camera rotates around its vertical axis (without
changing its position in space, but can also be combined
with other camera movements). Thereby the visual field
shifts to the left or to the right.
tilt
2
The camera rotates around its horizontal axis (without
changing its position in space, but can also be combined
with other camera movements). Thereby the visual field
shifts up or down.
tracking shot
3
The camera moves linearly, mostly on a horizontal plane. It
often follows a moving object, keeping it framed.
4
When zooming, the camera actually stays unmoved, only
its field of vision is widened or narrowed by altering the
focal length of the lens. Thereby the gaze of the camera
'closes in' or 'backs-off', while the relative distances in the
image stay unchanged, which would change during an
actual movement of the camera.
shaking
5
The camera is constantly (slightly) moving through micromovements in various directions. As a result, the camera
view seems to be embedded into its surrounding world in a
perceivable way. The attention can thereby be drawn to the
carrier of the gaze and thus imply a subjective point of
view.
floating
6
The camera moves freely, i.e. its position in space as well
as its own axis can be in motion. Typically, the movement
is rather fluent and smooth.
minimal
7
The camera moves very little, the movement is hardly
perceptible, e.g. minimal reframings.
static
8
The camera is absolutely motionless.
pan
zoom
36
Camera Movement Speed
Perceived degree of the (relative) movement speed of the camera. This annotation type
provides a scale for the perceived camera speed from slow to fast.
•
•
•
•
Multiple Value Annotation Type
Ordered from slow to fast + alternating
Evolving Annotation Type
[TO]
Advene Label: Cam | Camera Movement Speed
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
slow
2
The camera moves comparatively slow.
medium
3
The camera movement can be characterised by a
comparatively medium degree of speed, i.e. the camera
moves neither significantly fast nor slow.
fast
4
The camera moves comparatively fast.
alternating
5
The camera movement varies significantly in speed.
37
∎Camera Movement Direction
Perceived movement direction of the camera. This annotation type provides a basic
classification of movement directions. The order of the values describes the temporal
succession of the movement. The directions are named in relation to the viewers' perspective
in front of the screen/display.
•
•
Multiple Value Annotation Type
Advene Label: Cam | Camera Movement Direction
Value
Shortcut
Description
left
1
The camera moves noticeably to the left.
right
2
The camera moves noticeably to the right.
up
3
The camera moves noticeably upwards.
down
4
The camera moves noticeably downwards.
forward
5
The camera moves noticeably forwards, i.e. into the depth
of the image.
back
6
The camera moves noticeably backwards.
circle
7
The camera circles around a center, often an object or a
figure.
canted
8
The camera noticeably cants around its axis to the left or
right.
undirected
9
The camera moves in a complex or subtle way that does
not allow the attribution of specific dominant camera
directions.
38
∎Camera Angle
Perceived vertical angle of (camera) vision. This annotation type provides a scale for camera
angles from extreme high angle to extreme low angle.
•
•
•
•
Single Value Annotation Type
Ordered from extreme high-angle to extreme low-angle + neither
[TO]
Evolving Annotation Type
Advene Label: Cam | Camera Angle
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
extreme highangle
2
The camera is pointed extremely downwards, e.g. in a
bird's eye perspective.
high-angle
3
The camera is pointed slightly downwards.
straight-angle
4
The camera is pointed horizontally, i.e. parallel to the
ground.
low-angle
5
The camera is pointed slightly upwards.
extreme lowangle
6
The camera is pointed extremely upwards, away from the
ground, e.g. in a worm's eye perspective.
neither
7
There is no reference point to account for the camera angle
in relation to the ground.
39
Camera Angle Canted
Perceived angle around the rolling axis of vision, so that the horizon is not/would not be
parallel to the lower border of the image. This annotation type provides a basic classification
of several forms of canted angle.
•
•
•
Single Value Annotation Type
[TO]
Evolving Annotation Type
Advene Label: Cam | Camera Angle Canted
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
tilt left
2
The camera is tilted towards the left.
tilt right
3
The camera is tilted towards the right.
level
4
The camera is level in relation to the ground.
inverted
5
The camera is tilted 180°. Up and down are thus inverted.
40
Camera Angle Vertical Position
Perceived height of the camera view. This annotation type provides a scale for the vertical
positioning of the camera in reference to the eye level of reference figures of a specific shot.
•
•
•
Single Value Annotation Type
Evolving Annotation Type
[TO]
Advene Label: Cam | Camera Angle Vertical Position
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
low
2
The camera position (relative to a vertical axis) is
significantly beneath the eye level of a reference figure.
medium
3
The camera position (relative to a vertical axis) is
approximately at the eye level of a reference figure. Films,
and even individual scenes, can establish different average
levels of the camera position height.
high
4
The camera position (relative to a vertical axis) is
significantly above the eye level of a reference figure.
indifferent
5
The vertical positioning of the camera in relation to figures
cannot be determined since there is no reference point.
41
Lens
Impression of a particular camera lens used in a specific shot. This annotation type provides
a basic classification of lens types.
•
•
•
Single Value Annotation Type
Evolving Annotation Type
[TO]
Advene Label: Cam | Lens
Value
Shortcut
[TO]
1
Syntax element that indicates a continuous development
between two values.
2
The image covers a notably wider visual field in comparison
to the human eye. Thereby the distances on the width of
the image are shortened, objects on the x-axis are
compressed.
3
The image covers a notably tighter visual field in
comparison to the human eye. Thereby the distances on
the depth of the image are shortened, objects on the z-axis
are compressed.
4
The image covers a much larger visual field in comparison
to the human eye. Objects on the x- and y-axis are
compressed not proportionally but increasingly towards the
edges of the image. Thereby the lines are distorted like in a
concave mirror.
wide-angle
telephoto
fisheye
Description
42
Image Composition
Aesthetic parameters that concern individual (and segments of) shots as graphic surfaces –
as arrangements and connections of formal elements. From visual patterns and colour
design to the movements of the objects within the frame. All visual effects that refer directly
to the camera as viewer and agent are annotated in ‘camera’.
Types:
∎Field Size
∎Image Brightness
Light Contrast
Colour Design
Colour Composition
Colour Saturation
∎Colour Range
Colour Accent
Texture
Animation
Visual Pattern
Aspect Ratio
∎Image Intrinsic Movement
∎Dominant Movement Direction
Movement Impression
Splitscreen Number
Splitscreen Shape
Splitscreen Dynamics
Frame in Frame
Spatial Arrangement
43
∎Field Size
The 'Field Size' is determined by the perceived size relation between a central object and the
framing of a shot. This relation can be perceived as the distance towards an object of reference
or how much of the centred subject in a shot and its surrounding is visible and thereby
establishes the distance/proximity of the spectator to the events. Besides human bodies,
reference objects can also be other figures (e.g. animals, machines). The spectrum is divided
into 8 different field sizes from wide to near in accordance with Faulstich: Grundkurs
Filmanalyse, 2002, Hickethier: Film- und Fernsehanalyse, 2001, Mikos: Film- und
Fernsehanalyse, 2003. Additionally, there is a category for shots without a distinct reference
object.
•
•
Multiple Value Annotation Type
Ordered from far to close + neither
[TO]
Evolving Annotation Type
•
•
Advene Label: ImCo | Field Size
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
extreme long
shot
2
In this framing the reference object (most often a human
body) is or would be perceived as very small. This framing
is used, for example, to show vast landscapes or skylines.
long shot
3
In this framing the reference object (most often a human
body) is or would be completely visible (a standing person
from head to toe). This field size shows the entire reference
object in its surroundings.
medium long
shot
4
In this framing the reference object (most often a human
body) is or would be visible almost completely (a standing
person from the head to the shins).
medium shot
5
In this framing the reference object (most often a human
body) is or would be only half visible (a standing person
from the head to the waist/the thighs), so that, for example,
the pistol in Western movies would still be visible.
medium closeup
6
In this framing only a third of the reference object (most
often a human body) is or would be visible (a standing
person from the head to the chest).
7
In this framing only a quarter of the reference object (most
often a human body) is or would be visible (a standing
person from the head to the shoulders). This field size is
often used for shot-reverse-shot dialogue sequences.
shoulder closeup
44
closeup
8
In this framing the reference object (most often a human
body) is or would only be partly visible (e.g. only the face of
a person). The close-up has a referential relationship to the
human face and can also depict other details in 'face-like'
framing.
extreme closeup
9
In this framing the reference object (most often a human
body) is or would only be visible minimally, i.e. a small
detail (e.g. only the eye of a person).
In this framing the reference object is not clear or
nonexistent (e.g. the image of a cloudy sky filling up the
frame could either be a closeup of a cloud or a long shot of
the sky, the composition of the image is abstract or does
not create an image space (e.g. the credit sequence of a
film).
neither
∎Image Brightness
Perceived light intensity of a shot. This annotation type provides a rating scale for image
brightness that refers to film-intrinsic variations and not to absolute values.
•
•
•
•
Single Value Annotation Type
Ordered from dark to bright
Evolving Annotation Type
[TO]
Advene Label: ImCo | Image Brightness
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
dark
2
There is little light, the general impression is a dark image.
medium
3
The lighting is neither noticeably bright nor noticeably dark.
bright
4
There is a lot of light, the general impression is a bright
image.
bright-dark
5
There are noticeably bright and noticeably dark areas in the
image.
45
Light Contrast
'Light Contrast' refers to the perceived degree of difference between the darkest and brightest
areas of the frame. This annotation type provides a rating scale from high to low contrast.
•
•
•
•
Single Value Annotation Type
Ordered from low to high
Evolving Annotation Type
[TO]
Advene Label: ImCo | Light Contrast
Value
Shortcut
[TO]
1
Description
Syntax element that indicates a continuous development
between two values.
2
All areas of the image are equally bright or dark, there are
hardly any differences in the brightness.
medium
3
Different areas of the image are slightly differently lit, there
are inconspicuous differences in the brightness of the
image.
high
4
There is a noticeably sharp contrast between bright and
dark areas in the image.
low
46
Colour Design
Perceived quality of the colour space of a sequence. This annotation type provides the basic
distinction of colour design patterns, shapes, and qualities that are dominant in a particular
segment.
•
•
•
Multiple Value Annotation Type
[VS]
Contrasting Annotation Type
Advene Label: ImCo | Colour Design
Value
Shortcut
[VS]
1
Syntax element that connects two contrasting values.
pastel
2
The image is dominated by light, pale colours.
neon
3
The image is dominated by glaring, gaudy colours of high
luminosity.
4
The image is dominated by the primary colours red, yellow,
green, and blue. The colours used are kept pure and
luminous. Colour fields are typically monochrome and
clearly delineated.
5
The image is dominated by mixed colours, which appear
rather muted and subdued due to their lack of purity as well
as their low luminosity. Shades include e.g. various hues of
brown and olive.
6
Only gradations of a single colour varied in brightness
and/or saturation are used. When using colour filters, this
can also include black and white.
7
The image is dominated by various, mostly luminous
colours. Brightness and saturation of the colours/the
shades may vary.
8
The image is dominated by one colour/one shade. Other
colours may appear in the image, but designated colour
selection particularly stands out.
9
The image is dominated by many different colours referring
to different areas in the colour spectrum, e.g. by blue,
yellow, and red shades.
primary
shades
monochrome
colourful
dominated by
broad range
narrow range
areal
pattern
Description
The image is dominated by only a few different colours
referring to a small area in the colour spectrum, e.g. only by
purple and blue shades.
The image is dominated by large, coherent areas, each
defined by one colour/one shade.
Colours form a noticeable pattern, e.g. a black-and-white
floor tiled like a chessboard or a colourfully striped
wallpaper. The colour pattern may also be purely abstract,
i.e. without any representational relation.
47
combination of
The image is dominated by two (or more) colours/shades
that interact with each other, e.g. harmonize with or
contrast each other.
warm
The image is dominated by shades of red to yellow
(including brown). Warm colours often appear to be
pleasant, cozy, and welcoming.
cold
The image is dominated by blue to blueish-green shades.
Cold colours often appear to be cool, repellent and distant.
strong
The image is dominated by pure colours of high luminosity,
which therefore appear very present. A high saturation may
intensify the impression of strong colours.
muted
The image is dominated by mixed colours of low luminosity,
which therefore appear rather dim and subdued. Beige,
brown, and olive are examples of muted colours.
tint
An image is tinged when all depicted colours are shifted
towards one colour (the degree of colouration may vary).
This can be caused by the fact that the image or parts of it
are colourised (in post production) as well as by camera
work (e.g. if the camera's white balance does not
correspond to the temperature of the recorded light, so the
image is tinged with blue or yellow-orange, respectively.).
Colour Composition
Perceived spatial colour distribution within the image, as well as the general colour pattern of
a shot and its possible transformation over time. This annotation type operates with a free
description.
•
•
Free Text Annotation Type
Advene Label: ImCo | Colour Composition
48
Colour Saturation
Impression of colourfulness (in proportion to its brightness) in the range from pale to colourful.
This annotation type provides a rating scale for the colour saturation of an image.
•
•
•
•
Single Value Annotation Type
Ordered from low to high
Evolving Annotation Type
[TO]
Advene Label: ImCo | Saturation
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
low
2
Weak colour application, achromatic impression.
medium
3
Medium degree of colour application, neither strong nor
weak saturation.
high
4
Strong colour application, colourful impression.
49
∎Colour Range
The perceived range of (main) colours in a sequence. In this annotation type, for the purpose
of comparability, colours have to be picked from a reduced set of colours. A description of the
colour impression is combined with a hexcode of the corresponding colour value as a
reference.
•
•
Multiple Value Annotation Type
Advene Label: ImCo | Colour Range
Value
Shortcut
Description
red
1
Strong, pure (luminous) red.
#ff0000
darkred
2
Dark, strong red.
#8b0000
tomato
3
Llight, pale red.
#ff6347
firebrick
4
Dark, pale red.
#b22222
crimson
5
Strong red with a tinge of blue.
#DC143C
blue
6
Strong, pure (luminous) blue.
#0000ff
skyblue
7
Light, pale blue.
#87ceeb
royalblue
8
Dark, pale blue.
#4169e1
darkblue
9
Dark, strong blue.
#00008b
steelblue
Pale blue with a tinge of grey.
#4682b4
cyan
Strong, pure (luminous) cyan.
#00ffff
darkcyan
Dark, strong cyan.
#008b8b
aquamarine
Light cyan with a tinge of green.
#7FFFD4
green
Light cyan with a tinge of green.
#7FFFD4
darkgreen
Colour
Dark, strong green.
#006400
50
greenyellow
olivedrab
strong green with a tinge of yellow.
#adff2f
Muted olive.
#6b8e23
darkolivegreen
Dark, muted olive.
#556b2f
khaki
Light, pale khaki.
#f0e68c
darkkhaki
Dark, pale khaki.
#bdb76b
saddlebrown
Dark, strong brown.
#8b4513
sandybrown
Light, strong brown.
#f4a460
gold
Strong, pure gold.
#ffd700
goldenrod
Dark, strong gold.
#daa520
yellow
Strong, pure (luminous) yellow.
#ffff00
orange
Strong, pure orange.
#ffa500
darkorange
Dark, strong orange.
#ff8c00
coral
salmon
pink
deeppink
Strong orange with a tinge of
pink.
#ff7f50
Light pink with a tinge of orange.
#fa8072
Light, pale pink.
#ffc0cb
Strong, pure (luminous) pink.
#ff1493
violet
Light, strong purple.
#ee82ee
purple
Strong, pure purple.
#a020f0
purple4
Dark, strong purple.
#551a8b
magenta
Strong, pure (luminous)
magenta.
51
#ff00ff
wheat1
Dark, muted beige.
#F5DEB3
antiquewhite
Light, muted beige.
#FAEBD7
ivory
Muted white with a tinge of beige.
#fffff0
white
Pure white.
#ffffff
grey
Strong, pure grey.
#bebebe
lightgrey
Light, pale grey.
#d3d3d3
dimgrey
Dark, strong grey.
#696969
silver
Strong, pure silver.
#d0d0d0
black
pure black.
#000000
52
Colour Accent
Colour(s) that – despite covering only a small fragment of the image – stand(s) out prominently
from the predominant range of colours and thus capture(s) the attention of viewers.
•
•
Multiple Value Annotation Type
Advene Label: ImCo | Colour Accent
Same values as in ‘Colour Range’. Please see above.
Texture
'Texture' as the perceived material quality is understood here as surface attribute of the film
image itself and not of the represented surfaces. These textures can refer to the technical
basis of the mediality of the image. This annotation type provides a basic classification of image
textures.
•
•
Single Value Annotation Type
Advene Label: ImCo | Texture
Value
Shortcut
Description
grainy
1
The texture of the image is grainy, meaning interlaced with
visible microstructures. Film grain is typical for analogue
film.
blurred
2
The texture of the image is blurred, meaning that contours
seem to be washed out. This can result e.g. from a
softening effect.
clear
3
The texture of the image is inconspicuous, the materiality of
the image does not stand out.
4
The image repeatedly shows scratches, grinding marks,
and other distortions. These traces of projection (and
abrasion) become integral part of the materiality of the
image and can e.g. draw attention to it.
pixelated
5
The texture of the image is pixelated, i.e. pixels as basic
elements of the image are perceived. This is often due to a
low resolution or compression artifacts. Thus the attention
is drawn to the digital mediality of the image.
other
6
The texture of the image cannot be classified with any of
the other values.
traces of
projection
53
Animation
Impression that a sequence is animated and not live action. This annotation type provides a
basic classification of different forms of animation.
•
•
Single Value Annotation Type
Advene Label: ImCo | Animation
Value
Shortcut
Description
3d-animation
1
2d-animation
digital
2
2d-animation
drawing
3
A classical animation consisting of individually drawn
frames (or the impression thereof).
stop motion
4
An animation that is based on the series of still images of
unmoving objects which (in continuous playback) gives the
impression of a coherent movement. Dolls or flexible
material are frequently used.
composite
5
Different kinds of animation are combined.
A digital animation which suggests depth of space.
A digital animation which does not suggest depth of space.
54
Visual Pattern
Abstract patterns of visual, graphical forms and structures in the image. It includes prominent
shapes and lines, as well as divisions of the image. This annotation type provides a basic
classification of visual patterns that characterise the image in a given segment.
•
•
•
•
Multiple Value Annotation Type
[TO]
Evolving Annotation Type
Contrasting Annotation Type
[VS]
Advene Label: ImCo | Visual Pattern
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
[VS]
2
Syntax element that connects two contrasting values.
diagonal
3
Diagonal lines structure the image, i.e. lines which run
approximately from bottom left to top right or from top left to
bottom right.
vertical
4
Vertical lines structure the image, i.e. lines that run top
down.
horizontals
5
Horizontal lines structure the image, i.e. lines that run from
left to right.
circular
6
Circles, round or oval shapes characterise the image.
rectangular
7
Shapes with 4 or more corners characterise the image.
triangle
8
Triangular shapes characterise the image.
grid
9
Net-like crossing lines characterise the image.
chaos
vanishing point
symmetry
Many different shapes and lines characterise the image.
The lines pointing into the depth of the image run towards
one vanishing point.
The halves of the image appear to be approximately
mirrored in the middle.
Centre figure
One figure or object occupies the center of the image.
2-division
The image is split in 2 distinct areas, e.g. because of a
diegetic framing such as a wall.
3-division
The image is split in 3 distinct areas, e.g. because of a
diegetic framing such as walls or windows.
4-division
The image is split in 4 distinct areas, e.g. because of a
diegetic framing such as walls or windows.
frame
The image is framed, e.g. by a television, a window, a
mirror, or a door frame.
55
Aspect Ratio
Proportional ratio between image width and height. In cases of split-screen sequences,
multiple aspect ratios can be present simultaneously. This annotation type provides a scale of
different aspect ratios.
•
•
•
•
•
Single Value Annotation Type
Ordered from >21:9 to <9:16
Evolving Annotation Type
[TO]
Contrasting Annotation Type
[VS]
Advene Label: ImCo | Aspect Ratio
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
[VS]
2
Syntax element that connects two contrasting values.
>21:9
3
Aspect ratio wider than 21:9.
21:9
4
Aspect ratio approximately 21:9.
16:9
5
Aspect ratio approximately 16:9.
4:3
6
Aspect ratio approximately 4:3.
1:1
7
Aspect ratio approximately 1:1.
9:16
8
Aspect ratio approximately 9:16.
<9:16
9
Aspect ratio narrower than 9:16.
56
∎Image Intrinsic Movement
Perceived overall degree of movement of all things within the frame. This annotation type
provides a scale from static to very dynamic for the rating of image-intrinsic movement.
•
•
•
•
•
Single Value Annotation Type
Ordered from 0 to 3
[TO]
Evolving Annotation Type
Contrasting Annotation Type
[VS]
Advene Label: ImCo | Image Intrinsic Movement
Value
Shortcut
Description
[TO]
1
Syntax element that indicates a continuous development
between two values.
[VS]
2
Syntax element that connects two contrasting values.
0
3
No movement in the image.
1
4
Little movement in the image.
2
5
Some movement in the image.
3
6
A lot of movement in the image.
57
∎Dominant Movement Direction
Dominant impression of the direction of movements in the image, e.g. objects or figures. This
annotation type provides a basic classification of the movement direction.
•
•
•
•
Multiple Value Annotation Type
Evolving Annotation Type
[TO]
[VS]
Contrasting Annotation Type
Advene Label: ImCo | Dominant Movement Direction
Value
Shortcut
[TO]
1
Syntax element that indicates a continuous development
between two values.
[VS]
2
Syntax element that connects two contrasting values.
left
3
Image elements move dominantly to the left.
right
4
Image elements move dominantly to the right.
up
5
Image elements move dominantly to the top of the image.
down
6
Image elements move dominantly to the bottom of the
image.
towards
7
Image elements move dominantly towards the camera.
away
8
Image elements move dominantly away from the camera.
undirected
9
Image elements move but without a dominant direction.
Description
inward
Image elements move from different directions towards the
centre.
outward
Image elements move from different directions away from
the centre.
spin
Image elements move predominantly in spinning motions.
58
Movement Impression
The forms of movement in the image seem directional, like striving towards an aim. This means
they run in rather straight lines, steady and focused, not jerky.
•
•
Single Value Annotation Type
Advene Label: ImCo | Movement Impression
Shortcut
Description
1
The forms of movement in the image seem directional, like
striving towards an aim. This means they run in rather
straight lines, steady and focused, not jerky.
2
The forms of movement in the image seem harmonic,
which means they are often soft, gentle, and adaptive to
another or their surrounding.
confrontative
3
The forms of movement in the image seem confrontative,
i.e. they are often hard, disharmonic, antagonistic to
another, colliding with their surroundings or bouncing off
one another.
chaotic
4
The forms of movement in the image seem chaotic, i.e.
they are often undirected or changing their direction,
unsteady or jerky.
Value
directional
harmonic
59
Splitscreen Number
The number of splitscreens in the image in a given time segment. This annotation type provides
a scale for the number of splitscreens.
•
•
Single Value Annotation Type
Advene Label: ImCo | Splitscreen Number
Shortcut
Description
1
1
The number of splitscreens is 1, i.e. one screen that
doesn't fill up the whole image, the rest of the image can be
in a monochromatic colour, often black.
2
2
The number of splitscreens is 2.
3
3
The number of splitscreens is 3.
4
4
The number of splitscreens is 4.
5
5
The number of splitscreens is 5.
6
6
The number of splitscreens is 6.
7
7
The number of splitscreens is 7.
8
8
The number of splitscreens is 8.
9
9
The number of splitscreens is 9.
Value
10
The number of splitscreens is 10.
11
The number of splitscreens is 11.
12
The number of splitscreens is 12.
more
The number of splitscreens is larger than 12.
60
Splitscreen Shape
The shape of splitscreen segments in the image. Several different shapes can be present
simultaneously. This annotation type provides a basic selection of shapes.
•
•
Multiple Value Annotation Type
Advene Label: ImCo | Splitscreen Shape
Value
Shortcut
rectangular
1
The shape of one or more splitscreens is rectangular.
round
2
The shape of one or more splitscreens is round.
triangular
3
The shape of one or more splitscreens is triangular.
organic
4
The shape of one or more splitscreens is organic.
other
5
The shape of one or more splitscreens cannot be assigned
to one of the other values. The specific quality or
characteristics can be specified in brackets.
Description
Splitscreen Dynamics
Dynamicity of splitscreens in the image, i.e. their movement or scale variations or changes.
Different impressions can be present simultaneously, e.g. one splitscreen is moving and
another is static. This annotation type provides a basic distinction between dynamic and static.
•
•
SingleValue Annotation Type
Advene Label: ImCo | Splitscreen Dynamics
Value
Shortcut
static
1
One or more splitscreens is static, i.e. the screen(s) is not
moving or changing its form.
dynamic
2
One or more splitscreens is dynamic, i.e. the screen(s) is
moving or changing its form.
Description
61
Frame in Frame
Free description of the frame-in-frame designs, e.g. the relation of frames to each other, such
as in the case of screens in the picture, photographs or other types of images.
•
•
Free Text Annotation Type
Advene Label: ImCo | Frame in Frame
Spatial Arrangement
'Spatial Arrangement' refers to the perceived composition of the image space. Different staging
strategies highlight different aspects of this space, e.g. highlighting the foreground or
establishing a dominant left-right axis. This annotation type provides a basic classification of
different spatial areas or axes which shape the spatial arrangement of the selected segment.
•
•
SingleValue Annotation Type
Advene Label: ImCo | Spatial Arrangement
Value
Shortcut
foreground
1
The foreground is emphasised through the spatial
arrangement and/or the staging.
background
2
The background is emphasised through the spatial
arrangement and/or the staging.
fore-/background
3
The fore-/background axis is emphasised through the
spatial arrangement and/or the staging.
left
4
The left side is emphasised through the spatial
arrangement and/or the staging.
right
5
The right side is emphasised through the spatial
arrangement and/or the staging.
left-right
6
The left-right axis is emphasised through the spatial
arrangement and/or the staging.
top
7
The top is emphasised through the spatial arrangement
and/or the staging.
bottom
8
The bottom is emphasised through the spatial arrangement
and/or the staging.
top-bottom
9
The top-bottom axis is emphasised through the spatial
arrangement and/or the staging.
Description
62
Language
Semantic dimension of language use: transcriptions of all significant instances of written or
spoken word. Background murmurs and text inserts too complex or too small are excluded.
Types:
∎Dialogue Text
Text Diegetic
Text Nondiegetic
63
∎Dialogue Text
'Dialogue Text' refers to the understandable, spoken language on the audio track of a film. This
usually refers to dialogue, off-commentary, but also spoken chorus. This annotation type
provides a transcript of these utterances. A change of speaker or a pause marks the beginning
of a new transcription unit.
•
•
Free Text Annotation Type
Advene Label: Lg | Dialogue Text
Text Diegetic
'Text Diegetic' refers to written language that is visible in the frame, highlighted in the
audiovisual staging, and part of the world the film creates. Examples include text on screens
or company logos on buildings – both in fiction films and documentaries. This annotation type
provides the transcription of diegetic text.
•
•
Free Text Annotation Type
Advene Label: Lg | Text Diegetic
Text Nondiegetic
'Text Nondiegetic' refers to written language that is visible in the frame and not part of the world
the film creates. Examples include fixed subtitles, captions or intertitles. This annotation type
provides the transcription of nondiegetic text.
•
•
Free Text Annotation Type
Advene Label: Lg | Text Diegetic
Text Aesthetics
The aesthetic design of diegetic and nondiegetic writing in the image. This can include e.g.
colours, shapes, but also prominent movement impressions or other (temporal and spatial)
transformations. This annotation type operates with free text description. Example: "Subtitles
fade from white to red and dissolve as if they were liquid."
•
•
Free Text Annotation Type
Advene Label: Lg | Text Aesthetics
64
Montage
Staging strategies that only result from the interrelation of two or more shots. Montage refers
here to the cutting of subsequent or co-occurring shots, as well as to the assemblage of
sequences as temporal gestalts. The emphasis is on the visual editing, sound editing is
primarily annotated under ‘acoustics’.
Types:
Shot
∎Shot Duration
∎Montage Figure Macro
Montage Figure Micro
Montage Rhythm
Shot Transition
Image Split
Dynamics Of Space
Found Footage
65
∎Shot
A shot of a film is a perceivable continuous image and is bound by a "discontinuation of the
entire composition" (Fuxjäger: Wenn Filmwissenschaftler versuchen, sich Maschinen
verständlich zu machen, 2009, own translation). In this annotation type, all shots of a film are
numbered sequentially.
•
•
Free Text Annotation Type
Advene Label: Montg | Shot
∎Shot Duration
The temporal duration of a shot. A shot of a film is a perceivable continuous image and is
bound by a "discontinuation of the entire composition". (Fuxjäger: Wenn Filmwissenschaftler
versuchen, sich Maschinen verständlich zu machen, 2009, own translation). In this annotation
type, the shot duration is stated in seconds.
•
•
Free Text Annotation Type
Advene Label: Montg | Shot Duration
66
∎Montage Figure Macro
Clusters of shots or superordinate shot arrangements, i.e. multiple shots can interrelate
graphically, rhythmically, spatially, or temporally. This annotation type provides a selection of
macro montage figures, i.e. a classification of different relations between sequences of shots
or individual shots with respect to a larger segment of shots.
•
•
Multiple Value Annotation Type
Advene Label: Montg | Figure Macro
Value
Shortcut
Description
cross-cut
1
Two or more seperate lines of action, which often refer to
each other in some meaningful way and/or take place at
the same time, are combined by switching between them.
This might be accompanied by a rhythmic culmination,
such as a cross-cut of one chasing and the one being
chased.
shot-reverse-shot
2
In a conversation situation, at least two people are shown
alternately. Usually the shots have rather close field sizes
and approximately maintain the respective camera angles.
continuity
3
Continuity editing according to the rules of temporal and
spatial coherence.
montage
4
A series of comparatively short shots which compress time
and/or space in the viewers' perception. The sequence
often has a certain rhythm or a connective musical
accompaniment. Sometimes a thematic cluster of shots
without spatiotemporal connection.
circular
5
Repeating pattern in the direct succession of shots that not
only juxtaposes two types of shots (ABABA...) but develops
a more complex pattern, e.g. ABCDABCDA...
framing
6
Returning, similar, or refering shots which enclose one or
more other shots.
sequence shot
7
Single comparatively long shot which creates a meaningful
unit, often via elaborate/complex figure choreography
and/or camera movements.
67
Montage Figure Micro
Clusters of shots or superordinate shot arrangements, i.e. multiple shots can interrelate
graphically, rhythmically, spatially, or temporally. This annotation type provides a selection of
macro montage figures, i.e. a classification of different relations between sequences of shots
or individual shots with respect to a larger segment of shots.
•
•
Multiple Value Annotation Type
Advene Label: Montg | Figure Micro
Value
Shortcut
movement
continuity
1
The shot picks up and continues the movement of the
preceding shot.
2
The shot suggests being the subjective gaze of a figure
who was marked as the subject of the gaze in the
preceding shot or will be marked in the following shot.
match cut
3
Two successive shots have a noticeable resemblance. This
resemblance can concern image composition or movement
direction, but it can also concern the audio track in an audio
match cut.
cut in
4
Two successive shots are taken from approximately the
same perspective, only the second seems much closer.
cut out
5
Two successive shots are taken from approximately the
same perspective, only the second is much further away.
eyelinematch
6
The first shot shows a subject of a gaze, such as a figure
looking off-screen, and the successive shot shows the
object of the gaze, i.e. what the figure is looking at.
match on action
7
Two successive shots are matched at the approximately
same moment of movement, giving the impression of a
continuous, uninterrupted movement.
crossing the line
5
The concept of 'crossing the line' is based on the
convention of an axis of action, i.e. the axis between two
figures. If the axis is crossed between two shots, meaning if
the camera frames the action from the opposite side, the
objects in the image change position from left to right and
vice versa. Classic Hollywood cinema relies on an axis of
action within the image which must not be crossed to keep
the impression of continuity and the cuts ideally invisible.
jump cut
6
A cut between two shots which are (nearly) identical
regarding distance and framing but jump within a displayed
movement, often a temporal gap in the action.
7
Shot(s) which falls outside the established spatio-temporal
fabric, i.e. that seem to be inserted into a coherent
segment, e.g. a detail shot, or a cutaway within a space or
to another space or another time.
pov
match on action
Description
68
Montage Rhythm
'Rhythm/Speed' refers to the perceived montage rhythm that is grounded in the variation of
consecutive shot lengths. This annotation type provides a basic classification of the rhythmic
profiles of segments.
•
•
Multiple Value Annotation Type
Advene Label: Montg | Rhythm
Value
Shortcut
Description
acceleration
1
In a series of successive shots, the duration of these shots
becomes shorter and shorter.
deceleration
2
In a series of successive shots, the duration of the shots
becomes longer and longer.
metric
3
In a series of successive shots, the duration of the shots is
remarkably unvarying.
4
The duration of a series of shots forms a repetitive pattern,
such as long-short-long-short or short-short-long-shortshort-long.
alternating
69
Shot Transition
Temporal segment of a graphical transition between two consecutive shots. This annotation
type provides a classification of different types of this transition, which can include overlapping
of shots, but also fade-ins and -outs.
•
•
Single Value Annotation Type
Advene Label: Montg | Shot Transition
Value
Shortcut
Description
fade out
1
A shot disappears gradually over a period of time. It
becomes more pale and translucent until only one colour is
left, usually black or white.
fade in
2
A shot appears gradually over a period of time. Starting
from a monochrome screen, usually black or white, the
image slowly becomes visible.
dissolve
3
One shot disappears while another becomes visible.
Thereby both shots overlap temporarily and both images
become transparent – one fades out while the other fades
in.
wipe
4
Shot A seems to be pushed out of the frame by shot B. For
the duration of the transition both shots can be seen in
different proportions.
iris
5
A shot is fading in/out via a circular shape.
70
Image Split
'Image Split' refers to the graphical co-presence of distinct image areas within one shot. This
annotation type aims at a free description of image splits, for example by splitscreens, but also
by framing of doors and windows in the image.
•
•
Multiple Value Annotation Type
Advene Label: Montg | Image Split
Value
Shortcut
Splitscreen
1
Multiscreen where at least two images appear
simultaneously but are visually separated on the screen.
2
Composition of the image from partial images or the
appearance of images within a superordinate image.
Examples are photographs in the image, screens such as
TVs or computers, but also virtual multiscreens in the form
of (digital) superimposition of different image levels.
3
The image is scaled down, this means smaller than
previous images or subsequent images, leaving the sides
of the frame empty, that is not filling the complete area of
the respective image format, e.g. through shots in a
different aspect ratio.
frame in frame
scaled down
Description
71
Dynamics Of Space
'Dynamics Of Space' describes the narrowing and widening of the image space (Kappelhoff:
Der Bildraum des Kinos, 2005). This annotation type provides a basic classification of these
dynamics.
•
•
•
Multiple Value Annotation Type
Ordered from widening to narrowing + alternating
Advene Label: Montg | Dynamics Of Space
Shortcut
Description
widening
1
The perception of space widens, e.g. when going from a
spatially narrow setting to a vast one, or switching in a
noticeable way from a close shot to a distant one, or from a
tele lens to a wide lens.
stasis
2
The perception of space stays similar during a series of
shots. Distances, setting dimensions, and field sizes do not
undergo remarkable changes in dimension.
3
The perception of space narrows, e.g. when transitioning
from a vast setting to a narrow one, or switching in a
noticeable way from a distant shot to a close one, or from a
wide lens to a tele lens.
4
The perception of space alternates between widening and
narrowing, e.g. when switching back and forth between
vast and narrow settings, between distant and close field
sizes, or wide and tele lenses.
Value
narrowing
alternating
72
∎Found Footage
'Found Footage' refers to shots that are perceptibly from another (media) context. This
annotation type provides a basic classification of different types of found footage.
•
•
Multiple Value Annotation Type
Advene Label: Montg | Found Footage
Value
Shortcut
Description
1
Footage that seems to originate from a past "then" and
thereby another time than the audiovisual fabric in which it
is embedded. The colouring of the image as well as the
image texture can be but do not have to be markers that
create this impression.
contemporary
2
Footage that seems to relate to the historical "now" of a
film, contemporary to the audiovisual fabric in which it is
embedded, but differs from it, e.g. through the origin and
the thereby implied usage of the footage.
cartoon
3
Footage that seems to be taken from a cartoon and thereby
differs from the audiovisual fabric in which it is embedded.
news
4
Footage that seems to be taken from (television) news and
thereby differs from the audiovisual fabric in which it is
embedded.
recorded session
5
Footage that seems to be from a recording of a session
(e.g. parliament, a hearing, court proceedings) and thereby
differs from the audiovisual fabric in which it is embedded.
home video
6
Footage that appears to be taken by amateur filmers, in a
style and quality which suggest the original intention for a
private purpose.
7
Footage that does not seem to be produced for a specific
audiovisual fabric but rather in regard to its potential to be
used in a multitude of different contexts. It often has a
generic professional look (in a commercial sense).
8
Footage that seems to be intended for surveillance
purposes. This footage is often black and white, has a low
image quality and frame rate. The setting is often public,
the camera angle is often high and the field of view rather
large.
9
Footage that seems to be produced on behalf of a
corporation or company, e.g. an image film, portraying the
corporate identity or films that portray/teach operational
procedures.
historical
stock footage
surveillance
corporate film
feature film
Footage that seems to be taken from a feature film, but not
the audiovisual fabric in which it is embedded.
73
educational film
Footage that seems to be produced for educational
purposes. Such educational films often feature an
explanatory or didactic mode of adressing their audience
(often through the tone of the voice-over), explaning issues
in a fact-oriented manner.
commercial
Footage that seems to be taken from (audiovisual)
advertisement.
music video
Footage that seems to be taken from a music video.
video game
Footage that seems to be taken from a video game.
scientific video
web video format
Footage that seems to have a scientific (epistemological)
purpose, e.g. the recording of an experiment, but also
different types of imaging procedures as they are used in
medicine and psychology.
Footage that seems to be produced for or originate from a
web context, e.g. a video that has apparently been
produced for a specific web plattform and uses its stylistics.
sports
Footage that seems to be taken from a sportscast.
witness
Footage that seems to be characterised by a documental
act of witnessing and thereby differs from the audiovisual
fabric in which it is embedded. Most often in this footage a
subjective point of view is marked by the camera work.
Examples are recordings of police brutality or violations of
environmental requirements.
archive
photography
Photographs which seem to originate from another context
(i.e. archive) and are embedded either statically or visually
dynamized.
unspecified
archive
Footage which seems to originate from another context (i.e.
archive) but cannot be assigned to any of the other values.
Further descriptions can be added in brackets.
74
Motifs
Basic categorisation of representation, e.g. objects, figures, and places, within a film. From
motif classifications (person, group, mass, etc.) to persons of interest (e.g. George W. Bush,
Michael Moore, Judith Rakers) to research-question-specific key visuals and sounds
(recurring motifs within a corpus).
Types:
∎Setting
∎Image Content
Important Persons
Keywords
75
∎Setting
Places of (narrative) action. This annotation type aims at a basal characterisation of setting
along the criteria of time of day, a distinction between inside and outside, a distinction of basal
landscape types. Places can be additionally described through freely chosen nominal
references, e.g. 'Mos Eisley Cantina'.
•
•
Multiple Value Annotation Type
Advene Label: Motf | Setting
Value
Shortcut
Description
interior
1
The events are set in an interior space.
exterior
2
The events are set in an exterior space.
day
3
The events are set during the day.
night
4
The events are set during the night.
twilight
5
The events are set at dusktime/dawntime.
nature
6
The image focuses on a space of seemingly pristine nature.
This can include e.g. forests, the sea, or deserts, but also
cultivated forms such as landscape parks or gardens.
rural
7
The image focuses on a space characterised by nature,
agriculturally used areas, and only a few individual houses
or small villages.
suburban
8
The image focuses on a space characterised as a
residential area by detached houses with gardens and
clearly separated properties. The impression of a quiet
neighborhood may be intensified by only very little
movement depicted in the image.
urban
9
The image focuses on urban space characterised by dense
development (e.g. skyscrapers), strong (transport)
infrastructure, and mainly commercially used premises.
76
∎Image Content
'Image Content' refers to the depicted elements (figures, objects, location) of a shot. This
annotation type aims at a basic classification of the represented image content that is
perceived as central to the shot.
•
•
Multiple Value Annotation Type
Advene Label: Motf | Image Content
Value
Shortcut
person
1
The image focuses on a single person.
2
The image shows two or more persons who evoke an
impression of togetherness due to their interaction,
proximity, or arrangement in space. This impression is also
one of a visually quantifiable number of people.
mass
3
The image creates the impression of a plurality of people
that is difficult or impossible to count visually through
simple observation due to their large number, their
movement patterns, and/or their arrangement in space.
object
4
The image focuses on a single item or only one part of an
object, e.g. a watch or one tyre of a car.
non-human being
5
The image focuses on a non-human being, e.g. a robot or a
dog.
writing
6
The image focuses predominantly on writings (diegetic or
nondiegetic) and thus invites the viewers to read.
location
7
The image focuses predominantly on the location (either as
part of the action itself, e.g. when a location is perceived,
marvelled at by a character, or when there is no focus on
specific actions or events). The attention is thus drawn to
the setting.
graphics
8
The image focuses on a graphic representation, e.g. a
diagram, a chart.
group
Description
77
Important Persons
'Important Persons' refers to recurring and noteworthy persons that are central to the selected
film corpus. Different research questions can lead to different criteria. This annotation type
operates with a free description of important persons in the context of the film, i.e. both
historical personalities and (fictional) movie characters. Possibly further descriptions, such as
appearance or actions performed.
•
•
Multiple Value Annotation Type
Advene Label: Motf | Important Persons
Name
Company/Institution
Position
Applegarth, Adam J.
Northern Rock
Bernanke, Ben
Federal Reserve
Board
Chairman (2006-2014)
Blankfein, Lloyd C.
Goldman Sachs
CEO, Chairman (2006-present)
CEO (former)
Keywords
'Important Persons' refers to recurring and noteworthy persons that are central to the selected
film corpus. Different research questions can lead to different criteria. This annotation type
operates with a free description of important persons in the context of the film, i.e. both
historical personalities and (fictional) movie characters. Possibly further descriptions, such as
appearance or actions performed.
•
•
Multiple Value Annotation Type
Advene Label: Motf | Important Persons
Value
Shortcut
Description
screen
1
Computer screens, TVs, video screens on buildings, white
noise.
ticker
2
Stock ticker & graphs (also in newspapers or animations,
etc).
skyline
3
Skyscrapers, skylines, towers as landscape.
doors
4
Prominently staged doors as the focus of a shot.
clocks
5
Clock, watch, timer, countdown, the ticking or beeping of
alarm clocks.
protests
6
Public protests, masses/groups of people with a (political)
goal, chants of protesters.
logos
7
Logos of companies and institutions, e.g. EZB, Bank of
America, Lehman Brothers, Bayerische Landesbank,
Sparkasse.
78
camera
8
Photo & video cameras, but also microphones and indirect
suggestions such as camera flashes or shutter noises.
street signs
9
Street signs as markers of specific places or as a symbol,
e.g. Wall St. / Main St.
money
Different forms of money: different currencies, e.g. dollar,
euro, pound, bitcoin, but also gold bars. Bills and coins, as
well as credit cards.
flags
Flags of countries, organisations and companies (specifics
in brackets).
revolving door
white house
wall street
building
Revolving doors as a motif.
Various places directly connected to the White House
(specific place in brackets, e.g. the Rose Garden, Press
Room, facade).
Buildings on Wall St. in Manhattan.
statue of liberty
The Statue of Liberty itself as well as other depictions of it
(e.g. drawings, snow globes).
plane
Aerial transport vehicles (public and private), e.g. planes,
jets, helicopters.
car
Cars as emphasised image content, i.e. not every car that
can be seen within a shot but only those central to the
staging (e.g. when followed by the camera, or entered by
the protagonist of a shot).
train
sale sign
lehmann building
nature
Public transport as well as freight trains (but only those
central to the staging).
Real estate sale signs.
Former Headquarters of Lehman Brothers in Manhattan.
Parks, landscapes and prominently staged plants.
documents
Prominently staged documents, folders, e.g. contracts.
monument
Statues and other art in the public space.
globe
Depictions of the entire world, e.g. globes, maps.
factory
Production industries, factories, production halls, heavy
industry, ...
children
Children, playgrounds, schools, metonymic sounds.
parliament
facades
Assembly halls of important political entities.
Prominently staged glass facades of buildings, emphasised
through reflections, people behind glass, looking from the
outside into a building.
79
skyscraper
Focus on a single skyscraper.
trading floor
International Trading floors (specific place in brackets: e.g.
DAX, NYSE, TSE).
wind
fog
Revolving doors as a motif.
Prominently staged fog or low hanging clouds (especially
interesting when used as means of distorting vision or to
express confusion).
news
TV Studios, news audio streams, prominent headlines in
print and online. (specific context in brackets, e.g. the BBC
News TV studio).
police
Police forces, vehicles, symbols, sirens.
eviction
Houses that are empty or have to be cleared because of
the financial crisis or function as a symbol for it. Also
eviction notices and foreclosure signs in close ups.
government
building
Buildings of political institutions (specific place in brackets,
e.g. the German Kanzleramt, Downing Street, the US
Congress, the FED).
bridge
Prominently staged bridges (specific place in brackets, e.g.
the Manhattan Bridge or the Brooklyn Bridge in New York
City).
jingle
Short audio queues that are directly connected to
companies and organizations, e.g. in advertisements as the
acoustic counterpart to a logo (specific target in brackets).
song
Famous songs aiming at a recognition effect, e.g. Jay-Z
feat. Alicia Keys, 'Empire State Of Mind'.
red light
Motifs of red light include persons (e.g. prostitutes, call
girls/boys, johns) as well as locations (e.g. redlight districts,
strip clubs).
landmark
Famous landmarks as prominent motifs (specific place in
brackets, e.g. the Brandenburger Tor, the Eiffel Tower, the
Grand Canyon).
drugs
Drugs as objects or the consumption of drugs. Illegal drugs
such as cocaine, as well as excessive use of alcohol.
luxury
Excessive status display symbols (e.g. pools, yachts).
cargo
Places of cargo handling (e.g. ports).
people in the
street
Group or mass of pedestrians in a street or pedestrian area
as a central motif.
80
talking head
Image form of a talking head, often a situation of
interrogation, e.g. an interview, expert explanation, witness
report. Mostly tied to a close camera shot. Thus 'talking
head' is a combination of image composition and
embedding in its context.
public statement
Different forms of public statement, e.g. by politicians, this
can be a press statement, a hearing, a speech. The
adressed public is not the audience of the film.
poverty
Explicit depiction of poverty, e.g. homeless persons, food
stamps, run down neighbourhoods.
construction site
relocation
residential area
Different forms of construction sites.
Relocation situations (layoffs, moving house), as well as
objects (e.g. cardboard boxes) connected to these
situations.
Private homes in suburban areas staged as a specific
setting.
party
Night clubs, excessive dancing, parties (including
celebrations in offices).
irony
Prominent use of irony, diegetically as well as
nondiegetically.
Costume
Free description of costumes (or a selection thereof) in a segment, e.g. clothes, but also
accessories or props. This may include designations of objects as well as a description of their
appearance (colour, shape, etc.). E.g. 'brown beige mixed trousers' (high waist, slim fit, cf.
trousers from 1910).
•
Multiple Value Annotation Type
•
•
Free Text Annotation Type
Advene Label: Motf | Costume
81
Segmentation
Perceived units that structure the flow of film experience. ‘Perceived’ aims at excluding
intended structural units from paratexts (DVD chapters, screenplays) and focuses instead on
experienced figurations. These units can appear on the meso-level (scenes), as well as on
the micro-level (cinematic expressive movements, see Kappelhoff: Matrix der Gefühle, 2004,
Kappelhoff: Kognition und Reflexion, 2018).
Types:
Expressive Movement
Argumentation Unit
Scene
82
Expressive Movement
'Expressive Movement' [Ger. Ausdrucksbewegung] refers to a phenomenological concept (see
Plessner: Die Deutung des mimischen Ausdrucks, 1982, Bühler: Ausdruckstheorie, 1933,
Wundt: Völkerpsychologie, 1900–1920) that was adapted to describe the affective dynamics
of audiovisual images (Kappelhoff/Bakels: Zuschauergefühl, 2011). In this regard films are
understood as movement patterns that combine different staging tools such as sound
composition, montage rhythm, camera movements, and acting into one temporal gestalt.
These patterns organize the spectators' perception processes over the temporal course of film
viewing (see Müller/Kappelhoff: Cinematic Metaphor, 2018, 132). This annotation type
provides free descriptions of these cinematic expressive movements.
•
•
Free Text Annotation Type
Advene Label: Seg | Expressive Movement
Argumentation Unit
'Argumentation Unit' refers to a thematic, semantic unit of a film. This segmentation refers
predominantly to verbalized topics and themes, as well as representations in sound and image.
This annotation type provides the marking of beginning and end, as well as the labeling of
those segments. Argumentation units may correspond with scenes, but do not neccessarily
have to.
•
•
Free Text Annotation Type
Advene Label: Seg | Argumentation Unit
Scene
Structural segmentation unit in the viewer perception that is constituted by aesthetic and
narrative markers: for example, through plot and figure constellations (beyond a simple unity
of plot, place, and time) for the feature film, as well as through argumentative and other units
of meaning for non-fictional formats. The marked scenes are provided with a working title and
numbered consecutively.
•
•
Free Text Annotation Type
Advene Label: Seg | Scene
83