I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n

ITU-T Series G
OF ITU (06/2022)



Influencing factors on quality of experience

(QoE) for video customized alerting tone (CAT)
and video customized ringing signal (CRS)

ITU-T G-series Recommendations – Supplement 77




Supplement 77 to ITU-T G-series Recommendations

Influencing factors on quality of experience (QoE) for video customized alerting

tone (CAT) and video customized ringing signal (CRS) services

Supplement 77 to ITU-T G-series Recommendations describes video customized alerting tone
(CAT) and video customized ringing signal (CRS) services and helps to identify the quality of
experience (QoE) key factors of video CAT and CRS.

Supplement 77 to ITU-T G-series Recommendations

Influencing factors on quality of experience (QoE) for video customized alerting

tone (CAT) and video customized ringing signal (CRS) services

1 Scope
As specified in [ETSI TS 124 182], the customized alerting tone (CAT) service is an operator
specific service by which an operator enables the subscriber to customize the media which is played
to the calling party during alerting of the called party. As specified in [ETSI TS 124 183], the
customized ringing signal (CRS) service is an operator specific service by which an operator
enables the subscriber to customize the media which is played to the called party during alerting of
the called party.
For both services the media can consist of user's favourite songs, multi-media clips or other
customized alerting tones. When the media consists of video clips, it is called video CAT and video
CRS service, respectively. High definition (HD) video CAT and CRS service can provide high-
definition, full-screen video.
In order to assess the quality of experience (QoE) of a specific video CAT or video CRS service,
analysis of influencing factors is critical. Compared with traditional video and audio, the short-time
experience in video CAT and CRS imposes a new set of requirements to QoE assessment. The
challenge is to characterize video CAT's real-life short video, terminal display strategy, and
This Supplement categorizes and summarizes the major factors that affect user-perceived
experience of a HD video CAT and CRS service, with the intention to help identify the
methodologies for assessing the video CAT or CRS quality.

2 References
[ITU-T H.262] Recommendation ITU-T H.262 (2012), Information technology – Generic
coding of moving pictures and associated audio information: Video.
[ITU-T P.10] Recommendation ITU-T P.10/G.100 (2017), Vocabulary for performance,
quality of service and quality of experience.
[ETSI TS 124 182] ETSI TS 124 182 (2022), Digital cellular telecommunications system
(Phase 2+) (GSM); Universal Mobile Telecommunications System
(UMTS); LTE; IP Multimedia Subsystem (IMS) Customized Alerting Tones
(CAT); Protocol specification.
[ETSI TS 124 183] ETSI TS 124 183 (2022), Universal Mobile Telecommunications System
(UMTS); LTE; IP Multimedia Subsystem (IMS) Customized Ringing Signal
(CRS); Protocol specification.

3 Definitions

3.1 Terms defined elsewhere

This Supplement uses the following terms defined elsewhere:
3.1.1 quality of experience (QoE) [ITU-T P.10]: The degree of delight or annoyance of the user
of an application or service.
3.1.2 QoE influencing factors [ITU-T P.10]: Include the type and characteristics of the
application or service, context of use, the user's expectations with respect to the application or

service and their fulfilment, the user's cultural background, socio-economic issues, psychological
profiles, emotional state of the user, and other factors whose number will likely expand with further
3.1.3 frame rate [ITU-T H.262]: The rate at which frames are output from the decoding process.

4 Abbreviations and acronyms

This Supplement uses the following abbreviations and acronyms:
CAT Customized Alerting Tone
CRS Customized Ringing Signal
HD High Definition
IMS IP Multimedia Subsystem
UE User Equipment
VoIMS Voice over IMS

5 Conventions
This Supplement uses the following conventions:
– The person who is called by another person is referred to as called party or called user.
– The person calling the other person is referred to as calling party or calling user.

6 HD video CAT overview

After the users apply for the video customized alerting tone (CAT) function, they can set their own
personalized video alerting tone. The video CAT service provides a flexible video medium for the
calling user to replace the ordinary network ring back tone service. In the ringing stage when they
are called, the system plays the personalized media for the calling users. The user equipment (UE)
should support voice over IMS (VoIMS) function when using the CAT service.
The service user can subscribe to the video CAT service, activate (or deactivate) the service, and
update the settings, e.g., to make changes by configuring the active CAT media. The media can
consist of videos, multimedia clips or other video customized alerting tones. The video CAT
subscriber is able to refine the CAT media selection behaviour with configured rules, e.g., time,
calling party's location, called party's location, the identity of the calling and called party. The video
CAT service enables the user to select the appropriate CAT media according to the service rules.
When the precondition procedure is used in the initial call media negotiation stage, the calling party
should support the video CAT media negotiation with the video CAT platform in the video calling
After the calling party completes the video CAT media negotiation, the calling party should receive
resource reservation confirmation. After the calling party receives the video CAT media, the
requirements for playing the video CAT are met. The calling party can then start playing the video
CAT media.
When the called party answers, the calling party call interface should enter the call connection state.
At the same time, the calling party should stop displaying the video media image and other relevant
CAT prompt words, and stop playing the audio media of the video CAT.
When the called party hangs up, the calling party should stop displaying the video media image
screen and related CAT prompts words and stop playing the audio media of the video CAT,
according to the result of media negotiation. Instead, it should play the network refusal notification

sound and display the related prompts. Then, when the calling party hangs up or hangs up for
timeout, the calling party should end the call, close the call details interface, stop displaying the
relevant prompts, and stop playing the network refusal notification sound.
When the called party hangs up without response and timeout or the calling party hangs up without
a called side answer, the calling party should end the call, close the call details interface, stop
displaying the video media image and related CAT prompt words, and stop playing the audio media
of the video CAT.
Video customized ringing signal (CRS) service is similar to video CAT service. After the users
apply for the video CRS function, they can set their own personalized video ringing signals.
Through video CRS service, the customized media is played to the called party as an incoming
communication indication during establishment of a communication. The presentation of the
selected CRS media to the called party starts at a certain time after the initiation of a session, but
before the answer of the session. The procedure of CRS playing is similar to the procedure of CAT
Connections with high bandwidth and low delay enable the video CAT or CRS to achieve high
quality video alerting tone.

7 Key influencing factors of HD video CAT and CRS QoE

As [b-Le Callet] recommended, the factor of HD video CAT and CRS QoE can be categorized
according to the following groups: human influence factors, system influence factors and context
influence factors. Compared with a typical video service, some major system influence factors
which are special in video CAT an CRS are listed in clauses 7.1 to 7.5.

7.1 Key factors of video quality

7.1.1 Video content related
The HD video CAT and CRS media has high-definition resolution and a high frame rate, so it will
provide clearer and smoother video quality.
The HD video CAT content should be easily understandable to the calling user in a short time. The
duration of a video CAT should be less than the timeout of the calling user, and usually should be
less than 1 minute.
The HD CRS content should be easily understandable to the called user in a short time. The
duration of CRS content should be less than the timeout of the called user, and usually should be
less than 1 minute.
The HD video CAT and CRS content has strong social functions, so video content should be easily
configured by users.
7.1.2 Video codec related
Video codec is used to compress original scene data from raw format, so video CAT and CRS
media is streamed from a platform to the user terminal via an IP multimedia subsystem (IMS)
network. Traditional video codecs (e.g., H.264, H.265), may be used for video CAT and CRS
The decompressed video should have clear texture, and blur should be imperceptible.
7.1.3 Video initial buffer time
The video initial buffer time is the duration between a user's calling time and the time when the first
video CAT and CRS frame is displayed on the screen. A shorter initial buffer time can provide the
user with a better experience.

7.1.4 Video stall
The number of stalling events, the length of stalling events, and the interval between stalling events
of video CAT and CRS will affect the quality of user experience.
7.1.5 Video mosaic
The number of mosaic events, the length of mosaic events, mosaic area ratio and the interval
between mosaic events of video CAT and CRS will affect the quality of user experience.
7.1.6 Frame skipping
The number of frame skipping events, the length of frame skipping events and the interval between
frame skipping events of video CAT and CRS will affect the quality of user experience.

7.2 Key factors of audio quality

7.2.1 Audio content related
The frequency and channels of CAT and CRS audio media will affect audio quality.
The audio CAT and CRS content should be easily understandable to the calling user in a short time.
7.2.2 Audio codec related
Audio codecs are used to compress original scene data from raw format, so audio CAT and CRS
media is streamed from a platform to the user terminal via an IMS network. Traditional audio
codecs such as adaptive multi-rate wideband (AMR-WB) may be used for video content.
7.2.3 Audio stall
The number of stalling events, the length of stalling events and the interval between stalling events
of audio CAT and CRS will affect user quality of experience.

7.3 Key factors of video display on terminal

7.3.1 Video interface layout
The displayed video should maintain the same aspect ratio as the original CAT and CRS media and
expand the size to fill the terminal screen in one direction.
Called information and the function menu should not be concealed when video CAT media is
displayed on the screen.
Calling information and the function menu should not be concealed when video CRS media is
displayed on the screen.
7.3.2 Dial plate display strategy
When video CAT and CRS media is played, the dial plate should be easily identifiable and hidden
after a fixed period of time.
The dial plate should be woken up and displayed on the screen immediately when the user touches
the screen.
7.3.3 Video prompt words
When the calling party starts playing the video CAT media, it should prompt the user's calling
status by displaying the prompt language, such as "the other party is ringing", "waiting for the other
party to answer the call".

7.4 Key factors of audio playback on terminal
7.4.1 Audio output related
CAT or CRS media is played using the same audio output as ringtone playback. By default, CAT
and CRS media are played from the microphone.
7.4.2 Audio volume related
When CAT and CRS audio is played, the volume of the audio being played should be similar to the
audio volume of the raw media.
When CAT and CRS audio is played at maximum volume in speaker mode, there should be no
harsh sound. When the user sets the phone to silent mode, the CAT and CRS audio should be
played at no volume.

7.5 Key factors of CAT and CRS switch

Video CAT and CRS service should not negatively affect the conversation between calling and
called parties. The switch speed of calling and called parties in video CAT and CRS service should
be equivalent to that of normal calling switch.
7.5.1 Called party answered related
When video CAT media are played, the video image of the local camera should be paused in the
calling interface.
After the called party answers, the calling party should stop playing video CAT media.
After the called party answers, the called party should stop playing video CRS media.
During the audio call, the terminal should not start the local camera, nor should it display the video
image of the local camera on the screen. During the video call, the video images of both local
cameras should be displayed normally.
7.5.2 Called party hangs up the call related
In CAT service, when the called party hangs up the call, the calling party should end the call, close
the call details interface, stop displaying the video media image and related CAT prompt words and
stop playing the audio media. Instead, the calling party should play the rejection notification sound
and display the relevant prompt words.
In CRS service, when the called party hangs up the call, the called party should end the call, close
the call details interface, stop displaying the video media image and related CRT prompt words, and
stop playing the audio media. The calling party should play the rejection notification sound and
display the relevant prompt words.
7.5.3 Called party timeout related
In CAT service, when there is a called party timeout without response, the calling party should end
the call, close the call details interface, stop displaying the video media image and related CAT
prompt words and stop playing the audio media of the video CAT. Instead, the calling party should
play the timeout notification sound and display the relevant prompt words.
In CRS service, when there is a called party timeout without response, the called party should end
the call, close the call details interface, stop displaying the video media image and related CRS
prompt words, and stop playing the audio media of the video CRS. The calling party should play
the timeout notification sound and display the relevant prompt words.

7.5.4 Calling party hangs up the call related
In CAT service, when the calling party hangs up the call without a called side answer, the calling
party should end the call, close the call details interface, stop displaying the video media image and
related CAT prompt words, and stop playing the audio media of the video CAT.
In CRS service, when the calling party hangs up the call without a called side answer, the called
party should end the call, close the call details interface, stop displaying the video media image and
related CRS prompt words, and stop playing the audio media of the video CRS.

[b-Le Callet] Le Callet, P., Möller, S., Perkis, A., et al. (2012), Qualinet white paper on
definitions of quality of experience, Eur. Netw. Qual. Exp. Multimed. Syst. Serv.
COST Action IC 1003.

