Jcomss v6n2 June2010 49 55
Jcomss v6n2 June2010 49 55
Jcomss v6n2 June2010 49 55
2, JUNE 2010 49
Abstract Digital television produces video signals with differ- must be much lower in order to fit the display of a mobile
ent bit rates, encoding formats, and spatial resolutions. To deliver phone. One solution would be to transmit the video in full
video to users with different receivers, the content needs to be resolution and let the mobile receiver process the video in
dynamically adapted. Transcoding devices convert video from one
format into another. The reception of digital videos using mobile order to reduce its resolution. The problem with this solution is
receivers, implies that the spatial resolution of the video must be the limited processing capacity of a mobile device. Moreover,
adjusted to fit the small display. This paper presents subjective more processing implies an increase in energy consumption.
and objective quality analysis of spatially transcoded videos. A better solution is to process the video using a spatial
Transcoding algorithms that downsample the video frames using transcoder, before sending its signal to the mobile receiver.
the moving average, median, mode, weighted average and sigma
filters are considered. This idea is illustrated in the block diagram shown in Fig-
ure 1 [4], [5]. Transcoding before transmitting saves space
Index Terms Mobile TV, Performance evaluation, Video cod- and production time, because only the content with maximum
ing and processing, Transcoding.
resolution is stored. It also keeps the computational load of the
mobile receiver at a minimal, saving battery time and avoiding
I. I NTRODUCTION overheating.
by the filter. The filters have been chosen for their simplicity.
The 1 1 filter corresponds to a simple elimination.
NOKIA N95. The distance between the subject and the device of one variable are related with the one of the other and given
was 18 cm. This distance was computed by multiplying the by [18]
height of the screen of the device by six (3 6 cm), as PA
recommended by ITU-T [14]. The tests lasted, on average, 30 j=1 [(j )(j )]
= qP , (9)
minutes. For presentation purposes all the videos were encoded
qP
A 2 A 2
j=1 ( j ) j=1 (j )
using the H.264 encoder with a bit rate of 243 kbit/s and
15 frames/s. in which A is the number of samples and e are the
variables to be related. When the correlation equals 1, it is
said to be strong.
B. Objective Metrics
Two metrics were used for objective evaluation of the video IV. R ESULTS
quality: PSNR and SSIM (Structural Similarity Metric). The
This section presents the results of the performance eval-
PSNR estimates the quality of a video frame by comparing a
uation (objective and subjective) of the spatial transcoders
reference to the corresponding processed version of it using
presented in Section II. The original videos used in the tests
the following expression [15]
were the Mobile, News and Foreman. Each video has 10
seconds and is publicly available for download [19]. Those
F
1 X M N 2552 videos were chosen because they contain a good mixture of
P SN R(x, y) = 10 log PM PN ,
F 2 texture, movement, and colors.
k=0 i=0 j=0 (x(i, j) y(i, j))
Fig. 3. PSNR curves for the transcoded videos. Fig. 5. SSIM curves for the transcoded videos.
B. Subjective Evaluation
Fig. 4. PSNR curves for an encoded video after transcoding. The transcoding techniques used in this subsection are those
that provided the best results for the objective evaluation. The
techniques are presented in the Table II.
Figure 5 shows the SSIM results obtained for the set For the Foreman video the MOS scores are shown in Figure
containing the spatially transcoded videos. It can be observed 7. It can be noticed from the bar plots in Figure 7 that the best
that the best results for the Mobile video were obtained using results for that video were obtained using the 2 2 Sigma,
the 2 2 Sigma, 2 2 Median, and 4 4 Median filters. For 2 2 Median, Weighted Average 3, and 3 3 Median
the videos News and Foreman, the best results were obtained filters.
using the 2 2 Median and 3 3 Moving Average filters. For the Mobile video the MOS values from the experiment
Figure 6 shows the results and the SSIM curves, for the are shown in Figure 8. The best results for this video were
transcoded video after coding. obtained using the Weighted Average 3 and 3 3 Median
For the transcoded videos the best results, using the SSIM filters.
REGIS et al.: OBJECTIVE AND SUBJECTIVE EVALUATION OF SPATIALLY TRANSCODED VIDEOS 53
TABLE II
T HE TRANSCODING TECHNIQUES .
Number Filter
1 2 2 Sigma
2 2 2 Median
3 3 3 Moving Average
4 Weighted Average 3
5 3 3 Sigma
6 Weighted Average 2
7 Weighted Average 1
8 3 3 Median
C. Processing Time
Another important factor that should be considered when
comparing different algorithms is the processing time, or
computational complexity of the algorithm. Table III shows the
processing time for each of the transcoding algorithms under
test, to indicate the time spent as the filter window increases.
Table III shows that the Sigma and Mode filters demand
longer processing times as compared to the Moving Average
and the Weighted Average filters. This is why those techniques
need to be compared. Also, the Median processing time is
slightly higher than for the Average filter. Considering only
the processing time, the best results were obtained for the
filters: Weighted Average, 2 2 and 3 3 Moving Average,
Fig. 8. MOS bar plot obtained for the Mobile video. and 2 2 Median.
V. C ONCLUSION
For the News video the MOS gathered from the experiment This article presented an analysis of the subjective and
are shown in Figure 9. The best results for this video were objective quality of spatially transcoded videos. The transcoder
obtained using the 2 2 Sigma and 2 2 Median filters. operation consisted of downsampling the video frames us-
The correlation between the MOS and PSNR results for ing Moving Average, Median, Mode, Weighted Average and
each transcoded video was calculated, resulting in a low Sigma filters with sizes 1 1, 2 2, 3 3, and 4 4.
54 JOURNAL OF COMMUNICATIONS SOFTWARE AND SYSTEMS, VOL. 6, NO. 2, JUNE 2010
TABLE III
[3] J. Xin, C.-W. Lin, and M.-T. Sun, Digital video transcoding, Proceed-
P ROCESSING TIME FOR A VIDEO . ings of the IEEE, vol. 93, no. 1, pp. 8497, Jan. 2005.
Transcoding Method Time(seconds) [4] J. Xin, M.-T. Sun, B.-S. Choi, and K.-W. Chun, An HDTV-to-SDTV
spatial transcoder, Circuits and Systems for Video Technology, IEEE
Simple Elimination 0.47
Transactions on, vol. 12, no. 11, pp. 9981008, Nov 2002.
2 2 Moving Average 1.30 [5] Y.-R. Lee, C.-W. Lin, S.-H. Yeh, and Y.-C. Chen, Low-complexity
3 3 Moving Average 1.13 DCT-domain video transcoders for arbitrary-size downscaling, 2004
4 4 Moving Average 3.89 IEEE 6th Workshop on Multimedia Signal Processing, pp. 3134, Sept.-
2 2 Median 1.59 1 Oct. 2004.
3 3 Median 5.00 [6] I. Ahmad, X. Wei, Y. Sun, and Y.-Q. Zhang, Video transcoding: an
4 4 Median 13.69 overview of various techniques and research issues, Multimedia, IEEE
Transactions on, vol. 7, no. 5, pp. 793804, Oct. 2005.
2 2 Mode 7.78 [7] P. Yin, M. Wu, and B. Liu, Video transcoding by reducing spatial
3 3 Mode 8.22 resolution, International Conference on Image Processing, vol. 1, pp.
4 4 Mode 56.47 972975, 2000.
Weighted Average 1 0.75 [8] T. Acharya and A. K. Ray, Image Processing - Principles and Applica-
Weighted Average 2 1.19 tions. Hoboken, New Jersey, USA: John Wiley & Sons, Inc., 2005.
Weighted Average 3 3.42 [9] M. Ahmad and D. Sundararajan, A fast algorithm for two dimensional
median filtering, Circuits and Systems, IEEE Transactions on, vol. 34,
2 2 Sigma 5.76
no. 11, pp. 13641374, Nov 1987.
3 3 Sigma 12.06 [10] A. Bovik, T. Huang, and J. Munson, D., A generalization of median
4 4 Sigma 20.50 filtering using linear combinations of order statistics, IEEE Transactions
on Acoustics, Speech and Signal Processing, vol. 31, no. 6, pp. 1342
1350, Dec 1983.
[11] H. Wu and K. Rao, Digital Video Image Quality and Perceptual Coding.
Boca Raton, FL, USA: CRC Press Taylor & Francis Group, 2006.
The objective quality evaluation of the transcoded videos [12] R. Lukac, B. Smolka, K. Plataniotis, and A. Venetsanopoulos, Gen-
used the PSNR and SSIM metrics. The PSNR results were eralized adaptive vector sigma filters, International Conference on
considered satisfactory, and the filters 4 4 Median, 2 2 Multimedia and Expo. ICME 03., vol. 1, pp. I53740 vol.1, July 2003.
[13] T. H. Falk and W.-Y. Chan, Performance study of objective speech qual-
Sigma, and 22 Median produced the best results. The results ity measurement for modern wireless-voip communications, EURASIP
obtained with the SSIM metric were also satisfactory, and the Journal on Audio, Speech, and Music Processing, p. 11 pages, 2009.
filters 2 2 Sigma and 2 2 Median showed the best results. [14] ITU-T, ITU-T Recommendation P.910, subjective video quality assess-
ment methods for multimedia applications, September 1999.
A subjective experiment was performed to obtain a more [15] Q. Huynh-Thu and M. Ghanbari, Scope of validity of psnr in im-
reliable quality assessment of the transcoded videos. The age/video quality assessment, Electronics Letters, vol. 44, no. 13, pp.
experiment was performed using the Pair Comparison (PC) 800801, 19 2008.
[16] R. de Freitas Zampolo, D. de Azevedo Gomes, and R. Seara, Avaliacao
method described in the ITU-T P.910 Recommendation [14]. e comparacao de metricas de referencia completa na caracterizacao
The test sequences were displayed on the NOKIA N95 cell de limiares de deteccao em imagens, XXVI Simposio Brasileiro de
phone. Subjects used a scale of discrete numbers, ranging from Telecomunicacoes - SBrT 2008, Sept. 2008.
[17] Z. Wang, L. Lu, and A. C. Bovik, Video quality assessment using
0 to 10 to inform their quality judgments (MOS). structural distortion measurement, in in Proc. IEEE Int. Conf. Image
The data gathered from the subjective experiment showed Proc, 2002, pp. 6568.
that all transcoded videos presented MOS values above 7, [18] J. Hu and J. Gibson, New rate distortion bounds for natural videos
based on a texture dependent correlation model in the spatial-temporal
which is an acceptable subjective evaluation. The data also domain, in Communication, Control, and Computing, 2008 46th Annual
showed that the 2 2 Median and 2 2 Sigma filters provided Allerton Conference on, Sept. 2008, pp. 9961003.
the best results in terms of quality. [19] YUV video sequences, http://trace.eas.asu.edu/yuv/index.html,
November 2008.
The computational complexity of the proposed transcoding
algorithms was also tested. Considering only the processing Carlos Danilo Miranda Regis was born in Guara-
time, the best results were obtained for the Weighted Average, bira, Brazil. He received his Bachelor Degree in
2 2 and 3 3 Moving Average, and 2 2 Median filters. Electrical Engineering, in 2007, and his Masters De-
gree, in 2009, in Electrical Engineering, both from
The spatially transcoded algorithms using the 2 2 Median the Federal University of Campina Grande (UFCG),
and 2 2 Sigma filters produced the best results, both Brazil, where he is currently a doctoral student. He
objectively and subjectively, and these techniques are the is with the Iecom Executive Staff of the Journal
of Communication and Information Systems (JCIS),
most appropriate to perform spatial transcoding. In particular, since 2006. He is currently a substitute professor
the 2 2 Median filter has the advantage of requiring less at the Federal Institute of Education, Science and
processing time. Technology of Paraba (IFPB), Brazil. His current
interests include video quality metrics, video processing, multimedia, Digital
VI. ACKNOWLEDGMENTS TV, mobile TV and video transmission.
The authors would like to thank CAPES and CNPq for
funding this work and Iecom for providing the equipment and Raissa Bezerra Rocha was born in Campina
Grande. She is a masters student of Electrical Engi-
facilities. neering at the Federal University of Campina Grande
R EFERENCES (UFCG) and is currently taking part in research at
the Iecom.
[1] M. S. Alencar, Digital Television Systems. Cambridge Univerty Press,
2008.
[2] M. Bonuccelli, F. Lonetti, and F. Martelli, Temporal transcoding for
mobile video communication, in the second Annual International Con-
ference on Mobile and Ubiquitous Systems: Networking and Services.
Citeseer, 2005, pp. 1829.
REGIS et al.: OBJECTIVE AND SUBJECTIVE EVALUATION OF SPATIALLY TRANSCODED VIDEOS 55