Voice Over Internet Protocol (VOIP) : Overview, Direction and Challenges
Voice Over Internet Protocol (VOIP) : Overview, Direction and Challenges
1. Introduction
Voice over Internet Protocol (VoIP) is a technology that makes it possible for users to make telephone calls over the
internet or intranet networks. The technology does not use the traditional Public Switched Telephone Network
(PSTN); instead calls are made over an internet protocol data network. VoIP has great benefits of increased saving,
high quality voice and video streaming and several other added value services. Examples of VoIP software are:
Skype, Google talks and windows live messenger (Di Wu, 2002).
2. Overview of VoIP
VoIP stand for Voice over Internet Protocol. VoIP enables us to compress and convert voice signal to digital signal
and transmit it through Internet Protocol (IP)-enabled network like Internet, Ethernet and Wireless LAN. VoIP uses
Internet protocol to manage voice packet over Internet protocol (IP) network.
Types of VoIP Service
There are different types of VoIP based on the infrastructures employed by the owner of the network. Listed are
some popular services used in VoIP.
Computer to Computer
Computer to Computer service provides Internet telephony free using the same softphone software such as Skype,
Instant Messaging, AOL etc. It is a software-based VoIP service and both the caller and the receiver must be using
their computer in order to place calls. The following requirement must be met to use computer to computer VoIP
service: softphone software, a sound card and good Internet service. With computer to computer VoIP service, the
user may not be able to call either landline or mobile phones, and also the recipient must be online in order to call
him/her.
Computer to phone and vice versa
This is a software-based and hardware-based service. Softphone software is used to route the call to an Internet
protocol and hands off to a conventional telephone network . To use the service, one needs to subscribe and be
charged at a low rate. Examples include Skype, MSN and Google Talk that provides the service to enable their
customers to call landline from their computer. Computer to phone requirements are Internet-enabled phone and
computer, VoIP service subscription, modem and Analog Terminal Adapter to convert the call signal to digital signal
and also back to analog signal. Computer to phone does not allow emergency call users and needs to have a
computer connected to the internet.
Phone to Phone
18
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
This is a hardware-based service that allows the caller and receiver to call each other using the Internet. Many
telephone companies use this to handle long distance calls. VoIP convert the audio sound into data packets and
transfer these packets over the Internet. It allows emergence calls and does not need PSTN for initiation and
termination of calls.
19
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
The server enables the establishment of call and support for other features in the system. The Session initiation server
allows the user to forward calls to different location in the VoIP network.
H.323 protocol
H.323 protocol was developed by International Telecommunication Union and Telephony (ITU-T) in 1997 based on
Real-time Transport Control protocol (RTCP) for sending voice, video and data over IP-based network (Rakesh,
2000). H.323 provides multimedia conferencing on Local Area Network (LAN) and brings together point to point
communication and multipoint conferences. It was widely adopted because it is reliable and easy to maintain. The
components include terminals, Gateway, Gatekeeper and Multipoint control unit (MCU). Terminal is the end point
that provides real-time communication for the VoIP network. Gateway is the interface between IP network and the
PSTN to another H.323 gateway. It provides Internet translation between the different terminals. Gatekeeper is the
most vital component of the H.323 protocol. It acts as the central point for all calls and provides services for end
point registrations. The functions of Gatekeeper are:
• Translate alias address to transport address
• Deny or grant access based on call authorisation, source and destination address.
• Call signalling with end point terminals
• Control the number of terminals permitted at a time in H.323 protocol (bandwidth management)
• Maintain the list of ongoing call H.323 calls to determine the busy terminals for bandwidth management.
• Rejection of call from terminal due to authorisation failure in the use of H.225 signalling (Call
authorisation).
Another important component of H.323 protocol is the multipoint control unit (MCU). It acts as a bridge that enables
two or more terminals and Gateway to participate in a multipoint conferencing. MCU is made up of Multipoint
controller (MC) and Multipoint processor. Multipoint controller determines the capabilities of the network terminal
using H.245 protocol stack but does not perform multiplexing of audio, video and data. Multipoint processor is
responsible for multiplexing of media stream . H.323 consists of a number of protocol suites. The protocol suites and
functions are listed:
• H.245 provides capabilities for channel usage, advertisement, establishment and conference control.
• H.255 for call control
• Q.931 for all signalling, call control and setup.
• Registration Admission status (RAS) is used for communicating with H.323, endpoint and gatekeeper. It
provides interaction between H.323 and the gatekeeper.
The table 1 lists the use of protocol stacks in audio, video and data packet, and their transport protocols.
The problems of H.323 protocol are lack of flexibility, high connection setup latencies implementation difficulties.
Figure 2 and 3 shows the architecture of H.323 and the connection procedures.
21
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
Figure 3: H.323 connection and call flows (Mona &, Nirmala, 2002)
Session Initiation Protocol (SIP)
Session Initiation Protocol (SIP) was developed by Internet Engineering Task Force (IETF) designed to initiate and
terminate VoIP session with one or more participant (Rakesh, 2000), (Mona &, Nirmala, 2002). It is an ASCII-based
peer to peer application protocol that initiates, modifies, creates and terminates interactive multimedia
communication session between users. Because of the flexibility of SIP, it is used for audio, video and data packet
transmission and communication. Session Initiation Protocol is similar to hypertext transfer protocol (http) made up
of client-server. The client sends a request to the server and the server process the request and sends back to the
client in a process called transaction. SIP is used in applications such as instant messaging, Apple chart, MSN
messenger. The use of User description protocol (SDP) for carrying out negotiation for codec identification enables
the support for user mobility by proxy and redirect server to the user’s current location.
Components of SIP
SIP consists of User Agent and Network server. User Agent is the endpoint that acts on behalf of the user and maybe
client or server. The client is called the user agent client and helps to initiate SIP request while the server known as
user agent server receives the request, process it and returns the responses on behalf of the user. Network servers
include registration server, proxy server and redirect server. Registration server is used for uploading current location
of user, proxy server receives the request forward it to the next hop while redirect server on receiving request,
determines the next hop and returns the address of the next hop server to the client instead of forwarding the request.
SIP Messages
SIP defines several messages for communicating with the client and SIP server (Rakesh, 2000). Some of the
messages are listed
• INVITE- used to initiate a call by inviting user SIP session call.
22
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
23
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
SIP H.323
Simple to implement Very complex protocol
Use binary representation for its message Use textual representation
Not very modular Very modular
Not scalable Highly scalable
Need full backward compatibility Does not need backward compatibility
Use complex signalling Use simple signalling
Has a lot of elements Has only 37 elements
Loop detection is difficult Loop detection is easy
Large share of market Backed by IETF
Table 2 Comparison SIP of and H.323
24
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
G.711: approved in 1965 and is the simplest way of digitizing analog signal. The algorithm uses Pulse Code
Modulation with less than 1% acceptable packet loss factor. The encoded audio stream of G.711 is 64kbit/s, so is
worst in terms of bandwidth but the best in quality among the entire scheme.
G.722 was approved in 1988 and provides higher quality digital coding at 7 KHz of audio spectrum at only 48, 56, or
64bits. It is mainly used for all professional conversation voice application such as video conference and IP phone
applications
G.722.1 is a wideband coder designed by Picturetel, which operate at 24bit/s or 32bit/s. It encodes a frames of 20ms
with workload of 20ms, 16kbit/s version of G.722.1 supports windows messenger.
G.723.1 was approved in 1995 for use in H.323 communication and UMTS 99 video cell phones. It uses frame
length of 30ms and needs a workload of 7.5ms in 64kbt/s or 5.3kbit/s operation modes. The algorithm is not
designed for music and its difficult to be used in a fax and modem signal transmission. The International
Telecommunication Union and Telephony recommends it for use in narrow band video conferencing and 3G
wireless multimedia devices.
G.726 was approved in 1990 and uses Adaptive differential Code modulation (ADPCM) techniques to encode G.711
bit stream in words of 2, 3, or 4 bits resulting in bit rate of 16, 24, 32 or 64kbits/s .
G.728 uses low delay, codec executed linear prediction (LD CELP) coding techniques with a mean opinion score
(MOS) similar to G.726. The algorithm is used for Fax and modem transmission, and also for H.323 video
conference.
G.729 is conjugate-structure, Algebraic Code Excited Linear Prediction (CS-ACELP) speech compression algorithm
approved by ITU-T for use in voice over frame relay application. It produce 80-bits frame encoding 10ms of speech
at a bit rate of 8kbit/s. The scheme is not designed for music and does not support Dual-Tone Multi-Frequency
(DTMF) signalling tones reliably. Listed in table 3 are properties of common voice codec schemes.
Codec Bit Rate Payload Packets per Quality bandwidth Sample algorithm
Seconds (pps) period
G.711 64kbit/s 160bytes 50pps Excellent 95.2kbps 20ms PCM
G.729 8kbit/s 20bytes 50pps Good 39.2kbps 10ms CS-ACELP
G.723.1 6.3kbit/s 24bytes 34pps Good 27.2kbps 30ms MPC-MLQ
G.723.1 5.3kbit/s 20bytes 34pps Good 26.1kbps 30ms ACEP
25
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
• Widespread availability of Internet protocol (IP): IP network is readily available all over the world, with
people having access to PC linked to internet. Furthermore, availability of gateways to/from PSTN allows
calls to use VoIP for voice and video calls (Mona & Nirmala, 2002).
• Reduce the cost of Ownership: VoIP integrates data and voice communication traffic into a single network
thereby reducing the cost of infrastructural ownership and maintenance redundancies. It brings different
network elements together such as call server, application server and client server (Bhogal et al, 2004).
• Efficient utilisation of network resource: VoIP network improves the network bandwidth efficiency and
quality of service by eliminating silence during conversation, reduce repetitive pattern in human speech and
increases inefficient data throughput.
• Greater operational flexibility: IP-based network is made up of different layers of separate components
that can be integrated to form a whole system. This allows the system, application, and services to be
dynamically managed resulting in a customised, flexible and extensible system.
26
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
packet is intolerant to packet loss, jitter and delay unlike traditional data packet that has minimal delay in delivery of
their packets; there is no need to address Quality of Service (QoS) issues. To convey voice traffic over IP network,
there is need to ensure reliable arrival of the packets. Therefore, quality of service provides dedicated bandwidth,
controlled jitter and latency, and also improved loss characteristics. This section will discuss these Quality of Service
(QoS) parameters as they affect VoIP.
Delay: delay is the amount of time it takes to transmit data packet from source to destination. It is the end to end
delay or time delay incurred in speech by VoIP system. To ensure high quality, delay should be controlled so that
communication delay should be less than 150ms (Di Wu, 2002), (Jeomgoo, Inyong & Suh, 2010). Delay is caused
by three major factors such as codec algorithm, queue algorithm of communicating equipment and variable delay
caused by network condition at the time of transmission. It is important to minimise delay to an acceptable level of
150ms to ensure better quality of service. Codec (compression-decompression) introduces three kinds of delay:
• Processing or algorithmic delay which is the time required for codec to encode one voice frame.
• Look ahead delay, the time required for a codec to examine part of the frame
• Frame delay is the time required for sending system to transmit a single frame.
Compression algorithm affects delay, the higher the level of compression the higher the delay the system.
Packet Loss: Packet Loss is caused by hybrid circuits where it changes from 4-wire to 2-wire. It occurs when there
is packet drop in the network leading to loss. And VoIP packet is very sensitive, packet loss can greatly affect the
Quality of Service (QoS) of VoIP system. The acceptable packet loss in VoIP system is below 1%, and anything
beyond this limit is unacceptable. The major causes of packet drop are congestion in the network and buffer size,
every effort should be made to ensure the network is design to counter network congestion.
Jitter: Jitter is the variation in inter-packet arrival rate which introduce variable transmission delay over the network.
Because VoIP use User Datagram Protocol (UDP), IP network cannot guarantee the delivery time to the packets
leading to inconsistent rate of arrival. Jitter can be removed using jitter buffer, allowing an equal stream to collect a
packet and store them long enough to permit slowest arrival in correct sequence. Jitter buffer adds to the overall
delay. To support VoIP traffic reliability, the network should guarantee the following:
• Packet-forwarding latency that should be within maximum tolerable for VoIP conversation.
• Packet forwarding jitters within tolerable level to sustain a VoIP session.
• And guarantee bandwidth and capacity for VoIP session in case of network congestion.
The network should provide low latency and jitter to maintain high quality. We need to control all the mentioned
parameters to ensure high quality of VoIP service for students and staff. Sometimes we also need to prioritised
network application and limited shared network resources.
27
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.4, 2013
• Protection of network servers and endpoints from well known threat and man in the middle attacks.
3 CONCLUSION
To keep abreast with the global technological change and maximizing cost, a reliable and cheap means of
communication is inevitable. This paper has been given a critical and succinct digest on voice over Internet protocol
(VoIP). It systematically educates one on the VoIP system, its standards, protocols and security challenges. VoIP
however is presented as a sure alternative over the Public Service Telephone Network (PSTN).
REFERENCES
1 Bhogal Amit; Hamza Idrissi; Thai-son Nguyen; Michael Wakahe (2004) “Voice over Internet Protocol”.
Online at www3.sympatico.ca/albert_nguyen/project/VoIP.pdf
2 Di Wu (2002), “ Performance studies of VoIP over Ethernet LANs”, Online at
http://www.autoresearchgateway.ac.uz/bitstream/10292/677/5/Diw/pdf. (Accessed : 28/06/2011)
3 Greg S Tucker (2004) “Voice over Internet Protocol (VoIP) and security”, Online at
www.sans.org/reading_room/whitepapers/voip/voice-internet-protocol-voip-security-1513. (Accessed:
30/08/2011)
4 Hersent Oliver (2011) “IP telephony: deploying VoIP protocol and IMS Infrastructure”, Second Edition, John
Wiley and sons, UK
5 Jeomgoo Kim, Inyong Lee, Suh ron Noh (2010) “VoIP Quality of Service Design of measurement
management process model”, International Conference on information science and applications (ICISA) Pp
1-6.
6 Mona Habib, Nirmala Bulusu (2002), “Improving QOS of VoIP over WLAN (IQ-VW)”. Online at
http://www.cs.iccs.edu (Accessed: 15/05/2011)
7 Rakesh Arora (2000) “Voice over IP: Protocol and Standards”, Online at
http://www.cse.wustl.edu/~jain/cis788-99/ftp/voip_protocols.pdf (Accessed; 29/05/2011)
28
This academic article was published by The International Institute for Science,
Technology and Education (IISTE). The IISTE is a pioneer in the Open Access
Publishing service based in the U.S. and Europe. The aim of the institute is
Accelerating Global Knowledge Sharing.
More information about the publisher can be found in the IISTE’s homepage:
http://www.iiste.org
The IISTE is currently hosting more than 30 peer-reviewed academic journals and
collaborating with academic institutions around the world. There’s no deadline for
submission. Prospective authors of IISTE journals can find the submission
instruction on the following page: http://www.iiste.org/Journals/
The IISTE editorial team promises to the review and publish all the qualified
submissions in a fast manner. All the journals articles are available online to the
readers all over the world without financial, legal, or technical barriers other than
those inseparable from gaining access to the internet itself. Printed version of the
journals is also available upon request of readers and authors.