Singly Linked Lists Algorithm
Singly Linked Lists Algorithm
Singly Linked Lists Algorithm
Introduction
A linked list is data structure based on connected nodes rather than a
backbone array. Each node in a linked list has an element (variable value)
and a pointer to the next node in the list (variable next). That is why is
called a linked list, because the nodes in it are “chained” to one another by
pointers. The linked list has only two components: an element counter to
keep count of how many elements are stored in the list (variable count), and a pointer to the first node
in the list (variable named head). From the first node we can reach any other node in the list.
As we have learned, a vector is an excellent data structure that, when used properly, provides a very
efficient way to store and access elements. Part of this efficiency is due to the locality of data in
memory (i.e., elements are stored in a contiguous chunk of memory as opposed to in fragmented parts),
as well as due to array-based indexing (i.e., random / immediate access). Nevertheless, those
advantages may be overshadowed by the need to find
and allocate a large chunk of memory that may not even
be used by the data structure in its entirety (e.g., as
when doubling the underlying array capacity to store just
one more element). In addition, some operations like
removing and inserting an element at the head or at the
middle of the vector not only are linear-time (O(n))
operations, but they require several write-access to
memory to shift elements around, or copy them into a
new array, which typically are more costly than read-
access.
1
A pointer is a value in memory that represents a memory address, and thus, it occupies a small, fixed amount of
space in memory no matter to which data type it points to. How much memory a pointer occupies exactly
depends on the computer architecture, but typically it occupies roughly 4 bytes (equivalent to an int) in a 32-bit
machine, or 8 bytes (equivalent to a long) in a 64-bits machine.
That said, linked lists are not necessarily better or more efficient than vectors, it all depends for what
they are used. For instance, linked lists, due to their linked nature, have sequential access to their
elements, that is, to reach the tenth (10th) element in a linked list you will have to traverse the list from
the first (1st) node, to the second (2nd) node, then to the third (3rd) node, and so on until you reach the
tenth (10th) node. Vectors, on the other hand, just jump to the desired slot instantly, no matter the
position. Sequential access is slow for fetching elements, and thus, it makes searching and sorting
algorithms VERY inefficient. Yet, linked lists are better suited for situations where memory is very
fragmented or limited, and when writing-access to memory is costly or inefficient (i.e., linked lists do not
shift elements around like vectors do, thus, they could be more efficient in that particular aspect).
Linked lists like vectors are data structures with a vast range of uses. They may be used to represent
bounded or unbounded queues, lists, or stacks, which in turn may be used to store items when we do
not know the number of items beforehand or we know the number of records will increase eventually.
Operations
Like a vector, the linked list data structure allows us to insert, remove, or fetch elements to and from the
list it represents. To do that, it provides a set of methods which allow us to insert, remove, or fetch an
element at the beginning (or head) of the list, or at the end (or tail) of the list. Optionally, an
implementation may include an operation to insert, remove, or fetch an element in the ith position (or at
the middle), or an operation that allows us to replace (set) an element at a given position rather than
insert a new one, to name a few. This document discusses the insert, remove and fetch operations on a
linked list.
Insertion
Insertion on a linked list occurs by creating a new node with the element value and inserting it in the list
in the specified position. For the operation to proceed, the position must be a valid position. A valid
position is that between the first element (0) or the head, and the position just past the last element
(count) or the tail.
If the position is valid, we traverse the list from node to node until we reach the node just before the
position we want. This node is pointed by the pointer ptr. Then, we make the new node point to the
node next to the node that ptr is pointing to, and then make the node pointed by ptr point to the new
node, effectively inserting the new node at the specified position in the list.
Those steps work for every position except when inserting at the head (or position 0). The reason why it
does not work on the head is because there is no node before the head node. So inserting at the head is
a special case that must be treated separately from the rest. When inserting a node at the head we just
make the new node point to the first node on the list, and then make the pointer to that first node
(called head) point to the new node. Remember that a linked list consists of two components an
element counter called count and a pointer to the first element called head? Well, that head pointer is
the one we change to point to the new node. This all may seem confusing at first but keep reading and
you will understand better.
After successfully inserting an element we should increment the variable count by one to indicate that
the list holds one more element than it did before. Doing this is also necessary to determine which
range of positions is valid later on.
Initial state.
Step 1.
Step 2.
Step 3.
Final state.
(It works!)
Inserting at the Tail
Contrary to inserting an element at the head, inserting an element at the tail requires that we reach the
last (or tail node). Since access in a linked list is sequential, this means that we need to traverse the list
from the first node up to the last node. Due to this fact, inserting an element at the tail of the list is an
O(n) or linear-time operation on the average case. (Read Appendix A to learn more about the Big-O
notation.)
if(index == 0)
{
newNode->next = head ;
head = newNode ;
}
else
{
Node<T>* ptr = getNode(index-1) ;
newNode->next = ptr->next ;
ptr->next = newNode ;
}
count++ ;
return true ;
}
return false ;
}
return ptr ;
}
Removal
Removal on a linked list occurs by re-chaining the nodes and deleting the node in the specified position.
For the operation to proceed, the position must be a valid position. A valid position is that between the
first element (0) or the head, and the position of the last element (count-1) or the tail.
If the position is valid, we traverse the list from node to node until we reach the node just before the
position we want to remove (ptr). Then, we create a temporary pointer (tmp) to the node at the position
we want to remove (i.e., the node next to ptr), and then make node ptr point to the node next to the
one pointed by tmp, basically skipping the node tmp. After that, we delete the content of the node
pointed by tmp and return the allocated space to memory.
Those steps work for every position except when removing at the head (or position 0). The reason why
it does not work on the head is because there is no node before the head node. So, removing at the
head is a special case that must be treated separately from the rest. When removing a node at the head
we just create a temporary pointer (tmp) point to the first node on the list (just like the pointer head
does), then we make the head pointer point to the node next to the node pointed by tmp, basically
skipping the first node. After that, we delete the content of the node pointed by tmp (i.e., the ex-first
node) and return the allocated space to memory. It may seem confusing at first but keep reading and
you will understand better.
After successfully removing an element we should decrement the variable count by one to indicate that
the list holds one less element than it did before. Doing this is also necessary to determine which range
of positions is valid later on.
Removing at the Head
Removing an element at the head of the list is one of the most simple and most efficient operations in a
linked list. Since there is a variable head pointing to the first node on the list, it takes a fixed amount of
steps to remove a node at that position no matter how many nodes there are on the list. Due to this
fact, removing an element at the head of the list is a O(1) or constant-time operation on the average
case. (Read Appendix A to learn more about the Big-O notation.)
Remove Implementations
count-- ;
return true ;
}
return false ;
}
Remove at the Head
bool removeFirst()
{
return removeMiddle(0) ;
}
bool removeLast()
{
return removeMiddle(count-1) ;
}
Fetching
Access on a list is sequential. To fetch an element in a specific position we need to traverse the list from
the head node up to the node we want. Once we get the node at that position we extract the element
value from it. Of course, the position we fetch must be a valid position. A valid position is that between
the first element (at position 0) and the last element (at position count-1). Fetching the element at the
head always takes a fixed amount of steps so, it is an O(1) or constant-time operation. Fetching an
element at the tail requires to traverse all the nodes on the list so it is an O(n) or linear-time operation
on the average case, while fetching an element at the middle requires a similar process which in the
average case also makes it an O(n) or linear-time operation on the average case. (Read Appendix A to
learn more about the Big-O notation.)
Fetch Implementations
T getMiddle(int index)
{
if(index >= 0 && index < count)
{
Node<T>* ptr = getNode(index) ;
return ptr->value ;
}
return NULL ;
}
Fetching at the Head
T getFirst()
{
return getMiddle(0) ;
}
T getLast()
{
return getMiddle(count-1) ;
}
Advanced Topics
Iterators
Fetching the ith element in a linked list takes ith steps due to sequential access. This may not seem as
much for short lists but when the element count reaches large quantity it becomes very inefficient.
Furthermore, when iterating through a list as to display every element, or process each individual
element in whichever way, we may be tempted to use the following snippet of code:
At first sight it looks fine, but if we analyze it we can easily see that it becomes very, VERY inefficient to
do it this way. For example, imagine that the element count is 10. If we follow the code, it takes one
jump for getMidddle(0) to reach the first position, then it takes two jumps for getMiddle(1) for the
second position, then three jumps for getMiddle(2) to the third position, and so on. So, in total, it takes
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55 jumps to iterate through a list of only 10 elements!
In general, it would take [n (n + 1) / 2] steps for a list with n elements using that method, that is, O(n2) or
quadratic-time operation. In contrast, iterating through a vector of n elements using the same method
just takes n steps, that is, O(n) or linear-time operation, due to its random access property.
Iterators are an internal mechanism used by many data structures, particularly linked list, to optimize
iteration-based operations. Rather than iterate through a list using the getMiddle() operation,
whenever a data structure needs to iterate through its elements we obtain a list Iterator object, and use
its interface to iterate through the list more efficiently. The interface of an Iterator object consists of
two methods:
Iterator Interface
class Iterator<T>
{
public:
The following sample code does the same as the previous example, but using an Iterator:
Iterator<T>* i = list.getIterator() ;
while(i->hasNext())
{
cout << i->getNext() << endl ;
}
Iterator<T>* getIterator()
{
return new LinkedListIterator<T>(head) ;
}
In the case of a linked list the iterator will keep track of the last element it visited so consecutives calls to
the getNext() method will return the next value without starting from the first element like the
getMiddle() method does. Here is a simple implementation for the Iterator interface for linked lists:
Node<T>* ptr ;
public:
LinkedListIterator<T>(Node<T>* startNode)
{
ptr = startNode ;
}
bool hasNext()
{
return (ptr != NULL) ;
}
T getNext()
{
T value = ptr->value ;
ptr = ptr->next ;
return value ;
}
} ;
Optionally, an Iterator interface may include methods that allow inserting, and removing from and to
the current position of the iterator. Another common method could be one that allow us to traverse
the list bidirectionally (e.g., hasPrevious(), getPrevious()), but that is normally an exclusive operation of
double linked lists.
Exercises
1. Figure a way to know how many elements are in a linked list without using the variable count.
2. Unlike vectors, linked lists do not have an array, so indexing is not possible. Figure a way to optimize
the performance of a linked list to access the last node in O(1) or constant-time? Figure a way to
optimize the performance of a linked list to reach the ith position in no more than n/2 steps? Where
n is the number of elements in the list.
3. In future documents you are going to learn that the circular linked list and the double linked list
implementations are possible answers for the previous exercise. Can you think of other possible
answers? Explain your answer(s) and implement your solution(s).
4. Due to the fact that the head node has no previous node, the standard algorithm to insert or
remove a node at the head of a linked list does not work properly, so, we must treat this as special
case like the code snippet shows. Is there a way to modify the code so there is no need to treat
inserting or removing a node at the head of the linked list like a special case? Explain your answer(s)
and implement your solution(s).
References
D.S. Malik. C++ Programming: Program Design Including Data Structures, 5th Edition. Course
Technology, 2010. ISBN-10: 0538798092 ISBN-13: 978-0538798099
R. Lafore, Data Structures and Algorithms in Java, 2nd Edition. Sams, 2002 [Classic]. ISBN-10:
0672324539, ISBN-13: 978-0672324536
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 3rd Edition. The
MIT Press, 2009. ISBN-10: 0262033844, ISBN-13: 978-0262033848
From Wikipedia, the free encyclopedia:
o http://en.wikipedia.org/wiki/Linked_list
o http://en.wikipedia.org/wiki/Queue_(data_structure)
o http://en.wikipedia.org/wiki/Stack_(data_structure)
o http://en.wikipedia.org/wiki/Iterator