Strings Code
Strings Code
Strings Code
Strings
Database to DNA!
INTRODUCTION
In this chapter we will learn how to represent a string in C. In computer science, we deal regularly with
strings everywhere, starting from details of customer in a service industry to DNA strands in the bio-
chemistry lab. C doesnt offer string as a basic built in data structure. We use char array or char pointer
to simulate string behavior. This type of representation is commonly known as C-Style String Represen-
tation. Once we learn basic string initialization, then we will learn about the C-library functions which
will be the building blocks for some user defined functions that we build towards the later of this chap-
ter. Moreover linked lists are also used to represent strings and store them. Most contemporary object
oriented programming languages, like Java and C#, support splitting a given string by delimiters. We
will create such a function in this chapter. This function will take a string and a delimiter to chop it and
will return a linked list of chopped strings. Later in the book, especially in the chapter on file handling
we will use this function for reading delimited notepad files and process them.
l String processing also finds application to find Plagiarism. Mostly Rabin-Karp algorithm is
used to detect it.
l String comparison algorithms/methods are used in different areas like Spell Checking, DNA
matching, Adaptive Text Prediction using T9 dictionary for mobile phones, etc.
J A C O B \0
//Calculative Initialization
//The name below has exactly 15 characters
//and thus it has been stored in
//a character array of 16 pockets. The
//last pocket of the array is for storing the null character.
char name[16]=Jacob Alexander;
The integers denote the locations of the characters in the name. Later it will be shown how to extract
a part of a given string. Then we can use that function to extract the first, middle and the last name.
int main()
{
char s[2]={12,11};
printf("The Female Symbol :%c\n",s[0]);
printf("The Male Symbol :%c\n",s[1]);
return 0;
}
Here is the output of the program,
The Female Symbol:E
The Male Symbol:G
This scheme for initializing strings are highly used in cryptography and user interface design for
console based C programs. (You will find these applications in departmental stores).
S i r I s s a c N e w t o n \0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
strlen() returns length of the string including the null character. So the length of the string returned is 16.
Concatenating Two C-Style Strings
Functions used:
l strcat()
l strncat()
strcat()
To concatenate two strings entirely (One is concatenate at the end of the other) we can use strcat () as
shown below.
char *firstname = John;
//Note the carefully left blank space before the family name.
//If this blank space is not left then, given name and family name will
//be close and not readable
strcmp()
Two strings can be compared using the built-in library function strcmp(). This method compares two
strings considering case. That means if we compare the strings World and wOrLd then its compari-
son will tell us that the strings are not equal. In case the two passed arguments are equal then this
function returns zero. Otherwise this method will return a non zero value. Here is an example.
char *a=world;
char *b=worm;
if(strcmp(a,b)==0)
printf(Two Strings are same );
else
printf(They are unequal);
The output of this program snippet will be They are unequal as the two strings are unequal.
Besides using this built-in function you can write your own string comparison method. The algorithm
is to check letter by letter and then proceed until you encounter the null character or a mismatch.
There are two other methods that are used to compare strings. They are
strcmpi()
This functions compares two strings ignoring their cases. This function also returns 0 in case the two
string values are same. Two Strings World and wOrLd are same to strcmpi(). So the output of the
code snippet
char *s=World;
char *t=wOrLd;
if(strcmpi(s,t)==0)
printf(The two strings are same);
else
printf(The strings are different);
strncmp()
It accepts three arguments, two strings and one integer. It checks two strings up to that nth character. In
case they are same, then zero is returned. Otherwise, non zero value is returned. This function becomes
particularly handy, when we search for names in address-book programs. Suppose you want to find out
that how many names are there in the address-book that starts with Ja, then we can do something like
170 Data Structures using C
//Start searching
//Till the end of the address book
//pick the name of the person in a string called Name then
if(strncmp(Name,Ja,strlen(Ja))==0)
//Show the name
strlen(Ja) = 2. So the above piece of code will return success for all those entries in the
address book entries for which, the first 2 letters of the name are Ja, like Jacob, Jack, Jasmin,
Jana, etc.
But it will return false for Jonathon, because here only the first character matches.
strncmpi()
Same as strcmp(). The only difference is that this function is case-insenstive. This function compares
two strings up-to nth character ignoring their case. So in the above example (of address-book) the best
suited method is strncmpi() because that will match names starting with Ja, jA, JA and ja
This function also returns 0 on success and a non-zero value on failure.
Copying a String to Another
Functions used:
l stpcpy()
l strcpy()
l strncpy()
stpcpy()
This function copies all the characters of one string to other empty string till it reaches the null character.
So this function is appropriate when you are trying to add some more characters at the logical end of the
string being copied.
#include <stdio.h>
#include <string.h>
#include <conio.h>
int main()
{
char *a="FRANKFURT";
char *b=";
stpcpy(b,a);
puts(b);
getch();
return 0;
}
strcpy()
This function copies one string to another blank string. This function is mainly used for assigning the
new string variables.
#include <stdio.h>
#include <string.h>
#include <conio.h>
int main()
{
Strings (Database to DNA!) 171
char *a="FRANKFURT";
char *b="Hamburg";
//We will swap the contents of the strings using strcpy
char *temp="";
clrscr();
printf("Before Swapping a = %s and b = %s\n",a,b);
strcpy(temp,a);
strcpy(a,"");
strcpy(a,b);
strcpy(b,"");
strcpy(b,temp);
printf("After Swapping a = %s and b = %s\n",a,b);
getch();
return 0;
}
The output of the program will be
Before Swapping a = FRANKFURT and b = Hamburg
After Swapping a = Hamburg and b = FRANKFURT
strncpy()
This function is similar to that of strcpy(), but it will copy up to the nth character of the source string.
Suppose we have the complete name for a person and we want to extract the first name. To do that, we
first need to find out the index of the first white space in the string. Then we need to copy part of the
string from starting to the index one less than the index of the first white space.
#include <stdio.h>
#include <string.h>
#include <conio.h>
int main()
{
char *name="Jacob Alexander";
char *fname="";
int i;
clrscr();
//Finding the location of the first whitespace.
for(i=0;name[i]!=' ';i++);
printf("%d ",i);
strncpy(fname,name,i);
//Putting the null character manually.
fname[strlen(fname)]='\0';
puts(fname);
getch();
return 0;
}
The output of this code will be
Jacob.
Changing Case of a String to Lower or to UPPER
Functions used
l strlwr()
l strupr()
172 Data Structures using C
strlwr()
This function returns a string which is a lower cased version of the argument string.
strupr()
This function returns a string which is a upper cased version of the argument string.
int MAIN ()
{
puts ("I am here");
RETURN 0;
}
The program accepts the filename and then displays the corrected version of the program.
#include <stdio.h>
#include <string.h>
#include <conio.h>
int main()
{
char *line;
FILE *fp;
clrscr();
//Open the input wrong case program file
fp = fopen("D:\\test.C","r");
if(fp)
{
while(!feof(fp))
{
fgets(line,81,fp);
puts(strlwr(line));
}
}
else
perror("Error");
getch();
return 0;
}
Strings (Database to DNA!) 173
int main ()
{
puts ("i am here");
return 0;
}
Try to extend this application that will take care of the following situations:
1. User defined constants like #define USD2INR 44.5 should not be changed to lower case.
2. User defined function name like draw_rectangle() should not be changed to lower case.
3. Inbuilt constant names (like M_PI_2) and macro names (like EOF) should not be changed to lower
case.
int main()
{
char *line;
FILE *fp;
clrscr();
fp = fopen("D:\\who.txt","r");
if(fp)
{
while(!feof(fp))
{
fscanf(fp,"%s",line);
if(strcmpi(line,"who")==0 ||
strcmpi(line,"unicef")==0)
printf("%s ",strupr(line));
else
printf("%s ",line);
}
}
174 Data Structures using C
else
perror("Error");
getch();
return 0;
}
Who.txt contains
WHO, the World Health Organization, had identified 7th June 2006 as Global Polio Eradication Day.
Who said that whosoever may tell you that these 2 drops are useless, you shouldnt listen to them. unicef
and who requested the state and central government to take necessary measures to assure that all the kids
up to the age of 5, get the polio vaccine as per who standard. unicef along with local ngos are trying their
best to ensure that no child is left out from this gigantic who - unicef effort. I will take my children to the
nearest polio booth. As a who doctor, I ask you to do so my friend! lets fight polio along with who,
unicef and ngos.
and the output of the program will be
WHO the World Health Organization had identified 7th June 2006 as Global Polio Eradication Day.
WHO said that whosoever may tell you that these 2 drops are useless, you shouldnt listen to them.
UNICEF and WHO requested the state and central government to take necessary measures to assure
that all the kids up to the age of 5, get the polio vaccine as per WHO standard. UNICEF along with
local NGOs are trying their best to ensure that no child is left out from this gigantic WHO - UNICEF
effort. I will take my children to the nearest polio booth. As a WHO doctor, I ask you to do so my friend!
Lets fight polio along with WHO, UNICEF and NGOs.
See the changes are made bold
Try extending this program with the following changes.
1. Try to make the changes (i.e. capitalization) interactively. For example there may be a sentence
like who doesnt know who? It actually means who doesnt know WHO ? Whenever an abbrevia-
tion from the pool of abbreviations is found, display the full line and then ask user whether to
change that abbreviation or not. If yes then make the case change, else skip it and continue search
for next matching abbreviation.
2. Create a pool of abbreviations
3. Allow users to add new abbreviations.
int main()
{
char *s=ABACAS;
printf(Original String : %s\n,s);
printf(Reverse String : %s\n,strrev(s));
return 0;
}
Strings (Database to DNA!) 175
int main()
{
char *s=abcdefghijklmnop;
char symbol = x;
printf(The original string is : %s\n,s);
printf(After setting with x :%s\n,strset(s,symbol));
return 0;
}
The output of the program will be
The original string is: abcdefghijklmnop
After setting with x: xxxxxxxxxxxxxxxx
strnset()
Sometime you may be interested to set only a particular number of characters from the start of the string.
In those situations, you have to use strnset( ). This function sets the first n characters of a string to
another character as given by the user.
#include <stdio.h>
#include <string.h>
int main()
{
char *s=abcdefghijklmnop;
char symbol = x;
printf(The original string is : %s\n,s);
printf(After setting with x up to first 4 characters
:%s\n,strnset(s,symbol,4));
return 0;
}
The output of the program will be
The original string is: abcdefghijklmnop
After setting with x up to first 4 characters: xxxxefghijklmnop
176 Data Structures using C
These functions can be used in computer secrecy projects. For example, you may have noticed that when
you buy something over internet using your credit card, the digits of the credit card are set to some
different characters. Sometimes all the characters are set and sometime some predefined range of char-
acters are set. These sorts of operations can be achieved using these two functions.
int main()
{
char *s=Life is beautiful;
printf(The first occurrence of e is at %d\n,
strchr(s,e)-s);
return 0;
}
The output of the program will be
The first occurrence of e is at 3
strstr()
This function is used for finding out the first occurrence of a substring/word from a given string. This is
syntactically the same as strchr( ). This function returns the substring that starts from the given word. For
example if the given string is
char *s = Life is beautiful;
And the search pattern/word is is then if we call strstr() as strstr(s,is) then it will return
is beautiful
Using similar pointer arithmetic as in strchr() we can find the index location of the first occurrence of
the sought word. Here is the code
#include <stdio.h>
#include <conio.h>
#include <string.h>
int main()
Strings (Database to DNA!) 177
{
char *s=Life is beautiful;
printf("The first occurrence of \"is\" is at
%d\n",strstr(s,"is")-s);
return 0;
}
The output of this code is
The first occurrence of is is at 5
4.14 HOW TO FIND THE LOCATION FROM WHERE TWO STRINGS START
TO DIFFER
Functions used:
l strspn()
l strcspn()
strspn()
This function is used to find the location of a string where it started to differ from another given string.
Here is a code snippet to demonstrate the functions behaviour.
#include <stdio.h>
#include <string.h>
#include <malloc.h>
int main()
{
char *s="12345678";
char *t="1234ffsf";
printf("The strings intersect at %d\n",strspn(s,t));
return 0;
}
This program prints
The strings intersect at 4
The point where from the strings start differing is sometime referred as their intersection point.
strcspn()
This is just the complimentary function of strspn(). This function finds that position in a string till when
it doesnt match with another string. Here is the code
#include <stdio.h>
#include <string.h>
#include <malloc.h>
int main()
{
char *s="12345678";
char *t="ffsf5678";
printf("The strings intersect at %d\n",strcspn(s,t));
return 0;
}
Here is the output of the program:
The strings intersect at 4
178 Data Structures using C
int main()
{
char *s=Able was I as I saw Elba; //A Palindromic String!
char *dup_s = strdup(s);
printf(Duplicate String : %s\n,dup_s);
free(dup_s);
return 0;
}
This will print
Duplicate String: Able was I as I saw Elba
int main()
{
char *s=12,400;
printf(Part before , is %s\n,strtok(s,,));
printf(Part after , is %s\n,strtok(NULL,,));
return 0;
}
This will print
Part before , is 12
Part after , is 400
Strings (Database to DNA!) 179
So you may have noticed that to get the trailing part (Whichever part of the string is after the delimiter)
we need to call strtok(NULL,delimiter); after calling strtok(stringname, delimiter) ;
This function can be very handy to extract part of a code or phone number. Typically any land line
number has three parts. They are country code, city code and the phone number. We can write a small
program from which we will accept a phone number from the user and then extract these three sections
from that number. Here is the code and the sample output.
#include <stdio.h>
#include <string.h>
int main()
{
char s[16];
puts("Enter Number as [country_code-City_code-Phone Number ,
Eg 123-456-12345678] :" );
gets(s);
char *p;
p=strtok(s,"-");
printf("Country Code : %s\n",p);
p = strtok(NULL,"-");
printf("City Code : %s\n",p);
printf("Phone Number : %s\n",strtok(NULL,"-"));
return 0;
}
A Sample Run of this code
Enter Number as [country_code-City_code-Phone Number , Eg 123-456-12345678]:
91-011-25882746
Country Code: 91
City Code: 011
Phone Number: 25882746
Next in this chapter there is a function Split() that uses strtok() internally to tokenize a given line with
a given delimiter. If you want, you can jump there to understand how that works.
{
for(i=0,j=0;i<strlen(Pattern);i++,j++)
{
if(Text[j]!=Pattern[i])
{
Pref = 0;
break;
}
}
}
else
Pref = 0;
return Pref;
int j = 0;
int Suff = 1;
if(Text[strlen(Text)-1]==Pattern[strlen(Pattern)-1])
{
for(i=strlen(Text)-2,j=strlen(Pattern)-2;j>=0;i--,j--)
{
if(Text[i]!=Pattern[j])
{
Suff = 0;
break;
}
}
}
else
Suff = 0;
return Suff;
}
Strings (Database to DNA!) 181
//O(MN)
}
This above function finds whether b is a subsequence of a or not, provided a and b do NOT have any
duplicate characters. The time complexity of the above code is O(MN + K) where M is the length of the
text, N is the length of the Pattern b.
Try Yourself: Try to modify this above program to allow duplicate in pattern a and b:
For example the above code will say that order is a subsequence for wonderful but it will fail to
identify that rental is a subsequence of ornamental because a occurs once before e and the
above program does not store location information of characters of the word.
182 Data Structures using C
Maintain a linked list of the positions for each character. Write a structure to store each character
and its location list. Now once you preprocess the main word (example Ornamental in the figure)
then you will be ready for checking whether another string (Like rental) is a subsequence of it or not.
The picture explains the rest for you. Try it!
How to Check whether a Word starts with a Prefix or not
int startsWith(char *string,char *pattern)
{
int start = 1;
int i=0;
for(;i<strlen(pattern);i++)
{
if(string[i]!=pattern[i])
{
start=0;
break;
}
}
return start;
}
We can also write the above method using strncmp as
int startsWith(char *string,char *pattern)
{
return strncmp(string,pattern,strlen(pattern));
}
for(i=strlen(s)-1,j=0;i>=0;i--,j++)
rs[j]=s[i];
rs[j]='\0';
return rs;
}
//This function finds out whether the given string with the given
pattern
int endsWith(char *string,char *pattern)
{
char rs[20],rp[20];
strcpy(rs,rstring(string));
strcpy(rp,rstring(pattern));
return startsWith(rs,rp);//Notice how startsWith() is re-
used.
}
Example 4.4 Write a function to check whether a string matches the following basic asterisk (*)
wild character matches or not. Namely,
<pattern>*
*<pattern>
<pattern1>*<pattern2>
*<pattern>*
Solution Here is a simple function that uses the startsWith and endsWith function defined above in
order to match the four wildcard patterns mentioned. It doesnt work for a generalized combination of
those wildcard characters.
int wildCharMatch(char *string,char *pattern)
{
char t[20];
char *p = strchr(pattern,'*');
int starindex = p - pattern;
if(pattern[0]=='*')
{
if(pattern[strlen(pattern)-1]=='*')
{
strcpy(t,substring(pattern,1,strlen(pattern)-2));
return strstr(string,t)!=NULL?1:0;
}
else
{
strcpy(t,substring(pattern,1,strlen(pattern)-1));
return endsWith(string,t);
}
}
if(pattern[strlen(pattern)-1]=='*')
{
strcpy(t,substring(pattern,0,strlen(pattern)-2));
return startsWith(string,t);
}
if( i d ! 0 || i d ! l ( ) 1)
184 Data Structures using C
if(starindex!=0 || starindex!=strlen(pattern)-1)
{
strcpy(t,substring(pattern,0,starindex-1));
if(startsWith(string,t)==1)
{
strcpy(t,substring(pattern,starindex+1,strlen(pattern)-1));
return endsWith(string,t);
}
}
else
return 0;
}
How to accept a String of Words Delimited with Space and Return a Linked
List of Those Words as char*
typedef struct String
{
char s[20];
struct String *next;
}String;
h = push_back_String(h,p);
return ch;//Returning the head of the list
}
How to Count the Total Number of Words in a Sentence
int countWords(String *words)
{
int totalwords = 0;
String *cwords=words;
for(;cwords!=NULL;cwords=cwords->next)
totalwords++;
return totalwords;
}
How to Replace a Word by another Word from a Phrase
char* replace(char phrase[81],char *word,char *newword)
{
int count = 0;
int i,j=0;
String *words = NULL;
words = split(phrase," ");
String *cwords = NULL, *modphrase = NULL;
char newphrase[81];
for(;words!=NULL;words=words->next)
{
if(strcmp(words->s,word)==0)
{
cwords = push_back_String(cwords,newword);
count++;
if(count==1)
modphrase = cwords;
cwords = push_back_String(cwords," ");
}
else
{
cwords = push_back_String(cwords,words->s);
count++;
if(count==1)
modphrase = cwords;
cwords = push_back_String(cwords," ");
for(;modphrase!=NULL;modphrase=modphrase->next)
for(i=0;i<strlen(modphrase->s);i++,j++)
newphrase[j]=modphrase->s[i];
newphrase[j]='\0';
return newphrase;
}
186 Data Structures using C
}
for(;modifiedline!=NULL;modifiedline=modifiedline->next)
for(i=0;i<strlen(modifiedline->s);i++,j++)
modifiedsentence[j] = modifiedline->s[i];
modifiedsentence[j] = '\0';
return modifiedsentence;
}
Here is a client code snippet that demonstrates the usage of the above function.
strcpy(t,"Anger is never without reason but seldom with a good one);
strcpy(t,DeleteWord(t,"without"));
strcpy(t,DeleteWord(t,"reason"));
strcpy(t,DeleteWord(t,"but"));
strcpy(t,DeleteWord(t,"seldom"));
strcpy(t,DeleteWord(t,"with"));
strcpy(t,DeleteWord(t,"a"));
strcpy(t,DeleteWord(t,"one"));
puts(t);
This outputs
Anger is never good
Strings (Database to DNA!) 187
getch();
return 0;
}
it generates the following output:
Few people are
capable of expressing
with equanimity opinions
which differ from
the prejudices of
their social environment.
Most people are
even incapable of
forming such opinions
Albert Einstein
How to Demonstrate the Random Cipher Encryption of a Text
char* encrypt(char *pwd)
{
char *epwd;
int i,j;
//Add more symbols if you want
char a[]={'a','b','c','d','e','f','g','h','i','j','k'
,'l','m','n','o','p','q','r','s','t','u','v','w','x','y','z',
'1','2','3','4','5','6','7','8','9','0'};
char b[]={'0','9','8','7','6','5','4','3','2',
'1','z','y','x','w','v','u','t','s','r','q','p','o','n','m','l',
'k','j','i','h','g','f','e','d','c','b','a'};
188 Data Structures using C
for(i=0;i<strlen(pwd);)
{
for(j=0;j<strlen(a);j++)
{
if(pwd[i]==a[j])
{
epwd[i]=b[j];
i++;
}
}
}
epwd[i]='\0';
return epwd;
}
How to Decrypt a Text Encrypted with the Above Function
char* decrypt(char *pwd)
{
char *dpwd;
int i,j;
char a[]={'a','b','c','d'
,'e','f','g','h'
,'i','j','k','l'
,'m','n','o','p','q'
,'r','s','t','u','v'
,'w','x','y','z','1'
,'2','3','4','5','6'
,'7','8','9','0'};
char b[]={'0','9','8','7'
,'6','5','4','3','2'
,'1','z','y','x','w'
,'v','u','t','s','r'
,'q','p','o','n','m'
,'l','k','j','i','h'
,g','f','e','d','c'
,'b','a'};
for(i=0;i<strlen(pwd);)
{
for(j=0;j<strlen(a);j++)
{
if(pwd[i]==b[j])
{
dpwd[i]=a[j];
i++;
}
}
}
dpwd[i]='\0';
return dpwd;
}
,'l','k','j','i','h'
,g','f','e','d','c'
,'b','a'};
for(i=0;i<strlen(pwd);)
Strings (Database to DNA!) 189
{
for(j=0;j<strlen(a);j++)
{
if(pwd[i]==b[j])
{
dpwd[i]=a[j];
i++;
}
}
}
dpwd[i]='\0';
return dpwd;
}
How to Represent a String as a Linked List of Characters
typedef struct character
{
char c;
struct character *next;
struct character *prev;
}character;
s=push_back(s,'\0');
return cs;
}
}
How to Trim a Specified Number of Characters from the Left of a Given String
char* trimleft(char *word,int index)
{
char *tl;
int last = strlen(word);
strcpy(tl,substring(word,index,last));
return tl;
}
This function will trim the specified number of characters from the left of the given word. Here is
how to call the function.
trimleft(,Wonderful,1) will return Wonderful
How to Trim a Specified Number of Characters from the Right of a Given String
char* trimright(char *word,int index)
{
char *t;
int x = strlen(word);
strcpy(t,substring(word,0,x-index-1));
return t;
}
The function will trim the specified number of characters from the right of the given word. Here is
how to call the function
trimright(Lovely,2) will return Love
How to Pad n Number of Specified Characters to the Left of a String
char* padleft(char *word,char c,int n)
{
char *t;
Strings (Database to DNA!) 191
int i;
for(i=0;i<n;i++)
t[i]=c;
t[i]='\0';
strcat(t,word);
return t;
}
The function will pad n number of characters c at the left of the word. Here is an example run.
padleft(100.00,$,1) will return $100.00
How to Pad n Number of Specified Characters to the Right of a String
//In this function we have not used strcat()
char* padright(char *word,char c,int n)
{
char *t;
int i,j;
for(i=0;i<strlen(word);i++)
t[i]=word[i];
for(j=i;j<n+i;j++)
t[j]=c;
t[j]='\0';
return t;
}
This function will pad the given word at the left with the given number of characters at the right side.
Like if the function is called like padright(chapter0,?,2) will return chapter0??
We can use these building block functions to create the following functions.
How to Trim White Spaces from Both Sides of a Word
//It is assumed that the word passed as a parameter to this function has
//whitespaces on both sides.
char* trim(char *word)
{
char *templ;
char *tempr;
int i = 0;
int index = 0;
for(;i<strlen(word);i++)
{
if(word[i]!=' ')
{
index = i;
break;
}
}
strcpy(templ,trimleft(word,index));//Left whitespaces trimmed
for(i=0;i<strlen(templ);i++)
{
if(templ[i]==' ')
break;
}
strcpy(tempr,trimright(templ,i));//Right whitespaces trimmed
return tempr;
return temp;
}
}
Here is a sample client code snippet that demonstrates the above function pad():
strcpy(t,pad("Love",20,'.',ALLIGN_LEFT));
puts(t);
printf("Length = %d\n",strlen(t));
strcpy(t,pad("Love",20,'.',ALLIGN_CENTER));
puts(t);
printf("Length = %d\n",strlen(t));
strcpy(t,pad("Love",20,'.',ALLIGN_RIGHT));
puts(t);
printf("Length = %d\n",strlen(t));
Love................
Length = 20
........Love........
Length = 20
................Love
Length = 20
How to Remove all the Blank Spaces in a Phrase
char* sweepwspace(char *phrase)
{
int i = 0,j=0;
char temp[100];
for(;i<strlen(phrase);i++)
Strings (Database to DNA!) 193
{
if(phrase[i]!=' ')
{
temp[j]=phrase[i];
j++;
}
}
temp[j]='\0';
return temp;
}
How to Extract all the k-grams of a String for a given k
String* kgrams(char *line,int k)
{
int i = 0;
int count = 0;
char temp[50];
//deleting all the white spaces
strcpy(line,sweepwspaces(line));
String *words = NULL,*ktokens = NULL;
int total = strlen(line)-k-1;
for(;i<=total;i+=k)
{
strcpy(temp,substring(line,i,i+k-1));
words = push_back_String(words,temp);
count++;
if(count == 1)
ktokens = words;
}
return ktokens;
}
k-grams are used for generating the digital fingerprint of intelligent materials like a document. See the
end of chapter on file. There is an algorithm called Winnowing which uses k-grams to detect plagiarism.
Different nearest neighbor identification algorithms, like Near Duplicate Document finding in a web
search uses k grams.
How to check whether a String is a Valid UPC (Universal Product Code)
Code or not
Each product being sold in the market has a different product code. This code is known as Universal
Product Code or UPC in short. This concept was first introduced for making grocery store checkout
194 Data Structures using C
time faster. The first digit in the UPC tells about the type of the product. The last digit (6 in this case) is
the check digit. This digit tells the scanner whether the code had been read properly or not.
The algorithm to check whether UPC is valid or not is popularly known as Module 10 algorithm or as
Luhn Algorith according to its discoverer Hans Peter Luhn.
Step 1: The even place digits are added
Step 2: The number at odd places are multiplied by 3 and then the sum of the digits of all generated
numbers are found.
Step 3: If the total of the numbers obtained in step 1 and step 2 is divided by 10.
Step 3: If the remainder is 0 that means the UPC code is valid. Otherwise it means that the UPC code is
invalid.
int digsum(int number)
{
int sum = 0;
while(number!=0)
{
sum+=number%10;
number/=10;
}
return sum;
}
int EvenSum = 0;
int OddSum = 0;
for(;kgs!=NULL;kgs=kgs->next)
{
if(count%2==0)
//Find digit sum of numbers we get by multiplying the
digits at
//even locations.
EvenSum+=digsum(atoi(kgs->s)*2);
else
OddSum+=atoi(kgs->s);
count++;
}
the card has a proper length for its type or not. For example, if the card is of type VISA then the length
of the card needs to be validated. Once that is done, then it is checked whether the card number starts
with a predefined globally accepted prefix for VISA type cards or not. Once all these criteria are passed
then the number is validated using check digit or Luhn or Module 10 algorithm.
The following two functions implement the algorithm.
int IsValidCheckDigit(char *ccn)
{
String *kgs = kgrams(ccn,1);
int count = 1;
int EvenSum = 0;
int OddSum = 0;
for(;kgs!=NULL;kgs=kgs->next)
{
if(count%2==0)
EvenSum+=digsum(2*atoi(kgs->s));
else
OddSum+=digsum(atoi(kgs->s));
count++;
}
return (EvenSum + OddSum)%10==0;
}
if(strcmpi(type,"American Express")==0)
{
if(strlen(ccn)==15)
{
if(startsWith(ccn,"34")==1 || startsWith(ccn,"37")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"Carte Blanche")==0 || strcmpi(type,"Diners Club")==0 )
{
if(strlen(ccn)==14)
{
if(startsWith(ccn,"300")==1
|| startsWith(ccn,"301")==1
|| startsWith(ccn,"302")==1
|| startsWith(ccn,"303")==1
|| startsWith(ccn,"304")==1
|| startsWith(ccn,"305")==1
|| startsWith(ccn,"36")==1
|| startsWith(ccn,"38")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
Strings (Database to DNA!) 197
else
return 0;
}
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"Discover")==0 )
{
if(strlen(ccn)==16)
{
if(startsWith(ccn,"6011")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"Enroute")==0 )
{
if(strlen(ccn)==15)
{
if(startsWith(ccn,"2014")==1 || startsWith(ccn,"2149")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"JCB")==0 )
{
if(strlen(ccn)==15 || strlen(ccn)==16)
{
if(startsWith(ccn,"3")==1
|| startsWith(ccn,"2131")==1 || startsWith(ccn,"1800")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
l
198 Data Structures using C
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"Maestro")==0 )
{
if(strlen(ccn)==16)
{
if(startsWith(ccn,"6")==1 || startsWith(ccn,"5020")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"Mastercard")==0 )
{
if(strlen(ccn)==16)
{
if(startsWith(ccn,"51")==1
|| startsWith(ccn,"52")==1
|| startsWith(ccn,"53")==1
|| startsWith(ccn,"54")==1
|| startsWith(ccn,"55")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"Solo")==0 )
{
if(strlen(ccn)==16 || strlen(ccn)==18 || strlen(ccn)==19)
{
if(startsWith(ccn,"6334")==1 || startsWith(ccn,"6767")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
else
Strings (Database to DNA!) 199
return 0;
}
else
return 0;
}
if(strcmpi(type,"Switch")==0 )
{
if(strlen(ccn)==16 || strlen(ccn)==18 || strlen(ccn)==19)
{
if(startsWith(ccn,"4903")==1 ||
startsWith(ccn,"4905")==1
|| startsWith(ccn,"4911")==1
|| startsWith(ccn,"4936")==1
|| startsWith(ccn,"564182")==1
|| startsWith(ccn,"633110")==1
|| startsWith(ccn,"6333")==1
|| startsWith(ccn,"6759")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"Visa")==0 )
{
if(strlen(ccn)==13 || strlen(ccn)==16)
{
if(startsWith(ccn,"4")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
return 0;
}
else
return 0;
}
else
return 0;
}
if(strcmpi(type,"Visa Electron")==0 )
{
{
if(strlen(ccn)==16)
{
if(startsWith(ccn,"417500")==1
||startsWith(ccn,"4917")==1 || startsWith(ccn,"4913")==1)
{
if(IsValidCheckDigit(ccn)==1)
return 1;
else
200 Data Structures using C
return 0;
}
else
return 0;
}
else
return 0;
}
toggledsentence[i] = toupper(sentence[i]);
continue;
}
if(sentence[i]>='A' && sentence[i]<='Z')
{
toggledsentence[i] = tolower(sentence[i]);
continue;
}
else
{
toggledsentence[i] = sentence[i];
continue;
}
}
toggledsentence[i]='\0';
return toggledsentence;
}
When called with the following client code snippet
char *t;
strcpy(t,I am here.lets fOlk agAin!);
strcpy(t, ToggleCase(t));
puts(t)
the following output is generated.
i AM HERE.LETS FoLK AGaIN!
How to Calculate the Frequency of a given Word in a Sentence
int WordFrequency(char *sentence,char *word)
{
int freq = 0;
char *csen;
strcpy(csen,sentence);
String *words = split(csen," ");
for(;words!=NULL;words=words->next)
if(strcmp(words->s,word)==0)
freq++;
return freq;
}
To know how this function can be called, see the next question.
How to Display the Word Histogram of a given Sentence
void WordHistogram(char *sentence)
{
char *longsentence;
char *mostusedword;
strcpy(longsentence,sentence);
String *words = split(sentence," ");
for(;words!=NULL;words=words->next)
printf("%s %d\n",words->s,WordFrequency(longsentence,
words->s));
}
Here is a client code snippet that demonstrates how to use the above
function.
202 Data Structures using C
Anger 1
is 2
never 1
without 1
a 2
reason 1
but 1
it 1
is 2
seldom 1
with 1
a 2
good 1
one 1
Try Yourself: As you can see this program shows same word multiple times if it occurs multiple times
in the sentence. Create an user defined Word-Frequency List to remove this duplication.
How to Find the Most Used Word in a Sentence/Phrase/String
char* MostUsedWord(char *sentence)
{
char longsentence[100];
int maxf;
char mostusedword[30];
strcpy(longsentence,sentence);
String *words = split(sentence," ");
maxf = WordFrequency(longsentence,words->s);
for(;words!=NULL;words=words->next)
if(maxf<WordFrequency(longsentence,words->s))
strcpy(mostusedword,words->s);
return mostusedword;
}
When called with the following client code snippet,
char *t,*s;
strcpy(t,"If you miss the train I'm on you will know that I am gone");
strcpy(s,MostUsedWord(t));
puts(The most used word is );
puts(s);
the following output is generated.
The most used word is
you
How to Find whether a Word/Phrase is an Anagram of Another One
An anagram is a different combination of same characters that occur in a word or phrase. While compar-
ing whether two words/phrases are anagrams of one another or not we check for the frequencies of all
the characters in two strings. If they match, then the two strings are anagram of one another. Punctuations
and special characters are discarded. For example Great Taste! and Gear at Test! are the anagrams
of one another. While generating anagrams all the punctuations are discarded.
Strings (Database to DNA!) 203
}
count = 0;
i=0;
//Creating the histogram of characters of the second string
for(;i<strlen(diffcomb);i++)
{
cf.freq = 0;
cf.c = diffcomb[i];
for(j=0;j<strlen(diffcomb);j++)
{
if(diffcomb[i]==diffcomb[j])
{
cf.freq++;
}
}
204 Data Structures using C
if(!Contains(t1,cf.c))
{
t = push_back(t,cf);
}
count++;
if(count==1)
t1 = t;
}
}
//Voila! Now we have both the histogram of strings. So we are ready to run
//through these histograms to see if they match or not.
for(;s1!=NULL;s1=s1->next)
{
temp = t1;
//Two words which are anagram of one another
//s=doctorwho
//d-1
//o-3
//c-1
//t-1
//r-1
//w-1
//h-1
//t=torchwood
//t-1
//o-3
//r-1
//c-1
//h-1
//w-1
//d-1
for(;temp!=NULL;temp=temp->next)
{
if(s1->c==temp->c)
{
flag=1;
//Well, we have the same character in both the strings
//but their frequencies dont match. So they are not anagram
//of one another
if(s1->freq!=temp->freq)
{
anagram = 0;
break;
}
}
}
}
return anagram;
}
Strings (Database to DNA!) 205
4.25 HOW TO FIND OUT THE SOUNDEX CODE FOR A GIVEN WORD
A soundex code is the phonetic code of the word. If the soundex codes of two words are same, that
means they are pronounced same way. This soundex algorithm was developed for indexing surnames in
a census.
Here is the soundex algorithm:
l Delete all occurrences of a,e,i,o,u,w,h and y from the word
l The first character of the soundex code will be the same as that of the words first character.
l If two consecutive characters are same, then the first one will be considered and the rest all will be
ignored.
l If the character encountered in the word is either b,f,p or v put 1 in the corresponding soundex
code of the word.
l If the character encountered in the word is either c,k,x,q,g,z,j, or s put 2 in the corresponding
soundex code of the word.
l If the character encountered in the word is either d or t put 3 in the corresponding soundex code of
the word.
l If the character encountered in the word is l put 4 in the corresponding soundex code of the word.
l If the character encountered in the word is either m or n put code 5 in the corresponding soundex
code of the word.
l If the character encountered in the word is r put 6 in the corresponding soundex code of the word.
th
l Find the substring of the Soundex code till the 4 character.
l If there are fewer than 4 characters in the soundex code, including the initial letter, then pad the
right of the soundex code with zeros.
//The following function returns the soundex code for a given word.
char* SoundexCode(char *word)
{
int i,j;
char temp[30];
temp[0]=word[0];
for(i=1,j=1;i<strlen(word);i++)
{
if(word[i]==word[i-1])
i++;//This is to avoid repetition of characters
if(word[i]!='a' || word[i]!='e' || word[i]!='h' ||
word[i]!='i'
|| word[i]!='o' || word[i]!='u' || word[i]!='w' ||
word[i]!='y')
{
if(word[i]=='b' || word[i]=='f' || word[i]== 'p' ||
word[i]== 'v')
{
temp[j]='1';
j++;
}
if(word[i]=='c' || word[i]=='g' || word[i]=='j'
|| word[i]=='k' || word[i]=='q' || word[i]=='s'
|| word[i]=='x' || word[i]=='z')
{
temp[j]='2';
j++;
}
if(word[i]=='d' || word[i]=='t')
206 Data Structures using C
{
temp[j]='3';
j++;
}
if(word[i]=='l')
{
temp[j]='4';
j++;
}
if(word[i]=='m' || word[i]=='n')
{
temp[j]='5';
j++;
}
if(word[i]=='r')
{
temp[j]='6';
j++;
}
}
}
temp[j]='\0';
strcpy(temp,substring(temp,0,3));
return temp;
}
How to Check whether two Words are Pronounced Same or not Using Their
Soundex Codes
int isSameSoundex(char *word,char *suspectedhomonym)
{
if(strcmp(word,suspectedhomonym)==0)
return 1;
else
return 0;
}
So build and billed both have the same soundex code b430 (Blank means zero) and thats
because they are pronounced the same way. If you want to test the above algorithm for different strings,
visit Alan Coopers most extensive list of homonyms http://www.cooper.com/alan/homonym_list.html
The above algorithm works fine for almost all these strings listed here. The algorithm doesnt work for
those homonyms that start with different letters and pronounced the same way somehow because it
retains the first letter of the word.
REVIEW QUESTIONS
PROGRAMMING PROBLEMS
1. Write a function to check whether a string matches the single wildcard pattern or not.
2. Write a program to demonstrate Naïve string search algorithm. This is also known as Brute Force
Algorithm.
3. Write a program to demonstrate Karp-Robin Algorithm.
4. Write a program to demonstrate Boyer Moore string search algorithm.
5. Write a program to demonstrate Boyer Moore Horspool string search algorithm.
6. What do you mean by string distance?
7. Write a program to calculate Hamming distance of two strings.
8. Write a program to calculate the levensthian distance of two strings.
9. What do you mean by similarity of two strings?
10. Write a function ExtractDigits() that extracts the digits from a string.
11. Write a function isValidIdentifier() that accepts a string as a variable name and returns 1, if the
name can be a valid C Identifier else it returns 0.
12. Write a function isCamelCase() that returns 1, if the string passed as argument is a valid C identi-
fier written in Camel Case.
13. Write a function isPascalCase() that returns 1, if the string passed as argument is a valid C identi-
fier written in Pascal Case.
14. Write a function isAComment() that return 1, if the string passed is a valid C++ single line com-
ment or C Style single line comment. Otherwise the function should return 0.
15. Write a function GetTense() that returns the tense of the sentence that is passed as the argument.
For example, the call like GetTense(I shall have been learning PHP next year by this time) will
return Future Perfect Continuous and a call like GetTense(I shall have finished my dinner by
then) will return Future Perfect.
16. Write a function Metaphone() to find the metaphone of a string.
5
Recursion
Time and Again!
INTRODUCTION
Recursion as the name suggests means re occurrence of the same method. Something that is occurring
over and over again is recurrence. In this chapter we will learn about different types of recursion and
then we will see how this technique is applied to solve different types of problems starting from solving
quadratic equations to creation of recursive fractals like Serpinski triangle, Monge Sponge, etc.
Tail Recursion A recursive procedure where the recursive call is the last action to be taken by the
function. Tail recursive functions are generally easy to transform into iterative functions.
In words: you start with 0 and 1, and then produce the next Fibonacci number (Fn) by adding the two
previous Fibonacci numbers:
Fig. 5.2
Fibonacci numbers have some very interesting properties. Here are some:
Lucas Theorem
Shifting Property
Fm+n = Fm ◊ Fn+1 + Fm1 ◊ Fn
#include <stdio.h>
Example 5.4 Write a program that calculates the numbers of a Fibonacci type series. Write a
function that will accept three values. The first two are the seeds (Like for Fibonacci numbers the
first two are 0 and 1) and the number of terms up to which the loop rotates.
Solution
#include <stdio.h>
Example 5.5 Write a program to find the greatest common divisor of two numbers, using
Euclids algorithm.
Solution Almost 2500 years ago, Euclid discovered an algorithm to find the greatest common factor
or the test common divisor of two numbers.
int gcf(int x,int y)
{
if(y ==0)
return x;
else
return gcf(y,x%y);
}
Notice that this is a Tail Recursive Algorithm.
Example 5.6 Write a program to find the GCD using Joseph Steins Algorithm. This method is
also known as binary method.
Solution Euclids algorithm involves division and thus it is computationally expensive. A quite differ-
ent GCD algorithm specially suited for the binary number was first proposed by Joseph Stein is given by
If u and v are both even then
GCD(u,v) = 2* GCD(u/2,v/2)
If u is even and v is odd then
GCD(u,v) = GCD(u/2,v)
If u and v are both odd then u-v is even so,
GCD(u,v) = GCD(u-v,v)
212 Data Structures using C
if(n!=0)
{
printf("%d\n",(x*x+x)%eto2);
n--;
x = (x*x+x)%eto2;
coveyou(a,m,c,x,n);
}
else
return;
}
}
int von(int n)
{
214 Data Structures using C
int main()
{
int i=0;
int seed = 123;
for(;i<40;i++)
{
printf("%d --> %d\n",i,seed);
seed = von(seed);
}
getch();
return 0;
}
Here is the output of the above program.
0 --> 123
1 --> 512
2 --> 621
3 --> 856
4 --> 327
5 --> 69
6 --> 761
7 --> 791
8 --> 256
9 --> 553
10 --> 58
11 --> 364
12 --> 324
13 --> 49
14 --> 401
15 --> 608
16 --> 696
17 --> 844
18 --> 123
19 --> 512
20 --> 621
21 --> 856
22 --> 327
23 --> 69
24 --> 761
25 --> 791
Recursion (Time and Again!) 215
26 --> 256
27 --> 553
28 --> 58
29 --> 364
30 --> 324
31 --> 49
32 --> 401
33 --> 608
34 --> 696
35 --> 844
36 --> 123
37 --> 512
38 --> 621
39 --> 856
The seed is chosen such that the repetition characteristics of PRNs generated through this scheme can
be well proved. If you notice the above curves and highlighted entries, you will find that each of the
PRN is repeating itself at every 18th iterations. Or in other words 1st and 18th iteration values are same,
2nd and 20th iterated values are same. (Look at the above output).
Ideally the seed should not have any zero amongst its digits, because that will kill the sequence even
faster. Here is an example output where the seed was 108:
0 --> 108
1 --> 166
2 --> 755
3 --> 700
4 --> 900
5 --> 100 (Border Line PNR)
6 --> 0
7 --> 0
8 --> 0
9 --> 0
See anything after 5th iteration is zero.
Even using nonzero digits in the initial seed doesnt guarantee a good sequence. See the following
example output generated with seed 222.
0 --> 222
1 --> 928
2 --> 611
3 --> 733
4 --> 372
5 --> 383
6 --> 466
7 --> 171
8 --> 924
9 --> 537
10 --> 883
11 --> 796
12 --> 336
13 --> 128
14 --> 638
15 --> 70
216 Data Structures using C
16 --> 900
17 --> 100
18 --> 0
19 --> 0
return n+1;
return ack(m-1,n);
return ack(m-1,ack(m,n-1));
Have you noticed the bold line above in the code? This type of calling is known as Nested Recursion
because one of the arguments in this call actually calls the ack() function again.
if (x<=y)
return z;
else
void Number2Words(number)
{
char *one2nine[9]=
218 Data Structures using C
{
"ONE",
"TWO",
"THREE",
"FOUR",
"FIVE",
"SIX",
"SEVEN",
"EIGHT",
"NINE"
};
char *tens[10]=
{
"TEN",
"TWENTY",
"THIRTY",
"FOURTY",
"FIFTY",
"SIXTY",
"SEVENTY",
"EIGHTY",
"NINETY"
};
char *teens[9]=
{
"ELEVEN",
"TWELVE",
"THIRTEEN",
"FOURTEEN",
"FIFTEEN",
"SIXTEEN",
"SEVENTEEN",
"EIGHTEEN",
"NINETEEN"
};
char *modifiers[]={"HUNDRED","THOUSAND"};
int copy=0;
int digits;
int dig[5];
int i=0;
int isTeens=0;
int isZero=0;
int noZeros=0;
int isAlreadySet=0;
copy = number;
digits = countdigits(number);
if(number==0 && isAlreadySet==0)
{
Recursion (Time and Again!) 219
printf("ZERO");
isAlreadySet=1;
}
if(digits==1 && isAlreadySet==0)
{
printf("%s ",one2nine[number-1]);
isAlreadySet=1;
}
while(number!=0)
{
dig[i]=number%10;
if(dig[i]==0 && digits>2 && isAlreadySet==0)
{
noZeros++;
isZero=1;
}
number/=10;
i++;
}
if(digits==2 && isZero==0 && copy>=21
&& copy<=99 && isAlreadySet==0)
{
printf("%s %s",tens[dig[1]-1],
one2nine[dig[0]-1]);
isAlreadySet=1;
}
if(digits==3 && isAlreadySet==0)
{
if(dig[1]==1)
isTeens=1;
if(digits==3
&& isTeens==0
220 Data Structures using C
&& isZero==0
&& isAlreadySet==0)
{
printf("%s HUNDRED AND %s %s ",
one2nine[dig[2]-1],
tens[dig[1]-1],one2nine[dig[0]-1]);
isAlreadySet=1;
}
if(digits==3
&& isTeens==1
&& isZero==0
&& isAlreadySet==0)
{
printf("%s HUNDRED AND %s ",
one2nine[dig[2]-1], teens[dig[0]%10-1]);
isAlreadySet=1;
}
if(digits==3
&& noZeros==2
&& isAlreadySet==0)
{
if(dig[0]==0 && dig[1]==0)
{
printf("%s HUNDRED ",one2nine[dig[2]-1]);
isAlreadySet=1;
}
}
if(digits==3 && noZeros==1 && isAlreadySet==0)
{
if(dig[1]==0)//102
printf("%s HUNDRED AND %s ",
one2nine[dig[2]-1],
one2nine[dig[0]-1]);
if(dig[0]==0)//210
printf("%s HUNDRED AND %s
",one2nine[dig[2]-1],
tens[dig[1]-1]);
isAlreadySet=1;
}
}
}
int main()
{
int number=0;
do
{
int number=0;
printf("Enter a number :");
scanf("%d",&number);
if(number>100000)
Recursion (Time and Again!) 221
{
Number2Words(number/100000);
printf("LAKH ");
number=number%100000;
}
if(number>1000)
{
Number2Words(number/1000);
printf("THOUSAND ");
number=number%1000;
}
if(number>0)
{
Number2Words(number);
}
}while(1);
}
Have you noticed that the program only has code to handle up to 3 digit number but it recursively
calls itself so that it can work properly to convert the bigger integers.
xk +1 = ÊÁ xk + A ˆ˜
1
2Ë xk ¯
Here is the C code:
#include <stdio.h>
#include <conio.h>
#include <math.h>
return r;
else
r = root(A,r,n,iter);
return r;
}
int main()
{
printf("Root is :%f",root(18,4,2,20));
return 0;
}
Example 5.12 Write a Program to find
the root of a function using NewtonRaphson
Method. This method is recursive in nature.
Solution When the derivation of f (x) is a sim-
ple expression and easily found, the real roots
of f (x) = 0 can be computed rapidly by a process
called Newton-Raphson method as shown in the
above figure:
#include <stdio.h>
#include <conio.h>
#include <math.h>
Fig. 5.3
double fun(double x)
{
//Define your Function here
return pow(x,3)-x-4;
}
double dfun(double x)
{
//Define the derivative of the function here
return 3*pow(x,2)-1;
}
}
int main()
Recursion (Time and Again!) 223
{
printf("Root is : %f",NewtonRaphson(2,10));
return 0;
}
Example 5.13 Write a program to find the root of a
function using bisection method.
Solution The method of bisection to find the roots of non-
linear equations is old and trivial and in a way inefficient. If
for two values of x, a and b the two values of y, y(a) and y(b)
have different signs, then, there exists a root of f (x) between a
and b. As the name suggests, method of bisection, bisects this
span and the point of bisection replaces either a or b depend-
ing on the sign of f (x) at point of bisection. For example if
f (a) is positive and f (b) is negative then there is a root be-
tween a and b. According to bisection method the close ap-
proximation of the root is x = (a+b)/2. Now if f (x) is positive
then a will be replaced by else b will be replaced by x. If f (x)
= 0, then x is a perfect root of f (x).
#include <stdio.h>
#include <conio.h>
#include <math.h> Fig. 5.4
double fun(double x)
{
//Define your Function here
return pow(x,3)-x-4;
}
int main()
{
printf("Bisection :%f\n",bisection(1,2,10));
return 0;
}
224 Data Structures using C
Example 5.15 Write a program to find the value of a function using Inverse Quadratic Interpo-
lation.
Fig. 5.6
Solution This method is not used individually but is used in hybrid numerical in Dekkers and Brents
method. The concept here, is first the inverse quadratic interpolation found at a point starting from three
initial guesses xn,xn-1 and xn-2 using Lagranges distribution. Let that point be f ^(1) (y) then the solu-
tion for y = f (x) = 0 is given by the following equation.
f n -1 f n fn - 2 fn
xn+1 = xn - 2 + xn -1
( f n - 2 - f n -1 ) ( f n - 2 - f n ) ( fn -1 - fn - 2 ) ( fn -1 - fn )
fn - 2 fn -1
+ x
( fn - fn - 2 ) ( fn - fn -1 ) n
Heres the function that calculates the value of Xn+1
double inversequad(double a,double b,double c)
{
double first=0,second=0,third=0,next=0;
double ab=
first=(fun(b)*fun(c)*a)/(((fun(a)-fun(b))*((fun(a)-fun(c)));
second=(fun(a)*fun(c)*b)/(((fun(b)-fun(a))*((fun(b)-fun(c)));
third=(fun(a)*fun(b)*c)/(((fun(c)-fun(a))*((fun(c)-fun(b));
226 Data Structures using C
next = first+second+third;
return next;
}
Note that fun() is the function that evaluates the function. You can
write your own functions.
Example 5.16 Write a program to find the root of a function using Secant Method.
Solution Newtons formula has a basic disad-
vantage. It involves differentiation of the func-
tion being sought. A secant or a chord of a curve
is defined as a straight line that intersects the
curve at least twice. This method calculates the
root of a nonlinear equation using the secant.
See in the above figure, the secant through x0
and x1 meets the x axis at x2. This will be the
new approximation of the root of the curve. This
process continues until we reach the desired level
of accuracy. The recursive relation in the figure
above gives the n+1th approximation of the root.
As you can probably understand from the fig-
ure above that if x0 and x1 are actually close to
the actual root, the rate of convergence will be
very high. Secant method is the base for Mull-
ers method which tries to approximate the curve
by a parabola instead of a straight line. Thus
Muller method is more accurate and its conver- Fig. 5.7
gence rate is comparable to that of Newton
Raphsons method which is really fascinating.
Here is the C function that solves the secant method.
double SecantMethod(double xn_1, double xn, int m)
{
double d;
d = (xn - xn_1)* fun(xn) / (fun(xn) - fun(xn_1)) ;
m--;
if(m==0)
return xn;
else
{
xn_1 = xn;
xn = xn - d;
d = SecantMethod (xn_1,xn,m);
}
return xn;
}
Recursion (Time and Again!) 227
Fig. 5.8
Solution Mullers method is a close cousin of secant method where at each iteration the curve is tried
to be mapped using a parabola that passes through the three points. xk, xk 1 and xk 2. Using Netwons
divided difference formula the curve can be approximated as the following equation below.
y = f (xk) + (x xk) f [xk, xk 1] + (x xk) (x xk 1)f [xk, xk 1, xk 2]
where f [xk, xk 1] and f [xk, xk 1, xk 2] denote divided differences. This can be rewritten as
y = f (xk) + w(x xk) f [xk, xk 1, xk 2](x xk)2
where
w = f [xk , xk 1] + f [xk, xk 2] f [xk 1, xk 2]
The next value in the iteration process is given by the root of the quadratic equation y = 0. This yields
the recurrence relation
2 f ( xk )
xk + 1 = xk -
w ± w 2 - 4 f ( xk ) f [ xk , xk -1 , xk - 2 ]
Note than xk + 1 should be one step close to the sought root than xk. So the denominator value for the
right half should be maximum. The sign + or is chosen depending on this logic.
Here is the C code. Here we are trying to solve the same equation as above.
228 Data Structures using C
int main()
{
printf("Muller :%f\n",Muller(0,1,2,10));
return 0;
}
Recursion (Time and Again!) 229
Fig. 5.9
The most fascinating fact about Mullers method is that it achieves almost the same level of conver-
gence as that of Newton-Raphson method without the need of differentiation. So from a computation
point of view, Mullers method is the best of all these methods discussed above. Muller method acquires
up to 2 decimal place accuracy fairly quickly (maximum by 5 iterations). In the above figure we see that
values from Newton-Raphson method are slightly more but the difference between values we get from
Newton-Raphson and Mullers method become steady after 4 iterations. If we notice carefully we will
see that Newton-Raphson method also took that time to stabilize. So we can conclude that Mullers
method accuracy and stability is comparable with Newton-Raphsons method and as an advantage it
doesnt require the knowledge of derivative of the function whose roots are being sought.
#include <stdio.h>
#include <conio.h>
#include <math.h>
int count=0;
int squareextractsum(int n)
{
int s=0;
int sum=0;
if(count==1)
s=n*n;
else
s = n;
while(s!=0)
{
sum+=(int)pow(double(s%10),2);
s/=10;
}
return sum;
}
int isHappy(int n)
{
if(n==1)
return 1;
if(n==4 || n==16 || n==20
||n==37 || n==42 || n==58
||n==89 || n==145)
return 0;
else
{
n=squareextractsum(n);
return isHappy(n);//Tail Recursive call
}
}
int main()
{
printf("%d \n,isHappy(230));
return 0;
}
This will output 1 as 230 is a happy number.
Try Yourself: Try to write a program that prints all happy numbers in a given range.
Example 5.19 Write a program to generate a section number pattern.
Say there are 2 big sections in a chapter and under each big section there are 5 small sections.
Write a program for printing the pattern
Recursion (Time and Again!) 231
1.1
1.2
:
:
:
2.5
Solution Here is the C code which recursively generates the pattern:
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
int big=0,small=0;
int i=1,j=1;
int count=0;
FILE *fp;
2.1
2.2
2.3
2.4
2.5
and so on
*/
fprintf(fp,"%d.%d\n",i,j);
j++;
if(j<=small)
generateSectionNumberFile(file,i,j);
if(j>small && i<big)
{
j=1;
i++;
generateSectionNumberFile(file,i,j);
}
if(i>big && j>small)
fclose(fp);
int main()
{
232 Data Structures using C
char file[20];
printf("Number of Big sections :");
scanf("%d",&big);
printf("Number of sections under each Big sections :");
scanf("%d",&small);
printf("Enter the file name :");
fflush(stdin);
scanf("%s",file);
fp = fopen(file,"a");
generateSectionNumberFile(file,1,1);
return 0;
}
As you can see, this function can be extended to any level of hierarchy. For example this method can
be enhanced to generate a pattern like
1.1
1.1.1
1.1.2
1.2
1.2.1
1.2.2
The output of the above program when run with 2 and 4 as the inputs for Big and small will generate
a file with pattern
1.1
1.2
1.3
1.4
2.1
2.2
2.3
2.4
else
{
if(position == 1)
return compute_bell(row-1,row-1);
else
return compute_bell(row,position-1)+
compute_bell(row-1,position-1);
}
}
Ê x x2 5 x3 ˆ
e e = e Á1 + 1 + 2
x
+ + ...˜
Ë 1! 2! 3! ¯
else
return
compute_bernoulli(row-1,position-1)
+compute_bernoulli(row-1,position);
}
int catalan_sum(int r)
{
int i=1;
int sum=0;
for(i=1;i<=r;i++)
sum+=compute_catalan(r,i);
return sum;
}
1 Ê 2 nˆ ( 2 n )!
Cn = Á ˜ = for n ≥ 0
n + 1 Ë n ¯ ( n + 1)!n !
Here is the C Program that will generate nth Catalan Number:
int Catalan(int n)
{
return nCk(2*n,n)/(n+1);
}
Fig. 5.11
Recursion (Time and Again!) 237
A Dyck Path is a staircase walk from (0, 0) to (n, n) that lies strictly below the reverse diagonal which
is inclined at 45 degree to the horizon. The number of Dyck paths of order n is given by nth Catalan
Number. See the above image to understand the concept better.
The first box in the above image corresponds to a staircase where there is only one stair. So the
number of Dyck path is 1 for n = 2 it is the second catalan number 2, and for 3 it is the third Catalans
Number = 5, and so on.
}
238 Data Structures using C
else
{
return compute_Leibnitz(row-1,position-1)
-compute_Leibnitz(row,position-1);
}
}
This function calculates Leibnitz Harmonic Number at row and position.
Example 5.20 Write a program to find the Entringer numbers using recursion.
Solution
E(n,k) = E(n,k-1) + E(n-1,n-k)
This relation gives the number of
Where, E(0,0) = 1 and E(n,0) = 0
Here is the C code to generate the Entringer numbers.
int compute_Entringer(int n,int k)
{
if(n==0 && k==0)
return 1;
if(k==0)
return 0;
else
return compute_Entringer(n,k-1)
+compute_Entringer(n-1,n-k);
}
Example 5.21 Write a program to display the numbers of Hofstadter Conway $10,000 Sequence.
Solution Hofstadter Conway discovered a sequence which is recursive in nature and is given by the
following equation:
A(n) = A(A(n-1)) + A(n-A(n-2))
void display_HConway(int n)
{
240 Data Structures using C
for(i=1;i<=n;i++)
printf("%d \n",compute_HConway(i));
}
Write functions to generate and display the numbers of the Mallows Sequences
int compute_Mallows(int n)
{
if(n==1 || n==2)
return 1;
else
return compute_Mallows(compute_Mallows(n-2))
+compute_Mallows(n-compute_Mallows(n-2));
}
void display_Mallows(int n)
{
for(i=1;i<=n;i++)
printf("%d \n",compute_Mallows(i));
}
Example 5.22 Write a program to generate Hofstadters Q Sequence. These numbers are also
known as Q- Numbers.
Solution This is a recursive sequence and generated by the recurrence equation
Q(n) = Q(n-Q(n-1)) + Q(n-Q(n-2))
with Q(1) = Q(2) = 1 . The first few values are 1, 1, 2, 3, 3, 4, 5, 5, 6, 6, .
Here is the C code to generate and display the Q-number
int computeQ(int n)
{
if(n==1 || n==2)
return 1;
else
{
return computeQ(n-computeQ(n-1))
+computeQ(n-computeQ(n-2));
}
}
void display_Q(int n)
{
f (i 1 i i )
for(i=1;i<=n;i++)
printf("%d \n",computeQ(i));
}
for(i=1;i<=rows;i++)
{
for(m=0;m<(51/2)-i;m++)
printf(" ");
for(j=1;j<=i;j++)
{
n = compute_pascal(i,j);
if(n%2==0)
{
printf("%c ",'1');
}
else
{
printf("%c ",'0');
}
}
printf("\n");
}
}
Note that we have used compute_pascal( ) function described above.
This program generates a pattern like
Fig. 5.12
Fig. 5.13
Can you find similarity between the generated pattern and the fractal image?
242 Data Structures using C
for(int i=0,j=0;i<strlen(start);i++)
{
if(start[i]=='A')
{
temp[j]='A';
j++;
temp[j] = 'B';
j++;
}
if(start[i]=='B')
{
temp[j] = 'A';
Recursion (Time and Again!) 243
j++;
}
}
if(strcmpi(start,"ABAABABA")==0)
//The function is Tail-Recursive
return algee(start);
}
int main()
{
puts(algee("A"));
getch();
return 0;
}
Many recursive fractals or some of their variations can be generated using this L-System.
Fig. 5.16
One dimensional mid-point displacement is a great algorithm for drawing a ridgeline, as mountains
might appear on a distant horizon. Here is how it works.
Start with a straight line as shown in the figure above
Repeat for a sufficiently large number of times
{
Repeat over each line segment in the scene
{
Repeat Find the mid point of the line segment
Repeat Displace the midpoint in Y by a random amount
Repeat Reduce the range for random numbers,
}
}
How much we want to reduce the range for random numbers depends on how rough we want the
mountain to look like. The more we reduce it in each pass through the loop, the smoother the resulting
ridgeline will be.
Note two things about the above algorithm. First, it is recursive in nature; and secondly, it can create
fairly complex image with very low level details in
few simple steps.
The realization that a small, simple set of instruc-
tions can create a complex image has lead to research
in a new field known as fractal image compression.
The idea is to store the simple, recursive instructions
for creating the image rather than storing the image
itself. This works great for images which are truly
fractal in nature, since the instructions take up much
Recursion (Time and Again!) 245
less space than the image itself. Chaos and Fractals, New Frontiers of Science 3 has a chapter and an
appendix devoted to this topic and is a great read for any fractal nut in general.
Without much effort, you can read the output of this function into a paint program and come up with
something like this:
REVISION OF CONCEPTS
REVIEW QUESTIONS
1. What is nth Ackermanns number? Where n is the product of two consecutive primes starting with
1,2.
2. What is Hilbert Curve. What are the steps to create one recursively.
246 Data Structures using C
PROGRAMMING PROBLEMS
INTRODUCTION
Some real-life scenarios can be modeled easily with a first in, last out list. In these types of situations a
stack will be the ideal data structure. A stack can be designed either by using arrays or by using linked
list. The basic operations possible on a stack are popularly known as push and pop. All these operations
are described in this chapter. Stack is a simple data structure but it has tremendous usages in the software
industry. Starting from a simple postfix calculator to a complicated XML reader, stacks find their way.
All these diverse applications of this simple data structure have been discussed at length in this chapter.
}MyStack;
Here data [MAX] is an integer array that can hold 10 integers in the stack. Count is the variable that
holds the current number of elements in the stack. Typedef has been used so that from now on MyStack
can be used as a built-in data type.
248 Data Structures using C
Fig. 6.1
{
printf("MyStack is full!\n");
return 0;
}
else
{
printf("Enter the number :");
scanf("%d",&st->data[st->count]);
sc++;
printf("Successfully pushed");
return 1;
}
}
Since the counting of the elements for the stack starts from 1 (Empty Stack), the total number of
elements in the stack when the stack is full is given by MAX 1. This method returns 1 in case the
pushing of elements is successful, otherwise it returns 0. In case, the stack is not full, the element entered
is put at the last of the array in the location st->count.
Here the method itself asks for the element to be added at the top of the stack. But we can pass the
element to the method.
6.3 HOW TO POP THE MRA ELEMENT FROM THE ABOVE STACK
Deleting the last element from the stack is popularly known as popping of Most Recently Added(MRA)
element. If the stack is empty and we want to pop the MRA element, then the system will display the
message Stack is empty, Underflow Error. Otherwise it will pop the last element. from the Stack. Here
is the code to pop the last element from the stack defined above.
int pop(MyStack *st)
{
if(st->count<0)
printf("MyStack is Empty!\nUnderflow Error");
else
{
printf("Last number is popped\n");
sc--;
}
return st->data[st->count-1];
}
This method returns the popped element to the calling method.
else
printf("Top = %d\n",st->data[(st->count)-1]);
}
Now it checks whether the Stack is empty or not. If the Stack is not empty, then it displays the last
element of the array.
#define MAX 10
}MyStack;
MyStack ms;
MyStack *s = &ms;
int sc=-1;
//Copy and paste each methods code here that are listed above
Fig. 6.2
When we represent the stack using a linked list, pushing an element onto the top of the stack involves
two activities. First of all, we have to create enough space using malloc() method to insert one more
node in the list. Then the next address of the new node is assigned with the previous last node. This is
done because stack is a LIFO list, so MRA element will be at the top. Here is the code of the function
that pushes an element on to the stack top.
Stack * push(Stack *rec)
{
Stack *new_rec;
printf("Enter the value to push :");
new_rec = (Stack *)malloc(sizeof(Stack));
scanf("%d",&new_rec->info);
new_rec->next=rec;
rec = new_rec;//The old is now new
return rec;
}
This method returns the pointer to the last node. malloc() returns a void * . So it needs to be type
casted to the Stack * type.
Fig. 6.3
if(rec==NULL)
printf("Stack is Empty");
else
{
temp=rec->next;
free(rec);//Freeing up memory space
rec = temp;
}
return rec;
}
From this method also the pointer to the top element is returned.
Fig. 6.4
else
{
//Swapping the values of the top 2 nodes
temp = (rec->next)->info;
(rec->next)->info = rec->info;
rec->info = temp;
}
return rec;//returning the pointer to the top element
Example 6.1 Write a program using stack to evaluate a postfix expression. Assume that the
postfix expression can only take +,-,*,/ and all the parameters are from 0 to 9. The program will work
as follows. A postfix expression will be entered from the console by the user and then the program
will calculate the value of the expression and displays that on console. But the program doesnt check
whether the supplied postfix expression is valid or not. Model the stack
using linked list.
Algebraic expressions such as (A+B)/(CD) have an inherent tree-
like structure. For example, the figure below is a representation of the
expression in equation (A+B)/(CD). This kind of tree is called an ex-
pression tree.
If we travel the tree in-order we get infix notation. If we travel the tree
in post-order then we get the postfix expression. Verify that the expres-
sion tree values and the postfix expression match below. Fig. 6.5
Stack (One upon Another) 255
Fig. 6.6
Solution
//This shows how to use a Stack to evaluate the given postfix
//expression.
#include <stdio.h>
#include <conio.h>
#include <malloc.h>
#include <ctype.h>
Stack *start;
int main()
{
char *postfix;
int first=0;
int second=0;
int i=0;
int result=0;
int temp=0;
clrscr();
printf("Please enter a Postfix Expression :");
scanf("%s",postfix);
for(i=0;i<strlen(postfix);i++)
{
if(!isop(postfix[i]))
{
printf(Enter value of %c,postfix[i]);
Stack (One upon Another) 257
scanf(%d,&temp);
start = push(start,temp);
}
else
{
second = peep(start);
start = pop(start);
first = peep(start);
start = pop(start);
if(postfix[i]=='+')
{
result=0;
result+=(first+second);
start = push(start,result);
}
if(postfix[i]=='-')
{
result=0;
result+=(first-second);
start = push(start,result);
}
if(postfix[i]=='*')
{
result=0;
result+=(first*second);
start = push(start,result);
}
if(postfix[i]=='/')
{
result=0;
result+=(first/second);
start = push(start,result);
}
//Add your own custom defined operators
//Like ^ for exponentiation and so on
}
}
printf("Result is = %d",peep(start));
getch();
return 0;
}
A sample run of the program.
Please enter a Postfix Expression :ab+cd-/
Enter value of a 80
Enter value of b 20
258 Data Structures using C
Enter value of c 5
Enter value of d 1
Result is = 25
printf("Missing {\n");
pop(s);
}
}
}
if(!isEmpty(s))//We have reached the end of file
printf("} Missing\n");
Fig. 6.7
Fig. 6.8
Here is the code in C to solve the problem using stack.
#include <stdio.h>
#include <conio.h>
#include <malloc.h>
#include <ctype.h>
#include <stdlib.h>
Stack *start;
else
{
temp = (rec->next)->info;
(rec->next)->info = rec->info;
rec->info = temp;
}
return rec;
}
void display(Stack *rec)
{
if(rec==NULL)
printf("Stack is empty.");
else
{
while(rec!=NULL)
{
printf("Info = %d Address : %u Next Address :
%u\n",rec->info,rec,rec->next);
rec = rec->next;
}
}
}
else
return 0;
}
int main()
{
int i;
int pins;
int net[]={1,2,3,4,5,6,7,4};
printf("How many pins :");
scanf("%d",&pins);
for(i=0;i<pins;i++)
{
if(isempty(start))
{
start = push(start,i);
}
else
{
if(net[i]==net[peep(start)])
{
printf("%d",peep(start));
start = pop(start);
}
262 Data Structures using C
else
{
start = push(start,i);
}
}
}
if(isempty(start))
printf("Switchbox is routable");
else
printf("Switchbox is not routable");
return 0;
}
Example 6.3 Write a program to find out whether a given string is a palindrome or not using a
stack. Ignore white space and other characters. Check your program with the following inputs .
Madam I am Adam and was it a cat I saw and Level. The program should be case insensitive.
Solution Here is the code that uses stacks to find whether the string entered is a palindrome or not.
The program skips all the white spaces. The stack is modeled using a linked list, because the string can
be of any length. Three stacks are used to perform this operation
Here is the illustration. People speak about so many palindromes. I have used one of my favorites.
Fig. 6.9
#include <stdio.h>
#include <conio.h>
#include <malloc.h>
#include <ctype.h>
#include <stdlib.h>
Stack (One upon Another) 263
Stack *start;
Stack *rstart;
Stack *cstart;
int isaplha(char c)
{
//Checking whether c is a alphabet or not.
//White space is also not allowed. ASCII of space is 32
//by writing c!=32 I mean that ASCII of c will not be 32.
if(((c>='a' && c<='z') || (c>='A' && c<='Z')) && (c!=32))
return 1;
else
return 0;
}
return rec;
}
return rec;
int main()
{
char c;
int flag=0;
printf("Enter the characters :");
do
{
c = getche();
if(isalpha(c))
{
start = push(start,c);
cstart = push(cstart,c);
}
while(!isempty(start))
{
rstart = push(rstart,peep(start));
start = pop(start);
}
while(!isempty(rstart))
{
if(toupper(peep(rstart))==toupper(peep(cstart)))
flag = 1;
else
{
flag=0;
break;
}
rstart = pop(rstart);
Stack (One upon Another) 265
cstart = pop(cstart);
}
if(flag==1)
printf("The string is a palindrome");
else
printf("The string is not a palindrome");
return 0;
}
Try this program with the following inputs:
Mom
LiriL
Brother
Was it a cat i saw!
Madam I m Adam
A Man, A Plan, A Canal, Panama!
A saguaro cactus
This type of stack can be very useful. For example say the URL entered in internet explorer contains
some illegal characters, using saguaro stack we can extract the invalid characters. Branches of this URL
stack are www , the company name and the domain(.com, .edu., .org etc..) See the figure below for more
explanations.
266 Data Structures using C
A saguaro stack can be thought as the combination of stacks. Here is the structure that is used to
represent the stack.
typedef struct MyStack
{
int data[MAX];
int count;
}MyStack;
st->Stacks[whichStack].data[st->Stacks[whichStack].count] =
whattopush;
st->Stacks[whichStack].count++;
sc++;
printf("Successfully pushed");
return 1;
}
}
if(st->Stacks[whichStack].count<0)
printf("Specified Stack is Empty!\nUnderflow Error");
else
{
printf("Last number from Stack %d is
popped\n",whichStack);
sc--;
}
return st->Stacks[whichStack].data[st-
>Stacks[whichStack].count-1];
}
Stack (One upon Another) 267
Fig. 6.10