2 Python vs R Algorithmic Structures
2 Python vs R Algorithmic Structures
Abstract
This article discusses algorithmic structures found in the Python and R languages,
with an emphasis on simple data structures such as variables and constants. It also
explores the operations commonly used with these structures and the different control
flows available in these programming languages. The objective is to present in a clear
way the correspondences between the elements of Python and R, thus facilitating the
understanding and the transition between the two languages.
∗
Researcher in Optimization and MAchine Learning Laboratory (OPTIMALL), DR Congo.
E-mail: g.kamingu@unikin.ac.cd.
1 Introduction
An algorithmic structure is a way of organizing the elements of an algorithm (and by extension of a
program) in such a way as to facilitate programming. These structures affect either the data (data
structures) or the layout of the instructions (control flows). Data structures and control flows are
therefore part of the major classes of algorithmic structures (Wirth (1987)).
Control flows are essential elements in imperative programming and are present in all imperative
programming languages, including Python and R. They play a crucial role in making decisions, repeat-
ing actions and handling errors in a program. Both Python and R offer a range of control flows that
facilitate logic and control of the flow of statement execution. As far as data structures are concerned,
there are different ones, ranging from the simplest to the most complex. In this paper, we will mainly
focus on simple data structures. Other structures will be explored in more detail in later publications.
However, interested readers can refer to the works of Goodrich et al. (2013), Venables, W. N. et al.
(2023) and that of Prakash, Dr. PKS and Rao, Achyutuni Sri Krishna (2016) which address, among
other things, the subjects of data structures in the Python and R languages.
The rest of the paper is structured as follows. We begin by presenting in the section 2 the
algorithmic structures in general, then we proceed to the presentation of the algorithmic structures of
the Python language in the section 3, then we present the algorithmic structures of the R language in
the 4 section. Finally, we discuss the elements of comparison of these two languages in the section 5
before concluding.
Our environment is composed of seven data. The numbers x1 , x2 , s and v̄. The data x1 , x2 , s
and v̄ can take any value; and they can even change value while the program is running. These are
variables. Moreover, if for example five people show up, each can enter the numbers they want, which
will also modify the calculated numbers s and v̄. On the other hand, a datum that cannot change
value is a constant. In our case, 2 is a constant.
• its name or identifier (invariable), i.e. a sequence of characters (letters, numbers or even special
characters) used to design it. An identifier must begin with a letter;
23
• its value (variable);
• its type (can be (in)variable), which describes the possible use of the variable.
From the above, a variable is like a memory cell (box or cell) that can hold a value. To be able
to use a memory cell (a variable), it is necessary to indicate to the processor its identifier and its
type using a declaration. In our pseudo-code1 , the variables will be declared using the keyword Var,
followed by the comma-separated list of names, then the type of the variables in the list. We will
agree that at the time of the declaration of a variable, its value is indeterminate.
As said above, 2 is a constant. Memory space can also be reserved for this constant, with the
keyword Const, followed by the name of the constant.
Remark 2.1. Like named constants, in programming languages of certain paradigms, such as functional
and logical programming languages, so-called variables maintain a single variable throughout their
lifetime due to referential requirements. Also, variables are bound to expressions.
We distinguish the elementary types called simple types or primitive types or even basic types, and
the compound types.
1. The primitive data types, which are data types from which all other data types are built.
Primitive data types are usually atomic and cannot be broken down into smaller pieces. They
generally occupy a fixed amount of memory and can be directly manipulated by arithmetic and
logical operations (Cormen et al. (2022)).
2. The compound data types, which are data types resulting from the association of several elemen-
tary types. Common compound data types include lists, sets, and dictionaries. The elements
of a collection can be of different types and they can be added, deleted or modified dynamically
(Sedgewick and Wayne (2011)).
The main difference between primitive data types and composite data types is their ability to
store and represent information. Primitive data types are used to represent individual and simple
data, while composite data types are used to represent more complex and structured data sets.
• The integer type, associated with relative integers, i.e. numbers belonging to the set Z.
• The real type, associated with real numbers (numbers having an integer part and a decimal part
(or fractional part)), ie numbers belonging to the set R.
1
A pseudo-code is a "pseudo" language independent of all programming languages, which can be clear and
easily transcribable in any language.
24
Example 2.2. −2.25, 4.0, 95.2000 and 0.623 are reals.
In some programming languages, the "real" data type is sometimes referred to as "float" (floating
point numbers). This designation is justified by the fact that the reals are stored as a float-
ing point number, which implies that they are represented using a mantissa and an exponent.
However, some programming languages reserve the term float 2 to denote single-precision real
numbers, that is i.e. those that use 32-bit for storage. These languages then make a distinction
between float and double, where the latter refers to double-precision floating-point real numbers
typically using 64 bits for storage.
According to the richness of a programming language, we can distinguish the complex type, which
is associated with complex numbers, that is to say with numbers belonging to the set C.
However, programming languages like R and Fortran provide the complex type as the base type, while
other programming languages like Python or C++ require additional modules to include them.
B. Alphanumeric types
The alphanumeric type is made up of the set of character sequences that the processor can handle.
This character can be a letter (uppercase or lowercase), a number, a punctuation mark, but also a
space, a tabulation, a new line and some other non-printable characters. The value of an alphanumeric
is always enclosed in " " and in other programming languages in ’ ’. We distinguish :
Example 2.3. "d", "5", "K" and " " are characters.
Example 2.4. "Maëlys", "007", "Volume 1" and "vibration" are character strings.
C. Boolean Type
The boolean type is a two-state type of variable, usually denoted true and false. These values are also
represented respectively by 1 and 0 depending on the programming languages.
25
Example 2.5. In the expression 10+4, the operator is + and the operands are the constants 10 and 4
are operands.
Each expression has an associated value (Goulet (2016)). The fundamental data symbols represent
this data, as expected. It is possible to combine several expressions to form a compound expression.
The evaluation of this expression is usually from left to right, unless the order of evaluation is changed
by parentheses, like mathematical rules.
1. The arithmetic operators which allow you to perform arithmetic operations between numeric
operands:
• The addition: +;
• The subtraction: −;
• The multiplication: × or ∗;
• The exponentiation: ˆ;
• The (real) division: /;
• The integer division: //;
• The modulo: %;
Example 2.6. a // b returns the integer part of the quotient of the integer division of a by
b, while a % b gives the remainder of the integer division of a by b.
Here is the commonly used arithmetic operator precedence rule, in descending order of prece-
dence:
(a) Parentheses: operations between parentheses are performed first. Parentheses can also be
nested to indicate additional precedence levels;
(b) Power;
(c) Multiplication, division, and modulo, which are evaluated from left to right in the order
they appear in the expression.
(d) Addition and subtraction, which are also evaluated from left to right in the order they
appear in the expression.
Using parentheses allows you to change the order of evaluation and give specific priority to
certain parts of an expression. Everything inside parentheses is evaluated first, then other op-
erations follow the precedence rule mentioned above.
• The negation or logical NOT, which returns the inverse of the Boolean value. If the
expression is true, then not returns false, and vice versa.
• The conjunction or logical AND, which returns true if all boolean expressions are true,
otherwise it returns false.
• The disjunction (inclusive) or logical OR, which returns true if at least one of the boolean
expressions is true, otherwise it returns false.
26
• The exclusive disjunction or logical XOR, returns true if exactly one of the expressions is
true, and false otherwise.
Voici la règle de priorité des opérateurs logiques, par ordre décroissant de priorité:
(a) La négation, qui la priorité la plus élevée. Il est généralement évalué avant d’autres opéra-
tions logiques;
(b) La conjonction;
(c) La disjonction (inclusive);
(d) La disjonction exclusive.
Using parentheses can also change the order of evaluation of logical operations and give specific
priority to certain parts of a logical expression. Everything inside parentheses is evaluated first,
then other logical operations follow the precedence rule mentioned above.
3. The comparison operators, also called relational operators which compare two operands and
produce a Boolean value. The comparison operators are:
The execution of an action brings the environment from a particular initial state to a new well-
defined state. An action will be primitive or elementary for a given processor if its instruction is
sufficient for the processor to execute it without additional information.
In short, the elementary actions are simple operations, directly usable.
A. Reading
Definition 2.3. Reading is an elementary action which consists of assigning a value to a variable
keyboard input.
To perform an action, the processor pauses and observes the keyboard. Each key typed on the
keyboard communicates part of the expected information to the processor. To indicate the end of the
information, the user must validate his order. Before this validation, the user has the possibility to
modify the information entered.
27
B. Writing
Definition 2.4. Writing is an elementary action which consists in displaying information on the screen.
Example 2.7 (Hello, World!). Consider the following algorithm3 , which displays the message “Hello,
world!” :
Begin
Write "Hello, World!"
End.
Before reading a variable, it is strongly recommended to write labels on the screen, in order to
warn the user of what he must enter (Darmangeat (2008)).
Example 2.8. Let the following program which asks the user for his name appear on the screen:
C. Assignment
Definition 2.5. The assignment is an elementary action which consists in attributing a value to a
variable.
In this action, we establish a link between the name of the variable and its value (its content)
(Swinnen (2012)). Unlike the read action, which allows you to assign a value to an object externally,
i.e. by typing the value on the keyboard, the assignment allows you to do it internally.
Var x, y, z integer
Begin
3
This is a test program. This piece of code was influenced by a sample program in the 1978 book, The
C Programming Language (Ritchie and Kernighan (1978)) by Canadian-American computer scientist Brian
Kernighan and American computer scientist Denis Ritchie. However, there is no evidence that it originated in
this book, and it is very likely that it was used before in BCPL (as below). However, in an interview granted
to the Indian edition of Forbes magazine, Brian Kernighan explains that this sentence comes from a cartoon he
had seen, where a chick came out of its egg saying “Hello, World!”. However, this program remained popular in
learning programming. In particular for the initiation to a new programming language.
28
x←2
y←4
z←x
x←x+4∗y
End.
Here, it is a question of assigning the value 2 to the variable x, the value 4 to the variable y, the
value of the variable of the identifier x to the variable z and the value x + 4 × y in the variable x.
Example 2.10. Let’s go back to the illustration of the point 2.1.1, with the calculation of the arithmetic
mean of two numbers. We have the following algorithm:
Var x1 , x2 , s, v̄ as real
Begin
Write "Give the first number"
Read x1
Write "Give the second number"
Read x2
s ← x1 + x2
v̄ ← s/2
Display "The arithmetic mean of these two numbers:", v̄
End.
The action of assigning an initial value to a variable is called variable initialization. However,
several programming languages allow the possibility of declaring and initializing the variable at the
same time.
A control flow determines the flow of control (or execution flow), i.e. the order in which the in-
structions of the algorithm must be executed.
• Sequential structures;
• Alternative structures;
• Iterative structures;
• Routines4 .
29
The sequence is therefore the implicit control flow. Control thus passes from one instruction to
another according to the order in which they appear in the algorithm.
However, sometimes the sequence is interrupted by a jump, also called branch or branching. A
jump is a transfer of control from the algorithm to a specific location. The line we branch to is specified
using its label or tag. A label is a number or identifier associated with a source code instruction. It
is intended to serve as a target for a control structure located elsewhere in the program. Apart from
this localization role, a label has no effect on the program: it does not modify the behavior of the
instruction with which it is associated.
There are two families of instructions that address these labels: unconditional jumps and condi-
tional jumps.
In a conditional jump, control is unconditionally transferred to one line of the algorithm. This
jump is systematic, it causes a break in the flow of execution. The instruction following the jump in
the program can therefore only be reached by another jump.
Example 2.11. Consider an algorithm that asks the user for their name and displays it by adding at
the beginning "Your name is:"
Remark 2.2. It should also be noted that the jumps (including the instruction Go To) have been criti-
cized, in particular by the Dutch mathematician and computer scientist Edsger W. Dijkstra, who will
later become his emblematic opponent. Wanting to fight against the misuse of the statement, he wrote
in 1968 for ACM Communications an article which he titled "A Case against the GOTO Statement"
(Dijkstra (1968a)). Wanting to quickly publish the article as a letter to the editor, editor Niklaus Wirth
renamed it "Go To Statement Considered Harmful" (Dijkstra (1968b)).
B. Program termination
A program usually terminates after the execution of the last instruction. Most programming languages
also provide one or more instructions to stop program execution at an arbitrary position.
30
A. Simple alternative structures
Simple alternative structures exist in two forms. The first form is:
If p Then
s.
End If
If p Then
s
Else
t.
End If
If the condition, also called test, t is verified, the sequence of instructions s in both forms. If it is
not checked, in the second form the sequence of instructions t is executed.
The End indicates the end of the sequence of instructions relating to the part Then for the first
form. For the second form, this ending is executed by the word Else.
Example 2.12. Consider writing an algorithm that calculates the absolute value of a real x.
In the first form, we can chain as many instructions Else If than desired: only the first whose
condition will be verified will be executed. We can generally associate a clause Else which will only
be executed if no Else if has not been verified.
Example 2.13. Consider an algorithm that asks the user for the child’s age and then determines the
child’s category:
31
Write "Youth Category"
Else If age ≥ 10 Then
Write "Intermediate Category"
Else If age ≥ 8 Then
Write "Pupil Category"
Else If age ≥ 6 Then
Write "Youngster Category"
Else
Write "Error"
End If
End If
End If
End If
End.
The second form is to select the block to perform based on the value of a variable. This form is
used when a turnout offers several outputs, and a condition must be tested several times, always using
the same variable.
Structurally, this is equivalent to a succession of Else If, but knowing that the value of the
variable under test will not change when evaluating the conditions allows the translator to make some
optimizations.
32
2.2.4 Iterative structures
Definition 2.9. A iterative structure, also called repetitive structure or loop, is a control flow allowing
certain instructions to be executed several times in succession.
There are several types of iterative structures, depending on whether the number of repetitions
(or iterations) is predetermined or not.
Repeat
s
Until p
This loop makes it possible to reiterate an instruction or a series of instructions s until a condition,
called exit condition is verified. The sequence of instructions is executed at least once, regardless of
the condition.
Example 2.15. Consider an algorithm allowing the user to enter a number randomly and which only
stops when this number is equal to 1524. At the end, the algorithm must provide the number of at-
tempts by the user. Hence the following algorithm:
Var x, i as integer
Begin
i←0
Repeat
Write "Enter the number please."
Read x
i←i+1
Until x = 1524
Write "You have found the number."
Write "The number of attempts is: "; i
End.
The integer variable i whose initial value is 0 contains the number of times that the instruction
Read x has been repeated, such a variable will be called (loop) counter, with each iteration its value
will increase by a constant amount. We will say that the counter is incremented.
The constant value with which the counter increases with each iteration will be called increment
step. In the example 2.15, the increment step is 1. When the value of the control variable constantly
decreases from one iteration to the next, it is called the decrement of the counter. We also talk about
decrementation step.
33
Thus, a counter is a variable that controls the iterations of a loop. It is so named because most uses
of this variable cause the variable to take a range of integer values in an ordered sequence. Because the
counter is a variable, it can be named using variable naming conventions. However, common identifier
naming is for the loop counter to use the variable names i, j, and k (and so on if necessary). The
reverse order is also used by some programmers. This practice goes back to mathematical notation
where the indices for sums and multiplications are often i, j, etc.
While p Do
s
End While
where s a sequence of instructions and p the iteration control predicate.
If the condition p is verified, the sequence of instructions s is executed, then at the end of the block
we evaluate p again and start again. When p returns a false result, we exit the loop. The p predicate is
usually called continuation condition because the flow of execution continues with s when the p is true.
Example 2.16. Consider again an algorithm allowing the user to enter a number randomly and which
only stops when this number is equal to 1524. At the end, the algorithm must provide the number of
attempts by the user. Hence the following algorithm:
Var x, i as integers
Begin
i←1
While n ̸= 1524 Do
Print "Enter the number please"
Read x
i←i+1
End While
Print "You found the number."
Print "The number of attempts is: "; i
End.
Var i as integer /*Integer variable indicating the number of times the operation is repeated*/
Var P, a as integers
Begin
Print "Enter the number to multiply by 11:"
Read a
10. P ← P + a
34
i←i+1
If i < 11 Then
Go To 10
End If
Print "The product is: "; P
End.
To avoid the explicit declaration of the counter variable and the use of the conditional jump, the
following algorithm can be used:
Var P, a as integers
Begin
Print "Enter the number to multiply by 11:"
Read a
For i ← 1 To 11
P ←P +a
End For
Print "The product is: "; P
End.
This version of the algorithm shows that, when the number of repetitions is known in advance, the
For loop is more efficient than the first two. By the way, unlike the while loop or the until loop, the
To loop includes a variable called counter (and its initialization for that matter) in its syntax. In a
For loop, the progress of the counter is left at its free disposal. In most cases, we need a variable that
increases by 1 with each iteration, and generally on nothing specific on the step of incrementation. If
on the other hand the step is different from 1, it suffices to specify, with the instruction step k, where
k is the incrementation (or decrementation) step.
Example 2.17. Consider an algorithm that displays the first 3000 multiples of 3 without doing any
multiplications.
Var i as integer
Begin
For i ← 3 To 3003 (step 3)
i
End For
C. General Case
Programming languages such as C and Common LISP allow the use of more general iterative structures
than those expressed more. Its general format is as follows:
Loop
a
Exit If p
b
End Loop
where a, b sequences of instructions and p the predicate to be evaluated.
In a general scheme of iterative structure, the sequence of instructions a always occurs once more
than the sequence of instructions b because the evaluation of the control predicate is done in the
middle of the scheme (after a sequence instructions a). When one of the sequences of instructions a
or b is empty, the general schema is reduced to either an Until loop, or a While loop, or a For loop.
35
1. When b is empty, the general scheme reduces to the Until loop. We have therefore:
Loop
s
Exit Loop If p
End Loop
It is possible for a problem, to use either an alternative structure, or a For loop, or a While loop or
even a Until loop. Consider a problem of writing an algorithm that displays the first twenty positive
integers. This problem can be solved by the following algorithm:
Var i as integer
Begin
For i ← 1 to 20
Write i
End For
End.
With the simple alternative structure, the same algorithm can be written as:
Var i as integer
Begin
i←0
5. i ← i + 1
If i < 20 Then
Write i
Go To 5
End If
36
End.
However, using the alternate structure in repetitive tasks requires the use of the Go To statement.
Note that the instruction Go To is increasingly obsolete and less and less accessible in some high-level
programming languages. Indeed, programming language like Ruby, since version 1.9, has the instruc-
tion Go To only as a joke: the __goto__ and __label__ are activated if the variable SUPPORT_JOKE is
set when compiling Ruby. In 1966, the Böhm-Jacopini theorem (Böhm and Jacopini (1966); Ramshaw
(1988)) demonstrated that any program with Goto could be transformed into a program using only
subroutines, sequences, alternative structures and iterative structures.
With the While loop, the same algorithm can be written as follows:
Var i as integer
Begin
i←0
While i < 20
i←i+1
Write i
End While End.
With the Until loop, the same algorithm can be written as follows:
Var i as integer
Begin
i←0
Repeat
i←i+1
Write i
Until i ≥ 20
End.
The only interest of the For loop compared to the While loop or the Until loop is to save the
programmer a little fatigue, by avoiding him having to manage himself the progress of the variable
which serves as his counter ( Darmangeat (2008)).
37
In [1]: a = 4
b = 50.8
c = " 100 "
In the example 3.1, the variables a, b and c receive the values 4, 50.8 and "100" respectively.
Example 3.2.
In [1]: d = e = 19
Example 3.3.
Here, the value "Gloria" is assigned to the variable e and the value "M aelys" is assigned to the
variable "f ".
In Python, there are no constants in the strict sense. Indeed, to define a constant in Python, it is
common to name constants in uppercase to indicate that they should not be changed, although they
are not actually constants in the strict sense.
Example 3.4.
• A variable identifier cannot start with a digit. Also, if it starts with a dot, the second character
cannot be a digit.
• Python is case sensitive. It distinguishes between upper and lower case. Indeed, OPTIMALL,
OptiMaLL, Optimall and optimall are different variables.
• For a variable with a compound name, it is advisable to use the so-called "snake_case" case
style, which consists of writing the words are, usually lowercase, separated by underscores.
Example 3.5. We will use per_capita_income to express per capita income and exchange_rate to
express exchange rate when programming in Python.
38
Table 1: List of Python keywords.
In the Python language, there are specific keywords that are reserved and have a special meaning.
You can consult the Table 1 to see the complete list of these keywords used in Python.
In Python, there are no constants in the strict sense. Indeed, to define a constant in Python, it is
common to name constants in uppercase to indicate that they should not be changed, although they
are not actually constants in the strict sense.
Example 3.6.
Remark 3.1. Note that the use of variables and constants may vary depending on specific programming
conventions and program needs. The example above illustrate commonly used approaches, but it is
important to understand that variables can be changed and constants are not strictly immutable in
Python.
• The complex type, represented by the keyword complex5 , and it is created using the function
complex (x, y), x is the real part and y the imaginary part.
Example 3.9. The numbers complex(1, 4) and complex(7, −3) are of the complex type of the
Python language.
The character and the string types are represented by the keyword str.
• The boolean type, which has the value True or False (with a capital letter at the beginning).
5
The notion of function will be discussed below.
39
3.1.3 Casting
You can switch from one type of object to another. In this moment, we must use a technique called
casting or type conversion.
In Python, casting is often implicit, which means that Python will automatically convert data
types if necessary. Indeed, if we add an integer and a real, Python will automatically convert the
integer to a real before performing the addition.
Remark 3.2. Note that to know the type of object (of the variable) in Python, we use type().
Operator Python
Addition +
Subtraction -
Multiplication *
(Real) division /
Exponentiation **
Modulo (Remainder of integer division) %
Integer division //
40
Table 3: Logical operators in Python.
Operator Python
Negation (logical NOT) not
Conjunction (logical AND) and
Disjunction (logical OR) or
Exclusive disjunction (logical XOR) ˆ
Operator Python
Equal to (=) ==
Not equal to (̸=) !=
Greater than (>) >
Less than (<) <
Greater than or equal to (≥) >=
Less than or equal to (≤) <=
41
In [15]: True ^ False # The exclusive disjunction
Out [15]: False
Remark 3.3 (Illegal operations). It is important to point out that in Python, certain operations are
considered illicit, which means that they are incorrect or invalid according to the rules of syntax and
semantics of the language. Therefore, these operations cannot be performed correctly and will result
in an error while performing them. Here are some examples of illicit operations in Python:
• The incorrect type conversion, when performing operations between incompatible data types (such
as numeric and string operations). The error generated is TypeError;
• The overflow, when performing calculations that exceed the maximum capacity of a numeric
type, which generates OverflowError.
In [3]: 4/0
------------------------------------------------------------
Zer oDivis ionErr or Traceback ( most recent call last )
< ipython - input -4 -221068 dc2815 > in < module >
----> 1 4/0
In [4]: 0/0
------------------------------------------------------------
Zer oDivis ionErr or Traceback ( most recent call last )
< ipython - input -4 -221068 dc2815 > in < module >
----> 1 0/0
TypeError : can only concatenate str ( not " int " ) to str
Write x
where x is the identifier of the data to which a value typed on the keyboard must be assigned. In
Python, this format can be translated as:
42
x = input().
Python also gives the possibility to ask the user to enter their own values from the keyboard. For
this, the syntax used is as follows:
variable_name = input("msg")
where "msg" is the optional waiting message that tells the user what to enter.
Example 3.15. The following Python script asks the user to enter their name:
Write inf o
where inf o is the information to display on the screen. In Python, this is implemented by:
print(info).
Example 3.16. The Hello, world! from the 2.7 example is implemented in Python as follows:
This method can be handy for quick output without using print(). However, if you want to
display multiple values or add text to your output, print() should be used. Also, if you want to
43
display a specific value at a specific time in the code, print() should also be used.
Example 3.17. Let’s go back to the illustration of the point 2.1.1, with the calculation of the arithmetic
mean of two numbers. We have the Python implementation of Figure 2
In the previous script, the input() instruction is used to receive input as a string. However, to
perform arithmetic calculations, we need to convert these strings to real numbers using float(x1)
and float(x2). So, the instruction combines these two steps into one to make it easier to convert the
user’s input to the desired type:
variable_name = target_type(input(["msg"])
where target_type is the target type, different from chr, of the variable variable_name.
Example 3.18. Figure 3 illustrates well the change made to the program of Figure 2.
To display the contents of variables by inserting them into a string, there are at least two commonly
used methods: (i) writing with a simple display and (ii) writing with a formatted display. These two
approaches are illustrated respectively in the code examples shown in Figures 10 and 11.
Example 3.20. Figure 11 illustrates the case of the display in formatted writing.
44
Figure 3: Calculation of the arithmetic mean of two numbers with Python, second version.
Figure 4: Calculation of the arithmetic mean of two numbers and display by inserting the
values of the variables in writing without formatting.
Figure 5: Calculation of the arithmetic mean of two numbers and display by inserting the
values of the variables in formatted writing.
45
3.4 Control flows in Python
3.4.1 Alternatives structures
The first form of the simple conditional structure:
if p :
s
where p is the condition and s the sequence of instructions to be executed if the condition is
satisfied.
Example 3.21. Taking the example 2.12, we have the following implementation:
if x >= 0:
val_abs = x
else :
val_abs = -x
if p1 :
s1
elif p2 :
s2
[...]
elif pn :
sn
else :
t
This structure can be explained as follows:
• The statement if p1: checks the condition p1. If the condition p1 is satisfied, the sequence of
instructions s1 that follows will be executed. If the condition p1 is not satisfied, the program
goes to the instruction elif.
• The instruction elif p2: allows to check a second condition, p2. If the condition p2 is satis-
fied and the previous conditions (p1) are not satisfied, the sequence of instructions s2 will be
executed. If the condition p2 is not verified, the program goes to the next elif instruction, if
any.
46
• The elif instructions can be repeated as many times as necessary with different conditions
(p3, p4, ..., pn). Each elif instruction allows to test an additional condition. If one of the
pi conditions is satisfied, the corresponding sequence of if instructions will be executed and
the other elif clauses and the else block will be ignored.
• The else: instruction is optional. It is used when all the previous conditions (p1, p2, ...,
pn) are evaluated are not satisfied. The sequence of instructions t which follows will be executed
if none of the preceding conditions is satisfied.
Example 3.22. The 2.13 example can be implemented in Python as follows.
Since the input() function only retrieves strings, it is necessary to ensure that all characters en-
tered are digits. To do this, we can use the isdigit() method which allows to check if a character
string is composed only of digits.
47
Among the methods available to carry out checks on character strings, we have: isdigit(),
isnumeric(), isdecimal() and isalnum().
• isnumeric(): This method is similar to isdigit(), but it also accepts other types of numeric
characters such as superscript digits, fractions, and foreign language digits.
• isdecimal(): This method checks if all characters in a string are decimal digits (base 10).
• isalnum(): This method checks if all characters in a string are alphanumeric, i.e. letters or
numbers.
48
Python does not have a specific structure called “Select Case”. However, Python offers an alterna-
tive with the generalized structure with If...Then...Else...End If, which can be used to handle
multiple conditions similar to “Select Case”.
Example 3.25. The example 2.14 in Figure 7 shows an implementation of this structure in Python.
x = 0
i = 1
while x != 1524:
49
print ( " Please enter the number . " )
x = int ( input () )
i += 1
x, i = 0, 0
while x != 1524:
print ( " Please enter the name . " )
x = int ( input () )
i += 1
Example 3.28. By resuming the display of the first 3000 multiples of 3 without doing multiplications.
# Declaration de la variable i
i = 3
50
4 Structures R language
4.1 Assignment and Data Types in R
Just like Python, the R language also uses dynamic typing.
> a <- 4
> 50.8 -> b
> c = " 100 "
It is important to note the position of the variable identifier and its value. When we use the "->"
operator, the value is on the left of the operator and the identifier is on the right.
Also, it is recommended to avoid using the “=” operator to assign a value to a variable. This
practice can be confusing with "symbol=value" constructs in function calls. R’s syntax rules say to
use the "<-" or "->" operators for assignment.
Example 4.2.
> d <- 19
> e <- " Gloria "
> f <- " Maelys "
• the character «.» can be used inside a variable identifier. The use of accented letters in object
names may be allowed depending on the linguistic environment of the computer. However, it is
strongly recommended to avoid this practice, as it can harm the portability of the code (Goulet
(2023)).
• A variable identifier cannot start with a digit. Also, if it starts with a dot, the second character
cannot be a digit.
• R is case sensitive. They distinguish between upper and lower case letters. Indeed, OPTIMALL,
OptiMaLL, Optimall and optimall are different variables.
• For a variable with a compound name, it is advisable to use the so-called "camelCase" case style,
which consists of writing a set of words by linking them without spaces or punctuation, and by
capitalizing the first letter of each word except the first letter of the object name.
Example 4.3. We will use perCapitaIncome to express per capita income and exchangeRate to express
exchange rate when programming in Python.
R also has certain keywords that are reserved and used in the language. You can consult the Table
5 to see the list of R-specific keywords.
In R, there are no constants in the strict sense. Indeed, to define a constant in R, it is common to
use the function const to indicate that the value should not be modified. However, it is important to
51
Table 5: List of R keywords.
note that this use of const does not strictly prevent the value from being changed. Rather, it serves
as a hint to other programmers that the value should not be changed intentionally.
Example 4.4.
Remark 4.1. Note that the use of variables and constants may vary depending on specific programming
conventions and program needs. The example above demonstrates commonly used approaches, but it’s
important to understand that, like in Python, variables can be changed and constants are not strictly
immutable in R.
• The boolean type, which has the value TRUE or FALSE (all uppercase letters), which can be
abbreviated with T and F.
52
4.1.3 Casting
You can switch from one type of object to another. In this moment, we must use a technique called
casting or type conversion.
In R, casting is often explicit, which means that you have to explicitly specify the type of data
you want to convert. Additionally, R has a stricter approach to typecasting, which means that type
conversions can fail if the data is not compatible.
Remark 4.2. Note that to know the type of object (of the variable) in R, we use class().
Operators R
Addition +
Subtraction -
Multiplication *
(Real) division /
Exponentiation ˆ
Modulo (Remainder of integer division) %%
Integer division %/%
Operator R
Negation (logical NOT) !
Conjunction (logical AND) &&
Disjunction (logical OR) ||
Exclusive disjunction (logical XOR) xor()
53
Example 4.11. Here is an R script illustrating the operators:
Remark 4.3 (Some special values). In the R language, division by zero and divisions by infinity generate
specific results governed by the rules of real number arithmetic. Here are the results we get in R for
different divisions:
• The result Inf which means "Infinite". This is represented by the symbol "Inf" or "-Inf" (for
positive and negative infinity respectively), when making any number by zero.
• The result NaN, which means "Not a Number", when dividing zero by zero or infinity over infinity
.
• If we divide zero by a non-zero number (even if it is infinity), the result is always zero.
54
[1] TRUE
> c <- 0/0
> c
[1] NaN
> is . nan ( c )
[1] TRUE
> d <- Inf / Inf
> d
[1] NaN
> is . nan ( d )
[1] TRUE
> e <- 0/1420
> e
[1] 0
> f <- 0/ Inf
x <- readline()
where
x is the identifier of the data to which a value typed on the keyboard must be assigned.
R also gives the possibility to ask the user to enter their own values from the keyboard. For this,
the syntax used is as follows:
Example 4.13. The following R script asks the user to enter their name:
Example 4.14.
4.3.2 Writing in R
In R, there are several ways to display the contents of a variable. Here are the differences between the
functions print(), cat() and the direct display of the variable.
print(info).
55
Example 4.15. The Hello, world! from the 2.7 example is implemented in R as follows:
Remark 4.4. When print(info), the result will be displayed with the line index of the result.
The second display scheme is:
cat(info).
Example 4.16. With this scheme, the Hello, world! from the 2.7 example is implemented in R as
follows:
Remark 4.5. Contrary to the scheme with print(), when cat(info), the result will be displayed
without index of the line of the result.
The third scheme is the direct use of info. It can be the content of a variable or a constant. So,
info.
Example 4.17. The Hello, World! from the 2.7 example is implemented in R as follows:
Remark 4.6. Here too, the result will be displayed with the index of the line of the result.
As with Python, if you want to display multiple values or add text to your output, print() should
be used. Also, if you want to display a specific value at a specific time in the code, print() should
also be used.
Example 4.18. Let’s go back to the illustration of the point 2.1.1, with the calculation of the arithmetic
mean of two numbers. We have the R implementation of Figure 8
In the preceding script, the readline() instruction is used to receive input as a string. How-
ever, to perform arithmetic calculations, we need to convert these strings to real numbers using
as.numeric(x1) and as.numeric(x2). So, the instruction combines these two steps into one to make
it easier to convert the user’s input to the desired type:
variableName = as.type_target(readline(["msg"])
where type_target is the target type, different from character, of the variable variableName.
Example 4.19. Figure 9 illustrates well the change made to the program of Figure ??.
To display the contents of variables by inserting them into a string, there are two commonly used
methods: (i) writing with a simple display and (ii) writing with a formatted display. These two
approaches are illustrated respectively in the code examples shown in Figures 10 and 11.
Example 4.20. Figure 10 illustrates the case of display without formatting.
In the script in Figure 10 , we used variables in the displayed message separated by commas (,)
as we would with Python. However, the displayed value has three decimal places after the decimal
point, which is the default number of decimal places.
56
Figure 8: Calculaton of the arithmetic mean of two numbers with R.
57
Figure 10: Calculation of the arithmetic mean of two numbers and display by inserting the
values of the variables in writing without formatting.
Example 4.21. Figure 11 illustrates the case of the display in formatted writing.
In this script, we used the sprintf statement and format specifiers to format values according
to different criteria. These format specifiers can be combined with additional options to control field
width, padding, alignment, etc. Here are some cases of other commonly used format specifiers:
58
Figure 11: Calculation of the average of two numbers and display by inserting the values of the
variables in formatted writing.
where p is the condition and s the sequence of instructions to be executed if the condition is
satisfied.
if ( p ) {
s
} else {
t
}
Example 4.22. Taking the example 2.12, we have the following implementation:
if ( x >= 0) {
val_abs <- x
} else {
val_abs <- -x
}
59
The format of the previous alternative structure can be simplified, especially when it comes to
performing a simple assignment. The general format of this simplification is:
if ( p1 ) {
s1
} else if ( p2 ) {
s2
} else if ( p3 ) {
s3
}
[...]
else {
t
}
60
Example 4.24. The example 2.14 in Figure 12 shows an implementation of this structure in R.
x <- 0
i <- 1
while ( x != 1524) {
cat ( " Please enter the number .\ n " )
x <- as . integer ( readline () )
i <- i + 1
}
61
In R, the Until loop has the general format of the implementation of this loop in Python:
repeat {
if ( p ) {
break
}
s
}
with s a sequence of instructions and p the iteration control predicate.
x <- 0
i <- 0
repeat {
cat ( " Please enter the number .\ n " )
x <- as . integer ( readline () )
i <- i + 1
if ( x == 1524) {
break
}
}
Example 4.27. By resuming the display of the first 3000 multiples of 3 without doing multiplications.
# Declaration of variable i
i <- 3
62
Table 8: Comparisons of data types.
63
Table 13: Alternatives structures.
Pseudocode Python R
If p Then if p : if ( p ) {
s. s s
End If }
If p Then if p : if ( p ) {
s s s
Else else : } else {
t. t t
End If }
Pseudocode Python R
Pseudocode Python R
64
6 Conclusion
The objective of this paper was to present in a simple way the main primitive data structures and
control flows in Python and R.
Python and R are programming languages that provide primitive data structures, such as variables
and constants, to represent integers, booleans, reals, characters, and complex numbers. Both of these
languages provide the ability to perform mathematical operations, comparisons, and logical opera-
tions on these basic data types. Nevertheless, it is important to point out that there are significant
distinctions between Python and R in the way they handle these data structures.
Python is distinguished by its clear and readable syntax, which facilitates the manipulation of
primitive data types. It also offers advanced features for converting and manipulating character
strings. In contrast, R is specifically designed for statistical analysis and has more powerful features
for working with vectors and arrays of data. R offers vectorized operations that simplify calculations
and transformations on datasets. Also, when it comes to complex numbers, Python and R support
mathematical operations on these numbers, but R offers more advanced functionality for complex
number calculations and manipulations thanks to its native support for this data type.
We also covered control flows such as sequential structures, alternate structures, and loop struc-
tures. We have observed that Python and R use very similar control flows, but Python’s syntax
is more readable and less verbose than R’s. In contrast, R favors a syntax suitable for processing
large amounts of data. Moreover, R offers specific structures such as switch...case or until, which
Python does not offer natively, but which can be easily adapted.
In our next paper, we will introduce advanced control flows called sub-algorithms, which further
structure the code and make it more modular and reusable.
65
References
Böhm, C. and Jacopini, G. (1966). Flow diagrams, turing machines and languages with only two
formation rules. Communications of the ACM, 9(5):366–371.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2022). Introduction to Algorithms. MIT
Press and McGraw-Hill, 4th edition.
Darmangeat, C. (2008). Algorithme et Programmation pour les non-matheux. Cours complet avec
exercices, corrigés et citations philosophiques. Université Paris 7.
Goodrich, M. T., Tamassia, I. R., and Goldwasser, M. H. (2013). Data Structures and Algorithms in
Python. John Wiley & Sons, New York.
Prakash, Dr. PKS and Rao, Achyutuni Sri Krishna (2016). R Data Structures and Algorithms. Packt
Publishing, Birmingham.
Ramshaw, L. (1988). Eliminating go to’s while preserving program structure. Journal of the ACM,,
35(4):893–920.
Sedgewick, R. and Wayne, K. (2011). Algorithms. Princeton University Press, Princeton, 4th edition.
Venables, W. N., Smith, D. M., and R Core Team (2023). An Introduction to R, Notes on R: A
Programming Environment for Data Analysis and Graphics. Version 4.3.0.
66