3 Python vs R Procedures and Functions
3 Python vs R Procedures and Functions
Résumé
This paper presents the notions of procedures and functions, as well as their imple-
mentation using Python and R programming languages. In addition, the paper addresses
certain notions of functional programming, such as currying, presenting them in an in-
formative way. We emphasize the importance of understanding these concepts to develop
effective and well-structured programs. The paper also highlights the similarities and dif-
ferences between Python and R instructions, giving readers a comparative perspective.
Keywords : Procedure ; Function ; functional programming ; Python language ; R language.
∗
Researcher in Optimization and MAchine Learning Laboratory (OPTIMALL), DR Congo.
E-mail : g.kamingu@unikin.ac.cd.
1 Introduction
As programs become more and more complex, it becomes more and more laborious to solve them
by directly implementing the corresponding algorithms. This is why it is useful to decompose the
problem into independent sub-problems and solve each sub-problem on its own. This is where the
notion of subroutine (or sub-algorithm) comes in, which is a powerful tool in programming (Knuth,
1997a). Indeed, a judicious use of the functions, in particular thanks to the structured programming
approach, can considerably reduce the development and maintenance costs of a complex program,
while improving its quality and its reliability. This is why the syntax of many programming languages
includes the possibility of writing and using subroutines.
While Alan Turing (1945) had already introduced the concept of subroutines in an article dis-
cussing design proposals for the NPL ACE computer, the fundamental idea of subroutines emerged
from the work of John Mauchly and Kathleen Antonelli on ENIAC (Dasgupta, 2014). However, the
formalization of this concept took place during a symposium held at Harvard in January 1947 (Mau-
chly, 1982). David Wheeler 1 , along with Maurice Wilkes and Stanley Gill, are generally credited with
the invention of subroutines around 1951 (called closed sub-routines). They also provided the initial
explanation on how to design software libraries (Wilkes et al., 1951; Wheeler, 1952).
The rest of the paper is structured as follows. We proceed to the presentation of the notion of
subroutine in the section 2, then present the procedures and functions in Python in the section 3, and
the procedures and functions in R in the section 4. Finally, we discuss the elements of comparison of
these two languages in the section 6 before concluding.
2 Subroutines
2.1 Notion of subroutine
The more complex the algorithm becomes, the more likely it is to perform the same processing,
or similar processing, in several places (Darmangeat, 2008). Hence the use of subroutines, also called
sub-programs (or sub-algorithms). It is like, for example, an algorithm that can repeat ten times the
calculation of the arithmetic mean of the ten real numbers or an algorithm that requires the entry of
the answer "Yes" or "No" twenty times for different questions.
To solve this kind of problem, it will be necessary to separate the repetitive processing from the
instructions that compose it in a separate module. This module has several advantages, including :
• The simplification of the complexity of the algorithms by avoiding unnecessary copies, which
involves calling subroutines or reproducing the same instructions.
• The possibility of dividing the work so that the realization of large algorithms can be carried
out autonomously for the benefit of a significant saving of time.
• The readability is ensured, that is to say that the algorithm being modular, the localization of
the errors is easy ; it suffices to make a single modification in the right place, for the modification
to be effective for the whole algorithm.
The body of the algorithm is therefore the main algorithm (program) or main procedure. Thus, there
are two types of algorithms : procedures and functions :
1. David John Wheeler (1927 – 2004), a British computer scientist, the first in the world to obtain a doctorate
in Computer Science (in 1951). At the time, most academics associated with computer science research had
advanced degrees in Mathematics.
68
• The procedures, which are subroutines executed by a calling algorithm, which generally generate
changes in the values of certain variables or return several values (instead of only one) or no
value on each call.
• The functions, which are subroutines that must return one and only one value each time they
are called to the calling algorithm.
In most programming languages, the declaration (and therefore the definition) of a subroutine usually
includes :
• a keyword (procedure or function) in the case of a subroutine in a language clearly distinguishing
the different forms of subroutine ;
• the subroutine identifier (name given by the programmer to the subroutine) ;
• the description of parameters indicating for each one (if there are parameters) :
• the parameter identifier (name),
• the (explicit or implicit) type of the parameter (in the case of a subroutine in a typed
language).
2.2 Procedures
The declaration of a procedure is done as follows :
By using par1, par2, ..., parn to represent the n parameters of the procedure and par_type1,
par_type2, . . . , par_typen to designate their types respective.
The execution (or the call) of a procedure can be triggered by using its identifier and, if necessary,
by providing its parameters, separated by commas. This action triggers the execution of the instruc-
tions that make up the procedure. When a procedure is called, the program interrupts its normal
flow of execution, executes the instructions specified in the procedure, and then returns to the calling
program to continue execution from the next instruction.
69
It is possible to have a procedure without any parameters, as shown in the following example :
#The procedure
Procedure greeting()
Write “Hello !”
End Procedure greeting
The above algorithm defines a procedure named greeting which displays the message “Hello !”. Then,
in the main algorithm, the procedure greeting() is called. So, when the algorithm is executed, it
displays the message “Hello !”.
Example 2.2. Still in the same context, we can modify the algorithm to ask the user for his first name.
#The procedure
Procedure greeting()
Write "Enter your first name :"
Read firstname
Write "Hello", firstname, " !"
End Procedure greeting
The pseudocode above defines a procedure called greeting(). This procedure asks the user to enter
their first name, reads the value entered and then displays the message “Hello [first name] !” where
[first name] represents the first name entered by the user. In the main algorithm, the greeting pro-
cedure is called. So when the algorithm is run, it asks the user for their first name and then displays
the corresponding greeting.
Example 2.3. Let’s modify the previous algorithm so that the procedure greeting() takes one para-
meter. So,
#The procedure
Procedure greeting (firstname as string)
Write "Hello", firstname, " !"
End Procedure greeting
70
greeting(your_firstname)
End
The previous algorithm allows the user to enter their first name, then calls the greeting() procedure
with that first name as a parameter. The greeting() procedure then displays the message "Hello"
followed by the user’s first name, with an exclamation mark at the end.
Consider another example where the procedure has more than one parameter.
Example 2.4 (Multiplication table). Let’s assume we want to display a specific part of the multipli-
cation table of a given number (which we will call "base"), by displaying each multiplication within a
range of operations defined by two numbers (a starting number and an ending number). Here is the
corresponding algorithm :
When the procedure tableMulti() is called with the parameters 4, 1 and 12, as in the call tableMulti(4,
1, 12) in the main algorithm, the algorithm desplays a specific fragment of the multiplication table
for the number 4. Specifically, it displays the operations of multiplying 1 to 12 by 4, highlighting the
corresponding results.
Note that if there are several parameters, as in algorithms, the arguments must be provided, when
calling the procedure, in the same order as that of the corresponding parameters (also separated by
commas). The first argument will be assigned to the first parameter, the second argument will be
to the second parameter, and so on, up to the n-th argument which will be assigned to the n-th
parameter.
2.3 Functions
The declaration of a function is done as follows :
71
# Body of the function
Sequence of instructions
...
Return return_value
End Function function_name
By using par1, par2, ..., parn to designate the n parameters of the procedure and par_type1,
par_type2, . . . , par_typen to designate their types respective.
Since the function is mainly designed to return a value, it is therefore necessary to specify the type
of the function, which actually corresponds to the type of value_to_return. It’s called return_type
The execution of a function can be initiated by using its identifier and, if necessary, by supplying
its parameters. This action triggers the execution of the instructions that make up the function.
Example 2.5.
# Variable declaration
Var x, y, s as integer
# The function
Function sum (num1 as integer, num2 as integer) as integer
Var sum as integer
sum_result ← num1 + num2
Return sum_result
End Function sum
This algorithm performs a sum operation between two numbers entered by the user. It uses a function
called sum(), which takes two integer parameters and returns their sum. In the main algorithm, it
asks the user to enter two numbers, then calls the sum() function with those numbers as arguments.
The result of the function is then stored in a variable s. Finally, the algorithm displays the message
"The sum of x and y is : s".
The function differs from the procedure in its ability to be called by name in an expression or
assignment. As in the following example :
Example 2.6.
# Variable declaration
Var x, y, z, result as integer
72
# The function sum()
Function sum (num1 as integer, num2 as integer, num3 as integer) as integer
Var sum as integer
sum_result ← num1 + num2 + num3
Return sum_result
End Function sum
This algorithm declares the integer variables x, y, z and result. It defines a function sum() which takes
three integer parameters and returns the sum of these three numbers. In the main algorithm, it asks
the user to enter three numbers, then it calls the sum() function with those numbers as arguments
and stores the result in the result variable. Finally, it subtracts 22 from the result and displays the
final result.
However, two main types of parameter passing are used, offering distinct uses : (i) pass by value ; and
(ii) pass by reference.
From a syntax perspective, in pass by value, nothing is placed before the parameter when declaring
the subroutine.
73
Example 2.7.
From a syntax perspective, in passing by reference, the keyword "Var" ; similar to the one used
in variable declarations, is placed in front of the formal parameter during the declaration of the
subroutine.
Example 2.8.
In the above pass-by-value algorithm, the procedure incrementer() receives a copy of the value of
the variable x, and when the value is modified inside the function, it does not affect not the original
value of x. Therefore, displaying x at the end returns the original value, which is 5. In contrast, in the
pass by given reference algorithm, the procedure incrementer() receives a reference to the variable
x, and when the value is modified inside the procedure, it directly modifies the value d origin of x. So
displaying x at the end returns the modified value, which is 6.
Note also that pass by value has the advantage of guaranteeing security and data protection, but
it can be slower due to the copying of data and the double consumption of memory space (however,
it is well suited for simple variables). On the other hand, the pass by reference offers the advantage
of a fast access to the data and a reduced memory occupation, because it uses addresses. However, it
can present data security risks and requires knowledge of how the data is physically implanted on the
machine.
74
2.5 Scope of a variable
Definition 2.1. The scope of a variable refers to the area where this variable is visible and accessible.
• A variable declared outside the body of a subroutine (i.e., in the declaration section of the main
algorithm) is called global variable or globally scoped variable. It is accessible from anywhere in
the algorithm, including from procedures and functions. It exists for the lifetime of the program.
In the previous example, the x variable is a global variable that can be used both in the proce-
dure add() and in the main algorithm. The y variable is a local variable that is declared and
used only in the procedure add(). So in the procedure we can access both the global variable x
and the local variable y.
• A variable declared in a subroutine is called local variable or variable with local scope. It is only
accessible to the procedure in which it is defined and is not accessible to other procedures. The
lifetime of a local variable is limited to the execution of the procedure.
75
End Procedure display
In this example, we have both a global variable x and a local variable x in the procedure
display(). When we access the variable x inside the routine, we are referring to the local
variable of the same name. On the other hand, when we access the variable x outside the
procedure, we are referring to the global variable. Thus, using the same variable name creates
a distinction between the global variable and the local variable.
It can be called only by the main algorithm or by other nested subroutines, directly or indirectly,
in the same main algorithm.
Example 2.11. Let’s take the example of a function that calculates the triple of a number. This program
can be written in a simple way using an operation of multiplication by 3, or it can be formulated as
follows :
The function inner() takes an argument x of type integer and also returns an integer. Inside this
function we find another function called inner() which takes an argument y of type integer and re-
turns an integer. The function inner() simply returns the multiplication of x and y.
Then, inside the function outer(), we use the Return instruction to return the result of calling the
function inner() with the argument 3. Hence, the function outer() returns the result of multiplying
x by 3.
Recursion provides the ability to define an infinite set of objects using a finite statement. Similarly, a
finite recursive program can describe an infinite number of computations, even if it contains no explicit
loops (Wirth, 1976).
Example 2.12. Suppose we want to write a function that calculates the factorial of an integer n. As a
reminder, the factorial of a number, denoted n!, is calculated as follows :
76
n! = n × (n − 1) × (n − 2) × · · · × 2 × 1. (1)
In other words :
n! = n × (n − 1)!. (2)
This function can still be defined equivalently as follows :
(
1 si n = 0;
n! = (3)
n × (n − 1)! si n > 0.
This definition says that if n is zero, the result is 1. Otherwise, if n is greater than zero, the result
is n multiplied by the factorial of (n − 1). Hence the following algorithm :
The function facto() takes an integer parameter n and also returns an integer. It uses a conditional
structure to determine if n equals zero. If so, that means we reach the base case or the termination
case, because the factorial of zero is defined as equal to 1, in other words, the recursion chain is broken.
Without its presence, the algorithm cannot terminate.
When n is not equal to zero, this indicates that we are in the general case or the propagation
case. In this case, we need to perform the recursive computation. The function returns the result of
the multiplication of n by the recursive call of the function facto() with the parameter n − 1. This
approach reduces the initial problem to a smaller one by recursively computing the factorial of n − 1
until reaching the base case.
In summary, a recursive function is defined by one or more base cases, where the function produces
a direct result without repeating itself, as well as one or more propagation cases, where the function
calls itself to solve a problem smaller. However, neither of these cases constitutes a complete definition
in itself. Therefore, the algorithm design of a recursive function requires specifying both the base cases
and the propagation cases to fully define its behavior.
Consider another illustration : Euclidean Algorithm, one of the oldest known algorithms (Knuth,
1997b), is described in Book VII (Proposition 1-3) of Euclid’s Elements, written around 300 BC. AD
and presented in the form of anthypheresis (Euclide, 1994).
Example 2.13 (Euclidean algorithm). A common formulation of this algorithm, presented in the work
of Cormen et al. (2004), is as follows : the GCD (Greatest Common Divisor) of two integers a and
b can be calculated using the relation gcd(a, b) = gcd(b, r) , where r is the remainder of the Euclidean
division of a by b, i.e. r = a mod b.
Thus, Euclidean algorithm can be rewritten as follows :
• If b = 0, the algorithm terminates and returns the value a : this is the base case.
• Otherwise, the algorithm calculates the remainder r of the Euclidean division of a by b, then
repeats the process replacing a by b and b by r (i.e. a ← b and b ← r). We then apply the relation
gcd(a, b) = gcd(b, r) : this is the case of propagation.
77
Function gcd(a, b as integers) as integer
If b = 0 then
Return a
Else
Return gcd(b, a mod b)
End Function gcd
Anonymous functions have their origin in the lambda calculus, a mathematical system developed
by Alonzo Church in 1936 (Fernandez, 2009). In this system, all functions are anonymous, which means
that they do not have explicit names assigned to them. Alonzo Church, an American mathematician,
computer scientist, logician and philosopher, introduced the concept of anonymous functions as part
of his pioneering work in the field of computability theory before the advent of electronic computers.
By the way, the terms "lambda abstraction", "lambda function", and "lambda expression" refer to
the notation for the function abstraction in the lambda calculus, where the usual function f (x) = M
would be written (λx.M ) (M is an expression that uses x).
Where Expression represents the expression, i.e. the value or the calculation, whose result should be
returned.
Example 2.14.
For simplicity of writing, we can write the general format of anonymous functions as follows :
Example 2.15. The anonymous function in the example 2.14 can be rewritten in a simplified way as
follows :
78
The use of anonymous functions is a matter of style. This is never the only way to solve a problem,
because each anonymous function could instead be defined as a named function and called by name.
For the purposes of this document, we will only use anonymous functions to perform currying.
However, it is important to note that anonymous functions are limited in complexity and functio-
nality. They are generally used for simple operations and are not intended to replace functions defined
by name when advanced functionality is required.
Anonymous functions can be used to contain functionality that does not need to be named and
possibly for short term use. Notable examples are currying 2 .
Definition 2.5 (Currying). Currying is a technique that consists of transforming a function with several
parameters into a sequence of functions that each take a single parameter.
Example 2.16. Consider the following function to calculate the sum of three real numbers :
The function sum_c() takes a parameter x of real type and returns an anonymous function taking a
parameter y of real type. This anonymous function in turn returns another anonymous function taking
a parameter z of real type. Inside this last anonymous function, the sum x + y + z is calculated and
returned.
The main idea is that each call of the function sum_c() with a parameter x returns a function
which will take the next parameter (y), then this function will return another function which will take
the last parameter (z) and calculate the final sum.
#Main algorithm
Begin
2. The term “currying” is a concept used by logician and mathematician Haskell B. Curry (1980), although
Moses Schönfinkel developed it six years earlier. It is also known as "Schönfinkelisation" (Heim and Kratzer,
1998). The precise origin of the term is unclear, but it is mentioned that Christopher Strachey may have coined
it in his 1967 lecture notes (Turner, 1997). However, the word itself does not appear in these notes. John C.
Reynolds (1972) then defined Currying in an article, without claiming invention of the term. The principle of
currying also dates back to the mathematical work of Frege in 1893 (Quine, 1967; Turner, 1997)
79
# Calling parameters one by one
step1 = sum_c(5)
step2 = step1(-3)
result = step2(5.0)
Write result
End
The function sum_c() takes a parameter x of real type and returns an anonymous function taking
a parameter y of real type. This anonymous function in turn returns another anonymous function
taking a parameter z of type real. Inside this last anonymous function, the sum x + y + z is calculated
and returned. The main idea is that each call of the function sum_c() with a parameter x returns a
function which will take the next parameter (y), then this function will return another function which
will take the last parameter (z) and will calculate the final sum.
#Main algorithm
Begin
# Calling the parameter sequence at once
sum_c(5)(-3)(5.0)
End
Here’s an explanation of the previous main program :
• sum_c(5) corresponds to the initial call to the function sum_c(5) with argument x equal to 5.
This function takes a parameter y and returns another anonymous function
80
3 Procedures and functions in Python
3.1 Notion of procedure and function in Python
In Python, the term "function" is used to encompass both strict functions (which return a value)
and procedures (which do not). Python uses the same def statement to define both types. Here is the
general syntax for defining a function in Python :
def function_name ( par1 , par2 , ... , parn ) :
# Declaration of local variables
variable1 : variable1_value
variable2 : variable2_value
# ...
variablen : variablen_value
# Body of the subroutine
# Sequence of instructions
# ...
return return_value
With :
• def function_name(par1, par2 ..., parn) : is the declaration of the function, where we
must specify the name of the function, the parameters with their types and the type of value
back.
• Sequence of instructions : is where the code that will be executed when the function is
called will be written.
• return return_value : is the return statement that returns the specified value as the result of
the subroutine.
It is possible to choose any identifier to name a subroutine in Python, provided that you do not use
the reserved words of the language (see the paper by Kamingu (2022)). Also, it is recommended not
to use any special or accented characters (except the underscore “_”), and to adopt the snake_case
style of case, just like for variable names in Python.
Example 3.1.
It is important to note that, just like for loops and conditionals, the indentation of the body of a
procedure is mandatory.
If we later call the procedure greeting() in the script, the message Hello! will be displayed.
81
However, introducing a parameter into the procedure results in an error, as shown in the following
example :
Example 3.2.
Similar to the previous section, we can modify the program to include the user’s first name.
Example 3.3.
The version of the example 3.3 with one parameter can be written as follows :
Example 3.4.
Let’s go back to the example of writing a program displaying a fragment of the multiplication table
that we mentioned earlier (example 2.4). Here is the corresponding script :
82
Example 3.5 (Python multiplication table).
n = starting
while n <= ending :
print (n , " x " , base , " = " , n * base )
n += 1
After calling the procedure tableMulti(4, 1, 12), the result obtained is the following :
Remark 3.1. In the example script 3.5, we used the assignment statement n += 1 instead of n = n+1.
This is because the += notation is a combined assignment operator in Python. It allows updating the
value of a variable by adding another value to the existing one and storing the result in the same
variable. Thus, the expression x += y is equivalent to x = x + y, where x and y can be variables or
expressions, and Opt represents an arithmetic operation such as Opt ∈ {+, -, *, /, %}.
Example 3.6. The following function can be used to calculate the sum of two numbers :
83
3.5 Subroutine parameters in Python
3.5.1 Passing parameters in Python
In Python, parameters are passed by reference. This means that when an object is passed to a su-
broutine as a parameter, that object’s reference is passed rather than its value. Therefore, any changes
made to the object inside the subroutine will be reflected outside of it.
However, it is important to note that the behavior may vary depending on the type of object
passed as a parameter. Two cases can be distinguished :
1. Immutable objects (such as variables (integers, reals, strings, tuples 3 ) are passed by value. This
means that when an immutable object is passed as an argument, a copy of its value is created
inside the subroutine, any changes made to the object inside the subroutine will not affect the
original object outside the subroutine.
Example 3.7.
2. Mutable objects 4 (such as lists, dictionaries) are passed by reference. This means that when an
object modifiable as a parameter is passed as an argument, changes made to that object inside
the subroutine will be reflected in the original object outside the subroutine.
Example 3.8.
In [46]: times (2 , 3)
Out [46]: 6
84
Out [47]: 16 .4 30 04 50 00 00 000 3
The * operator in Python is able to process different data types such as integers, floating point num-
bers, strings, and lists. This means that our function times() can perform different tasks depending
on the types of the supplied arguments. However, it is important to be aware of this great flexibility
offered by Python, as it can sometimes lead to unexpected results in your programs.
It is generally preferable for each argument to have a specific, clearly defined type, rather than
allowing different types of input. For example, it is clearer and less prone to design errors if we define
that the function times() only works with integers, floating point numbers or strings specifically.
This helps ensure consistent results and avoid unwanted behavior. It is therefore recommended to pay
attention to the flexibility offered by Python and to specify the types of arguments expected in your
functions, in order to avoid potential problems and to make your code more robust and predictable.
The function times() can thus be modified so that x and y can receive at least one numeric
parameter.
Example 3.9.
85
TypeError : At least one of the arguments must be numeric .
Let’s go back to the example 3.3. We can customize the greeting as in the following example :
Example 3.10.
It is important to place parameters without default values before parameters with default values in
the definition of a subroutine in Python. This is due to the mechanism of associating arguments with
parameters when calling the subroutine. Python associates arguments with parameters in the order
they are defined in the subroutine definition. By placing parameters without default values first, Py-
thon can associate them with the first arguments provided during the call. Then parameters with
default values can be associated with the remaining arguments, if provided. This approach ensures a
consistent mapping between arguments and parameters, making it easier to understand and use the
subroutine.
In this example, if we set the parameter parameter2 with a default value before parameter1,
it will cause a syntax error. Python won’t know how to map the arguments provided during the
subroutine call to the corresponding parameters. By respecting the order of the parameters without
default value before the parameters with default values, the association between the arguments and the
parameters is done correctly when calling the subroutine. It also ensures consistency and predictability
of subroutine behavior.
86
3.5.3 Arguments with labels in Python
In most programming languages, it is generally necessary to supply the arguments in the same
order as the corresponding parameters in the definition of a subroutine. However, Python offers greater
flexibility in this regard. When the subroutine’s parameters have default values, the subroutine can
be called by supplying the arguments in any order, provided that the names of the corresponding
parameters are explicitly specified.
Let’s go back to the example of writing a program that displays a fragment of the multiplication
table that we mentioned earlier. We can modify it in the following way :
n = starting
while n <= ending :
print (n , " x " , base , " = " , n * base )
n += 1
After calling the procedure tableMulti(starting=3, ending=14, base=5), the result obtained is
the following :
87
In [25]: # Declaration of a global variable
x = 5
Remark 3.2. In Python, it is possible to define a subroutine capable of modifying a global variable
using the instruction global. This statement is used inside the subroutine definition to indicate which
variables should be treated as global variables. This allows these global variables to be accessed and
modified within the subroutine.
88
12
In [451]: f (5)
Out [451]: 54
89
Example 3.18 (Currying in Python).
return ( return_value )
}
With :
• function_name <- function(par1, par2 ..., parn) : is the declaration of the function,
where we must specify the name of the function, the parameters with their types and the
return value type.
• Sequence of instructions : is where the code that will be executed when the function is
called will be written.
• return return_value : is the return statement that returns the specified value as the result of
the function.
It is possible to choose any identifier to name a function in R, provided that you do not use the
reserved words of the language (see Kamingu (2022)). In addition, it is recommended not to use any
special or accented characters, and to adopt the camelCase style of case, just like for variable names
in R.
90
4.2 Procedures without parameters
In R, it is possible to create a procedure without parameters. This can be illustrated with the
following example :
Example 4.1.
It is important to note that in R, just like for loops and conditionals, indenting the body of a function
or procedure is optional due to the use of braces.
If we later call the procedure greeting() or procedure in the script, the message Hello! will be
displayed.
Example 4.2.
The previous program can be modified to include the user’s first name.
Example 4.3.
# The procedure
greeting <- function () {
firstName <- readline ( prompt = " Input your first name : " )
print ( paste ( " Hello " , firstName , " ! " ) )
}
The version of the example 4.3 with one parameter can be written as follows :
91
Example 4.4.
# The procedure
greeting <- function ( firstName ) {
print ( paste ( " Hello " , firstName , " ! " ) )
}
Here is the script corresponding to the previous example where we wrote a program to display a
fragment of the multiplication table (example 2.4).
n <- starting
while ( n <= ending ) {
print ( paste (n , " x " , base , " = " , n * base ) )
n <- n + 1
}
}
After calling the procedure tableMulti(4, 1, 12), the result obtained is the following :
92
4.4 Functions in R
In R, functions are structured similarly to procedures, but they differ in their ability to return a
single value. This can be illustrated by the following example, allowing to calculate the sum of two
numbers :
Example 4.6.
Example 4.7.
Assigning a value to the variable x and calling the subroutine incrementer , and displaying the
variable produces the following result
> # The main program
> x <- 5
> incrementer ( x )
> print ( x )
[1] 5
However, there is an exception for objects of type data.frame and matrix which can be passed by
reference. This means that when a Dataframe or Matrix is passed to a function as a parameter, changes
made to the object inside the function will be reflected outside of it. This is due to the way these
objects are stored in memory in R.
Take the example of the greeting subroutine. We can customize the greeting message as shown
below :
Example 4.8.
93
greeting <- function ( firstName , message = " Hello " ) {
print ( paste ( message , firstName , " ! " ) )
}
It is important to note that parameters without default values must be placed before parameters with
default values in the subroutine definition in R. This is due to the way arguments are associated with
parameters when calling of the subroutine. By respecting this order, the first arguments provided du-
ring the call are associated with the corresponding parameters. Then parameters with default values
are associated with the remaining arguments, if provided. This approach ensures a consistent mapping
between arguments and parameters, making it easier to understand and use the subroutine.
In this example, if we set the parameter parameter2 with a default value before parameter1, it will
cause a syntax error. By respecting the order of parameters without default value before parameters
with default values, the association between arguments and parameters is done correctly when calling
the subroutine. This also ensures consistency and predictability in the behavior of the subroutine.
Let’s take the example of the multiplication table that we mentioned earlier and adapt it to the
R language :
94
n <- starting
while ( n <= ending ) {
print ( paste (n , " x " , base , " = " , n * base ) )
n <- n + 1
}
}
After calling the procedure tableMulti(starting=3, ending=14, base=5), the result obtained is
the following :
95
# Declaration of a global variable
x <- 4
In R, it is not possible to directly modify a global variable inside a subroutine. However, there is an
alternative approach to achieve a similar effect using a return function.
> display ()
[1] 12
> print ( x )
[1] 12
96
gcd <- function (a , b ) {
if ( b == 0) {
return ( a )
} else {
return ( gcd (b , a %% b ) )
}
}
f <- function ( x ) {
return (2* x ^2+4)
}
97
5 Comparison of Functions and Procedures concepts in Python
and R
Below is a comparative table of the notions of procedure and function in Python and R :
Notion Python R
Definition
def procedure_name ( procedureName <-
parameters ) : function ( parameters )
instructions {
instructions
}
Call
procedure_name ( arg1 , procedureName ( arg1 , arg2
arg2 ) )
Return value
return valeur return ( valeur )
Unlike Python, the use of the key-
word return is not necessary, be-
cause the return value is determi-
ned by the last evaluated expres-
sion. Procedures can have an en-
vironment that contains the func-
tion’s local variables.
Passing Parame- Arguments are usually passed by Parameters are usually passed by
ters reference. value.
Local variables
local_variable = value localVariable <- value
Global variables
global global_variable globalVariable <<- value
98
Table 2 – Comparison of Functions and Procedures concepts in Python and R (Continued).
Notion Python R
Anonymous func-
tion lambda arguments : function ( arguments ) {
expression expression
}
6 Conclusion
In conclusion, procedures and functions are essential elements of structured programming, allowing
to solve problems by dividing them into smaller and reusable tasks. Both Python and R provide data
scientists with the ability to program with procedures and functions in an easy way, making it easier
to organize and reuse code, improving project efficiency and maintenance.
In addition, Python and R offer the possibility of implementing functional programming, a pro-
gramming paradigm particularly well suited to Data Science applications. This paradigm offers po-
werful concepts for manipulating and analyzing data, as well as developing sophisticated algorithms.
In a later paper, we shall also discuss how to group these functions and/or procedures together to
form modules, providing an even more structured and modular approach to programming.
99
Références
Cormen, T., Leiserson, C., Rivest, R., and Stein, C. (2004). Introduction à l’Algorithmique. Cours et
exercices. Dunod, Paris, 2ème éd. edition.
Curry, H. B. (1980). Some Philosophical Aspects of Combinatory Logic. Studies in Logic and the
Foundations of Mathematics, 101 :85–101.
Darmangeat, C. (2008). Algorithme et Programmation pour les non-matheux. Cours complet avec
exercices, corrigés et citations philosophiques. Université Paris 7.
Dasgupta, S. (2014). It Began with Babbage : The Genesis of Computer Science. Oxford University
Press, Oxford.
Davies, T. M. (2016). The Book of R - A First Course in Programming and Statistics. William
Pollock, San Francisco.
Euclide (1994). Les Éléments, volume vol. 2. Livres V à IX. Notes et commentaires par Bernard
Vitrac. PUF, Paris.
Kamingu, G. L. (2022). Python vs. R pour Data Scientists : Structures algorithmiques. Série Optimall
Python vs. R pour Data Scientists, 001(003).
Mauchly, J. W. (1982). Preparation of problems for EDVAC-type machines. In Randell, B., editor,
The Origins of Digital Computers, pages 393–397. Springer.
Quine, W. V. O. (1967). On the building blocks of mathematical logic. In van Heijenoort, J., edi-
tor, A Source Book in Mathematical Logic, 1879–1931, pages 355–366. Harvard University Press,
Cambridge, MA. Translated by Stefan Bauer-Mengelberg.
Turing, A. M. (1945). Report by Dr. A.M. Turing on proposals for the development of an Automatic
Computing Engine (ACE) : Submitted to the Executive Committee of the NPL in February 1946
reprinted in Copeland, B. J., ed. (2005). Alan Turing’s Automatic Computing Engine. Oxford
University Press, Oxford.
Wheeler, D. J. (1952). The use of sub-routines in programmes. In Proceedings of the 1952 ACM
national meeting, page 235, Pittsburgh. ACM.
100
Wilkes, M. V., Wheeler, D. J., and Gill, S. (1951). Preparation of Programs for an Electronic Digital
Computer. Addison-Wesley.
101