Rapport Ahmed Ben Hmida

Download as pdf or txt
Download as pdf or txt
You are on page 1of 75

INFORMATION TECHNOLOGY

2022 - 2023
DevSecOps : Empowering IT
Collaboration and Customer Success

Written by: Ahmed Ben Hmida

Academic Supervisor: Mr. KHECHANA Fares

Professional Supervisor: Mr. Ahmed belhoula


Dedication

I dedicate this work to:

To my dear Mom, Amira, who has been my rock and cheerleader throughout this crazy
project. Your unending encouragement and your belief in me, even when I doubted myself.
I owe it all to your good heart.

To my Dad, Adel, who never failed to remind me that hard work and determination
always pay off. Your strong sense of discipline and resilience has been my guiding light in
this journey.

To my brother, Oussema, who had my back and kept me smiling through the toughest
parts of this work. It was your jokes, your understanding, and your unflinching support
that kept me sane during the most intense parts of this project.

And to my buddy, Amine, whose friendship was the greatest gift during this journey.
For every late-night study session, every brainstorming coffee break, and every time you
believed in me, I say thank you.

This project is more than just my achievement—it’s a tribute to all of you. To your
love, your guidance, and thanks for not stopping believing in me.

i
Acknowledgment

I would like to express my deepest appreciation and genuine excitement as I acknowledge


and commend all those who, through their unique contributions, played a pivotal role in
enabling me to finish and succeed in this internship.

The work presented in this graduation project was carried out within RidchaData
company. I would like to express my gratitude and thanks toRidchaData’s CEO, Mr.
Ridha Chamem, for giving me the opportunity to realize this project and to provide me
with a modern atmosphere and all the necessary equipment to achieve the internship goals.

I would like to express my sincere appreciation thanks to Mr. ahmed belhoula, my


supervisor in Ridchadata, for his marvelous support during this internship, his belief and
trust in my ideas, and my decisions. Thank you for your availability and for reviewing and
validating this work.

I am deeply grateful to have you by my side.

I want to express my deepest gratitude to Mr. KHECHANA Fares, my academic


supervisor at ESPRIT, for his effort and interest in my work.

I’m also delighted to thank the jury members for reviewing my report and for assisting
my graduation day.

ii
Contents

Introduction 1

1 Project context 2

1.1 The host company: RIDCHA DATA . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.2 Domains of Operation . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.3 Expertise and Services . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.4 Innovation and Talent Sourcing . . . . . . . . . . . . . . . . . . . . 4

1.1.5 Company Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Problematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 The existing solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5 Work Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5.1 Agile In DevOps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5.2 Introduction to Kanban Methodology . . . . . . . . . . . . . . . . . 7

1.5.3 Principles of Kanban: . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5.4 Kanban Board and Workflow Visualization . . . . . . . . . . . . . . 8

1.5.5 Kanban in DevOps and Software Development . . . . . . . . . . . . 8

1.5.6 Kanban vs. Scrum: Key Differences . . . . . . . . . . . . . . . . . . 9

iii
Contents iv

2 Needs Analysis and Specifications 10

2.1 DevSecOps approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.2 Advantages of DevSecOps . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.3 CI/CD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.4 Advantages of CI/CD . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Terraform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.1 Advantages of Terraform . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 GitOps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.1 Advantages of GitOps . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4 Kubernetes and Helm Charts . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4.1 Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4.2 Helm Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.5 ArgoCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.6 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.6.1 Advantages of monitoring in DevOps: . . . . . . . . . . . . . . . . . 17

2.7 Comparative study of tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.7.1 Version management tools . . . . . . . . . . . . . . . . . . . . . . . 18

3 Analysis & Conception 23

3.1 Identification of Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Project Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2.1 Terraform Modules : . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.3 Functional and non-Functional needs . . . . . . . . . . . . . . . . . . . . . 27

3.3.1 Functional Needs : . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.2 Non-Functional Needs : . . . . . . . . . . . . . . . . . . . . . . . . . 28


Contents v

4 Achievement 30

4.1 Adjustment environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 Gitlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2.1 Piplines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.2.2 Terraform Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.3 GitOps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.3.1 ArgoCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.4.1 Prometheus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.4.2 Grafana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Conclusion 62
List of Figures

1.1 RIDCHA DATA Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 DevSecOps process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1 Terraform Architecure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2 Project Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3 Helm Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.4 Grafana and prometheus Architecture . . . . . . . . . . . . . . . . . . . . 26

4.1 Project GCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2 Gitlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.3 First Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.4 Second Pipline 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.5 Second Pipline 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.6 Second Pipline 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.7 gcp-auth.yml 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.8 gcp-auth.yml 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.9 Kaniko Pipline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.10 Custom Service Account . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.11 Service Account Pipline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

vi
List of Figures vii

4.12 JOB APPLY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.13 Cloud Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.14 Cloud Function Job apply . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.15 BigQuery Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.16 main.tf of bigQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.17 Pipline BigQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.18 Document AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.19 Document AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.20 Document Ai Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.21 Subscription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.22 Publisher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.23 PUB/SUB Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.24 GKE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.25 GKE Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.26 Workload Identity Federation . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.27 Main.tf OF THE Workload Identity Federation . . . . . . . . . . . . . . . 49

4.28 Job Wif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.29 ArgoCD Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.30 Infra Of my GitOps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.31 Hlem Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.32 Main Applicaton (root) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.33 Application folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.34 Helm file of application in argoCD . . . . . . . . . . . . . . . . . . . . . . . 53

4.35 Helm file of Values of argoCD . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.36 Helm File Of AppProject . . . . . . . . . . . . . . . . . . . . . . . . . . . 55


List of Figures viii

4.37 Main Dash Of argoCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.38 ArgoCD application (chatGPT) 1 . . . . . . . . . . . . . . . . . . . . . . . 56

4.39 ArgoCD application (chatGPT) 2 . . . . . . . . . . . . . . . . . . . . . . . 57

4.40 Prometheus logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.41 Prometheus Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.42 Grafana logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.43 Grafana Dashboard 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.44 Grafana Dashboard 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.45 Grafana Dashboard 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61


List of Tables

1.1 Kanban vs. Scrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1 Comparison of GitLab and GitHub . . . . . . . . . . . . . . . . . . . . . . 18

2.2 Comparison of GitLab and Jenkins . . . . . . . . . . . . . . . . . . . . . . 19

2.3 Comparison of Terraform and Chef . . . . . . . . . . . . . . . . . . . . . . 19

2.4 Comparison of Argo CD and Flux CD . . . . . . . . . . . . . . . . . . . . 20

2.5 Comparison of Grafana and Kibana . . . . . . . . . . . . . . . . . . . . . . 21

2.6 Comparison of Grafana and Kibana . . . . . . . . . . . . . . . . . . . . . . 21

ix
Acronyms

CI Continuous Integration

CD Conginuous Delivery

DevOps Development and Operations

SIP Service Integration Platform

UML Unified Modeling Language

ODM Object Document Mapper

CTA Call To Action

SDK Software Development Kit

JWT Json Web Token

AI Artificial Intelligence

x
Introduction

DevSecOps, an acronym for Development, Security, and Operations, is a philosophy


or framework that integrates security practices into the DevOps process. It is sometimes
referred to as "security as code"..

The goal of DevSecOps is the involvement of security teams and considerations from
the beginning of the development cycle. The central principle is to incorporate security
measures as a fundamental part of the DevOps pipeline, rather than as an afterthought at
the end of the development process.

DevSecOps involves continuous, agile collaboration between release engineers, security


teams, and operations staff. Automated processes are used to ensure fast, consistent re-
sults. This shift aims to make everyone involved in the development process accountable
for security, encourage a more proactive approach to potential vulnerabilities, and ensure
rapid response times in the event of a breach.

This model is designed to increase the speed of software development and deployment
and reduce the risk of security issues. By bringing together developers, security experts
and operations staff, the goal is to create a more secure and efficient system for delivering
software. Core DevSecOps practices include continuous integration, continuous delivery,
automated build and infrastructure as code..

1
Chapter 1

Project context

Introduction

First, we present the host company RIDCHA DATA where we had the opportunity to work
on this project. Afterward, we briefly introduce the host company. Then, we describe
Methodology and project framework, the existing solutions, and the proposed solution.
Finally, we present the methodology that we’re going to adapt in order to reach our end
goal.

1.1 The host company: RIDCHA DATA

1.1.1 Presentation

RIDCHA DATA is an innovative digital services company that positions itself as a leader
in engineering and technology consulting. The organization operates as a subsidiary of the
RD Group, which focuses on the Information Technology (IT) and Healthcare sectors.

2
1.1. The host company: RIDCHA DATA 3

Figure 1.1: RIDCHA DATA Logo

In the next part, we’ll present the company’s Domains of Operation.

1.1.2 Domains of Operation

The company operates in two main areas: IT and Healthcare. They maintain a customer-
centric approach, treating each customer as unique, enabling them to provide tailored
services that meet each customer’s specific needs.

In what follows, we present the Expertise and Services.

1.1.3 Expertise and Services

RIDCHA DATA’s services are organized around three main areas of expertise and three
specific offers. Their areas of expertise include:

• Technical and operational expertise (DevOps/CloudOps/SecOps/DataOps)

• Organizational expertise (PMO/TestOps/CP/DP)

• Functional expertise (AMOA/MOA)

Their service offers include:

• DevOps and Cloud Services

• TestOps

• DataOps
1.2. Problematic 4

1.1.4 Innovation and Talent Sourcing

The company prides itself on its ability to invent, develop, deliver, and rapidly scale dis-
ruptive innovations for its clients. RIDCHA DATA is also recognized for its talent-sourcing
capabilities, offering improved responsiveness, optimized skills, and competitive financial
terms. We move now to present the company Policy

1.1.5 Company Policy

The company’s policy is geared towards achieving general objectives, developing and re-
taining its human resources, and preparing the company to meet future challenges and
satisfy its clients.

1.2 Problematic

During my internship, our main focus was to improve the efficiency and organization of the
data team’s work. To achieve this, I took on the task of developing infrastructure as code
using Google Cloud Platform (GCP). This approach aimed to simplify and streamline their
workflows. At the same time, we aimed to improve the practicality and usefulness of the
deployment process in Kubernetes (k8s), while addressing any issues and errors that arose.
Another critical aspect we focused on was the implementation of effective monitoring and
an alert manager to ensure the stability of the system.

Let’s now delve into the details of the existing solution, which we developed for our
final internship project.
1.3. The existing solution 5

1.3 The existing solution

Prior to implementing our solution, the data team developers faced several challenges and
inefficiencies in their workflow. Their work involved manually setting up and configur-
ing infrastructure in Google Cloud Platform (GCP), which was a time-consuming and
error-prone process. Each deployment to Kubernetes (k8s) required manual intervention
and lacked a standardized approach, leading to inconsistencies and potential deployment
errors. Troubleshooting and identifying issues in the system was a tedious task, as develop-
ers had limited visibility into infrastructure and application performance. This resulted in
longer resolution times and hindered their productivity. In addition, without an effective
monitoring and alerting system, the team had to rely on reactive approaches where prob-
lems were only detected after they had already affected the system. Overall, prior to our
solution, the developers’ work was characterized by manual and labor-intensive processes,
a lack of standardization, limited visibility, and inadequate means of proactive problem
detection and resolution.

In the next section, we’ll present the proposed solution that we’re going to work on
during this end-of-studies internship.

1.4 Proposed Solution

For our existing solution, we used a combination of powerful tools to address the challenges
we faced. Terraform, an infrastructure-as-code tool, played a key role in automating the
setup and configuration of the data team’s infrastructure on the Google Cloud Platform
(GCP). This allowed us to define and deploy resources consistently and efficiently, ensuring
a seamless workflow for the team.

We also integrated ArgoCD, a declarative continuous delivery tool, to simplify and


streamline the deployment process in Kubernetes (k8s). With ArgoCD, we achieved a
1.5. Work Methodology 6

practical and user-friendly approach to managing application deployments, allowing the


data team to iterate and roll out updates seamlessly.

To ensure the stability and performance of the system, we implemented Grafana and
Prometheus. Grafana, a powerful monitoring and visualization platform, provided real-
time insight into infrastructure and application performance, enabling the team to proac-
tively identify and resolve issues. Prometheus, an open-source monitoring system, comple-
mented Grafana by collecting and storing metrics and alerting the team to anomalies or
potential problems.

By leveraging the capabilities of Terraform, ArgoCD, Grafana and Prometheus, we


successfully transformed the data team’s working environment. Our solution not only sim-
plified their workflows through infrastructure as code but also improved the deployment
process in Kubernetes while addressing and resolving issues and bugs. Furthermore, the
implementation of robust monitoring and alerting mechanisms ensured the stability and re-
liability of the system, allowing the data team to perform their work with greater efficiency
and confidence.

1.5 Work Methodology

In a professional environment, it is essential to adopt an effective methodology to be applied


among the members of the team.

1.5.1 Agile In DevOps

Agile methodology is essential for DevOps teams because it promotes collaboration, flexi-
bility, and continuous improvement. It breaks tasks down into smaller increments, encour-
ages cross-functional teamwork and emphasises transparency. Agile enables frequent code
integration and delivery, reducing time to production. Retrospectives drive learning and
refinement. Overall, Agile empowers DevOps teams to deliver value quickly and adapt to
changing needs.
1.5. Work Methodology 7

This process defines four core values[1] to address these changes:

• Individuals and interactions: This value come to prioritize the team over pro-
cesses and tools as they are worthless when in the wrong hands. The right team is
vital to success.

• Customer collaboration: With this value, we ignore contracts and we focus on


continuous development where a feedback loop is built with the customer. This helps
in ensuring that the product works as intended for them.

• Responding to change: In every project, requirements are always shifting and


priorities are changing. That’s why we use a flexible roadmap to reflect these changes
whenever we need to.

Through these values, we have the definition of a philosophy that makes working on projects
much more efficient and successful. To apply this philosophy, we need a set of tools and
resources that we can use.
In the next section, we’ll talk about Kanban as an agile framework.

1.5.2 Introduction to Kanban Methodology

The Kanban methodology is a visual and flexible approach to work management. It orig-
inated in the manufacturing industry and has been widely adopted in various fields, in-
cluding software development. Kanban focuses on visualizing the workflow, limiting work
in progress (WIP) and optimizing the flow of work to deliver value efficiently. By provid-
ing real-time visibility of tasks and bottlenecks, Kanban enables teams to make informed
decisions and continuously improve their processes.

1.5.3 Principles of Kanban:

The principles of Kanban revolve around six core concepts: visualizing workflow, limiting
WIP, managing flow, making policies explicit, implementing feedback loops, and continuous
1.5. Work Methodology 8

improvement. By visualizing the workflow on a Kanban board, teams gain transparency


and can easily identify areas that need attention. Limiting WIP ensures that work is not
overloaded, allowing teams to focus on completing tasks effectively. Flow management in-
volves minimizing bottlenecks and optimizing the movement of work through the system.
Making policies explicit helps to establish clear guidelines and expectations. Feedback
loops allow teams to learn from experience and make necessary adjustments. Continu-
ous improvement is at the heart of Kanban, encouraging teams to refine their processes
incrementally.

1.5.4 Kanban Board and Workflow Visualization

A Kanban board is a visual representation of the workflow, typically consisting of columns


representing different stages of work. As tasks or user stories move from one stage to
another, they are represented as cards and progress through the columns. This visualization
provides a clear and shared understanding of work status, bottlenecks, and overall progress.
It facilitates collaboration, enables stakeholders to track the status of the project, and
promotes transparency and accountability.

1.5.5 Kanban in DevOps and Software Development

Kanban is widely used in DevOps and software development environments. Its flexibility
and adaptability make it suitable for managing complex and dynamic projects. Kanban
enables teams to respond to changing requirements, accommodate unplanned work and
maintain a steady flow of value delivery. It aligns well with agile principles and complements
DevOps practices by fostering collaboration, transparency, and continuous improvement
across development and operations teams.
1.5. Work Methodology 9

1.5.6 Kanban vs. Scrum: Key Differences

Kanban and Scrum are two of the most popular agile methodologies, each with unique
features. While Scrum follows fixed-length iterations (sprints) and emphasizes time-boxed
planning and delivery, Kanban focuses on optimizing flow and visualizing the workflow
without fixed iterations. Kanban uses a pull system, whereas Scrum assigns tasks during
sprint planning. Kanban’s WIP limits aim to balance and optimize work allocation, while
Scrum encourages full utilization of the team’s capacity. Roles and responsibilities differ
between the two methodologies, and they also vary in terms of change management, meet-
ings, customer collaboration, and documentation practices. It is important to take into
account the specific demands of the project and the dynamics of the team when choosing
between Kanban and Scrum.
Aspect Kanban Scrum
Focus Flow of work Time-boxed iterations
Planning No fixed iterations or sprints Fixed-length iterations (sprints)
Work Allocation Pull system Assigning tasks in planning
Work Management Visualizing workflow on Kanban board Backlog and sprint planning
Work in Progress (WIP) Limits WIP to optimize flow Encourages full utilization
Roles and Responsibilities No prescribed roles Defined roles (e.g., Scrum Master, Product Owner, etc.)
Change Management Adapts to changes in real-time Changes accommodated in next sprint
Meetings Continuous improvement and retrospectives Daily stand-ups, sprint planning, review, and retrospective
Customer Collaboration Continuously collaborates Collaboration during sprint planning and review

Table 1.1: Kanban vs. Scrum

Conclusion

In this first chapter, we begin with a presentation of our host organization. We then move
on to the analysis of the current situation and the proposed solution. Finally, we conclude
with the methodology adopted for our project.

In the next chapter, we will describe the concepts of DevOps and carry out a compar-
ative study to choose the appropriate tools.
Chapter 2

Needs Analysis and Specifications

Introduction

In this chapter, we begin by highlighting the various basic concepts required for the im-
plementation of our project. We then proceed with a comparative study of the various
existing tools in order to synthesize our technological choices.

2.1 DevSecOps approach

2.1.1 Definition

DevSecOps stands for Development, Security and Operations. This is a culture or practice
that incorporates security into every step of the software development cycle. In traditional
development processes, security was often considered at the end of the project or as a sepa-
rate process altogether. In DevSecOps, security considerations and controls are integrated
from the initial design and development stages through to deployment in production.

10
2.1. DevSecOps approach 11

2.1.2 Advantages of DevSecOps

Working with the DevSecOps model provides the team with the following benefits:

• Quickly identify vulnerabilities: By integrating security measures from the get-go,


DevSecOps helps to quickly identify and resolve security issues, reducing the risk of
potential data breaches or system compromises.

• Faster Recovery: When a vulnerability is found or a breach occurs, DevSecOps


integrated approach allows for faster response and recovery, as security is a shared
responsibility.

• Profitability: It is usually less costly to fix problems during the development process
than to fix them after production. Therefore, embedding security throughout the
development lifecycle can result in significant cost savings.

• Improved compliance: By incorporating safety measures through the development


cycle, organizations can better meet regulatory compliance standards.

• Enhanced collaboration: DevSecOps promotes greater collaboration and shared


accountability across development, security, and operations teams.

2.1.3 CI/CD

CI/DC means Continuous Integration and Continuous Deployment. Continuous Onboard-


ing (CI) is a development practice where developers embed code into a shared repository
frequently, often several times a day. Each integration can then be verified using an auto-
mated build and automated tests. The aim is to identify and fix integration errors quickly
and to minimize the time spent resolving merge conflicts. Continuous Deployment (CD) is
the process of making automatic changes to production systems, ensuring that the software
can be reliably distributed at any time.
2.1. DevSecOps approach 12

Figure 2.1: DevSecOps process

2.1.4 Advantages of CI/CD

• Increased efficiency: Automation of the construction and testing process increases


team productivity and efficiency.

• Improved code quality: Integration and frequent testing can quickly detect bugs
and facilitate troubleshooting, improving overall code quality.

• Faster delivery speed: With CD, updates, and enhancements can be sent to end
users more frequently and reliably.

• Risk Mitigation: Through smaller and more frequent updates, issues can be detected
and canceled more quickly, greatly reducing the risk and impact of potential issues.

• Better collaboration: Fosters a culture of shared ownership and collaboration amongst


developers.
2.2. Terraform 13

Continuous Integration (CI)

As mentioned above, continuous integration is a development practice in which developers


routinely merge their code modifications into a central repository where automated builds
and tests are performed. The primary objective of CI is to avoid integration issues caused
by developers working in isolation for long periods.

Continuous Deployment (CD)

As mentioned earlier, Continuous Deployment is a software development practice in which


every code change is automatically pushed through the pipeline and into production, re-
sulting in many production deployments every day. It does this reliably and securely,
making it faster and easier to release new features for users.

2.2 Terraform

Terraform is a free software tool Infrastructure as Code (IaC) developed by ShiCorp. It lets
developers define and deploy the data center infrastructure using a high-level configuration
language called HashiCorp Configuration Language (HCL), or optionally JSON.

Terraform encodes APIs into declarative configuration files that can be shared between
team members, processed as code, edited, revised, and versioned. It supports a wide range
of service providers as well as custom in-house solutions and enables the management of a
variety of infrastructures such as public clouds, private clouds, and Software as a Service
(SaaS) with a unified workflow.

2.2.1 Advantages of Terraform

• Agnostic platform: Terraform can manage a wide range of cloud service providers
as well as customized solutions internally, offering flexibility and reducing vendor
lockdown.
2.3. GitOps 14

• State Management: Terraform captures the state of your infrastructure and can
generate plans to apply modifications, minimize disruptions and avoid manual errors.

• Declarative language: The declarative nature of Terraform syntax makes it easier


to understand and reflect on infrastructure design. You define what you want and
Terraform determines how you can get there.

• Modularity and reusability: Terraform code can be arranged in modules to create


reusable components, promote DRY (Don’t Repeat Yourself) principles and reduce
the risk of errors.

• Collaboration and Sharing: The infrastructure code can be monitored and shared,
allowing for collaboration, peer review, and testing of infrastructure changes.

• Efficiency: By automating infrastructure management, Terraform reduces the time


and effort it takes to build and maintain infrastructure, resulting in faster deploy-
ments and improved productivity.

2.3 GitOps

GitOps is a software development framework that implements the Git version control
system as a unique source of truth for declarative infrastructure and applications. It
applies the same principles used in version control, such as extraction requests and version
history, to managing the infrastructure. Its purpose is to provide a more effective and
collaborative means of managing deployments and operations.

2.3.1 Advantages of GitOps

• Enhanced Productivity and Speed: By using Git as the only source of truth, teams
can use the same tools and processes they use to build code to manage infrastructure,
streamline workflow, and increase productivity.
2.4. Kubernetes and Helm Charts 15

• Improved Auditability: Every change to the system is tracked in Git’s validation


history, which provides a complete version history of changes made, by who, and
when, improving auditability.

• Better Reliability and Stability: Changes are automatically tested and deployed to
the infrastructure based on Git changes. Doing so reduces the risk of human error
and increases the stability of the system.

• Easy Rollbacks: If a change causes a system failure, teams can quickly revert to a
previous state using the built-in version control features of Git. improved productiv-
ity.

2.4 Kubernetes and Helm Charts

2.4.1 Kubernetes

Kubernetes is an open-source container orchestration platform that automates the deploy-


ment, scale-up, and management of containerized applications. Helm, a package manager
for Kubernetes, makes it possible to define, install and update complex Kubernetes appli-
cations.

2.4.2 Helm Charts

Helm charts are packages of pre-configured Kubernetes resources. They help manage Ku-
bernetes apps by providing a way to templatize your services and setups. This allows us
to make deployments more reliable, repeatable and manageable.
2.5. ArgoCD 16

2.5 ArgoCD

ArgoCD is a declarative GitOps continuous delivery tool for Kubernetes. It leverages


Git repositories as the source of truth for Kubernetes resources and the desired state of
applications.

With ArgoCD, any modification of the Git repository triggers a synchronization process
that aligns the real-time status of the Kubernetes cluster with the desired state defined
in the Git repository. It is particularly useful for managing deployments across multiple
environments, ensuring that all are synchronized and updated in an efficient and controlled
manner.

Together, GitOps, Kubernetes, Helm and ArgoCD provide a robust framework for
managing a modern container-based infrastructure, streamlining workflows, and improving
global productivity and reliability. They allow teams to manage large, complex environ-
ments with enhanced control, visibility and ease.

2.6 Monitoring

Monitoring in DevOps is the practice of continually observing and gathering data from
various components of a software system, such as servers, applications, databases and net-
work infrastructure. It involves the use of specialized tools and techniques to monitor and
analyze system performance, identify issues or anomalies, and ensure the full operation
of the software delivery pipeline. The primary objective of monitoring in DevOps is to
provide real-time visibility of system status, enable proactive problem detection and fa-
cilitate quick remediation. By monitoring key metrics and indicators, DevOps teams can
gain valuable insights into resource utilisation, application behaviour and user experience,
enabling them to optimise performance, improve reliability and enhance overall system
stability.
2.6. Monitoring 17

2.6.1 Advantages of monitoring in DevOps:

• Quickly detect and respond to problems: Surveillance allows teams to identify


problems in the early stages, before they affect end-users. By configuring alerts and
notifications, teams can be alerted immediately to abnormalities, allowing them to
react quickly and minimize downtime.

• Improve system performance: Monitoring provides valuable information on re-


source usage, application response times and other performance indicators. By an-
alyzing these data, teams can optimize resource allocation, fine-tune configurations
and identify bottlenecks, improving system performance and responsiveness.

• Improved reliability and stability:Monitoring helps to identify and resolve problems


that may cause system breakdowns or disruptions. By continually monitoring KIs,
teams can take proactive action to manage potential risks and ensure a stable and
reliable environment for development and production systems.

Grafana:

Grafana is a powerful data visualization and analytics platform widely used in the surveil-
lance and observability space. It serves as a front-end tool that allows users to create
visually appealing and interactive dashboards, charts and graphs. Grafana supports a
wide range of data sources, including Prometheus, and provides an easy-to-use interface
for exploring and analysing monitoring data. With Grafana, users can easily customise
their dashboards, add different panels to display metrics and visualise data trends over
time. Its flexible interrogation capabilities and large plugin ecosystem make it a popular
choice for data visualization in the DevOps community.

Prometheus:

Prometheus, on the other hand, is a feature-rich monitoring system and time series database.
It is especially designed for surveillance and warning purposes. Prometheus follows a drawn
2.7. Comparative study of tools 18

pattern, gathering measurements and time series data from a variety of targets monitored
at regular intervals. It stores the data gathered into its own time series database and
provides a powerful query language called PromQL. PromQL allows users to run com-
plex queries to retrieve specific metrics and gain insight into system performance and
behaviour. Prometheus also includes an alerting system that can send notifications based
on pre-defined conditions. This enables DevOps teams to set up customized alerts, receive
real-time notifications, and proactively take action to resolve issues quickly.

2.7 Comparative study of tools

To accomplish this project we needed many tools starting with GCP for our private cloud
and we used as gitlab for versioning and pipliens and we used terraform for the infra as
code and argoCD for GitOps and we finish with the monitoring working with grafana and
prometheus.

2.7.1 Version management tools

Gitlab Vs Github

GitLab GitHub
Hosting Platform Self-hosted or cloud-based Cloud-based
Pricing Offers both free and paid options Offers both free and paid options
Repository Types Supports both public and private repositories Supports both public and private repositories
Issue Tracking Robust issue tracking system Robust issue tracking system
Collaboration Built-in collaboration tools Built-in collaboration tools
Continuous Integration Offers built-in CI/CD pipelines Integrates with various CI/CD tools
Integration Provides extensive integration capabilities Provides extensive integration capabilities
Security Strong emphasis on security and access control Strong emphasis on security and access control
Community Active community and open-source contributions Active community and open-source contributions
Enterprise Offers enterprise-level features and support Offers enterprise-level features and support

Table 2.1: Comparison of GitLab and GitHub


2.7. Comparative study of tools 19

Gitlab Vs Jenkins

Feature GitLab Jenkins


Version Control Built-in Git repository management Supports multiple version control systems
CI/CD Built-in CI/CD pipelines and automation Requires plugin installation for CI/CD
Pipeline Syntax YAML-based configuration Groovy-based configuration
User Interface Web-based UI with modern design Web-based UI with customizable themes
Community Open-source with a large community Open-source with a large community
Integration Tight integration with Git and issue tracking Extensive plugin ecosystem for various integrations
Security Built-in security features like code scanning and vulnerability management Plugin-based security options
Containerization Built-in Docker container registry Supports Docker and other container platforms
Extensibility Customizable with integrations and extensions Highly extensible through plugins
Scalability Supports both small teams and large enterprises Suitable for any team size or enterprise
Licensing Available as both free and paid versions Open-source and available for free

Table 2.2: Comparison of GitLab and Jenkins

Terraform Vs Chef

Terraform Chef
Use Case IaC (Infrastructure as Code) Config. Management
Provider HashiCorp Chef Software Inc.
Language HCL and JSON Ruby and DSL for recipes
Infrastructure Immutable - new resources for changes Mutable - configs apply to existing resources
Approach Declarative - defines final state Imperative - defines the steps
Integration Manages low-level and high-level components Manages systems at OS level
Cloud Support Multi-cloud and on-premises Multi-cloud and on-premises
State Management Tracks state Does not inherently track state
Open Source Yes Yes
Code Reusability Reuses modules Reuses recipes and cookbooks
Community Large, growing rapidly Large, established longer
Rollback Supports planning, allows rollback Requires scripting

Table 2.3: Comparison of Terraform and Chef


2.7. Comparative study of tools 20

ArgoCd Vs Flux CD

Argo CD is a GitOps continuous delivery tool for Kubernetes. It automates the deployment
of applications and configurations from Git repositories to Kubernetes clusters. FluxCD
is another GitOps tool that provides continuous delivery for Kubernetes applications, au-
tomatically synchronising container images and configurations from Git repositories to
Kubernetes clusters.

Argo CD Flux CD
Use Case Kubernetes Continuous Delivery GitOps for Kubernetes
Provider Intuit Inc. Weaveworks
Declarative Model Yes Yes
GitOps Ready Yes Yes
Primary Focus Deployment and synchronization Synchronization
Secret Management Uses Sealed Secrets, External Secrets, or Vault Uses SOPS
Manifest Generation Kustomize and Helm Kustomize, Helm and Jsonnet
Sync Strategy Automatic or manual synchronization Automatic synchronization
Rollback/History Support Yes, supports rollback to previous configurations Yes, supports rollback
Health Checks Yes Yes
Multi-Cluster Deployment Yes Yes

Table 2.4: Comparison of Argo CD and Flux CD

Grafana Vs Kibana

Grafana is an open source analytics and visualisation tool used for monitoring and ob-
servability. It integrates with various data sources and enables the creation of custom
dashboards to visualise metrics. Kibana is part of the Elastic Stack and provides powerful
visualisation capabilities for analysing and exploring data stored in Elasticsearch. It is
commonly used for log analysis and search.
2.7. Comparative study of tools 21

Grafana Kibana
Use Case Metrics & logs viz., alerting Log exploration, viz., Elastic Stack interface
Provider Grafana Labs Elastic
Data Sources Many incl. Prometheus, Graphite, etc. Primarily for Elasticsearch
Viz. Types Graphs, tables, heatmaps, etc. Charts, maps, etc.
Alerting Yes Yes (via ElastAlert or X-Pack)
Dashboards Highly customizable Less flexible
Setup Ease Easy to moderate Moderate to hard
Log Exploration Requires integration with Loki Advanced with Elasticsearch
Cloud Support Available Available (Elastic Cloud)
Security Supports Auth Proxy, LDAP, OAuth RBAC, Spaces for isolation
Community Large, growing Large, established

Table 2.5: Comparison of Grafana and Kibana

Kubernetes (K8s) Vs Docker Swarm

* Kubernetes is an open source container orchestration platform that automates the de-
ployment, scaling and management of containerised applications. It provides a robust
ecosystem for managing distributed systems. * Docker Swarm is a native clustering and
orchestration solution from Docker. It allows the creation of a swarm of Docker nodes and
enables container orchestration across those nodes.
Kubernetes (K8s) Docker Swarm
Provider Cloud Native Computing Foundation (CNCF) Docker Inc.
Complexity Higher Lower
Setup and Configuration Can be complex to set up and configure Easier to set up and configure
Scalability Highly scalable Scales well but may not handle large clusters as effectively
Networking A bit complex; uses CNI for pod networking Simpler networking
Load balancing Manual service configuration required Automatic load balancing
Service discovery Uses CoreDNS for service discovery DNS based service discovery
Rolling updates Supported Supported
Data volumes Supports shared volumes and storage mounting Supports volume mounting
Logging and monitoring Integrates with multiple logging and monitoring tools Limited built-in monitoring tools
Compatibility Supports a wide range of infrastructure More limited; best with Docker’s own platform

Table 2.6: Comparison of Grafana and Kibana


2.7. Comparative study of tools 22

Conclusion

In this chapter we have presented the main concepts of our project, where we have defined
the DevSecOps approach, continuous integration, continuous delivery and deployment, and
concluded with monitoring. We have also carried out a comparative study of different tools
with an educational objective.
Chapter 3

Analysis & Conception

Introduction

In this section, we will detail the analysis of functional and non-functional requirements,
the overall architecture, and the project design.

3.1 Identification of Actors

* DevOps Engineer: It’s going to be us because we will be responsible for managing the
infrastructure, deployment, and pipelines.

* Data Engineer: It’s going to be the data team who will be working on the project
that we are going to deploy.

3.2 Project Architecture

I used Terraform for infrastructure management, defining and versioning my data centre
setup in a codified way. I used Kubernetes (k8s) as my container orchestration platform
to automate the deployment, scaling and management of my containerised applications.

23
3.2. Project Architecture 24

Figure 3.1: Terraform Architecure

3.2.1 Terraform Modules :

I have created GCP resources using terraform and create pipelines to manage what the
data team needs These are the modules that i have created :
Custom Service Account : I have created terarform module that manage to create a
service account using terraform code with a pipline in gitlabe and it’s used for managaing
IAM and minimising roles for security reasons
Cloud Function : I have created terraform module that manage to create a cloud func-
tion usingg terraform code with a pipline in gitlab and it’s used for many reasons such as
creating alerts in GCP or data since uses
BigQuery : I have created terarform module that manage to create BigQuery using
terraform code with a pipline in gitlabe , lets you incorporate GoogleSQL functionality
with software outside of BigQuery by providing a direct integration with Cloud Functions
and Cloud Run.
Document AI : I have created terarform module that manage to create Document AI
using terraform code with a pipline in gitlabe , and it’s used to create seamless integrations
and easy to use applications for your users.
3.2. Project Architecture 25

Pub/Sub : I have created terarform module that manage to create Pub/Sub using ter-
raform code with a pipline in gitlabe , It’s used in distributed systems for asynchronous
communication between different components or services
GKE (Google Kubernetes engine): I have created terarform module that manage to
create GKE (Google Kubernetes engine) using terraform code with a pipline in gitlabe ,
GKE gives you complete control over every aspect of container orchestration, from net-
working, to storage.
Workload Identity Federation : I have created terarform module that manage to cre-
ate using terraform code with a pipline in gitlabe , Using identity federation, you can grant
on-premises or multi-cloud workloads access to Google Cloud resources, without using a
service account key.

ArgoCD played a critical role in my deployment pipeline, as it implemented GitOps


principles by synchronising application state from Git repositories to Kubernetes, helping
me to effectively maintain and scale my applications.

Figure 3.2: Project Architecture

Helm Charts In my work, I often use Helm charts, an effective package manager
for Kubernetes. Think of it as Ubuntu’s apt or Red Hat’s yum, but designed specifically
3.2. Project Architecture 26

Figure 3.3: Helm Logo

Figure 3.4: Grafana and prometheus Architecture

for Kubernetes applications. A Helm chart is a bundle of files describing a set of related
Kubernetes resources, which can be as simple as a standalone pod or as complex as a
full-stack application.

During deployment, Helm combines the templates in the chart with the provided values
to create a Kubernetes manifest that is deployed to the cluster. In this way, I can customise
deployments for different environments without changing the core templates.

In my role, I use Helm charts with ArgoCD to enable GitOps workflows. The charts
are stored in a Git repository, and ArgoCD ensures that the Kubernetes cluster stays
in sync with that repo. This setup allows us to manage applications in a declarative,
version-controlled way, which is a more efficient way to handle application lifecycles and
complexity.

For monitoring, I used a combination of Prometheus and Grafana. Prometheus col-


lected and stored metrics from my applications as time-series data, while Grafana provided
an intuitive interface for querying, visualising and sharing these metrics. Overall, this ar-
chitecture efficiently managed my infrastructure, application deployment and monitoring,
while encouraging best practices such as IaC and GitOps.
3.3. Functional and non-Functional needs 27

3.3 Functional and non-Functional needs

3.3.1 Functional Needs :

Application Deployment:

Application deployment is a key responsibility within the DevOps domain which in-
volves the automated, reliable and consistent process of delivering project software to
users. The DevOps team uses Continuous Integration/Continuous Deployment (CI/CD)
pipelines to make sure all code changes are seamlessly and securely integrated, tested and
deployed in the production environment. This process includes managing dependencies,
ensuring compatibility across different environments, and finally pushing the application to
servers where end users can access it. By automating this process, we dramatically reduce
the risk of human error, shorten deployment cycles, and increase overall project efficiency.

Monitoring:

DevOps surveillance means continuous observation and monitoring of application per-


formance, user experience, system behavior and anomalies in real time. It involves the
use of tools and applications that provide an overview of system state, resource utiliza-
tion and application performance. The objective of monitoring is to rapidly detect and
diagnose system problems, identify trends, prevent potential failures, and ensure that the
system meets predefined performance parameters. An effective surveillance strategy allows
DevOps teams to proactively manage systems and make data-driven decisions, reducing
downtime and enhancing user satisfaction.

Security:

Security within DevOps, often referred to as DevSecOps, integrates security practices


into the DevOps workflow. This means integrating security controls, audits and measures
into every step of the development and deployment process to minimize vulnerabilities and
risks. Security measures include code reviews, automated security testing, vulnerability
scanning and more. This approach makes safety not a secondary consideration, but an
3.3. Functional and non-Functional needs 28

integral part of the software development cycle. It promotes the principle of ’security as
code’, where infrastructure security is version controlled and auditable. It also recommends
responding to threats in real time, allowing for rapid detection and correction of security
threats. The outcome is robust and secure applications that maintain data integrity and
confidentiality.

3.3.2 Non-Functional Needs :

Scalability:

Scalability refers to the system’s capacity to manage a growing amount of work, or


its potential to adapt to future growth. In the context of a DevOps project, this means
the application’s ability to accommodate a larger number of users, increased data volume
or enhanced functionality without negatively impacting the system’s performance or user
experience. Scalable application can be scaled or scaled down as required to ensure effi-
cient use of resources. This can involve load balancing, horizontal or vertical scaling, or
implementation of a microservice architecture, amongst other strategies.

Performance:

Performance, in the context of DevOps, is the ability of the system to respond quickly to
user demands and perform its functions efficiently. It encompasses various aspects such as
response time, throughput, resource utilisation and reliability under varying loads. A high-
performance system not only improves user satisfaction, but also improves the system’s
reliability and availability. Performance optimisation can involve many strategies, includ-
ing efficient code practices, database optimisation, caching strategies and the selection of
appropriate infrastructure.

Reusability:

Re-use is the ability to use existing software artifacts (such as code modules, libraries,
or frames) in various contexts and applications. In a DevOps environment, reuse can dra-
matically simplify the development process, reduce mistakes and shorten time to market.
3.3. Functional and non-Functional needs 29

This is often facilitated by implementing modular programming, where software is divided


into separate modules that can perform tasks independently, but can be used together
to achieve the overall functionality of the software. Reusability also extends to DevOps
practices such as Infrastructure as Code (IaC), where common infrastructure setups are
codified and reused across different projects or environments, ensuring consistency and
reducing manual configuration efforts.

Conclusion

This chapter has allowed us to identify the stakeholders, study the functional and non-
functional requirements, and present the global architecture..
In the next increment, we’ll represent the Order Management topic, during which we’re
going to work on the different aspects of the services’ booking and the payment processing.
Chapter 4

Achievement

Introduction

In this final chapter, we will present the work that has been done, detailing the tasks of
continuous integration and solution deployment with screenshots for a good understand of
the solution

4.1 Adjustment environment

Google Console First of all, we have to create a GCP project that will contain all of the
work and resources that will be created.

30
4.2. Gitlab 31

Figure 4.1: Project GCP

As you can see here is the dashboard of our project, so you can see the name and the
id of the project that we are going to work with.

4.2 Gitlab

GitLab is a web-based DevOps platform for software development. It offers version con-
trol, issue tracking, and project management features. With built-in CI/CD capabilities,
it enables automated testing and deployment. GitLab provides both cloud-based and self-
hosted options for flexibility and customization.
4.2. Gitlab 32

Figure 4.2: Gitlab

As you can see here I have worked with gitlab to store my code and in the same time
to manage it as a server and run my pipelines

4.2.1 Piplines

First Pipline

I have worked with two principle pipelines to manage the infrastructure and create
resources in my project GCP :
4.2. Gitlab 33

Figure 4.3: First Pipeline

As you can see here, this pipeline contains 4 stages, Validate, Plan, Apply and Destroy,
which are needed to create a route with Terraform, and for this pipeline, it’s using service
account authentication, which we have exported in GOOGLE APPLICATION CREDEN-
TIALS.

Second Pipeline

And this pipeline is the same as the other one, but there’s one thing that’s changed, first
of all, it doesn’t work with the service account anymore, it works with workload identity
federation, we’ll talk about that next and in same time .
4.2. Gitlab 34

Figure 4.4: Second Pipline 1

Figure 4.5: Second Pipline 2


4.2. Gitlab 35

Figure 4.6: Second Pipline 3

In order for this pipeline to work, it includes another pipeline called .gitlab-gcp-auth.yml,
which is used to create a temporary token and authenticate against it.

Figure 4.7: gcp-auth.yml 1


4.2. Gitlab 36

Figure 4.8: gcp-auth.yml 2

Kaniko Pipline

Kaniko is a container image building tool in GCP. It works without Docker daemon,
allowing secure and reproducible image builds in Kubernetes. Compatible with GCR and
other container registries, it simplifies the containerization process for developers.

Figure 4.9: Kaniko Pipline


4.2. Gitlab 37

4.2.2 Terraform Modules

Custom Service Account

A service account in Google Cloud Platform (GCP) is an identity that represents a non-
human entity, such as an application or virtual machine. It is used to authenticate and
authorise interactions between services and resources within GCP. Service accounts have
unique credentials that enable secure access control and management of GCP resources.
They play a critical role in automating tasks and enabling seamless integration between
services in the cloud environment.

Figure 4.10: Custom Service Account

As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course we didn’t forget to create a test folder so we call it as a module and test it.
4.2. Gitlab 38

Figure 4.11: Service Account Pipline

Figure 4.12: JOB APPLY

As you can see here the pipline is successfully running and the service account have
been created .

Cloud Function

Google Cloud Functions (GCP) is a serverless execution environment for building and
connecting cloud services. It allows developers to deploy simple, single-purpose functions
that are executed in response to cloud events without the need to manage a server or
runtime environment. These functions can be written in various programming languages
4.2. Gitlab 39

such as Node.js, Python, Go, etc. They facilitate the creation of scalable, cost-effective
applications and services.

Figure 4.13: Cloud Function

As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course we didn’t forget to create a test folder so we call it as a module and test it.
4.2. Gitlab 40

Figure 4.14: Cloud Function Job apply

We can see here that two resources have been created .

BigQuery

Google BigQuery is a fully managed, serverless data warehouse that enables super-fast
SQL queries by harnessing the processing power of Google’s infrastructure. It allows you
to analyse large data sets in real time by running SQL-like queries, while also providing
machine learning capabilities within the platform. BigQuery is designed to be scalable,
cost-effective and easy to use, making it ideal for organisations that need to leverage big
data analytics. It supports a wide range of data formats and integrates with various Google
Cloud services.
4.2. Gitlab 41

Figure 4.15: BigQuery Module

As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf,
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course we didn’t forget to create a test folder so we call it as a module and test it.

Figure 4.16: main.tf of bigQuery

As you can see here this is main.tf where we created the resources "google bigquery
4.2. Gitlab 42

dataset" and "google bigquery dataset iam binding" while we are getting the variables
from file terraform.tfvars

Figure 4.17: Pipline BigQuery

Here we can see that the pipeline worked successfully.

Document AI

Google Cloud Document AI is a suite of machine learning models designed to analyze,


understand, and extract valuable information from documents. It uses natural language
processing, OCR, and other AI technologies to process structured and unstructured data
in documents. It’s capable of performing tasks such as form parsing, invoice processing,
and content classification. This platform simplifies the automation of document workflows,
increasing efficiency and reducing manual effort.
4.2. Gitlab 43

Figure 4.18: Document AI

As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf,
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course, we didn’t forget to create a test folder so we call it as a module and test it.

Figure 4.19: Document AI

As you can see here this is the main.tf in the test folder where we call the module so
we can use it and put the variables that we want.
4.2. Gitlab 44

Figure 4.20: Document Ai Job

We can see here that two resources have been created.


Pub/Sub

Google Cloud Pub/Sub is a scalable, reliable, real-time messaging service that enables
asynchronous data streaming and event-driven computing. It provides a publish-subscribe
pattern where publishers categorize published messages into topics and subscribers con-
sume them based on those topics. The service ensures at-least-once message delivery and
automatic scaling, making it suitable for distributed systems and large data scenarios.
Pub/Sub integrates seamlessly with other GCP services, enabling complex data processing
pipelines.

I have created two projects one for the subscription and one for the publisher
4.2. Gitlab 45

Figure 4.21: Subscription

as you can see here the subscription main.tf will create these resources.

Figure 4.22: Publisher

And Here is the publisher project, As you can see here I have created the main.tf,
variables.tf, output.tf, and versions.tf, and as you can see .gitlab-ci.yaml is where I call the
pipeline we created earlier to use it and of course, we didn’t forget to create a test folder
so we call it as a module and test it.
4.2. Gitlab 46

Figure 4.23: PUB/SUB Job

And here you can see the resources that have been created.

GKE

Google Kubernetes Engine (GKE) is a managed, production-ready environment for


deploying, scaling, and managing containerized applications. GKE provides a consistent
platform to run your applications, backed by the reliability and security of Google’s in-
frastructure. It takes care of the underlying Kubernetes orchestration system, providing
automated updates, scaling, and a developer-friendly environment. GKE supports multi-
cluster deployments, auto-scaling, integrated developer tools, and multi-regional backups
for high availability.
4.2. Gitlab 47

Figure 4.24: GKE

As you can see here this is main.tf where we created the resources "google container
cluster" and "google container node pool" while we are getting the variables from file
terraform.tfvars in the test folder.

Figure 4.25: GKE Documentation

In each module I have created its own documentation so that Team Data can understand
4.2. Gitlab 48

and work with our modules, this is an example of the documentation I have created.

Workload Identity Federation

Google Cloud’s Workload Identity Federation is a feature that allows external identities
from vendors such as AWS, Azure, or OIDC identity providers to authenticate and access
Google Cloud resources without the need for service account keys.

Instead of storing and managing service account keys, an application can use an external
identity to impersonate a service account. This feature facilitates secure workload identity
pools and providers that vouch for these external identities and associate them with Google
Cloud service accounts.

This allows developers to maintain fewer service account keys, reducing the risk of key
leakage or misuse. It also provides a consistent authentication experience across different
cloud providers, improving the security and manageability of cross-cloud access control.

In addition, identity federation enables organizations to leverage existing identity so-


lutions, making the transition to Google Cloud smoother and more secure. This approach
simplifies identity management and improves the least privilege by providing granular IAM
roles and permissions.

Figure 4.26: Workload Identity Federation


4.2. Gitlab 49

As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf,
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course, we didn’t forget to create a test folder so we call it as a module and test it.

Figure 4.27: Main.tf OF THE Workload Identity Federation

And here we have called the required resources to create a workload identity federation
one of them the pool that contains many providers.

Figure 4.28: Job Wif


4.3. GitOps 50

4.3 GitOps

4.3.1 ArgoCD

Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes. It leverages Git
repositories as the source of truth for desired application state, including their configura-
tions. Argo CD automates the deployment of applications to specified target environments,
ensuring they match the state defined in the Git repository.

Figure 4.29: ArgoCD Logo

Architecture Using Helm Charts:

Argo CD’s architecture is designed to be compatible with Helm, one of the most pop-
ular Kubernetes package managers. Helm simplifies the deployment of applications on
Kubernetes using a packaging format called Helm charts.
4.3. GitOps 51

Figure 4.30: Infra Of my GitOps

Figure 4.31: Hlem Code


4.3. GitOps 52

Figure 4.32: Main Applicaton (root)

As you can see here is the root application as we mentioned in project : root , from
here all the applications associated with it will be deployed and synchronised with it.

Figure 4.33: Application folder


4.3. GitOps 53

As you can see here the list of application associate with the alpha-apps (root)

Figure 4.34: Helm file of application in argoCD

[H]

And here is an example of an application in argoCd, this is the helm chart for the
application as you can see all the values are in a file called values and the type of rudder
is application. In Argo CD, an "Application" is a Kubernetes Custom Resource Defini-
tion that provides the specification for a deployed application. It defines the Git source
repository (which can contain Helm charts), the path within the repository, and the target
cluster and namespace.
4.3. GitOps 54

Figure 4.35: Helm file of Values of argoCD

A Helm chart is a collection of files that describe a related set of Kubernetes resources,
which can represent an entire application stack. These charts are customizable using val-
ues.yaml files, allowing for parameterized configuration that can be shared across multiple
environments. "Projects" in Argo CD provide a logical grouping of applications, often
corresponding to a team or a business unit. They can enforce specific policies about where
applications can be deployed and which resources they can include, providing a level of
isolation and security.
4.3. GitOps 55

Figure 4.36: Helm File Of AppProject

Figure 4.37: Main Dash Of argoCD


4.3. GitOps 56

As you can see here, there is the alpha-app, which is the root that we talked about
before, and there is the chat-gpt-app, which is the application that we have deployed, and
finally we have the prometheus-stack, which we have also deployed with argoCD.

Figure 4.38: ArgoCD application (chatGPT) 1

This is our application Chat-GPT app and all the deployments it needs, you can see it
all in the dashboard, it’s clear and easy to understand.
4.4. Monitoring 57

Figure 4.39: ArgoCD application (chatGPT) 2

4.4 Monitoring

4.4.1 Prometheus

Prometheus is an open-source monitoring system and time series database. In your project,
you used it to collect numerical data over time, providing valuable insights about your
system’s performance. It uses a pull model to scrape metrics from your services at regular
intervals, based on defined targets and service discovery. You leveraged its powerful query
language, PromQL, for alerting and visualization. Thus, Prometheus served as a key
component in your observability stack, allowing you to maintain high service reliability
and quickly troubleshoot issues.
4.4. Monitoring 58

Figure 4.40: Prometheus logo

Figure 4.41: Prometheus Metrics

In Prometheus, metrics represent the time-series data that track the behavior of your
services over time. Each metric has a unique name and can have multiple dimensions, rep-
resented by key-value pairs called labels, which provide additional context like instance,
4.4. Monitoring 59

environment, or version. Metrics types include counters, gauges, histograms, and sum-
maries, each serving distinct purposes in tracking counts, measurements, distributions,
and quantiles, respectively.

4.4.2 Grafana

Grafana is a popular open-source tool for visualizing and analyzing metrics. In your project,
it served as a dashboard for displaying real-time data about your system’s performance,
utilizing data sourced from Prometheus. With its flexible and intuitive interface, you were
able to create custom graphs, gauges, and alerts, turning raw data into insightful visual
representations. By providing a clear overview of your system’s health and behavior,
Grafana played a pivotal role in your observability infrastructure, supporting decision-
making and problem-solving.

Figure 4.42: Grafana logo

Now i will present you the dashboards that i have created with grafana :
4.4. Monitoring 60

Figure 4.43: Grafana Dashboard 1

Figure 4.44: Grafana Dashboard 2


4.4. Monitoring 61

Figure 4.45: Grafana Dashboard 3

And as you can see, all the data that we get from Prometheus here in the Grafana
dashboards, for example, the number of replicas and CPU usage and how many requests
have been sent from the application.

Conclusion

In this final chapter, we have implemented our work and thoroughly described our project
using screenshots that illustrate the various stages of completion.
General conclusion & perspectives
This work is part of the final project within the organisation RidchaData. During this
six-month internship, we developed and implemented an innovative DevOps solution. Our
main tasks included setting up continuous integration and continuous deployment pipelines.
As part of our infrastructure management strategy, we used Terraform, and to ensure
smooth deployments, we used Kubernetes and ArgoCD.

By adopting a DevOps culture, we were able to effectively address the challenges that
RidchaData was trying to solve. Our solution automated tasks that were previously man-
ual, resulting in time and cost savings through faster delivery cycles. It also improved
lifecycle predictability, fostered an innovation-driven culture and improved collaboration
between different teams.

The data team in particular benefited from our work, as our solutions improved the
efficiency and effectiveness of their processes. We also implemented monitoring systems
using Grafana and Prometheus, which improved our ability to track and manage system
performance.

The internship was an enriching experience that gave us the opportunity to apply our
existing knowledge and develop new skills by exploring new tools and technologies, includ-
ing GCP DevOps, Terraform, Kubernetes, ArgoCD, Grafana and Prometheus. In addition,
this professional experience has enhanced our team collaboration, project management.

In response to the challenges faced, and armed with this experience, I aim to further
adapt projects developed in other languages to the DevOps approach in the future.

Finally, I hope that our work will meet the expectations of the RidchaData management
and the jury.

62
Bibliography

[1] productboard.com. The four values of the agile manifesto. April 11th, 2021.

63

You might also like