Rapport Ahmed Ben Hmida
Rapport Ahmed Ben Hmida
Rapport Ahmed Ben Hmida
2022 - 2023
DevSecOps : Empowering IT
Collaboration and Customer Success
To my dear Mom, Amira, who has been my rock and cheerleader throughout this crazy
project. Your unending encouragement and your belief in me, even when I doubted myself.
I owe it all to your good heart.
To my Dad, Adel, who never failed to remind me that hard work and determination
always pay off. Your strong sense of discipline and resilience has been my guiding light in
this journey.
To my brother, Oussema, who had my back and kept me smiling through the toughest
parts of this work. It was your jokes, your understanding, and your unflinching support
that kept me sane during the most intense parts of this project.
And to my buddy, Amine, whose friendship was the greatest gift during this journey.
For every late-night study session, every brainstorming coffee break, and every time you
believed in me, I say thank you.
This project is more than just my achievement—it’s a tribute to all of you. To your
love, your guidance, and thanks for not stopping believing in me.
i
Acknowledgment
The work presented in this graduation project was carried out within RidchaData
company. I would like to express my gratitude and thanks toRidchaData’s CEO, Mr.
Ridha Chamem, for giving me the opportunity to realize this project and to provide me
with a modern atmosphere and all the necessary equipment to achieve the internship goals.
I’m also delighted to thank the jury members for reviewing my report and for assisting
my graduation day.
ii
Contents
Introduction 1
1 Project context 2
1.1.1 Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
iii
Contents iv
2.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.3 CI/CD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Terraform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 GitOps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 ArgoCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Achievement 30
4.2 Gitlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.1 Piplines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 GitOps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.1 ArgoCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4.1 Prometheus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4.2 Grafana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Conclusion 62
List of Figures
4.2 Gitlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.7 gcp-auth.yml 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.8 gcp-auth.yml 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
vi
List of Figures vii
4.18 Document AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.19 Document AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.21 Subscription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.22 Publisher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.24 GKE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
ix
Acronyms
CI Continuous Integration
CD Conginuous Delivery
AI Artificial Intelligence
x
Introduction
The goal of DevSecOps is the involvement of security teams and considerations from
the beginning of the development cycle. The central principle is to incorporate security
measures as a fundamental part of the DevOps pipeline, rather than as an afterthought at
the end of the development process.
This model is designed to increase the speed of software development and deployment
and reduce the risk of security issues. By bringing together developers, security experts
and operations staff, the goal is to create a more secure and efficient system for delivering
software. Core DevSecOps practices include continuous integration, continuous delivery,
automated build and infrastructure as code..
1
Chapter 1
Project context
Introduction
First, we present the host company RIDCHA DATA where we had the opportunity to work
on this project. Afterward, we briefly introduce the host company. Then, we describe
Methodology and project framework, the existing solutions, and the proposed solution.
Finally, we present the methodology that we’re going to adapt in order to reach our end
goal.
1.1.1 Presentation
RIDCHA DATA is an innovative digital services company that positions itself as a leader
in engineering and technology consulting. The organization operates as a subsidiary of the
RD Group, which focuses on the Information Technology (IT) and Healthcare sectors.
2
1.1. The host company: RIDCHA DATA 3
The company operates in two main areas: IT and Healthcare. They maintain a customer-
centric approach, treating each customer as unique, enabling them to provide tailored
services that meet each customer’s specific needs.
RIDCHA DATA’s services are organized around three main areas of expertise and three
specific offers. Their areas of expertise include:
• TestOps
• DataOps
1.2. Problematic 4
The company prides itself on its ability to invent, develop, deliver, and rapidly scale dis-
ruptive innovations for its clients. RIDCHA DATA is also recognized for its talent-sourcing
capabilities, offering improved responsiveness, optimized skills, and competitive financial
terms. We move now to present the company Policy
The company’s policy is geared towards achieving general objectives, developing and re-
taining its human resources, and preparing the company to meet future challenges and
satisfy its clients.
1.2 Problematic
During my internship, our main focus was to improve the efficiency and organization of the
data team’s work. To achieve this, I took on the task of developing infrastructure as code
using Google Cloud Platform (GCP). This approach aimed to simplify and streamline their
workflows. At the same time, we aimed to improve the practicality and usefulness of the
deployment process in Kubernetes (k8s), while addressing any issues and errors that arose.
Another critical aspect we focused on was the implementation of effective monitoring and
an alert manager to ensure the stability of the system.
Let’s now delve into the details of the existing solution, which we developed for our
final internship project.
1.3. The existing solution 5
Prior to implementing our solution, the data team developers faced several challenges and
inefficiencies in their workflow. Their work involved manually setting up and configur-
ing infrastructure in Google Cloud Platform (GCP), which was a time-consuming and
error-prone process. Each deployment to Kubernetes (k8s) required manual intervention
and lacked a standardized approach, leading to inconsistencies and potential deployment
errors. Troubleshooting and identifying issues in the system was a tedious task, as develop-
ers had limited visibility into infrastructure and application performance. This resulted in
longer resolution times and hindered their productivity. In addition, without an effective
monitoring and alerting system, the team had to rely on reactive approaches where prob-
lems were only detected after they had already affected the system. Overall, prior to our
solution, the developers’ work was characterized by manual and labor-intensive processes,
a lack of standardization, limited visibility, and inadequate means of proactive problem
detection and resolution.
In the next section, we’ll present the proposed solution that we’re going to work on
during this end-of-studies internship.
For our existing solution, we used a combination of powerful tools to address the challenges
we faced. Terraform, an infrastructure-as-code tool, played a key role in automating the
setup and configuration of the data team’s infrastructure on the Google Cloud Platform
(GCP). This allowed us to define and deploy resources consistently and efficiently, ensuring
a seamless workflow for the team.
To ensure the stability and performance of the system, we implemented Grafana and
Prometheus. Grafana, a powerful monitoring and visualization platform, provided real-
time insight into infrastructure and application performance, enabling the team to proac-
tively identify and resolve issues. Prometheus, an open-source monitoring system, comple-
mented Grafana by collecting and storing metrics and alerting the team to anomalies or
potential problems.
Agile methodology is essential for DevOps teams because it promotes collaboration, flexi-
bility, and continuous improvement. It breaks tasks down into smaller increments, encour-
ages cross-functional teamwork and emphasises transparency. Agile enables frequent code
integration and delivery, reducing time to production. Retrospectives drive learning and
refinement. Overall, Agile empowers DevOps teams to deliver value quickly and adapt to
changing needs.
1.5. Work Methodology 7
• Individuals and interactions: This value come to prioritize the team over pro-
cesses and tools as they are worthless when in the wrong hands. The right team is
vital to success.
Through these values, we have the definition of a philosophy that makes working on projects
much more efficient and successful. To apply this philosophy, we need a set of tools and
resources that we can use.
In the next section, we’ll talk about Kanban as an agile framework.
The Kanban methodology is a visual and flexible approach to work management. It orig-
inated in the manufacturing industry and has been widely adopted in various fields, in-
cluding software development. Kanban focuses on visualizing the workflow, limiting work
in progress (WIP) and optimizing the flow of work to deliver value efficiently. By provid-
ing real-time visibility of tasks and bottlenecks, Kanban enables teams to make informed
decisions and continuously improve their processes.
The principles of Kanban revolve around six core concepts: visualizing workflow, limiting
WIP, managing flow, making policies explicit, implementing feedback loops, and continuous
1.5. Work Methodology 8
Kanban is widely used in DevOps and software development environments. Its flexibility
and adaptability make it suitable for managing complex and dynamic projects. Kanban
enables teams to respond to changing requirements, accommodate unplanned work and
maintain a steady flow of value delivery. It aligns well with agile principles and complements
DevOps practices by fostering collaboration, transparency, and continuous improvement
across development and operations teams.
1.5. Work Methodology 9
Kanban and Scrum are two of the most popular agile methodologies, each with unique
features. While Scrum follows fixed-length iterations (sprints) and emphasizes time-boxed
planning and delivery, Kanban focuses on optimizing flow and visualizing the workflow
without fixed iterations. Kanban uses a pull system, whereas Scrum assigns tasks during
sprint planning. Kanban’s WIP limits aim to balance and optimize work allocation, while
Scrum encourages full utilization of the team’s capacity. Roles and responsibilities differ
between the two methodologies, and they also vary in terms of change management, meet-
ings, customer collaboration, and documentation practices. It is important to take into
account the specific demands of the project and the dynamics of the team when choosing
between Kanban and Scrum.
Aspect Kanban Scrum
Focus Flow of work Time-boxed iterations
Planning No fixed iterations or sprints Fixed-length iterations (sprints)
Work Allocation Pull system Assigning tasks in planning
Work Management Visualizing workflow on Kanban board Backlog and sprint planning
Work in Progress (WIP) Limits WIP to optimize flow Encourages full utilization
Roles and Responsibilities No prescribed roles Defined roles (e.g., Scrum Master, Product Owner, etc.)
Change Management Adapts to changes in real-time Changes accommodated in next sprint
Meetings Continuous improvement and retrospectives Daily stand-ups, sprint planning, review, and retrospective
Customer Collaboration Continuously collaborates Collaboration during sprint planning and review
Conclusion
In this first chapter, we begin with a presentation of our host organization. We then move
on to the analysis of the current situation and the proposed solution. Finally, we conclude
with the methodology adopted for our project.
In the next chapter, we will describe the concepts of DevOps and carry out a compar-
ative study to choose the appropriate tools.
Chapter 2
Introduction
In this chapter, we begin by highlighting the various basic concepts required for the im-
plementation of our project. We then proceed with a comparative study of the various
existing tools in order to synthesize our technological choices.
2.1.1 Definition
DevSecOps stands for Development, Security and Operations. This is a culture or practice
that incorporates security into every step of the software development cycle. In traditional
development processes, security was often considered at the end of the project or as a sepa-
rate process altogether. In DevSecOps, security considerations and controls are integrated
from the initial design and development stages through to deployment in production.
10
2.1. DevSecOps approach 11
Working with the DevSecOps model provides the team with the following benefits:
• Profitability: It is usually less costly to fix problems during the development process
than to fix them after production. Therefore, embedding security throughout the
development lifecycle can result in significant cost savings.
2.1.3 CI/CD
• Improved code quality: Integration and frequent testing can quickly detect bugs
and facilitate troubleshooting, improving overall code quality.
• Faster delivery speed: With CD, updates, and enhancements can be sent to end
users more frequently and reliably.
• Risk Mitigation: Through smaller and more frequent updates, issues can be detected
and canceled more quickly, greatly reducing the risk and impact of potential issues.
2.2 Terraform
Terraform is a free software tool Infrastructure as Code (IaC) developed by ShiCorp. It lets
developers define and deploy the data center infrastructure using a high-level configuration
language called HashiCorp Configuration Language (HCL), or optionally JSON.
Terraform encodes APIs into declarative configuration files that can be shared between
team members, processed as code, edited, revised, and versioned. It supports a wide range
of service providers as well as custom in-house solutions and enables the management of a
variety of infrastructures such as public clouds, private clouds, and Software as a Service
(SaaS) with a unified workflow.
• Agnostic platform: Terraform can manage a wide range of cloud service providers
as well as customized solutions internally, offering flexibility and reducing vendor
lockdown.
2.3. GitOps 14
• State Management: Terraform captures the state of your infrastructure and can
generate plans to apply modifications, minimize disruptions and avoid manual errors.
• Collaboration and Sharing: The infrastructure code can be monitored and shared,
allowing for collaboration, peer review, and testing of infrastructure changes.
2.3 GitOps
GitOps is a software development framework that implements the Git version control
system as a unique source of truth for declarative infrastructure and applications. It
applies the same principles used in version control, such as extraction requests and version
history, to managing the infrastructure. Its purpose is to provide a more effective and
collaborative means of managing deployments and operations.
• Enhanced Productivity and Speed: By using Git as the only source of truth, teams
can use the same tools and processes they use to build code to manage infrastructure,
streamline workflow, and increase productivity.
2.4. Kubernetes and Helm Charts 15
• Better Reliability and Stability: Changes are automatically tested and deployed to
the infrastructure based on Git changes. Doing so reduces the risk of human error
and increases the stability of the system.
• Easy Rollbacks: If a change causes a system failure, teams can quickly revert to a
previous state using the built-in version control features of Git. improved productiv-
ity.
2.4.1 Kubernetes
Helm charts are packages of pre-configured Kubernetes resources. They help manage Ku-
bernetes apps by providing a way to templatize your services and setups. This allows us
to make deployments more reliable, repeatable and manageable.
2.5. ArgoCD 16
2.5 ArgoCD
With ArgoCD, any modification of the Git repository triggers a synchronization process
that aligns the real-time status of the Kubernetes cluster with the desired state defined
in the Git repository. It is particularly useful for managing deployments across multiple
environments, ensuring that all are synchronized and updated in an efficient and controlled
manner.
Together, GitOps, Kubernetes, Helm and ArgoCD provide a robust framework for
managing a modern container-based infrastructure, streamlining workflows, and improving
global productivity and reliability. They allow teams to manage large, complex environ-
ments with enhanced control, visibility and ease.
2.6 Monitoring
Monitoring in DevOps is the practice of continually observing and gathering data from
various components of a software system, such as servers, applications, databases and net-
work infrastructure. It involves the use of specialized tools and techniques to monitor and
analyze system performance, identify issues or anomalies, and ensure the full operation
of the software delivery pipeline. The primary objective of monitoring in DevOps is to
provide real-time visibility of system status, enable proactive problem detection and fa-
cilitate quick remediation. By monitoring key metrics and indicators, DevOps teams can
gain valuable insights into resource utilisation, application behaviour and user experience,
enabling them to optimise performance, improve reliability and enhance overall system
stability.
2.6. Monitoring 17
Grafana:
Grafana is a powerful data visualization and analytics platform widely used in the surveil-
lance and observability space. It serves as a front-end tool that allows users to create
visually appealing and interactive dashboards, charts and graphs. Grafana supports a
wide range of data sources, including Prometheus, and provides an easy-to-use interface
for exploring and analysing monitoring data. With Grafana, users can easily customise
their dashboards, add different panels to display metrics and visualise data trends over
time. Its flexible interrogation capabilities and large plugin ecosystem make it a popular
choice for data visualization in the DevOps community.
Prometheus:
Prometheus, on the other hand, is a feature-rich monitoring system and time series database.
It is especially designed for surveillance and warning purposes. Prometheus follows a drawn
2.7. Comparative study of tools 18
pattern, gathering measurements and time series data from a variety of targets monitored
at regular intervals. It stores the data gathered into its own time series database and
provides a powerful query language called PromQL. PromQL allows users to run com-
plex queries to retrieve specific metrics and gain insight into system performance and
behaviour. Prometheus also includes an alerting system that can send notifications based
on pre-defined conditions. This enables DevOps teams to set up customized alerts, receive
real-time notifications, and proactively take action to resolve issues quickly.
To accomplish this project we needed many tools starting with GCP for our private cloud
and we used as gitlab for versioning and pipliens and we used terraform for the infra as
code and argoCD for GitOps and we finish with the monitoring working with grafana and
prometheus.
Gitlab Vs Github
GitLab GitHub
Hosting Platform Self-hosted or cloud-based Cloud-based
Pricing Offers both free and paid options Offers both free and paid options
Repository Types Supports both public and private repositories Supports both public and private repositories
Issue Tracking Robust issue tracking system Robust issue tracking system
Collaboration Built-in collaboration tools Built-in collaboration tools
Continuous Integration Offers built-in CI/CD pipelines Integrates with various CI/CD tools
Integration Provides extensive integration capabilities Provides extensive integration capabilities
Security Strong emphasis on security and access control Strong emphasis on security and access control
Community Active community and open-source contributions Active community and open-source contributions
Enterprise Offers enterprise-level features and support Offers enterprise-level features and support
Gitlab Vs Jenkins
Terraform Vs Chef
Terraform Chef
Use Case IaC (Infrastructure as Code) Config. Management
Provider HashiCorp Chef Software Inc.
Language HCL and JSON Ruby and DSL for recipes
Infrastructure Immutable - new resources for changes Mutable - configs apply to existing resources
Approach Declarative - defines final state Imperative - defines the steps
Integration Manages low-level and high-level components Manages systems at OS level
Cloud Support Multi-cloud and on-premises Multi-cloud and on-premises
State Management Tracks state Does not inherently track state
Open Source Yes Yes
Code Reusability Reuses modules Reuses recipes and cookbooks
Community Large, growing rapidly Large, established longer
Rollback Supports planning, allows rollback Requires scripting
ArgoCd Vs Flux CD
Argo CD is a GitOps continuous delivery tool for Kubernetes. It automates the deployment
of applications and configurations from Git repositories to Kubernetes clusters. FluxCD
is another GitOps tool that provides continuous delivery for Kubernetes applications, au-
tomatically synchronising container images and configurations from Git repositories to
Kubernetes clusters.
Argo CD Flux CD
Use Case Kubernetes Continuous Delivery GitOps for Kubernetes
Provider Intuit Inc. Weaveworks
Declarative Model Yes Yes
GitOps Ready Yes Yes
Primary Focus Deployment and synchronization Synchronization
Secret Management Uses Sealed Secrets, External Secrets, or Vault Uses SOPS
Manifest Generation Kustomize and Helm Kustomize, Helm and Jsonnet
Sync Strategy Automatic or manual synchronization Automatic synchronization
Rollback/History Support Yes, supports rollback to previous configurations Yes, supports rollback
Health Checks Yes Yes
Multi-Cluster Deployment Yes Yes
Grafana Vs Kibana
Grafana is an open source analytics and visualisation tool used for monitoring and ob-
servability. It integrates with various data sources and enables the creation of custom
dashboards to visualise metrics. Kibana is part of the Elastic Stack and provides powerful
visualisation capabilities for analysing and exploring data stored in Elasticsearch. It is
commonly used for log analysis and search.
2.7. Comparative study of tools 21
Grafana Kibana
Use Case Metrics & logs viz., alerting Log exploration, viz., Elastic Stack interface
Provider Grafana Labs Elastic
Data Sources Many incl. Prometheus, Graphite, etc. Primarily for Elasticsearch
Viz. Types Graphs, tables, heatmaps, etc. Charts, maps, etc.
Alerting Yes Yes (via ElastAlert or X-Pack)
Dashboards Highly customizable Less flexible
Setup Ease Easy to moderate Moderate to hard
Log Exploration Requires integration with Loki Advanced with Elasticsearch
Cloud Support Available Available (Elastic Cloud)
Security Supports Auth Proxy, LDAP, OAuth RBAC, Spaces for isolation
Community Large, growing Large, established
* Kubernetes is an open source container orchestration platform that automates the de-
ployment, scaling and management of containerised applications. It provides a robust
ecosystem for managing distributed systems. * Docker Swarm is a native clustering and
orchestration solution from Docker. It allows the creation of a swarm of Docker nodes and
enables container orchestration across those nodes.
Kubernetes (K8s) Docker Swarm
Provider Cloud Native Computing Foundation (CNCF) Docker Inc.
Complexity Higher Lower
Setup and Configuration Can be complex to set up and configure Easier to set up and configure
Scalability Highly scalable Scales well but may not handle large clusters as effectively
Networking A bit complex; uses CNI for pod networking Simpler networking
Load balancing Manual service configuration required Automatic load balancing
Service discovery Uses CoreDNS for service discovery DNS based service discovery
Rolling updates Supported Supported
Data volumes Supports shared volumes and storage mounting Supports volume mounting
Logging and monitoring Integrates with multiple logging and monitoring tools Limited built-in monitoring tools
Compatibility Supports a wide range of infrastructure More limited; best with Docker’s own platform
Conclusion
In this chapter we have presented the main concepts of our project, where we have defined
the DevSecOps approach, continuous integration, continuous delivery and deployment, and
concluded with monitoring. We have also carried out a comparative study of different tools
with an educational objective.
Chapter 3
Introduction
In this section, we will detail the analysis of functional and non-functional requirements,
the overall architecture, and the project design.
* DevOps Engineer: It’s going to be us because we will be responsible for managing the
infrastructure, deployment, and pipelines.
* Data Engineer: It’s going to be the data team who will be working on the project
that we are going to deploy.
I used Terraform for infrastructure management, defining and versioning my data centre
setup in a codified way. I used Kubernetes (k8s) as my container orchestration platform
to automate the deployment, scaling and management of my containerised applications.
23
3.2. Project Architecture 24
I have created GCP resources using terraform and create pipelines to manage what the
data team needs These are the modules that i have created :
Custom Service Account : I have created terarform module that manage to create a
service account using terraform code with a pipline in gitlabe and it’s used for managaing
IAM and minimising roles for security reasons
Cloud Function : I have created terraform module that manage to create a cloud func-
tion usingg terraform code with a pipline in gitlab and it’s used for many reasons such as
creating alerts in GCP or data since uses
BigQuery : I have created terarform module that manage to create BigQuery using
terraform code with a pipline in gitlabe , lets you incorporate GoogleSQL functionality
with software outside of BigQuery by providing a direct integration with Cloud Functions
and Cloud Run.
Document AI : I have created terarform module that manage to create Document AI
using terraform code with a pipline in gitlabe , and it’s used to create seamless integrations
and easy to use applications for your users.
3.2. Project Architecture 25
Pub/Sub : I have created terarform module that manage to create Pub/Sub using ter-
raform code with a pipline in gitlabe , It’s used in distributed systems for asynchronous
communication between different components or services
GKE (Google Kubernetes engine): I have created terarform module that manage to
create GKE (Google Kubernetes engine) using terraform code with a pipline in gitlabe ,
GKE gives you complete control over every aspect of container orchestration, from net-
working, to storage.
Workload Identity Federation : I have created terarform module that manage to cre-
ate using terraform code with a pipline in gitlabe , Using identity federation, you can grant
on-premises or multi-cloud workloads access to Google Cloud resources, without using a
service account key.
Helm Charts In my work, I often use Helm charts, an effective package manager
for Kubernetes. Think of it as Ubuntu’s apt or Red Hat’s yum, but designed specifically
3.2. Project Architecture 26
for Kubernetes applications. A Helm chart is a bundle of files describing a set of related
Kubernetes resources, which can be as simple as a standalone pod or as complex as a
full-stack application.
During deployment, Helm combines the templates in the chart with the provided values
to create a Kubernetes manifest that is deployed to the cluster. In this way, I can customise
deployments for different environments without changing the core templates.
In my role, I use Helm charts with ArgoCD to enable GitOps workflows. The charts
are stored in a Git repository, and ArgoCD ensures that the Kubernetes cluster stays
in sync with that repo. This setup allows us to manage applications in a declarative,
version-controlled way, which is a more efficient way to handle application lifecycles and
complexity.
Application Deployment:
Application deployment is a key responsibility within the DevOps domain which in-
volves the automated, reliable and consistent process of delivering project software to
users. The DevOps team uses Continuous Integration/Continuous Deployment (CI/CD)
pipelines to make sure all code changes are seamlessly and securely integrated, tested and
deployed in the production environment. This process includes managing dependencies,
ensuring compatibility across different environments, and finally pushing the application to
servers where end users can access it. By automating this process, we dramatically reduce
the risk of human error, shorten deployment cycles, and increase overall project efficiency.
Monitoring:
Security:
integral part of the software development cycle. It promotes the principle of ’security as
code’, where infrastructure security is version controlled and auditable. It also recommends
responding to threats in real time, allowing for rapid detection and correction of security
threats. The outcome is robust and secure applications that maintain data integrity and
confidentiality.
Scalability:
Performance:
Performance, in the context of DevOps, is the ability of the system to respond quickly to
user demands and perform its functions efficiently. It encompasses various aspects such as
response time, throughput, resource utilisation and reliability under varying loads. A high-
performance system not only improves user satisfaction, but also improves the system’s
reliability and availability. Performance optimisation can involve many strategies, includ-
ing efficient code practices, database optimisation, caching strategies and the selection of
appropriate infrastructure.
Reusability:
Re-use is the ability to use existing software artifacts (such as code modules, libraries,
or frames) in various contexts and applications. In a DevOps environment, reuse can dra-
matically simplify the development process, reduce mistakes and shorten time to market.
3.3. Functional and non-Functional needs 29
Conclusion
This chapter has allowed us to identify the stakeholders, study the functional and non-
functional requirements, and present the global architecture..
In the next increment, we’ll represent the Order Management topic, during which we’re
going to work on the different aspects of the services’ booking and the payment processing.
Chapter 4
Achievement
Introduction
In this final chapter, we will present the work that has been done, detailing the tasks of
continuous integration and solution deployment with screenshots for a good understand of
the solution
Google Console First of all, we have to create a GCP project that will contain all of the
work and resources that will be created.
30
4.2. Gitlab 31
As you can see here is the dashboard of our project, so you can see the name and the
id of the project that we are going to work with.
4.2 Gitlab
GitLab is a web-based DevOps platform for software development. It offers version con-
trol, issue tracking, and project management features. With built-in CI/CD capabilities,
it enables automated testing and deployment. GitLab provides both cloud-based and self-
hosted options for flexibility and customization.
4.2. Gitlab 32
As you can see here I have worked with gitlab to store my code and in the same time
to manage it as a server and run my pipelines
4.2.1 Piplines
First Pipline
I have worked with two principle pipelines to manage the infrastructure and create
resources in my project GCP :
4.2. Gitlab 33
As you can see here, this pipeline contains 4 stages, Validate, Plan, Apply and Destroy,
which are needed to create a route with Terraform, and for this pipeline, it’s using service
account authentication, which we have exported in GOOGLE APPLICATION CREDEN-
TIALS.
Second Pipeline
And this pipeline is the same as the other one, but there’s one thing that’s changed, first
of all, it doesn’t work with the service account anymore, it works with workload identity
federation, we’ll talk about that next and in same time .
4.2. Gitlab 34
In order for this pipeline to work, it includes another pipeline called .gitlab-gcp-auth.yml,
which is used to create a temporary token and authenticate against it.
Kaniko Pipline
Kaniko is a container image building tool in GCP. It works without Docker daemon,
allowing secure and reproducible image builds in Kubernetes. Compatible with GCR and
other container registries, it simplifies the containerization process for developers.
A service account in Google Cloud Platform (GCP) is an identity that represents a non-
human entity, such as an application or virtual machine. It is used to authenticate and
authorise interactions between services and resources within GCP. Service accounts have
unique credentials that enable secure access control and management of GCP resources.
They play a critical role in automating tasks and enabling seamless integration between
services in the cloud environment.
As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course we didn’t forget to create a test folder so we call it as a module and test it.
4.2. Gitlab 38
As you can see here the pipline is successfully running and the service account have
been created .
Cloud Function
Google Cloud Functions (GCP) is a serverless execution environment for building and
connecting cloud services. It allows developers to deploy simple, single-purpose functions
that are executed in response to cloud events without the need to manage a server or
runtime environment. These functions can be written in various programming languages
4.2. Gitlab 39
such as Node.js, Python, Go, etc. They facilitate the creation of scalable, cost-effective
applications and services.
As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course we didn’t forget to create a test folder so we call it as a module and test it.
4.2. Gitlab 40
BigQuery
Google BigQuery is a fully managed, serverless data warehouse that enables super-fast
SQL queries by harnessing the processing power of Google’s infrastructure. It allows you
to analyse large data sets in real time by running SQL-like queries, while also providing
machine learning capabilities within the platform. BigQuery is designed to be scalable,
cost-effective and easy to use, making it ideal for organisations that need to leverage big
data analytics. It supports a wide range of data formats and integrates with various Google
Cloud services.
4.2. Gitlab 41
As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf,
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course we didn’t forget to create a test folder so we call it as a module and test it.
As you can see here this is main.tf where we created the resources "google bigquery
4.2. Gitlab 42
dataset" and "google bigquery dataset iam binding" while we are getting the variables
from file terraform.tfvars
Document AI
As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf,
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course, we didn’t forget to create a test folder so we call it as a module and test it.
As you can see here this is the main.tf in the test folder where we call the module so
we can use it and put the variables that we want.
4.2. Gitlab 44
Google Cloud Pub/Sub is a scalable, reliable, real-time messaging service that enables
asynchronous data streaming and event-driven computing. It provides a publish-subscribe
pattern where publishers categorize published messages into topics and subscribers con-
sume them based on those topics. The service ensures at-least-once message delivery and
automatic scaling, making it suitable for distributed systems and large data scenarios.
Pub/Sub integrates seamlessly with other GCP services, enabling complex data processing
pipelines.
I have created two projects one for the subscription and one for the publisher
4.2. Gitlab 45
as you can see here the subscription main.tf will create these resources.
And Here is the publisher project, As you can see here I have created the main.tf,
variables.tf, output.tf, and versions.tf, and as you can see .gitlab-ci.yaml is where I call the
pipeline we created earlier to use it and of course, we didn’t forget to create a test folder
so we call it as a module and test it.
4.2. Gitlab 46
And here you can see the resources that have been created.
GKE
As you can see here this is main.tf where we created the resources "google container
cluster" and "google container node pool" while we are getting the variables from file
terraform.tfvars in the test folder.
In each module I have created its own documentation so that Team Data can understand
4.2. Gitlab 48
and work with our modules, this is an example of the documentation I have created.
Google Cloud’s Workload Identity Federation is a feature that allows external identities
from vendors such as AWS, Azure, or OIDC identity providers to authenticate and access
Google Cloud resources without the need for service account keys.
Instead of storing and managing service account keys, an application can use an external
identity to impersonate a service account. This feature facilitates secure workload identity
pools and providers that vouch for these external identities and associate them with Google
Cloud service accounts.
This allows developers to maintain fewer service account keys, reducing the risk of key
leakage or misuse. It also provides a consistent authentication experience across different
cloud providers, improving the security and manageability of cross-cloud access control.
As you can see here I have created the main.tf, variables.tf, output.tf and versions.tf,
and as you can see .gitlab-ci.yaml is where I call the pipeline we created earlier to use it
and of course, we didn’t forget to create a test folder so we call it as a module and test it.
And here we have called the required resources to create a workload identity federation
one of them the pool that contains many providers.
4.3 GitOps
4.3.1 ArgoCD
Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes. It leverages Git
repositories as the source of truth for desired application state, including their configura-
tions. Argo CD automates the deployment of applications to specified target environments,
ensuring they match the state defined in the Git repository.
Argo CD’s architecture is designed to be compatible with Helm, one of the most pop-
ular Kubernetes package managers. Helm simplifies the deployment of applications on
Kubernetes using a packaging format called Helm charts.
4.3. GitOps 51
As you can see here is the root application as we mentioned in project : root , from
here all the applications associated with it will be deployed and synchronised with it.
As you can see here the list of application associate with the alpha-apps (root)
[H]
And here is an example of an application in argoCd, this is the helm chart for the
application as you can see all the values are in a file called values and the type of rudder
is application. In Argo CD, an "Application" is a Kubernetes Custom Resource Defini-
tion that provides the specification for a deployed application. It defines the Git source
repository (which can contain Helm charts), the path within the repository, and the target
cluster and namespace.
4.3. GitOps 54
A Helm chart is a collection of files that describe a related set of Kubernetes resources,
which can represent an entire application stack. These charts are customizable using val-
ues.yaml files, allowing for parameterized configuration that can be shared across multiple
environments. "Projects" in Argo CD provide a logical grouping of applications, often
corresponding to a team or a business unit. They can enforce specific policies about where
applications can be deployed and which resources they can include, providing a level of
isolation and security.
4.3. GitOps 55
As you can see here, there is the alpha-app, which is the root that we talked about
before, and there is the chat-gpt-app, which is the application that we have deployed, and
finally we have the prometheus-stack, which we have also deployed with argoCD.
This is our application Chat-GPT app and all the deployments it needs, you can see it
all in the dashboard, it’s clear and easy to understand.
4.4. Monitoring 57
4.4 Monitoring
4.4.1 Prometheus
Prometheus is an open-source monitoring system and time series database. In your project,
you used it to collect numerical data over time, providing valuable insights about your
system’s performance. It uses a pull model to scrape metrics from your services at regular
intervals, based on defined targets and service discovery. You leveraged its powerful query
language, PromQL, for alerting and visualization. Thus, Prometheus served as a key
component in your observability stack, allowing you to maintain high service reliability
and quickly troubleshoot issues.
4.4. Monitoring 58
In Prometheus, metrics represent the time-series data that track the behavior of your
services over time. Each metric has a unique name and can have multiple dimensions, rep-
resented by key-value pairs called labels, which provide additional context like instance,
4.4. Monitoring 59
environment, or version. Metrics types include counters, gauges, histograms, and sum-
maries, each serving distinct purposes in tracking counts, measurements, distributions,
and quantiles, respectively.
4.4.2 Grafana
Grafana is a popular open-source tool for visualizing and analyzing metrics. In your project,
it served as a dashboard for displaying real-time data about your system’s performance,
utilizing data sourced from Prometheus. With its flexible and intuitive interface, you were
able to create custom graphs, gauges, and alerts, turning raw data into insightful visual
representations. By providing a clear overview of your system’s health and behavior,
Grafana played a pivotal role in your observability infrastructure, supporting decision-
making and problem-solving.
Now i will present you the dashboards that i have created with grafana :
4.4. Monitoring 60
And as you can see, all the data that we get from Prometheus here in the Grafana
dashboards, for example, the number of replicas and CPU usage and how many requests
have been sent from the application.
Conclusion
In this final chapter, we have implemented our work and thoroughly described our project
using screenshots that illustrate the various stages of completion.
General conclusion & perspectives
This work is part of the final project within the organisation RidchaData. During this
six-month internship, we developed and implemented an innovative DevOps solution. Our
main tasks included setting up continuous integration and continuous deployment pipelines.
As part of our infrastructure management strategy, we used Terraform, and to ensure
smooth deployments, we used Kubernetes and ArgoCD.
By adopting a DevOps culture, we were able to effectively address the challenges that
RidchaData was trying to solve. Our solution automated tasks that were previously man-
ual, resulting in time and cost savings through faster delivery cycles. It also improved
lifecycle predictability, fostered an innovation-driven culture and improved collaboration
between different teams.
The data team in particular benefited from our work, as our solutions improved the
efficiency and effectiveness of their processes. We also implemented monitoring systems
using Grafana and Prometheus, which improved our ability to track and manage system
performance.
The internship was an enriching experience that gave us the opportunity to apply our
existing knowledge and develop new skills by exploring new tools and technologies, includ-
ing GCP DevOps, Terraform, Kubernetes, ArgoCD, Grafana and Prometheus. In addition,
this professional experience has enhanced our team collaboration, project management.
In response to the challenges faced, and armed with this experience, I aim to further
adapt projects developed in other languages to the DevOps approach in the future.
Finally, I hope that our work will meet the expectations of the RidchaData management
and the jury.
62
Bibliography
[1] productboard.com. The four values of the agile manifesto. April 11th, 2021.
63