0% found this document useful (0 votes)
23 views

Last Data Analytics Report-1267

Uploaded by

Charan baru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Last Data Analytics Report-1267

Uploaded by

Charan baru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

A Summer Industry Internship – II Report

on
Empowering Sentiment Analysis with Hugging Face on Amazon
SageMaker

during
III Year II Semester Summer

submitted to
The Department of Information Technology

In partial fulfillment of the academic requirements of


Jawaharlal Nehru Technological University
for
The award of the degree of

Bachelor of Technology
in
Information Technology
by

Sai Charan Baru-20311A1282

Mr. M. Dhanaraju
Assistant Professor

Sreenidhi Institute of Science and Technology


Yamnampet, Ghatkesar, R.R. District, Hyderabad - 501301

An Autonomous Institution

Affiliated to
Jawaharlal Nehru Technology University
Hyderabad - 500085
Department of Information Technology
Sreenidhi Institute of Science and Technology
Department of Information Technology

CERTIFICATE

This is to certify that this Summer Industry Internship – II Report on “Empowering Sentiment Analysis
with
Hugging Face on Amazon SageMaker”, submitted by Sai charan Baru(20311A1282) ,in the year
2023 in partial fulfillment of the academic requirements of Jawaharlal Nehru Technologica University for the
award of the degree of Bachelor of Technology in Information Technology, is a bonafide work in industry
internship that has been carried out during III B-Tech IT-II Semester, under our guidance. This report has not
been submitted to any other institute or university for the
award of any degree.

Internship Guide: Internship Coordinator: Head of the Department:

Mr.G.Raja Ramesh Mr.M.Dhanaraju Dr.Sunil Bhutada


Assistant Professor Assistant Professor Professor
Department of IT Department of IT Department of IT

External

Examiner Date:-
Internship Certificate:
DECLARATION

I, Sai Charan Baru(20311A1282), student of SREENIDHI INSTITUTE OF SCIENCE AND


TECHNOLOGY, YAMNAMPET, GHATKESAR, studying IV year I semester, INFORMATION
TECHNOLOGY solemnly declare that the Summer Industry Internship-II Report, titled “Empowering
sentiment analysis with hugging face on Amazon SageMaker” is submitted to SREENIDHI INSTITUTE
OF SCIENCE AND TECHNOLOGY for partial fulfillment for the award of degree of Bachelor
of technology in INFORMATION TECHNOLOGY. It is declared to the best of our knowledge that the
work reported does not form part of any dissertation submitted to any other University or Institute for award of
any degree.

SAI CHARAN BARU


20311A1282
ACKNOWLEDGEMENT

I would like to express my gratitude to all the people behind the screen who helped me to transform
an idea into a real application. I would like to thank Internship coordinator Mr.M.Dhanaraju sir for
their technical guidance, constant encouragement and support in carrying out my project at college. I
profoundly thank Dr. Sunil Bhutada sir, Head of the Department of Computer Science &
Engineering who has been an excellent guide and also a great source of inspiration to my work.
I would like to express my heart-felt gratitude to my parents without whom I would not have been
privileged to achieve and fulfill my dreams. I am grateful to our principal, Dr. T. Ch. Siva Reddy,
who most ably run the institution and has had the major hand in enabling me to do my project. The
satisfaction and euphoria that accompany the successful completion of the task would be great but
incomplete without the mention of the people who made it possible with their constant guidance and
encouragement crowns all the efforts with success. In this context, I would like thank all the other
staff members, both teaching and non-teaching, who have extended theirtimely help and eased my
task.

Sai Charan Baru


20311a1282
Analysis of data using AWS services

Abstract:
In this project, I have explained a combined service of AWS using some AWS services, i.e.,
Amazon Redshift, S3,IAM,SageMaker and some more. If we deploy this service as a project or
product, it will be more useful to the companies and as well as to the individual users also. In this
project, we’ve taken a sample dataset i.e., India GDP data and, by using the services provided by
Amazon RedShift, S3 and other modules, we’ve gone through the data and interpreted it (data) in
the form of charts and graphs. To avail this service precisely, the user or company has to follow
the steps that we’ve provided. The user has to upload, their specific data into S3 storage service,
then load it into the query editor and then the user will be free to perform analysis and
interpretation. These services will ensure the effective and efficient analysis and interpretation of
data. We’ve explained in detail about every step, with figures. We believe, by reading the
description given and by analyzing the picture of each step, the user will be able to implement the
desired services with ease. Data visualization will definitely help in prediction of further steps to
be taken and analyzing the data.

Rishitha
Goshika
20311A1267
INDEX

Abstract i

1. INTRODUCTION 1
1.1 About the Internship and Plan of Training program 1
1.2 Scope 1

1.3Proposed System 2

2. SYSTEM ANALYSIS 3

3.SYSTEM ARCHITECTURE AND UML DIAGRAM 4

4. SYSTEM IMPLEMENTATION and OUTPUT 5

5. INTERNSHIP FEEDBACK (Experience) 12


* About Company
* Experience in Internship
* Challenges faced
6. CONCLUSIONS AND FUTURE SCOPE 13

7. REFERENCES 14
1. INTRODUCTION:

1.1. About the Internship and Plan of Training program :


The internship is the of two parts,
Part1: AWS Cloud Foundations and
then, Part2: AWS Data Analytics

In the part1 of internship, i.e., AWS Cloud Technology, the mentors in the modules has
explained about every service. In the modules, the mentors has explained in-detail about
every services that the AWS provides to the users and also how the services are
implemented. The mentors has explained some basics of AWS cloud services, which are
so much beneficial for a beginner. In the Cloud Internship, the student will get to know
everything about AWS Cloud technology and some of the crucial techniques that one can
use to develop her/his idea.

In part2 of the internship, i.e., AWS Data Analytics, the students were taught how to
handle different kind of operations with different type data based on the requirements.
Themodules will give clear knowledge about how to load data into the platform or dialog
and handle data. It contains of 8 labs, and each module will explain a service that the
AWS provides, which will help in uploading, understanding data and interpret it using
different techniques.

1.2. Scope:
Data visualization is the graphical representation of information and data. By using visual
elements like charts, graphs, and maps, data visualization tools provide an accessible way to
see and understand trends, outliers, and patterns in data. Additionally, it provides an
excellent way for employees or business owners to present data to non-technical audiences
without confusion. In the world of Big Data, data visualization tools and technologies are
essential to analyse massive amounts of information and make data-driven decisions.
Asthe “age of Big Data” kicks into high gear, visualization is an increasingly key tool to
make sense of the trillions of rows of data generated every day. Data visualization helps to
tell stories by curating data into a form easier to understand, highlighting the trends and
outliers.

1
A good visualization tells a story, removing the noise from data and highlighting useful
information.

Building a BI and data visualization service in the cloud allows you to take advantage of
capabilities such as scalability, availability, redundancy, and enterprise grade security. It also
lowers the barrier to data connectivity and allows access to far wider range of data sources
—both traditional, such as databases, as well as non-traditional, such as SaaS sources. An
added advantage to a cloud-based data visualization service is the elimination of
undifferentiated heavy lifting related to managing server infrastructure.

Amazon Web Services (AWS) has numerous services for different applications. Like, for
storing data, analysing data, interpreting data, connection management and many more using
different modules, like, EC2 service, S3 bucket, RDS, etc. One of them is Visualization
service which is a combo of different services. Amazon Redshift is one of the most helping
service available in AWS for data visualization, analysis and interpretation. We’ve used
services in Amazon Redshift, like, clustering, query Editors, query Editor version2, etc.

1.3. Proposed System:


But AWS Console is the most secure interface, where the user can complete her/his tasks
without any insecurity feeling. As every user has to definitely work in their own AWS
account, i.e., IAM user or Root user, it ensures that user’s data is secured in a most
efficient way. To avoid security attacks on user’s actual AWS account, they can create
IAM user accounts, which are static, i.e., for some extent, the user can work on the
specific data on which wants to, ensures that, the private data of user is safely secured.
AWS has many services and techniques that help in visualization and understanding of
data. They are, Amazon RedShift, Amazon QuickSight, Amazon S3, etc…

In Amazon RedShift, the user will upload the data securely and then using the services
available in QueryEditors of it, the user can try to visualize and understand the data with
different techniques and can store or save it for further analysis and prediction works.

2
2. SYSTEM ANALYSIS:

 An Operating System, can be any type, i.e., Windows or MaC or Linux, etc… containing
a highly reliable browser application or program with high-speed internet is much
appreciable to perform the operations.
 The user should have an AWS account to access AWS Console Management.

 To work in secure environment, it is recommended to have a IAM user account or Root


user account. So that, the user’s private information about user and the data can be
ensured security.
 The system has to be fast enough to handle the operations perfectly and efficiently.

 AWS is also available as mobile edition. But, only some of the services available in AWS
are allowed to use. So, online interface is better option.
 As the tasks includes accessing of more than one sessions or dialogs simultaneously, the
operating system and processor should be able to handle it. So, fast computing system is
better recommended.

3
3. SYSTEM ARCHITECTURE AND UML DIAGRAM

UML diagram: Fig 3.1:aws services functioning

Fig 3.2:Use case Diagram

In UML, use-case diagrams model the behavior of a system and help to capture the
requirements of the system. Use-case diagrams describe the high-level functions and
scope of a system. These diagrams also identify the interactions between the system
and its actors.

4
4. SYSTEM IMPLEMENTATION:

Task 1: Create an IAM role with required permissions:


• Open your AWS account with user credentials and then,
• Search for IAM in services.
• In IAM dialog, we can find roles, click on it.
• Now, click on create role and create an IAM role with user specified details and
required permissions.
• Click on “Choose a service to view use case” drag box and choose REDSHIFT from
the list.
• Give the required permissions:

* AmazonS3FullAccess

* Amazon Redshift Full Access

• At last, click on create role.

Fig.4.1:IAM service in aws

5
Output Screen:

Fig:4.2:Roles creation in IAM service

Task 2: Creating a S3 bucket and upload files into it.


• Open AWS console management with user credentials.
• Go to S3 module in services.
• Click on create bucket.
• Fill all the required fields.
• Choose create bucket and the bucket is created.
• Click on the bucket we have created.
• Click on upload.
• Browse for files and select the required files.
• Choose upload and the files will be uploaded.

6
Fig.4.3:creating Buckets in S3
Output Screen:

Fig.4.4:Output Screen of S3

7
Task 3: Go to Amazon Redshift and Create a cluster:

• Go to services and search for Amazon RedShift.


• Open it and you will find clusters.
• Click on Create cluster and then fill all the user specific details.
• Enter Admin user name and password.
• Associate the IAM role that we have created before.
• Click on create cluster.
• Cluster created pop up will be prompted on the display. Wait until it shows available.

Fig.4.7:Cluster dashboard
Output Screens:

Fig.4.5:Output of cluster creation

8
Task 4: Creating Jupyter Notebook with Amazon SageMaker
 On the AWS Management Console, on the Services menu, choose Amazon SageMaker.

Fig.4.6:Iam service
 From the navigation menu, choose Notebook instances.
 Fill in the required details and give required permissions.
 Click on Create Notebook instance.
 The jupyter notebook is created

Fig.4.7:Creating Notebook Instance

9
Output Screens:

Fig.4.8:Notebook Instances

 This is the jupyter notebook lab page that we have created just now.

Fig.4.9:Jupyter Notebook

10
Task 5 : Creating Visualizations (Line graph) with Bokeh:
 After Successfully creating the jupyter notebook, Open it.
 By using some simple code we are going to create a line graph.
 After choosing Run,Bokeh creates a file called lines.html.
 Then we will save it as Create line graph.
 Open the Jupyter dashboard by choosing the Jupyter logo.
 From the list, Open the lines.html file.

Output Screen:

Fig.4.10:Visualization of data

11
5. INTERNSHIP FEEDBACK:

About AWS (Amazon Web Services):


In general, it is very hard to implement every service, i.e., dynamic storage service,
individual user accounts, etc… using hardware. It would cost more, if hardware is preferred
all the time. So, cloud based technologies introduced some intelligent implementation of
technology, in which the user or a business enterprise or a marketing company, etc.. often
doesn’t need to bother about storage, power and some other utilities. The cloud will provide
virtual services to accomplish tasks. AWS is one of such kind of Cloud based technology,
that provides services on virtual platform, which will obviously reduce the cost Some of the
services are for free and to use some of the services that AWS provides, user has to register
by paying some subscription fee. However, the costs are affordable. The main advantage of
using AWS is, the user will be asked to pay only for whatever she/he used actually.
AWS services are delivered to customers via a network of AWS server farms located
throughout the world. Fees are based on a combination of usage (known as a "Pay-as-you-
go" model), hardware, operating system, software, or networking features chosen by the
subscriber required availability, redundancy, security, and service options. Subscribers can
pay for a single virtual AWS computer, a dedicated physical computer, or clusters of either.

Experience in Internship and Challenges faced:


While working for the internship, we felt very delighted and often I got know about some of
my skills. The content that’s taught was very helpful in working for project. Sometimes, we
even felt exhausted. But we recovered back again after sometime and resumed work again
and then successfully finished all the tasks. At times, to complete some of the tasks, we had
to spend all time we had and that showed us about our concentration levels and dedication
towards work. During the Internship, every lab that we completed taught some of the
techniques, which we have used them while working with project. Every module contains
unique techniques and some known techniques. Part 1 of the internship, i.e., Cloud
Computing, the mentors has taught us about different kinds of services that AWS provides
and the process to use them. In the part 2 of the internship, i.e., Data Analytics, they’ve
taught us about handling and working with different kinds data in unique ways. So, we feel
that, the Internship with AWS was much helpful for us. This gave us further motivation to
set our primary goals.

12
6. CONCLUSION AND FUTURE SCOPE:

Data visualization is the most effective way of interpreting the data. Rather than normal data
representation, i.e., in the form of tables, sheets, etc..., visualized documentation will
communicate more about the data. Using visualized data, one can easily remove outliers,
handle noises, understand and classify the data and predict future scoping needs for a
company. Which is very much helpful for entrepreneurs. So, we recommend the users or
customers to utilize this service and the services that the interface or platform provides. If
one’s goal is to understand perfectly about their organization, they will look at every aspect
of the company, which includes, it’s production, progress, revenue that company is making,
etc… traits. This service is the most reliable and efficient platform to work for such kind of
goals. Because, as we’ve seen about what kind of services does the platform or interface is
providing and implementation of them, which are some of the main tasks that every
organization wants to expect of. Using the services that are explained before, anyone can
easily visualize and interpret the data and make analysis about it, i.e., the data.

13
7. REFERENCES:
1. https://docs.aws.amazon.com/wellarchitected/latest/analytics-lens/data-visualization.html
2. https://www.tableau.com/learn/articles/data-visualization
3. https://en.wikipedia.org/wiki/Amazon_Web_Services
4. https://aws.amazon.com/console
5. https://signin.aws.amazon.com/signin?redirect
6. https://aws.amazon.com/certification/certification-prep/testing
7. https://aws.amazon.com/about-aws/whats-new/2017/05/aws-training-and-certification-portal-
now-live/
8. https://aws.amazon.com/certification/certified-cloud-practitioner/

14
APPENDIX A

SREENIDHI INSTITUTE OF SCIENCE AND TECHNOLOGY


Department of Information Technology
Summer Internship - I
Roll No. Name Title

20311A1282 Sai Charan Baru Empowering sentiment


analysis with hugging
faces on amazon
sagemaker

Abstract
In this project, we’ve explained a combined service of AWS using some AWS services, i.e.,
Amazon Sagemaker and Hugging face hub. Amazon SageMaker provides several built-in machine
learning (ML) algorithms that you can use for a variety of problem types. These algorithms provide
high-performance, scalable machine learning and are optimized for speed, scale, and accuracy.
Using these algorithms you can train on petabyte-scale data. They are designed to provide up to 10x
the performance of the other available implementations. Hugging Face's transformers
library with a custom Amazon sagemaker-sdk extension to fine-tune a pre- trained transformer on
binary text classification. The pre-trained model is fine-tuned using the sst2 dataset.

Student:Sai Charan Baru


20311A1282

Internship Guide: Internship Coordinator: Head of the Department:

Mr.G.Raja Ramesh Mr.M.Dhanaraju Dr.Sunil Bhutada


Assistant Professor Assistant Professor Professor
Department of IT Department of IT Department of IT

15
APPENDIX B
DOMAIN OF INTERNSHIP AND NATURE OF INTERNSHIP

Roll No Name Title

20311A1282 Sai Charan Baru Empowering sentiment analysis with


hugging faces on amazon sagemaker

Table 2: Nature of the Project/Internship work (Please tick √Appropriate


for your project)

Nature of Project
Product Application Research Others(please
Title specify)

Empowering sentiment
analysis with hugging faces
on amazon sagemaker

Student:Sai Charan Baru


20311A1282

Internship Guide: Internship Coordinator: Head of the Department:

Mr.G.Raja Ramesh Mr.M.Dhanaraju Dr.Sunil Bhutada


Assistant Professor Assistant Professor Professor
Department of IT Department of IT Department of IT

16
Table 3: Domain of the Project/ Internship work (Please tick √Appropriate
for your project)

Domain of the Project


Artificial Computer Data Cloud Software
Title Intelligence Networks, Warehousing Computin Engineerin
,Machine Informatio ,DataMining gand gand
Learning nSecurity and Big Data Internet of Image
and Deep and Cyber Analysis Things Processing
Learning Security
Empowering
sentiment
analysis with
hugging faces on
amazon
sagemaker

Student:Sai Charan Baru


20311A1282

Internship Guide: Internship Coordinator: Head of the Department:

Mr.G.Raja Ramesh Mr.M.Dhanaraju Dr.Sunil Bhutada


Assistant Professor Assistant Professor Professor
Department of IT Department of IT Department of IT

17
18
19
20
21
22
23
24
25

You might also like