0% found this document useful (0 votes)
3 views3 pages

2. Data Processing in Shell

The document provides a comprehensive guide on fundamental commands for system management, application installation, data downloading, file extraction, data cleaning, database operations, and programming in Python and C++. It includes command-line instructions for various tasks such as managing services, installing applications, downloading files, and working with CSV files and databases. Additionally, it covers installation and execution of Python and C++ programs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views3 pages

2. Data Processing in Shell

The document provides a comprehensive guide on fundamental commands for system management, application installation, data downloading, file extraction, data cleaning, database operations, and programming in Python and C++. It includes command-line instructions for various tasks such as managing services, installing applications, downloading files, and working with CSV files and databases. Additionally, it covers installation and execution of Python and C++ programs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Người biên soạn: Nguyễn Đức Tây

fb.com/taaytungstieenf

PART A. Fundamental Commands


systemctl COMMANDS ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Restart machine $ systemctl reboot
2. Shutdown machine $ systemctl poweroff
3. Suspend machine $ systemctl suspend
4. Hibernate machine $ systemctl hibernate
5. Check a service state $ systemctl status <service_name>
6. Start a service $ systemctl start <service_name>
7. Stop a service $ systemctl stop <service_name>
SWITCHING USERS ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Log into another user $ su - <user_name>
2. Log out of current user $ logout
3. Log in as root user $ sudo -i
TOUCHING INTERNET ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Update system $ sudo apt update
2. Upgrade system $ sudo apt upgrade
3. Show IP address $ ip addr
4. Show datetime information $ timedatectl
SYSTEM INFORMATION ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Show machine information $ hostnamectl
2. Show RAM state $ free -h
3. Show ROM state $ df -h

PART B. Application Installation


ELITE APPLICATIONS ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Install wget $ sudo apt install wget
2. Install zip & unzip $ sudo apt install zip unzip
3. Install unrar $ sudo apt install unrar
4. Install vim $ sudo apt install vim
5. Install git $ sudo apt install git

PART C. Downloading Data on the Command Line


USING curl – Client for URL ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Download with the original file name $ curl -O <URL>
2. Download and change original name $ curl -o file_name <URL>
USING wget – World Wide Web get ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Install wget $ sudo apt-get install wget
2. Download $ wget <URL>
ADVANCED TECHNIQUES – Wget ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
$ nano links.txt
~ https://dlcdn.apache.org/tomcat/tomcat-9/v9.0.104/src/apache-tomcat-9.0.104-src.tar.gz
1. Download multiple files via a file of URLs ~ https://dlcdn.apache.org/tomcat/tomcat-10/v10.1.40/src/apache-tomcat-10.1.40-src.tar.gz
~ https://dlcdn.apache.org/tomcat/tomcat-11/v11.0.6/src/apache-tomcat-11.0.6-src.tar.gz
$ wget -i links.txt
2. Set up download bandwidth limit $ wget –limit-rate=<amount_of_kilobytes>k -i links.txt
3. Set up mandatory pause time $ wget -wait=<number_of_second> -i links.txt

PART D. Extracting Files


EXTRACTING COMMON FILES ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Extract .zip file $ unzip <file_name>
2. Extract .tar file $ tar -xvf <file_name>
3. Extract .tar.gz file $ tar -xzvf <file_name>
4. Extract .gz file $ gunzip <file_name>
5. Extract and move folder to a specific place $ tar -xzvf <file_name> -C /path/to/directory
PART E. Data Cleaning and Munging on the Command Line
csvkit ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Install csvkit $ sudo apt install csvkit
CONVERTING FILES ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Print the 1st sheet of xlsx file $ in2csv file.xlsx
2. Convert xlsx file to csv file $ in2csv file.xlsx > file.csv
3. Print all the sheet names of xlsx file $ in2csv -n file.xlsx
4. Save a specific xlsx file sheet to csv file $ in2csv file.xlsx --sheet “sheet_1” > sheet_1.csv
OBSERVATING FILES ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Display csv file in readable format $ csvlook file.csv
2. Show descriptive stats on csv file data $ csvstat file.csv
3. List all columns $ csvcut -n file.csv
4. Filter data by field number $ csvcut -c 1,2,3 file.csv
5. Filter data by field name $ csvcut -c “field_1”,”field_2”,”field_3” file.csv
6. Filter data by row value $ csvgrep -c “field_name” -m [searching_patern] file.csv
STACKING FILES TOGERTHER ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Stack 2 csv files having the same schema $ csvstack file1.csv file2.csv > file.csv
2. Stack 2 csv file and create distinct field $ csvstack -g “file1” “file2” file1.csv file2.csv > file.csv
3. // and create & change distinct field name $ csvstack -g “file1” “file2” -n “src” file1.csv file2.csv > file.csv
CHAINING COMMANDS ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Run commands sequentially $ csvlook -n file.csv; csvcut -c 1,2,3 file.csv
2. Run the 2nd if run 1st command successfully $ csvlook -n file.csv && csvcut -c 1,2,3 file.csv

PART F. Database Operations on the Command Line


INTRODUCTION TO MySQL ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Install MySQL on terminal $ sudo apt install mysql-server
2. Access MySQL via root user $ sudo mysql $ sudo mysql -u root -p
3. Create MySQL account & password > create user 'tae'@'localhost' identified by '246357';
4. Give full permission to new account > grant all privileges on *.* to 'tae'@'localhost';
5. List all users > SELECT user, host FROM mysql.user;
6. Delete user > DROP USER 'tae'@'localhost';
7. Remove user privileges immediately > FLUSH PRIVILEGES;
8. Log out > quit;
9. Clear screen > \! clear
0. Access MySQL via a specific user $ sudo mysql -u tae -p
QUERYING ON CSV FILE ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Apply SQL to local csv file $ csvsql --query "SELECT * FROM file_name" file_name.csv > output.csv
2. Render a better view $ csvsql --query "query" file_name.csv | csvlook
3. Assign long query to variable $ sql="field1, field2 FROM table_name”
4. Use that query $ csvsql --query "$sql" Spotify_Attributes.csv Spotify_Popularity.csv
PART G. Using Python On Command Line
PYTHON INSTALLATION ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Install Python programming language $ sudo apt install python3
2. Check Python version $ python3 --version
3. Check pip $ pip --version
4. Install package $ pip install <package_name>
5. Upgrade package $ pip install --upgrade <package_name>
6. Check all Python packages $ pip list
$ nano requirements.txt
~ numpy
~ pandas
~ matplotlib
7. Install multiple packages ~ scikit-learn
~ seaborn
~ xgboost
~ tensorflow
~ keras
$ pip install -r requirements.txt
INTRODUCTION TO PYTHON ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
$ python3
1. Run Python on terminal > print(“Hello World”)
> quit();
$ nano test.py
2. Run Python file ~ print(“Hello World!”)
$ python3 test.py

PART H. Using C++ On Command Line


C++ INSTALLATION ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
1. Install C++ programming language $ sudo apt install g++
2. Check C++ version $ g++ --version
INTRODUCTION TO C++ ───────────────────────── ⋆⋅☆⋅⋆ ─────────────────────────
$ nano test.cpp
~ #include <iostream>
~ using namespace std;
1. Run Python file ~
~ int main() {
~ cout << "Hello, C++ on terminal!" << endl;
~ return 0;
~ }
2. Interpret program $ g++ -o test test.cpp
3. Run program $ ./test

You might also like