0% found this document useful (0 votes)
3 views3 pages

HTML Code

The document is a comprehensive reference guide for essential commands in Python, PySpark, and SQL for developers. It includes command functions and examples for each programming language, providing quick access to key functionalities. The guide is version 2.0 and was updated in August 2024.

Uploaded by

wifaba2026
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views3 pages

HTML Code

The document is a comprehensive reference guide for essential commands in Python, PySpark, and SQL for developers. It includes command functions and examples for each programming language, providing quick access to key functionalities. The guide is version 2.0 and was updated in August 2024.

Uploaded by

wifaba2026
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Ultimate Developer Command Guide

Python, PySpark & SQL Reference


Essential Commands for Developers

Python Commands

Command Function Example

print() Outputs data to console print("Welcome to Python!") # Prints:


Welcome to Python!

len() Returns length of an object my_list = [1, 2, 3]; print(len(my_list))


# Output: 3

range() Generates a sequence of numbers for i in range(3): print(i) # Prints: 0,


1, 2

def Defines a custom function def greet(name): return f"Hello,


{name}"; print(greet("Alice")) # Prints:
Hello, Alice

import Imports a module or library import math; print(math.pi) # Prints:


3.141592653589793

[x for x in Creates a list using comprehension squares = [x**2 for x in [1, 2, 3]];
iterable] print(squares) # Prints: [1, 4, 9]

if/elif/else Conditional logic x = 10; if x > 5: print("Big") else:


print("Small") # Prints: Big

for Iterates over a sequence for fruit in ["apple", "banana"]:


print(fruit) # Prints: apple, banana

while Loops until condition is false count = 0; while count < 3:


print(count); count += 1 # Prints: 0, 1,
2

try/except Handles exceptions try: print(1/0) except


ZeroDivisionError: print("Cannot divide
by zero") # Prints: Cannot divide by
zero

open() Opens a file for reading/writing with open("example.txt", "w") as f:


f.write("Hello") # Creates file with
text

list.append() Adds an item to a list my_list = []; my_list.append(5);


print(my_list) # Prints: [5]

dict.get() Retrieves value from dictionary my_dict = {"key": "value"};


print(my_dict.get("key")) # Prints:
value

PySpark Commands
Command/Function Function Example

SparkSession.builder Initializes a Spark from pyspark.sql import SparkSession; spark =


session SparkSession.builder.appName("MyApp").getOrCreate()

spark.read.csv() Loads CSV file into df = spark.read.csv("data.csv", header=True,


a DataFrame inferSchema=True); df.show() # Displays CSV data

df.show() Displays first n df.show(3) # Shows first 3 rows


rows of DataFrame

df.printSchema() Displays df.printSchema() # Shows column names and types


DataFrame schema

df.select() Selects specific df.select("name", "age").show() # Shows name and


columns age columns

df.filter() Filters rows based df.filter(df.age > 25).show() # Shows rows where
on condition age > 25

df.where() Alias for filter df.where("salary > 50000").show() # Filters rows


where salary > 50000

df.groupBy().agg() Groups data and df.groupBy("department").agg({"salary":


applies aggregation "avg"}).show() # Shows avg salary per dept

df.join() Joins two df1.join(df2, df1.id == df2.id, "inner").show() #


DataFrames Inner join on id

df.withColumn() Adds or modifies a df.withColumn("age_plus_10", df.age + 10).show() #


column Adds column with age + 10

df.withColumnRenamed() Renames a column df.withColumnRenamed("old_name", "new_name").show()


# Renames column

df.drop() Drops specified df.drop("salary").show() # Drops salary column


columns

df.fillna() Replaces null df.fillna({"age": 0}).show() # Replaces null ages


values with 0

df.dropDuplicates() Removes duplicate df.dropDuplicates(["name"]).show() # Drops


rows duplicate names

df.write.csv() Saves DataFrame df.write.csv("output.csv", mode="overwrite") #


as CSV Saves DataFrame to CSV

df.createOrReplaceTempView() Registers df.createOrReplaceTempView("temp_table") # Creates


DataFrame as SQL SQL view
table

spark.sql() Runs SQL query on spark.sql("SELECT name FROM temp_table WHERE age >
DataFrame 30").show() # Runs SQL query

Window.partitionBy() Defines window for from pyspark.sql.window import Window; w =


ranking/aggregation Window.partitionBy("dept").orderBy("salary");
df.withColumn("rank", row_number().over(w)).show()
# Adds rank column
SQL Commands

Command Function Example

SELECT Retrieves data from a table SELECT name, age FROM employees #
Selects name and age columns

WHERE Filters rows based on condition SELECT * FROM employees WHERE age > 30 #
Filters employees older than 30

ORDER BY Sorts result set SELECT * FROM employees ORDER BY salary


DESC # Sorts by salary in descending
order

GROUP BY Groups rows for aggregation SELECT department, AVG(salary) FROM


employees GROUP BY department # Avg
salary per dept

HAVING Filters grouped results SELECT department, COUNT(*) FROM


employees GROUP BY department HAVING
COUNT(*) > 5 # Depts with > 5 employees

JOIN Combines rows from multiple tables SELECT e.name, d.dept_name FROM
employees e JOIN departments d ON
e.dept_id = d.id # Joins tables

LEFT JOIN Includes all rows from left table SELECT e.name, d.dept_name FROM
employees e LEFT JOIN departments d ON
e.dept_id = d.id # Left join

LIMIT Restricts number of returned rows SELECT * FROM employees LIMIT 5 #


Returns first 5 rows

INSERT INTO Adds new rows to a table INSERT INTO employees (name, age) VALUES
('Alice', 28) # Inserts a new employee

UPDATE Modifies existing rows UPDATE employees SET salary = 60000


WHERE name = 'Alice' # Updates salary

DELETE Removes rows from a table DELETE FROM employees WHERE age < 18 #
Deletes rows where age < 18

CREATE TABLE Creates a new table CREATE TABLE employees (id INT, name
VARCHAR(50), age INT) # Creates
employees table

ALTER TABLE Modifies table structure ALTER TABLE employees ADD COLUMN salary
DECIMAL(10,2) # Adds salary column

DROP TABLE Deletes a table DROP TABLE employees # Deletes employees


table

Cheat Sheet Summary


Comprehensive reference for Python, PySpark, and SQL development tasks.
Version 2.0 | Updated: August 2024

Print Tip: Use Ctrl+P (Win) / Cmd+P (Mac) to save as PDF

You might also like