# [ Excel Automation with Python ] CheatSheet
1. Basic Excel Operations with openpyxl
● Open workbook: wb = openpyxl.load_workbook("file.xlsx")
● Create workbook: wb = openpyxl.Workbook()
● Save workbook: wb.save("file.xlsx")
● Get active sheet: ws = wb.active
● Create new sheet: ws = wb.create_sheet("Sheet1")
● Access cell value: value = ws['A1'].value
● Write to cell: ws['A1'] = "Value"
● Get cell range: cell_range = ws['A1:B10']
● Get row values: row_values = list(ws.rows)[0]
● Get column values: col_values = list(ws.columns)[0]
● Copy sheet: wb.copy_worksheet(ws)
● Delete sheet: wb.remove(ws)
● Rename sheet: ws.title = "NewName"
● Get sheet names: sheet_names = wb.sheetnames
● Select cell range: cell_range = ws["A1:D10"]
● Get max row: max_row = ws.max_row
● Get max column: max_col = ws.max_column
● Write formula: ws['A1'] = '=SUM(B1:B10)'
● Get cell coordinates: coord = ws['A1'].coordinate
● Set column width: ws.column_dimensions['A'].width = 20
2. Excel Data Manipulation with pandas
● Read Excel: df = pd.read_excel("file.xlsx")
● Write to Excel: df.to_excel("output.xlsx", index=False)
● Read specific sheet: df = pd.read_excel("file.xlsx",
sheet_name="Sheet1")
● Read multiple sheets: dfs = pd.read_excel("file.xlsx",
sheet_name=None)
● Read specific range: df = pd.read_excel("file.xlsx", usecols="A:D",
nrows=100)
● Write multiple sheets: with pd.ExcelWriter("output.xlsx") as writer:
df.to_excel(writer, sheet_name="Sheet1")
● Format Excel output: df.to_excel("output.xlsx", float_format="%.2f")
● Read with header: df = pd.read_excel("file.xlsx", header=0)
● Read with skiprows: df = pd.read_excel("file.xlsx", skiprows=5)
By: Waleed Mousa
● Write with index: df.to_excel("output.xlsx", index=True)
● Read specific cells: df = pd.read_excel("file.xlsx", usecols="A,C,E")
● Write with formatting: df.style.to_excel("output.xlsx")
● Read with date parser: df = pd.read_excel("file.xlsx",
parse_dates=['Date'])
● Write to specific sheet: df.to_excel("output.xlsx",
sheet_name="Data")
● Convert Excel to CSV: df.to_csv("output.csv")
● Merge Excel files: pd.concat([df1, df2]).to_excel("merged.xlsx")
● Filter Excel data: df[df['Column'] > 100].to_excel("filtered.xlsx")
● Sort Excel data: df.sort_values('Column').to_excel("sorted.xlsx")
● Pivot Excel data: df.pivot_table(values='Value',
index='Index').to_excel("pivot.xlsx")
● Group Excel data: df.groupby('Column').sum().to_excel("grouped.xlsx")
3. Excel Formatting with openpyxl
● Set font: ws['A1'].font = Font(bold=True, size=12)
● Set alignment: ws['A1'].alignment = Alignment(horizontal='center')
● Set border: ws['A1'].border = Border(left=Side(style='thin'))
● Set fill color: ws['A1'].fill = PatternFill(start_color="FFFF00",
fill_type="solid")
● Set number format: ws['A1'].number_format = '0.00%'
● Merge cells: ws.merge_cells('A1:D1')
● Unmerge cells: ws.unmerge_cells('A1:D1')
● Set row height: ws.row_dimensions[1].height = 20
● Set print area: ws.print_area = 'A1:D10'
● Set print title rows: ws.print_title_rows = '1:2'
● Apply conditional formatting: ws.conditional_formatting.add('A1:A10',
FormulaRule(formula=['$A1>5']))
● Set protection: ws.protection.sheet = True
● Set column width: ws.column_dimensions['A'].width = 15
● Hide column: ws.column_dimensions['A'].hidden = True
● Hide row: ws.row_dimensions[1].hidden = True
● Set sheet tab color: ws.sheet_properties.tabColor = "1072BA"
● Add hyperlink: ws['A1'].hyperlink = "http://example.com"
● Set comment: ws['A1'].comment = Comment("Comment text", "Author")
● Apply style template: ws['A1'].style = 'Hyperlink'
● Set print scaling: ws.page_setup.scale = 75
By: Waleed Mousa
4. Excel Automation with win32com
● Create Excel application: excel =
win32com.client.Dispatch("Excel.Application")
● Set visibility: excel.Visible = True
● Open workbook: wb = excel.Workbooks.Open("file.xlsx")
● Run macro: excel.Application.Run("MacroName")
● Get active sheet: ws = excel.ActiveSheet
● Save as: wb.SaveAs("newfile.xlsx")
● Close workbook: wb.Close(SaveChanges=True)
● Quit Excel: excel.Quit()
● Copy range: ws.Range("A1:D10").Copy()
● Paste range: ws.Range("E1").PasteSpecial()
● Auto fit columns: ws.UsedRange.Columns.AutoFit()
● Refresh all: wb.RefreshAll()
● Calculate sheet: ws.Calculate()
● Set calculation mode: excel.Calculation = -4105 #
xlCalculationAutomatic
● Get last row: last_row = ws.Cells(ws.Rows.Count, "A").End(-4162).Row
● Get last column: last_col = ws.Cells(1,
ws.Columns.Count).End(-4159).Column
● Clear contents: ws.Range("A1:D10").ClearContents()
● Clear formats: ws.Range("A1:D10").ClearFormats()
● Set print area: ws.PageSetup.PrintArea = "A1:D10"
● Save as PDF: wb.ExportAsFixedFormat(0, "output.pdf")
5. Excel Charts and Visualizations
● Create chart: chart = ws.Shapes.AddChart2()
● Set chart type: chart.Chart.ChartType = -4100 # xlLine
● Set chart source data: chart.Chart.SetSourceData(ws.Range("A1:D10"))
● Set chart title: chart.Chart.ChartTitle.Text = "Sales Report"
● Set chart position: chart.Left, chart.Top = 100, 100
● Add chart legend: chart.Chart.HasLegend = True
● Format chart axis: chart.Chart.Axes(1).HasTitle = True
● Set axis title: chart.Chart.Axes(1).AxisTitle.Text = "Values"
● Change chart style: chart.Chart.ChartStyle = 2
● Export chart: chart.Chart.Export("chart.png")
● Add trendline: chart.Chart.SeriesCollection(1).Trendlines.Add()
● Set chart size: chart.Height, chart.Width = 300, 400
By: Waleed Mousa
● Format data labels:
chart.Chart.SeriesCollection(1).DataLabels.ShowValue = True
● Add secondary axis: chart.Chart.SeriesCollection(2).AxisGroup = 2
● Set chart colors:
chart.Chart.SeriesCollection(1).Format.Fill.ForeColor.RGB = 255
● Add error bars: chart.Chart.SeriesCollection(1).ErrorBars.Add()
● Format grid lines:
chart.Chart.Axes(1).MajorGridlines.Format.Line.Weight = 0.5
● Set plot area: chart.Chart.PlotArea.Format.Fill.ForeColor.RGB =
16777215
● Add data table: chart.Chart.HasDataTable = True
● Format chart area: chart.Chart.ChartArea.Format.Fill.ForeColor.RGB =
16777215
6. Excel Reports Generation
● Create report template: template =
openpyxl.load_workbook("template.xlsx")
● Fill report data: for row, data in enumerate(data_list, start=1):
ws.cell(row=row, column=1, value=data)
● Add report header: ws.merge_cells('A1:D1'); ws['A1'] = 'Report Title'
● Add page numbers: ws.oddHeader.center.text = "Page &P of &N"
● Set print options: ws.page_setup.fitToPage = True
● Add summary formulas: ws['A10'] = '=SUM(A1:A9)'
● Add report footer: ws.oddFooter.center.text = "Generated on &D"
● Create table of contents: ws.append(['Section', 'Page'])
● Add report filters: ws.auto_filter.ref = 'A1:D10'
● Format report sections: ws['A1:D1'].style = 'Heading 1'
● Add conditional highlights: ws.conditional_formatting.add('A1:A10',
ColorScaleRule(start_color='FF0000', end_color='00FF00'))
● Generate report timestamp: ws['A1'] =
datetime.now().strftime('%Y-%m-%d %H:%M:%S')
● Add report metadata: wb.properties.title = "Monthly Report"
● Create dynamic ranges: ws.defined_names['DataRange'] =
'Sheet1!$A$1:$D$10'
● Add report validation: dv = DataValidation(type="list",
formula1='"Option1,Option2,Option3"')
● Set report password: wb.security.workbookPassword = 'password'
● Create report sections: [wb.create_sheet(f"Section{i}") for i in
range(1,4)]
By: Waleed Mousa
● Add cross-references: ws['A1'] = '=Sheet2!A1'
● Generate summary dashboard: dashboard = wb.create_sheet("Dashboard",
0)
● Export report formats: wb.save('report.xlsx'); wb.save('report.pdf')
7. Excel Data Processing
● Filter data: df = df[df['Column'] > value]
● Sort data: df = df.sort_values(['Column1', 'Column2'])
● Remove duplicates: df = df.drop_duplicates()
● Fill missing values: df = df.fillna(0)
● Apply formula to column: df['New'] = df['Column'].apply(lambda x:
x*2)
● Pivot data: pivot = pd.pivot_table(df, values='Value',
index='Index', columns='Column')
● Group and aggregate: grouped = df.groupby('Column').agg({'Value':
'sum'})
● Merge data: merged = pd.merge(df1, df2, on='Key')
● Convert data types: df['Column'] = df['Column'].astype('int')
● Calculate running totals: df['Running_Total'] = df['Value'].cumsum()
● Apply conditional logic: df['Flag'] = np.where(df['Value'] > 100,
'High', 'Low')
● Create time series: df['Date'] = pd.date_range(start='1/1/2023',
periods=len(df))
● Reshape data: reshaped = df.melt(id_vars=['ID'], value_vars=['Val1',
'Val2'])
● Calculate percentages: df['Percent'] = df['Value'] / df['Total'] *
100
● Create bins: df['Bins'] = pd.cut(df['Value'], bins=5)
● Calculate statistics: stats = df.describe()
● Apply rolling calculations: df['MA'] =
df['Value'].rolling(window=3).mean()
● Create cross-tabulation: ctab = pd.crosstab(df['Col1'], df['Col2'])
● Calculate correlation: corr = df.corr()
● Normalize data: normalized = (df - df.mean()) / df.std()
8. Excel File Operations
● Combine multiple files: pd.concat([pd.read_excel(f) for f in
excel_files])
By: Waleed Mousa
● Split workbook: for sheet in wb.sheetnames: pd.read_excel(file,
sheet_name=sheet).to_excel(f"{sheet}.xlsx")
● Convert to CSV: pd.read_excel("file.xlsx").to_csv("file.csv")
● Batch process files: [process_excel_file(f) for f in
glob.glob("*.xlsx")]
● Archive workbooks: shutil.make_archive("excel_backup", "zip",
"excel_folder")
● Monitor file changes:
watchdog.observers.Observer().schedule(ExcelHandler(), path='.')
● Compare workbooks:
pd.read_excel("file1.xlsx").equals(pd.read_excel("file2.xlsx"))
● Create backup: shutil.copy2("original.xlsx",
f"backup_{datetime.now():%Y%m%d}.xlsx")
● Clean temp files: [os.remove(f) for f in glob.glob("~$*.xlsx")]
● Check file exists: os.path.exists("file.xlsx")
● Get file metadata: os.stat("file.xlsx")
● Set file permissions: os.chmod("file.xlsx", 0o666)
● Move files: shutil.move("source.xlsx", "destination/source.xlsx")
● Delete old files: [os.remove(f) for f in glob.glob("*.xlsx") if
file_age(f) > 30]
● Encrypt workbook: pyAesCrypt.encryptFile("file.xlsx",
"encrypted.xlsx", password)
● Decrypt workbook: pyAesCrypt.decryptFile("encrypted.xlsx",
"decrypted.xlsx", password)
● Create directory structure: os.makedirs("excel/reports",
exist_ok=True)
● Get file size: os.path.getsize("file.xlsx")
● Check if file locked: try: os.rename(file, file)
● Set file attributes: win32api.SetFileAttributes("file.xlsx",
win32con.FILE_ATTRIBUTE_READONLY)
9. Excel Integration with Other Tools
● Send email with Excel:
outlook.CreateItem(0).Attachments.Add("report.xlsx")
● Upload to SharePoint:
ctx.web.get_file_by_path("file.xlsx").upload(content)
● Connect to SQL: pd.read_sql("SELECT * FROM table",
connection).to_excel("output.xlsx")
● Export to Power BI: powerbi_dataset.tables["Table"].upload_data(df)
By: Waleed Mousa
● Sync with OneDrive: graph_client.upload_file("/Documents/file.xlsx",
content)
● Import from Google Sheets:
gspread.authorize(creds).open("Sheet").sheet1.get_all_records()
● Export to Teams: teams_client.upload_file("channel", "file.xlsx")
● Connect to SAP:
pyrfc.Connection().call("RFC_READ_TABLE").to_excel("sap_data.xlsx")
● Azure Blob storage: blob_client.upload_blob("file.xlsx")
● AWS S3 upload: s3.upload_file("file.xlsx", "bucket", "key")
● Google Drive upload:
drive_service.files().create(media_body="file.xlsx")
● Dropbox sync: dbx.files_upload(data, "/file.xlsx")
● FTP upload: ftplib.FTP(server).storbinary("STOR file.xlsx",
open("file.xlsx", "rb"))
● REST API integration: requests.post(url, files={"file":
open("file.xlsx", "rb")})
● MongoDB export:
pd.DataFrame(mongo.find()).to_excel("mongo_data.xlsx")
● Redis cache: redis.set("excel_data", pickle.dumps(df))
● ElasticSearch export:
pd.DataFrame(es.search()["hits"]).to_excel("es_data.xlsx")
● Tableau integration: tableau.publish("workbook.xlsx", "project",
"name")
● Slack notification: slack_client.files_upload(channels="#channel",
file="report.xlsx")
● Jira attachment: jira.add_attachment(issue, "file.xlsx")
10. Excel Data Validation and Error Handling
● Validate data types: df.dtypes.apply(lambda x: x == expected_type)
● Check missing values: df.isnull().sum()
● Validate range: df['Column'].between(min_val, max_val).all()
● Check duplicates: df.duplicated().any()
● Validate formulas: xl.Range("A1").Formula.startswith("=")
● Check cell errors: xl.Range("A1").Text == "#N/A"
● Validate constraints: df.apply(lambda x: validate_rules(x), axis=1)
● Log validation errors: logging.error(f"Validation failed: {error}")
● Handle exceptions: try: process_excel() except Exception as e:
handle_error(e)
● Check file corruption: try: openpyxl.load_workbook("file.xlsx")
By: Waleed Mousa
● Validate date format: pd.to_datetime(df['Date'],
errors='coerce').notna().all()
● Check numeric values: df['Number'].apply(lambda x: str(x).isdigit())
● Validate email format: df['Email'].str.match(email_pattern)
● Check required fields: df[required_columns].notna().all()
● Validate unique keys: df['ID'].is_unique
● Check reference integrity:
set(df['ForeignKey']).issubset(set(master_df['ID']))
● Validate business rules: df.apply(validate_business_rules, axis=1)
● Track validation status: df['Valid'] = df.apply(validate_row, axis=1)
● Generate validation report:
validation_results.to_excel("validation_report.xlsx")
● Clean invalid data: df.where(df.apply(validate_data), np.nan)
11. Excel Automation Best Practices
● Set error handling: sys.excepthook = custom_exception_handler
● Implement logging:
logging.basicConfig(filename='excel_automation.log')
● Use context managers: with pd.ExcelWriter("file.xlsx") as writer:
● Optimize performance: pd.options.mode.chained_assignment = None
● Implement retry logic: @retry(stop_max_attempt_number=3)
● Use configuration files: config =
configparser.ConfigParser().read('config.ini')
● Create backup strategy: schedule.every().day.do(backup_excel_files)
● Monitor memory usage: psutil.Process().memory_info().rss
● Implement timeout: signal.alarm(timeout)
● Use progress bars: tqdm(excel_files)
● Implement parallel processing: with ThreadPoolExecutor() as executor:
● Cache frequent operations: @functools.lru_cache(maxsize=128)
● Use type hints: def process_excel(file: str) -> pd.DataFrame:
● Implement unit tests: pytest.mark.parametrize("input,expected")
● Create documentation: sphinx-quickstart
● Version control Excel files: git add *.xlsx
● Implement security measures: cryptography.fernet.Fernet(key)
● Monitor execution time: timeit.default_timer()
● Create cleanup procedures: atexit.register(cleanup_function)
● Implement health checks: healthcheck.add_check(check_excel_system)
12. Advanced Excel Formulas with Python
By: Waleed Mousa
● Create array formula: ws["A1"] = "=SUM(IF(A2:A10>0,A2:A10,0))"
● Dynamic VLOOKUP: f'=VLOOKUP({cell},range,{col_index},FALSE)'
● Nested IF statements:
f'=IF({cond1},{val1},IF({cond2},{val2},{val3}))'
● SUMIFS equivalent: df.loc[df['Col1'] > 0, 'Col2'].sum()
● COUNTIFS in Python: len(df.loc[(df['Col1'] > 0) & (df['Col2'] ==
'Value')])
● AVERAGEIFS: df.loc[df['Col1'] > 0, 'Col2'].mean()
● INDEX/MATCH: f'=INDEX({range},MATCH({lookup_val},{lookup_range},0))'
● Pivot formulas: pd.pivot_table(df, values='Val', index='Idx',
aggfunc='sum')
● Running totals: df['Total'] = df['Value'].cumsum()
● Conditional formatting formulas: f'=AND({range}>=0,{range}<=100)'
● Date calculations: df['Days'] = (df['Date2'] - df['Date1']).dt.days
● Text concatenation: df['Full'] = df['First'] + ' ' + df['Last']
● Rank calculation: df['Rank'] = df['Value'].rank(method='min')
● Percentage of total: df['Pct'] = df['Value'] / df['Value'].sum() *
100
● Moving average: df['MA'] = df['Value'].rolling(window=3).mean()
● Compound growth: (df['End'] / df['Start']) ** (1/periods) - 1
● Weighted average: np.average(values, weights=weights)
● Financial calculations: npf.irr(cash_flows)
● Statistical formulas: df['ZScore'] = (df['Value'] -
df['Value'].mean()) / df['Value'].std()
● Lookup with multiple criteria: df.loc[(df['Col1'] == val1) &
(df['Col2'] == val2), 'Result'].iloc[0]
13. Excel Reporting Automation
● Create report template:
Template("template.xlsx").generate(data=report_data)
● Generate KPI dashboard: create_dashboard(kpi_data,
template="dashboard.xlsx")
● Automated charts: create_chart(data, chart_type='line', title='Trend
Analysis')
● Schedule reports: schedule.every().monday.do(generate_weekly_report)
● Email distribution: send_report_email(recipients,
attachment="report.xlsx")
● Dynamic headers: ws.cell(row=1, column=1, value=f"Report
{datetime.now():%Y-%m-%d}")
By: Waleed Mousa
● Conditional sections: if show_section: add_report_section(ws,
section_data)
● Report pagination: add_page_numbers(ws, format="Page {0} of {1}")
● Table of contents: create_toc(wb, sheet_list)
● Custom styling: apply_report_style(ws, style_config)
● Data validation rules: add_validation(ws, range="A1:A10",
type="list", formula1='"Yes,No"')
● Automated summary: create_executive_summary(data)
● Report footnotes: add_footnotes(ws, notes_list)
● Cross-references: add_sheet_references(wb)
● Report versioning: save_report_version(wb, version_info)
● Watermarks: add_watermark(ws, text="Confidential")
● Custom headers/footers: ws.oddHeader.center.text = "&[Date] &[Time]"
● Report metadata: set_document_properties(wb, properties_dict)
● Print setup: configure_print_settings(ws, orientation='landscape')
● Export formats: save_in_formats(wb, formats=['xlsx', 'pdf', 'csv'])
14. Excel Data Analysis Automation
● Descriptive statistics: df.describe().to_excel("stats.xlsx")
● Correlation analysis: df.corr().to_excel("correlation.xlsx")
● Time series analysis: perform_time_series_analysis(df['Values'])
● Regression analysis: statsmodels.OLS(y, X).fit()
● Cluster analysis: KMeans(n_clusters=3).fit_predict(data)
● Factor analysis: FactorAnalyzer().fit_transform(data)
● Hypothesis testing: scipy.stats.ttest_ind(group1, group2)
● Variance analysis: scipy.stats.f_oneway(data1, data2)
● Outlier detection: detect_outliers(df, method='zscore')
● Trend analysis: calculate_trend(time_series)
● Seasonal decomposition: seasonal_decompose(df['Values'])
● Forecast generation: generate_forecast(historical_data)
● Sensitivity analysis: perform_sensitivity_analysis(model, params)
● Monte Carlo simulation: run_monte_carlo(iterations=1000)
● Portfolio optimization: optimize_portfolio(returns, constraints)
● Risk analysis: calculate_var(returns, confidence=0.95)
● Decision tree analysis: DecisionTreeClassifier().fit(X, y)
● Pattern recognition: find_patterns(data_series)
● Scenario analysis: analyze_scenarios(base_case, scenarios)
● Attribution analysis: perform_attribution_analysis(returns)
By: Waleed Mousa
15. Custom Excel Functions
● Create UDF: @xw.func def custom_function(x): return x * 2
● Array function: @xw.func @xw.arg('x', ndim=2) def matrix_operation(x)
● Date function: @xw.func def workdays(start, end): return
np.busday_count(start, end)
● String manipulation: @xw.func def clean_text(text): return
re.sub(r'\W+', '', text)
● Lookup function: @xw.func def smart_lookup(lookup_value,
lookup_array, return_array)
● Statistical function: @xw.func def outlier_count(data): return
detect_outliers(data)
● Financial function: @xw.func def npv_custom(rate, cashflows)
● Range function: @xw.func def range_operation(range_input)
● Conditional function: @xw.func def conditional_calc(condition,
value1, value2)
● Array aggregation: @xw.func def custom_aggregate(data, method='sum')
● Data validation: @xw.func def validate_data(input_value, rules)
● Format conversion: @xw.func def format_number(number, format_type)
● Complex calculation: @xw.func def multi_step_calculation(inputs)
● Dynamic array: @xw.func def dynamic_array_calc(input_range)
● Text analysis: @xw.func def text_analytics(text_input)
● Custom sorting: @xw.func def sort_custom(data, criteria)
● Range lookup: @xw.func def range_lookup(lookup_value, lookup_range)
● Data transformation: @xw.func def transform_data(input_data,
transformation)
● Custom aggregation: @xw.func def aggregate_custom(data, weights=None)
● Error handling: @xw.func def safe_calculation(input_value)
16. Advanced Excel Automation Patterns
● State Management Pattern:
ExcelState(initial_state='ready').transition_to('processing')
● Observer Pattern: ExcelObserver().register_callback(on_change_event)
● Command Pattern: ExcelCommandExecutor().execute(command)
● Factory Pattern: ExcelReportFactory.create_report(report_type)
● Builder Pattern:
ExcelReportBuilder().add_header().add_data().add_footer().build()
● Strategy Pattern: ExcelProcessor(strategy=CustomStrategy()).process()
By: Waleed Mousa
● Chain of Responsibility:
ExcelChain().add_handler(validation).add_handler(processing)
● Template Method: class CustomReport(ExcelReportTemplate): def
process_data(self)
● Decorator Pattern: @excel_logging @excel_validation def
process_workbook()
● Facade Pattern: ExcelFacade().process_complex_operation()
● Singleton Connection: ExcelConnection().get_instance()
● Proxy Pattern: ExcelProxy(real_workbook).save()
● Pipeline Pattern:
ExcelPipeline().pipe(clean).pipe(transform).pipe(validate)
● Repository Pattern: ExcelRepository().save_workbook(wb)
● Unit of Work: with ExcelUnitOfWork() as uow:
uow.process_and_commit()
● Event Sourcing: ExcelEventStore().record_event(event)
● CQRS Pattern: ExcelCommandHandler().handle(command)
● Mediator Pattern: ExcelMediator().coordinate_processing()
● Specification Pattern: ExcelSpecification().is_satisfied_by(workbook)
● Circuit Breaker: ExcelCircuitBreaker(failure_threshold=3).execute()
By: Waleed Mousa