0% found this document useful (0 votes)
13 views

S02-Accessing A SQLite Database With Pandas

Uploaded by

AB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

S02-Accessing A SQLite Database With Pandas

Uploaded by

AB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

7/25/24, 4:14 PM s02-Accessing a SQLite Database with Pandas.

ipynb - Colab

keyboard_arrow_down Accessing a SQLite Database with Pandas


Pandas, a great python pakage, provides us a cool funtion of read_sql_query to access a SQLite database, directly.

This notebook is extracted from another tutorial of pySQLiteSWAT. If you are intested in more python-based tools such as Pandas, GeoPandas,
PySAL, folium, etc., you can have a look.

keyboard_arrow_down 1. Load all needed libraries


import pandas as pd
import sqlite3
import matplotlib.pyplot as plt
import matplotlib
%matplotlib inline

keyboard_arrow_down 2. Use pandas to read SQLite database


keyboard_arrow_down 2.1 Create a db connection

con = sqlite3.connect('data\\demo.db3')
cursor = con.cursor()

keyboard_arrow_down 2.2 Check table names

Here we take the RCH table as an example

cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")


print(cursor.fetchall())

[(u'rch',), (u'hru',), (u'sub',), (u'sed',), (u'watershed_daily',), (u'watershed_monthly',), (u'watershed_yearly',), (u'channel_dimensio

keyboard_arrow_down 2.3 Check columns of the RCH table

def table_columns(db, table_name):


curs = db.cursor()
sql = "select * from %s where 1=0;" % table_name
curs.execute(sql)
return [d[0] for d in curs.description]

table_columns(con, 'rch')

['RCH',
'YR',
'MO',
'FLOW_INcms',
'FLOW_OUTcms',
'EVAPcms',
'TLOSScms',
'SED_INtons',
'SED_OUTtons',
'SEDCONCmg_kg',
'ORGN_INkg',
'ORGN_OUTkg',
'ORGP_INkg',
'ORGP_OUTkg',
'NO3_INkg',
'NO3_OUTkg',
'NH4_INkg',
'NH4_OUTkg',
'NO2_INkg',
'NO2_OUTkg',
'MINP_INkg',
'MINP_OUTkg',
'CHLA_INkg',

https://colab.research.google.com/drive/1fdQTJC7_Fm-otRsl2LC_Z9WC1Sk2YirY#printMode=true 1/4
7/25/24, 4:14 PM s02-Accessing a SQLite Database with Pandas.ipynb - Colab
'CHLA_OUTkg',
'CBOD_INkg',
'CBOD_OUTkg',
'DISOX_INkg',
'DISOX_OUTkg',
'SOLPST_INmg',
'SOLPST_OUTmg',
'SORPST_INmg',
'SORPST_OUTmg',
'REACTPSTmg',
'VOLPSTmg',
'SETTLPSTmg',
'RESUSP_PSTmg',
'DIFFUSEPSTmg',
'REACBEDPSTmg',
'BURYPSTmg',
'BED_PSTmg',
'BACTP_OUTct',
'BACTLP_OUTct',
'CMETAL_1kg',
'CMETAL_2kg',
'CMETAL_3kg',
'TOT_Nkg',
'TOT_Pkg',
'NO3ConcMg_l',
'WTMPdegc']

keyboard_arrow_down 2.4 Read specified columns of RCH table inot a pandas DataFrame

e.g., Reach ID, Year, Month and Flow-out

df = pd.read_sql_query("SELECT RCH, YR, MO, FLOW_OUTcms from rch", con)


df = df.set_index(['MO'])
print(df.head(10))

RCH YR FLOW_OUTcms
MO
1 1 1981 146.252487
1 2 1981 96.182854
1 3 1981 11.861368
1 4 1981 49.406513
1 5 1981 272.106018
1 6 1981 486.318573
1 7 1981 23.289026
1 8 1981 214.281433
1 9 1981 193.491211
1 10 1981 53.856945

keyboard_arrow_down 2.5 Finally, do not forget to close the db connection

con.close()

keyboard_arrow_down 3. Further process flow data with pandas


keyboard_arrow_down 3.1 Custom quarters definition

quarters = {1: 'DJF', 2: 'DJF', 3: 'MAM', 4: 'MAM', 5: 'MAM', 6: 'JJA',


7: 'JJA', 8: 'JJA', 9: 'SON', 10: 'SON', 11: 'SON', 12: 'DJF'}

keyboard_arrow_down 3.2 Carry out seasonal statistics for each reach

ssndf = df.groupby(['RCH',quarters])['FLOW_OUTcms'].mean()
ssndf.head(5)

RCH
1 DJF 99.204991
JJA 1405.262298
MAM 559.746932
SON 454.737985
2 DJF 56.328539
Name: FLOW_OUTcms, dtype: float64

https://colab.research.google.com/drive/1fdQTJC7_Fm-otRsl2LC_Z9WC1Sk2YirY#printMode=true 2/4
7/25/24, 4:14 PM s02-Accessing a SQLite Database with Pandas.ipynb - Colab

ssndf = ssndf.reset_index()
ssndf.set_index(['RCH'])
ssndf.head(5)

RCH level_1 FLOW_OUTcms

0 1 DJF 99.204991

1 1 JJA 1405.262298

2 1 MAM 559.746932

3 1 SON 454.737985

4 2 DJF 56.328539

ssndf = ssndf.rename(index=str, columns={"level_1":"SSN"})


ssndf.head(5)

RCH SSN FLOW_OUTcms

0 1 DJF 99.204991

1 1 JJA 1405.262298

2 1 MAM 559.746932

3 1 SON 454.737985

4 2 DJF 56.328539

pivoted = ssndf.pivot(index='RCH', columns='SSN', values='FLOW_OUTcms')


pivoted.head()

SSN DJF JJA MAM SON

RCH

1 99.204991 1405.262298 559.746932 454.737985

2 56.328539 773.664280 203.242828 297.934326

3 32.838328 166.310382 379.771358 27.425488

4 28.384519 619.184517 278.996801 155.766145

5 336.824441 1635.518864 1215.423962 499.265579

keyboard_arrow_down 4. Visualize
Set some parameters to make figure pretty

# Plot size to 15" x 7"


matplotlib.rc('figure', figsize = (15, 7))
# Font size to 14
matplotlib.rc('font', size = 14)
# Display top and right frame lines
matplotlib.rc('axes.spines', top = True, right = True)
# Remove grid lines
matplotlib.rc('axes', grid = False)
# Set backgound color to white
matplotlib.rc('axes', facecolor = 'white')

pivoted.plot(kind='bar', title='Seasonal Mean Discharge from 1981 to 2010 ($m^3/s$)')

https://colab.research.google.com/drive/1fdQTJC7_Fm-otRsl2LC_Z9WC1Sk2YirY#printMode=true 3/4
7/25/24, 4:14 PM s02-Accessing a SQLite Database with Pandas.ipynb - Colab

<matplotlib.axes._subplots.AxesSubplot at 0xbed9128>

keyboard_arrow_down References
Fernando Pérez and Brian E. Granger. IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 9, 21-29
(2007), DOI:10.1109/MCSE.2007.53

John D. Hunter. Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, 9, 90-95 (2007), DOI:10.1109/MCSE.2007.55

Wes McKinney. Data Structures for Statistical Computing in Python, Proceedings of the 9th Python in Science Conference, 51-56 (2010)

https://www.sqlite.org/lang.html

Start coding or generate with AI.

https://colab.research.google.com/drive/1fdQTJC7_Fm-otRsl2LC_Z9WC1Sk2YirY#printMode=true 4/4

You might also like