|
| 1 | +# Importing_and_Exporting_Data_in_Pandas |
| 2 | + |
| 3 | +>Created by Krishna Kaushik |
| 4 | +
|
| 5 | +- **Now we're able to create `Series` and `DataFrames` in pandas, but we usually do not do this , in practice we import the data which is in the form of .csv (Comma Seperated Values) , a spreadsheet file or something similar.** |
| 6 | + |
| 7 | +- *Good news is that pandas allows for easy importing of data like this through functions such as ``pd.read_csv()`` and ``pd.read_excel()`` for Microsoft Excel files.* |
| 8 | + |
| 9 | +## 1. Importing from a Google sheet to a pandas dataframe |
| 10 | + |
| 11 | +*Let's say that you wanted to get the information from Google Sheet document into a pandas DataFrame.*. |
| 12 | + |
| 13 | +*You could export it as a .csv file and then import it using ``pd.read_csv()``.* |
| 14 | + |
| 15 | +*In this case, the exported .csv file is called `Titanic.csv`* |
| 16 | + |
| 17 | + |
| 18 | +```python |
| 19 | +## Importing Titanic Data set |
| 20 | +import pandas as pd |
| 21 | + |
| 22 | +titanic_df= pd.read_csv("https://raw.githubusercontent.com/kRiShNa-429407/learn-python/main/contrib/pandas/Titanic.csv") |
| 23 | +titanic_df |
| 24 | +``` |
| 25 | + |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | +<div> |
| 30 | +<style scoped> |
| 31 | + .dataframe tbody tr th:only-of-type { |
| 32 | + vertical-align: middle; |
| 33 | + } |
| 34 | + |
| 35 | + .dataframe tbody tr th { |
| 36 | + vertical-align: top; |
| 37 | + } |
| 38 | + |
| 39 | + .dataframe thead th { |
| 40 | + text-align: right; |
| 41 | + } |
| 42 | +</style> |
| 43 | +<table border="1" class="dataframe"> |
| 44 | + <thead> |
| 45 | + <tr style="text-align: right;"> |
| 46 | + <th></th> |
| 47 | + <th>pclass</th> |
| 48 | + <th>survived</th> |
| 49 | + <th>name</th> |
| 50 | + <th>sex</th> |
| 51 | + <th>age</th> |
| 52 | + <th>sibsp</th> |
| 53 | + <th>parch</th> |
| 54 | + <th>ticket</th> |
| 55 | + <th>fare</th> |
| 56 | + <th>cabin</th> |
| 57 | + <th>embarked</th> |
| 58 | + <th>boat</th> |
| 59 | + <th>body</th> |
| 60 | + <th>home.dest</th> |
| 61 | + </tr> |
| 62 | + </thead> |
| 63 | + <tbody> |
| 64 | + <tr> |
| 65 | + <th>0</th> |
| 66 | + <td>1</td> |
| 67 | + <td>1</td> |
| 68 | + <td>Allen, Miss. Elisabeth Walton</td> |
| 69 | + <td>female</td> |
| 70 | + <td>29.00</td> |
| 71 | + <td>0</td> |
| 72 | + <td>0</td> |
| 73 | + <td>24160</td> |
| 74 | + <td>211.3375</td> |
| 75 | + <td>B5</td> |
| 76 | + <td>S</td> |
| 77 | + <td>2</td> |
| 78 | + <td>NaN</td> |
| 79 | + <td>St Louis, MO</td> |
| 80 | + </tr> |
| 81 | + <tr> |
| 82 | + <th>1</th> |
| 83 | + <td>1</td> |
| 84 | + <td>1</td> |
| 85 | + <td>Allison, Master. Hudson Trevor</td> |
| 86 | + <td>male</td> |
| 87 | + <td>0.92</td> |
| 88 | + <td>1</td> |
| 89 | + <td>2</td> |
| 90 | + <td>113781</td> |
| 91 | + <td>151.5500</td> |
| 92 | + <td>C22 C26</td> |
| 93 | + <td>S</td> |
| 94 | + <td>11</td> |
| 95 | + <td>NaN</td> |
| 96 | + <td>Montreal, PQ / Chesterville, ON</td> |
| 97 | + </tr> |
| 98 | + <tr> |
| 99 | + <th>2</th> |
| 100 | + <td>1</td> |
| 101 | + <td>0</td> |
| 102 | + <td>Allison, Miss. Helen Loraine</td> |
| 103 | + <td>female</td> |
| 104 | + <td>2.00</td> |
| 105 | + <td>1</td> |
| 106 | + <td>2</td> |
| 107 | + <td>113781</td> |
| 108 | + <td>151.5500</td> |
| 109 | + <td>C22 C26</td> |
| 110 | + <td>S</td> |
| 111 | + <td>NaN</td> |
| 112 | + <td>NaN</td> |
| 113 | + <td>Montreal, PQ / Chesterville, ON</td> |
| 114 | + </tr> |
| 115 | + <tr> |
| 116 | + <th>3</th> |
| 117 | + <td>1</td> |
| 118 | + <td>0</td> |
| 119 | + <td>Allison, Mr. Hudson Joshua Creighton</td> |
| 120 | + <td>male</td> |
| 121 | + <td>30.00</td> |
| 122 | + <td>1</td> |
| 123 | + <td>2</td> |
| 124 | + <td>113781</td> |
| 125 | + <td>151.5500</td> |
| 126 | + <td>C22 C26</td> |
| 127 | + <td>S</td> |
| 128 | + <td>NaN</td> |
| 129 | + <td>135.0</td> |
| 130 | + <td>Montreal, PQ / Chesterville, ON</td> |
| 131 | + </tr> |
| 132 | + <tr> |
| 133 | + <th>4</th> |
| 134 | + <td>1</td> |
| 135 | + <td>0</td> |
| 136 | + <td>Allison, Mrs. Hudson J C (Bessie Waldo Daniels)</td> |
| 137 | + <td>female</td> |
| 138 | + <td>25.00</td> |
| 139 | + <td>1</td> |
| 140 | + <td>2</td> |
| 141 | + <td>113781</td> |
| 142 | + <td>151.5500</td> |
| 143 | + <td>C22 C26</td> |
| 144 | + <td>S</td> |
| 145 | + <td>NaN</td> |
| 146 | + <td>NaN</td> |
| 147 | + <td>Montreal, PQ / Chesterville, ON</td> |
| 148 | + </tr> |
| 149 | + <tr> |
| 150 | + <th>...</th> |
| 151 | + <td>...</td> |
| 152 | + <td>...</td> |
| 153 | + <td>...</td> |
| 154 | + <td>...</td> |
| 155 | + <td>...</td> |
| 156 | + <td>...</td> |
| 157 | + <td>...</td> |
| 158 | + <td>...</td> |
| 159 | + <td>...</td> |
| 160 | + <td>...</td> |
| 161 | + <td>...</td> |
| 162 | + <td>...</td> |
| 163 | + <td>...</td> |
| 164 | + <td>...</td> |
| 165 | + </tr> |
| 166 | + <tr> |
| 167 | + <th>1304</th> |
| 168 | + <td>3</td> |
| 169 | + <td>0</td> |
| 170 | + <td>Zabour, Miss. Hileni</td> |
| 171 | + <td>female</td> |
| 172 | + <td>14.50</td> |
| 173 | + <td>1</td> |
| 174 | + <td>0</td> |
| 175 | + <td>2665</td> |
| 176 | + <td>14.4542</td> |
| 177 | + <td>NaN</td> |
| 178 | + <td>C</td> |
| 179 | + <td>NaN</td> |
| 180 | + <td>328.0</td> |
| 181 | + <td>NaN</td> |
| 182 | + </tr> |
| 183 | + <tr> |
| 184 | + <th>1305</th> |
| 185 | + <td>3</td> |
| 186 | + <td>0</td> |
| 187 | + <td>Zabour, Miss. Thamine</td> |
| 188 | + <td>female</td> |
| 189 | + <td>NaN</td> |
| 190 | + <td>1</td> |
| 191 | + <td>0</td> |
| 192 | + <td>2665</td> |
| 193 | + <td>14.4542</td> |
| 194 | + <td>NaN</td> |
| 195 | + <td>C</td> |
| 196 | + <td>NaN</td> |
| 197 | + <td>NaN</td> |
| 198 | + <td>NaN</td> |
| 199 | + </tr> |
| 200 | + <tr> |
| 201 | + <th>1306</th> |
| 202 | + <td>3</td> |
| 203 | + <td>0</td> |
| 204 | + <td>Zakarian, Mr. Mapriededer</td> |
| 205 | + <td>male</td> |
| 206 | + <td>26.50</td> |
| 207 | + <td>0</td> |
| 208 | + <td>0</td> |
| 209 | + <td>2656</td> |
| 210 | + <td>7.2250</td> |
| 211 | + <td>NaN</td> |
| 212 | + <td>C</td> |
| 213 | + <td>NaN</td> |
| 214 | + <td>304.0</td> |
| 215 | + <td>NaN</td> |
| 216 | + </tr> |
| 217 | + <tr> |
| 218 | + <th>1307</th> |
| 219 | + <td>3</td> |
| 220 | + <td>0</td> |
| 221 | + <td>Zakarian, Mr. Ortin</td> |
| 222 | + <td>male</td> |
| 223 | + <td>27.00</td> |
| 224 | + <td>0</td> |
| 225 | + <td>0</td> |
| 226 | + <td>2670</td> |
| 227 | + <td>7.2250</td> |
| 228 | + <td>NaN</td> |
| 229 | + <td>C</td> |
| 230 | + <td>NaN</td> |
| 231 | + <td>NaN</td> |
| 232 | + <td>NaN</td> |
| 233 | + </tr> |
| 234 | + <tr> |
| 235 | + <th>1308</th> |
| 236 | + <td>3</td> |
| 237 | + <td>0</td> |
| 238 | + <td>Zimmerman, Mr. Leo</td> |
| 239 | + <td>male</td> |
| 240 | + <td>29.00</td> |
| 241 | + <td>0</td> |
| 242 | + <td>0</td> |
| 243 | + <td>315082</td> |
| 244 | + <td>7.8750</td> |
| 245 | + <td>NaN</td> |
| 246 | + <td>S</td> |
| 247 | + <td>NaN</td> |
| 248 | + <td>NaN</td> |
| 249 | + <td>NaN</td> |
| 250 | + </tr> |
| 251 | + </tbody> |
| 252 | +</table> |
| 253 | +<p>1309 rows × 14 columns</p> |
| 254 | +</div> |
| 255 | + |
| 256 | + |
| 257 | + |
| 258 | +The dataset I am using here for your reference is taken from the same repository i.e ``learn-python`` (https://raw.githubusercontent.com/kRiShNa-429407/learn-python/main/contrib/pandas/Titanic.csv) I uploaded it you can use it from there. |
| 259 | + |
| 260 | +**Now we've got the same data from the Google Spreadsheet , but now available as ``pandas DataFrame`` which means we can now apply all pandas functionality over it.** |
| 261 | + |
| 262 | +#### Note: The quiet important thing i am telling is that ``pd.read_csv()`` takes the location of the file (which is in your current working directory) or the hyperlink of the dataset from the other source. |
| 263 | + |
| 264 | +#### But if you want to import the data from Github you can't directly use its link , you have to first convert it to raw by clicking on the raw button present in the repo . |
| 265 | + |
| 266 | +#### Also you can't use the data directly from `Kaggle` you have to use ``kaggle API`` |
| 267 | + |
| 268 | +## 2. The Anatomy of DataFrame |
| 269 | + |
| 270 | +**Different functions use different labels for different things, and can get a little confusing.** |
| 271 | + |
| 272 | +- Rows are refer as ``axis=0`` |
| 273 | +- columns are refer as ``axis=1`` |
| 274 | + |
| 275 | +## 3. Exporting Data |
| 276 | + |
| 277 | +**OK, so after you've made a few changes to your data, you might want to export it and save it so someone else can access the changes.** |
| 278 | + |
| 279 | +**pandas allows you to export ``DataFrame's`` to ``.csv`` format using ``.to_csv()``, or to a spreadsheet format using .to_excel().** |
| 280 | + |
| 281 | +### Exporting a dataframe to a CSV |
| 282 | + |
| 283 | +**We haven't made any changes yet to the ``titanic_df`` DataFrame but let's try to export it.** |
| 284 | + |
| 285 | + |
| 286 | +```python |
| 287 | +#Export the titanic_df DataFrame to csv |
| 288 | +titanic_df.to_csv("exported_titanic.csv") |
| 289 | +``` |
| 290 | + |
| 291 | +Running this will save a file called ``exported_titanic.csv`` to the current folder. |
0 commit comments