|
| 1 | +# Importing_and_Exporting_Data_in_Pandas |
| 2 | + |
| 3 | +>Created by Krishna Kaushik |
| 4 | +
|
| 5 | +- **Now we're able to create `Series` and `DataFrames` in pandas, but we usually do not do this , in practice we import the data which is in the form of .csv (Comma Seperated Values) , a spreadsheet file or something similar.** |
| 6 | + |
| 7 | +- *Good news is that pandas allows for easy importing of data like this through functions such as ``pd.read_csv()`` and ``pd.read_excel()`` for Microsoft Excel files.* |
| 8 | + |
| 9 | +## 1. Importing from a Google sheet to a pandas dataframe |
| 10 | + |
| 11 | +*Let's say that you wanted to get the information from Google Sheet document into a pandas DataFrame.*. |
| 12 | + |
| 13 | +*You could export it as a .csv file and then import it using ``pd.read_csv()``.* |
| 14 | + |
| 15 | +*In this case, the exported .csv file is called `Titanic.csv`* |
| 16 | + |
| 17 | + |
| 18 | +```python |
| 19 | +## Importing Titanic Data set |
| 20 | +import pandas as pd |
| 21 | + |
| 22 | +titanic_df= pd.read_csv("https://raw.githubusercontent.com/kRiShNa-429407/learn-python/main/contrib/pandas/Datasets/Titanic.csv") |
| 23 | +print(titanic_df) |
| 24 | +``` |
| 25 | + |
| 26 | + pclass survived name \ |
| 27 | + 0 1 1 Allen, Miss. Elisabeth Walton |
| 28 | + 1 1 1 Allison, Master. Hudson Trevor |
| 29 | + 2 1 0 Allison, Miss. Helen Loraine |
| 30 | + 3 1 0 Allison, Mr. Hudson Joshua Creighton |
| 31 | + 4 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels) |
| 32 | + ... ... ... ... |
| 33 | + 1304 3 0 Zabour, Miss. Hileni |
| 34 | + 1305 3 0 Zabour, Miss. Thamine |
| 35 | + 1306 3 0 Zakarian, Mr. Mapriededer |
| 36 | + 1307 3 0 Zakarian, Mr. Ortin |
| 37 | + 1308 3 0 Zimmerman, Mr. Leo |
| 38 | + |
| 39 | + sex age sibsp parch ticket fare cabin embarked boat \ |
| 40 | + 0 female 29.00 0 0 24160 211.3375 B5 S 2 |
| 41 | + 1 male 0.92 1 2 113781 151.5500 C22 C26 S 11 |
| 42 | + 2 female 2.00 1 2 113781 151.5500 C22 C26 S NaN |
| 43 | + 3 male 30.00 1 2 113781 151.5500 C22 C26 S NaN |
| 44 | + 4 female 25.00 1 2 113781 151.5500 C22 C26 S NaN |
| 45 | + ... ... ... ... ... ... ... ... ... ... |
| 46 | + 1304 female 14.50 1 0 2665 14.4542 NaN C NaN |
| 47 | + 1305 female NaN 1 0 2665 14.4542 NaN C NaN |
| 48 | + 1306 male 26.50 0 0 2656 7.2250 NaN C NaN |
| 49 | + 1307 male 27.00 0 0 2670 7.2250 NaN C NaN |
| 50 | + 1308 male 29.00 0 0 315082 7.8750 NaN S NaN |
| 51 | + |
| 52 | + body home.dest |
| 53 | + 0 NaN St Louis, MO |
| 54 | + 1 NaN Montreal, PQ / Chesterville, ON |
| 55 | + 2 NaN Montreal, PQ / Chesterville, ON |
| 56 | + 3 135.0 Montreal, PQ / Chesterville, ON |
| 57 | + 4 NaN Montreal, PQ / Chesterville, ON |
| 58 | + ... ... ... |
| 59 | + 1304 328.0 NaN |
| 60 | + 1305 NaN NaN |
| 61 | + 1306 304.0 NaN |
| 62 | + 1307 NaN NaN |
| 63 | + 1308 NaN NaN |
| 64 | + |
| 65 | + [1309 rows x 14 columns] |
| 66 | + |
| 67 | + |
| 68 | +The dataset I am using here for your reference is taken from the same repository i.e ``learn-python`` (https://raw.githubusercontent.com/kRiShNa-429407/learn-python/main/contrib/pandas/Datasets/Titanic.csv) I uploaded it in the Datasets folder,you can use it from there. |
| 69 | + |
| 70 | +You can also place the filename with its path in `pd.read_csv()`. |
| 71 | + |
| 72 | +**Now we've got the same data from the Google Spreadsheet , but now available as ``pandas DataFrame`` which means we can now apply all pandas functionality over it.** |
| 73 | + |
| 74 | +#### Note: The quiet important thing i am telling is that ``pd.read_csv()`` takes the location of the file (which is in your current working directory) or the hyperlink of the dataset from the other source. |
| 75 | + |
| 76 | +#### But if you want to import the data from Github you can't directly use its link , you have to first convert it to raw by clicking on the raw button present in the repo . |
| 77 | + |
| 78 | +#### Also you can't use the data directly from `Kaggle` you have to use ``kaggle API`` |
| 79 | + |
| 80 | +## 2. The Anatomy of DataFrame |
| 81 | + |
| 82 | +**Different functions use different labels for different things, and can get a little confusing.** |
| 83 | + |
| 84 | +- Rows are refer as ``axis=0`` |
| 85 | +- columns are refer as ``axis=1`` |
| 86 | + |
| 87 | +## 3. Exporting Data |
| 88 | + |
| 89 | +**OK, so after you've made a few changes to your data, you might want to export it and save it so someone else can access the changes.** |
| 90 | + |
| 91 | +**pandas allows you to export ``DataFrame's`` to ``.csv`` format using ``.to_csv()``, or to a spreadsheet format using .to_excel().** |
| 92 | + |
| 93 | +### Exporting a dataframe to a CSV |
| 94 | + |
| 95 | +**We haven't made any changes yet to the ``titanic_df`` DataFrame but let's try to export it.** |
| 96 | + |
| 97 | + |
| 98 | +```python |
| 99 | +#Export the titanic_df DataFrame to csv |
| 100 | +titanic_df.to_csv("exported_titanic.csv") |
| 101 | +``` |
| 102 | + |
| 103 | +Running this will save a file called ``exported_titanic.csv`` to the current folder. |
0 commit comments