Skip to content

Commit 8c95bb1

Browse files
authored
Update handling-missing-values.md
1 parent 5d15c73 commit 8c95bb1

File tree

1 file changed

+14
-23
lines changed

1 file changed

+14
-23
lines changed

contrib/pandas/handling-missing-values.md

Lines changed: 14 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,28 @@
11
# Handling Missing Values in Pandas
22

3-
**Upuntil now we're working on complete data i.e not having any missing values. But in real life it is the one of the main problem.**
4-
5-
*Many datasets arrive with missing data either because it exists and was not collected or it never existed.*
3+
In real life, many datasets arrive with missing data either because it exists and was not collected or it never existed.
64

75
In Pandas missing data is represented by two values:
86

97
* `None` : None is simply is `keyword` refer as empty or none.
108
* `NaN` : Acronym for `Not a Number`.
119

12-
**There are several useful functions for detecting, removing, and replacing null values in Pandas DataFrame :**
10+
There are several useful functions for detecting, removing, and replacing null values in Pandas DataFrame:
1311

14-
1. isnull()
15-
2. notnull()
16-
3. dropna()
17-
4. fillna()
18-
5. replace()
12+
1. `isnull()`
13+
2. `notnull()`
14+
3. `dropna()`
15+
4. `fillna()`
16+
5. `replace()`
1917

2018
## 2. Checking for missing values using `isnull()` and `notnull()`
2119

2220
Let's import pandas and our fancy car-sales dataset having some missing values.
2321

24-
2522
```python
2623
import pandas as pd
27-
```
28-
2924

30-
```python
31-
car_sales_missing_df = pd.read_csv("https://raw.githubusercontent.com/kRiShNa-429407/learn-python/main/contrib/pandas/Datasets/car-sales-missing-data.csv")
25+
car_sales_missing_df = pd.read_csv("Datasets/car-sales-missing-data.csv")
3226
print(car_sales_missing_df)
3327
```
3428

@@ -128,7 +122,7 @@ Note here:
128122
* `True` means no `NaN` values
129123
* `False` means for `NaN` values
130124

131-
#### A little note here : `isnull()` means having null values so it gives boolean `True` for NaN values. And `notnull()` means having no null values so it gives `True` for no NaN value.
125+
`isnull()` means having null values so it gives boolean `True` for NaN values. And `notnull()` means having no null values so it gives `True` for no NaN value.
132126

133127
## 2. Filling missing values using `fillna()`, `replace()`.
134128

@@ -191,18 +185,15 @@ print(car_sales_missing_df.bfill())
191185

192186
#### Filling a null values using `replace()` method
193187

194-
**Now we are going to replace the all Nan value in the data frame with -125 value**
188+
Now we are going to replace the all `NaN` value in the data frame with -125 value
195189

196-
*For this we will need numpy also*
190+
For this we will also need numpy
197191

198192

199193
```python
200194
import numpy as np
201-
```
202195

203-
204-
```python
205-
print(car_sales_missing_df.replace(to_replace = np.nan, value = -125) )
196+
print(car_sales_missing_df.replace(to_replace = np.nan, value = -125))
206197
```
207198

208199
Make Colour Odometer Doors Price
@@ -220,7 +211,7 @@ print(car_sales_missing_df.replace(to_replace = np.nan, value = -125) )
220211

221212
## 3. Dropping missing values using `dropna()`
222213

223-
**In order to drop a null values from a dataframe, we used `dropna()` function this function drop Rows/Columns of datasets with Null values in different ways.**
214+
In order to drop a null values from a dataframe, we used `dropna()` function this function drop Rows/Columns of datasets with Null values in different ways.
224215

225216
#### Dropping rows with at least 1 null value.
226217

@@ -270,4 +261,4 @@ print(car_sales_missing_df.dropna(axis = 1))
270261

271262
Now we drop a columns which have at least 1 missing values.
272263

273-
**Here the dataset becomes empty after dropna() because each column as atleast 1 null value so it remove that columns resulting in an empty dataframe.**
264+
Here the dataset becomes empty after `dropna()` because each column as atleast 1 null value so it remove that columns resulting in an empty dataframe.

0 commit comments

Comments
 (0)