Skip to content

Commit d1c64d0

Browse files
authored
Merge pull request animator#1100 from anamika23428/my_new_branch
Pandas Series
2 parents dc42be5 + 33be240 commit d1c64d0

File tree

2 files changed

+318
-0
lines changed

2 files changed

+318
-0
lines changed

contrib/pandas/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@
99
- [Working with Date & Time in Pandas](datetime.md)
1010
- [Importing and Exporting Data in Pandas](import-export.md)
1111
- [Handling Missing Values in Pandas](handling-missing-values.md)
12+
- [Pandas Series](pandas-series.md)

contrib/pandas/pandas-series.md

Lines changed: 317 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,317 @@
1+
# Pandas Series
2+
3+
A series is a Panda data structures that represents a one dimensional array-like object containing an array of data and an associated array of data type labels, called index.
4+
5+
## Creating a Series object:
6+
7+
### Basic Series
8+
To create a basic Series, you can pass a list or array of data to the `pd.Series()` function.
9+
10+
```python
11+
import pandas as pd
12+
13+
s1 = pd.Series([4, 5, 2, 3])
14+
print(s1)
15+
```
16+
17+
#### Output
18+
```
19+
0 4
20+
1 5
21+
2 2
22+
3 3
23+
dtype: int64
24+
```
25+
26+
### Series from a Dictionary
27+
28+
If you pass a dictionary to `pd.Series()`, the keys become the index and the values become the data of the Series.
29+
```python
30+
import pandas as pd
31+
32+
s2 = pd.Series({'A': 1, 'B': 2, 'C': 3})
33+
print(s2)
34+
```
35+
36+
#### Output
37+
```
38+
A 1
39+
B 2
40+
C 3
41+
dtype: int64
42+
```
43+
44+
45+
## Additional Functionality
46+
47+
48+
### Specifying Data Type and Index
49+
You can specify the data type and index while creating a Series.
50+
```python
51+
import pandas as pd
52+
53+
s4 = pd.Series([1, 2, 3], index=['a', 'b', 'c'], dtype='float64')
54+
print(s4)
55+
```
56+
57+
#### Output
58+
```
59+
a 1.0
60+
b 2.0
61+
c 3.0
62+
dtype: float64
63+
```
64+
65+
### Specifying NaN Values:
66+
* Sometimes you need to create a series object of a certain size but you do not have complete data available so in such cases you can fill missing data with a NaN(Not a Number) value.
67+
* When you store NaN value in series object, the data type must be floating pont type. Even if you specify an integer type , pandas will promote it to floating point type automatically because NaN is not supported by integer type.
68+
69+
```python
70+
import pandas as pd
71+
s3=pd.Series([1,np.Nan,2])
72+
print(s3)
73+
```
74+
75+
#### Output
76+
```
77+
0 1.0
78+
1 NaN
79+
2 2.0
80+
dtype: float64
81+
```
82+
83+
84+
### Creating Data from Expressions
85+
You can create a Series using an expression or function.
86+
87+
`<series_object>`=np.Series(data=<function|expression>,index=None)
88+
89+
```python
90+
import pandas as pd
91+
a=np.arange(1,5) # [1,2,3,4]
92+
s5=pd.Series(data=a**2,index=a)
93+
print(s5)
94+
```
95+
96+
#### Output
97+
```
98+
1 1
99+
2 4
100+
3 9
101+
4 16
102+
dtype: int64
103+
```
104+
105+
## Series Object Attributes
106+
107+
| **Attribute** | **Description** |
108+
|--------------------------|---------------------------------------------------|
109+
| `<series>.index` | Array of index of the Series |
110+
| `<series>.values` | Array of values of the Series |
111+
| `<series>.dtype` | Return the dtype of the data |
112+
| `<series>.shape` | Return a tuple representing the shape of the data |
113+
| `<series>.ndim` | Return the number of dimensions of the data |
114+
| `<series>.size` | Return the number of elements in the data |
115+
| `<series>.hasnans` | Return True if there is any NaN in the data |
116+
| `<series>.empty` | Return True if the Series object is empty |
117+
118+
- If you use len() on a series object then it return total number of elements in the series object whereas <series_object>.count() return only the number of non NaN elements.
119+
120+
## Accessing a Series object and its elements
121+
122+
### Accessing Individual Elements
123+
You can access individual elements using their index.
124+
'legal' indexes arte used to access individual element.
125+
```python
126+
import pandas as pd
127+
128+
s7 = pd.Series(data=[13, 45, 67, 89], index=['A', 'B', 'C', 'D'])
129+
print(s7['A'])
130+
```
131+
132+
#### Output
133+
```
134+
13
135+
```
136+
137+
### Slicing a Series
138+
139+
- Slices are extracted based on their positional index, regardless of the custom index labels.
140+
- Each element in the Series has a positional index starting from 0 (i.e., 0 for the first element, 1 for the second element, and so on).
141+
- `<series>[<start>:<end>]` will return the values of the elements between the start and end positions (excluding the end position).
142+
143+
#### Example
144+
145+
```python
146+
import pandas as pd
147+
148+
s = pd.Series(data=[13, 45, 67, 89], index=['A', 'B', 'C', 'D'])
149+
print(s[:2])
150+
```
151+
152+
#### Output
153+
```
154+
A 13
155+
B 45
156+
dtype: int64
157+
```
158+
159+
This example demonstrates that the first two elements (positions 0 and 1) are returned, regardless of their custom index labels.
160+
161+
## Operation on series object
162+
163+
### Modifying elements and indexes
164+
* <series_object>[indexes]=< new data value >
165+
* <series_object>[start : end]=< new data value >
166+
* <series_object>.index=[new indexes]
167+
168+
```python
169+
import pandas as pd
170+
171+
s8 = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
172+
s8['a'] = 100
173+
s8.index = ['x', 'y', 'z']
174+
print(s8)
175+
```
176+
177+
#### Output
178+
```
179+
x 100
180+
y 20
181+
z 30
182+
dtype: int64
183+
```
184+
185+
**Note: Series object are value-mutable but size immutable objects.**
186+
187+
### Vector operations
188+
We can perform vector operations such as `+`,`-`,`/`,`%` etc.
189+
190+
#### Addition
191+
```python
192+
import pandas as pd
193+
194+
s9 = pd.Series([1, 2, 3])
195+
print(s9 + 5)
196+
```
197+
198+
#### Output
199+
```
200+
0 6
201+
1 7
202+
2 8
203+
dtype: int64
204+
```
205+
206+
#### Subtraction
207+
```python
208+
print(s9 - 2)
209+
```
210+
211+
#### Output
212+
```
213+
0 -1
214+
1 0
215+
2 1
216+
dtype: int64
217+
```
218+
219+
### Arthmetic on series object
220+
221+
#### Addition
222+
```python
223+
import pandas as pd
224+
225+
s10 = pd.Series([1, 2, 3])
226+
s11 = pd.Series([4, 5, 6])
227+
print(s10 + s11)
228+
```
229+
230+
#### Output
231+
```
232+
0 5
233+
1 7
234+
2 9
235+
dtype: int64
236+
```
237+
238+
#### Multiplication
239+
240+
```python
241+
print("s10 * s11)
242+
```
243+
244+
#### Output
245+
```
246+
0 4
247+
1 10
248+
2 18
249+
dtype: int64
250+
```
251+
252+
Here one thing we should keep in mind that both the series object should have same indexes otherwise it will return NaN value to all the indexes of two series object .
253+
254+
255+
### Head and Tail Functions
256+
257+
| **Functions** | **Description** |
258+
|--------------------------|---------------------------------------------------|
259+
| `<series>.head(n)` | return the first n elements of the series |
260+
| `<series>.tail(n)` | return the last n elements of the series |
261+
262+
```python
263+
import pandas as pd
264+
265+
s12 = pd.Series([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
266+
print(s12.head(3))
267+
print(s12.tail(3))
268+
```
269+
270+
#### Output
271+
```
272+
0 10
273+
1 20
274+
2 30
275+
dtype: int64
276+
7 80
277+
8 90
278+
9 100
279+
dtype: int64
280+
```
281+
282+
If you dont provide any value to n the by default it give results for `n=5`.
283+
284+
### Few extra functions
285+
286+
| **Function** | **Description** |
287+
|----------------------------------------|------------------------------------------------------------------------|
288+
| `<series_object>.sort_values()` | Return the Series object in ascending order based on its values. |
289+
| `<series_object>.sort_index()` | Return the Series object in ascending order based on its index. |
290+
| `<series_object>.sort_drop(<index>)` | Return the Series with the deleted index and its corresponding value. |
291+
292+
```python
293+
import pandas as pd
294+
295+
s13 = pd.Series([3, 1, 2], index=['c', 'a', 'b'])
296+
print(s13.sort_values())
297+
print(s13.sort_index())
298+
print(s13.drop('a'))
299+
```
300+
301+
#### Output
302+
```
303+
a 1
304+
b 2
305+
c 3
306+
dtype: int64
307+
a 1
308+
b 2
309+
c 3
310+
dtype: int64
311+
c 3
312+
b 2
313+
dtype: int64
314+
```
315+
316+
## Conclusion
317+
In short, Pandas Series is a fundamental data structure in Python for handling one-dimensional data. It combines an array of values with an index, offering efficient methods for data manipulation and analysis. With its ease of use and powerful functionality, Pandas Series is widely used in data science and analytics for tasks such as data cleaning, exploration, and visualization.

0 commit comments

Comments
 (0)