Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Print the standard deviation of Pandas series
In this program, we will find the standard deviation of a Pandas series. Standard deviation is a statistic that measures the dispersion of a dataset relative to its mean and is calculated as the square root of the variance.
Syntax
Series.std(axis=None, skipna=True, level=None, ddof=1, numeric_only=None)
Parameters
The std() method accepts several parameters:
- ddof ? Delta Degrees of Freedom (default is 1)
- skipna ? Exclude NaN values (default is True)
- axis ? Not applicable for Series
Example
Let's calculate the standard deviation of a Pandas series using the std() function:
import pandas as pd
series = pd.Series([10, 20, 30, 40, 50])
print("Series:")
print(series)
series_std = series.std()
print("\nStandard Deviation of the series:", series_std)
Series: 0 10 1 20 2 30 3 40 4 50 dtype: int64 Standard Deviation of the series: 15.811388300841896
Working with Missing Values
By default, std() excludes NaN values. Here's how it handles missing data:
import pandas as pd
import numpy as np
series_with_nan = pd.Series([10, 20, np.nan, 40, 50])
print("Series with NaN:")
print(series_with_nan)
std_skip_nan = series_with_nan.std()
std_include_nan = series_with_nan.std(skipna=False)
print(f"\nStandard deviation (skip NaN): {std_skip_nan}")
print(f"Standard deviation (include NaN): {std_include_nan}")
Series with NaN: 0 10.0 1 20.0 2 NaN 3 40.0 4 50.0 dtype: float64 Standard deviation (skip NaN): 18.257418583505537 Standard deviation (include NaN): nan
Comparison with Different ddof Values
The ddof parameter affects the calculation by changing the denominator:
import pandas as pd
data = pd.Series([2, 4, 6, 8, 10])
std_ddof_0 = data.std(ddof=0) # Population standard deviation
std_ddof_1 = data.std(ddof=1) # Sample standard deviation (default)
print(f"Data: {data.tolist()}")
print(f"Standard deviation (ddof=0): {std_ddof_0}")
print(f"Standard deviation (ddof=1): {std_ddof_1}")
Data: [2, 4, 6, 8, 10] Standard deviation (ddof=0): 2.8284271247461903 Standard deviation (ddof=1): 3.1622776601683795
Conclusion
The std() function in Pandas provides an easy way to calculate standard deviation. Use ddof=0 for population standard deviation and ddof=1 (default) for sample standard deviation.
Advertisements
