Python Pandas - Draw a Bar Plot and use median as the estimate of central tendency

A bar plot in Seaborn displays point estimates and confidence intervals as rectangular bars. You can use the estimator parameter in seaborn.barplot() to set median as the measure of central tendency instead of the default mean.

Required Libraries

Import the necessary libraries for creating bar plots with median estimation ?

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

Creating Sample Data

Let's create sample cricket data to demonstrate median estimation in bar plots ?

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Create sample cricket data
data = {
    'Academy': ['Mumbai', 'Delhi', 'Chennai', 'Mumbai', 'Delhi', 
                'Chennai', 'Mumbai', 'Delhi', 'Chennai'],
    'Matches': [45, 32, 28, 52, 38, 35, 41, 29, 33]
}

df = pd.DataFrame(data)
print("Sample Cricket Data:")
print(df)
Sample Cricket Data:
   Academy  Matches
0   Mumbai       45
1    Delhi       32
2  Chennai       28
3   Mumbai       52
4    Delhi       38
5  Chennai       35
6   Mumbai       41
7    Delhi       29
8  Chennai       33

Bar Plot with Default Mean Estimator

First, let's see the default bar plot that uses mean as the estimator ?

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Create sample data
data = {
    'Academy': ['Mumbai', 'Delhi', 'Chennai', 'Mumbai', 'Delhi', 
                'Chennai', 'Mumbai', 'Delhi', 'Chennai'],
    'Matches': [45, 32, 28, 52, 38, 35, 41, 29, 33]
}
df = pd.DataFrame(data)

# Bar plot with default mean estimator
plt.figure(figsize=(8, 5))
sns.barplot(data=df, x='Academy', y='Matches')
plt.title('Bar Plot with Mean Estimator (Default)')
plt.ylabel('Average Matches')
plt.show()

Bar Plot with Median Estimator

Now let's create a bar plot using median as the central tendency estimator ?

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Create sample data
data = {
    'Academy': ['Mumbai', 'Delhi', 'Chennai', 'Mumbai', 'Delhi', 
                'Chennai', 'Mumbai', 'Delhi', 'Chennai'],
    'Matches': [45, 32, 28, 52, 38, 35, 41, 29, 33]
}
df = pd.DataFrame(data)

# Bar plot with median estimator
plt.figure(figsize=(8, 5))
sns.barplot(data=df, x='Academy', y='Matches', estimator=np.median)
plt.title('Bar Plot with Median Estimator')
plt.ylabel('Median Matches')
plt.show()

Comparison of Mean vs Median

Let's compare the actual values to understand the difference ?

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Create sample data
data = {
    'Academy': ['Mumbai', 'Delhi', 'Chennai', 'Mumbai', 'Delhi', 
                'Chennai', 'Mumbai', 'Delhi', 'Chennai'],
    'Matches': [45, 32, 28, 52, 38, 35, 41, 29, 33]
}
df = pd.DataFrame(data)

# Calculate mean and median for each academy
summary = df.groupby('Academy')['Matches'].agg(['mean', 'median']).round(2)
print("Mean vs Median Comparison:")
print(summary)
Mean vs Median Comparison:
          mean  median
Academy              
Chennai  32.00    33.0
Delhi    33.00    32.0
Mumbai   46.00    45.0

Key Parameters

Parameter Description Example
estimator Function to estimate central tendency np.median, np.mean
ci Confidence interval size 95 (default), None
orient Plot orientation 'v' (vertical), 'h' (horizontal)

When to Use Median vs Mean

Use median when your data has outliers or is skewed, as it's more robust than the mean. Use mean for normally distributed data without significant outliers.

Conclusion

The estimator parameter in seaborn.barplot() allows you to use median instead of mean for central tendency. This is particularly useful when dealing with skewed data or outliers that might distort the mean.

Updated on: 2026-03-26T13:32:36+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements