Finding the Quantile and Decile Ranks of a Pandas DataFrame column

Quantile and decile ranks are statistical measures that determine the position of an observation relative to other values in a dataset. Quantile ranks show the percentage of values below each observation, while decile ranks divide data into 10 equal groups. In this tutorial, we will explore how to calculate both using Pandas DataFrame columns.

Understanding Quantile and Decile Ranks

A quantile rank represents the proportion of values in the dataset that are less than or equal to a given value. For example, if a value has a quantile rank of 0.7, it means 70% of the data falls below that value.

A decile rank divides the dataset into 10 equal parts, where each decile represents 10% of the data. Values are assigned ranks from 1 to 10 based on which decile they fall into.

Calculating Quantile Ranks

Use the rank() method with pct=True to calculate quantile ranks ?

import pandas as pd

# Create a DataFrame
data = {'scores': [85, 92, 78, 88, 95, 82, 90, 87, 93, 89]}
df = pd.DataFrame(data)

# Calculate quantile ranks
df['quantile_rank'] = df['scores'].rank(pct=True)

print(df)
   scores  quantile_rank
0      85            0.3
1      92            0.8
2      78            0.1
3      88            0.5
4      95            1.0
5      82            0.2
6      90            0.7
7      87            0.4
8      93            0.9
9      89            0.6

Calculating Decile Ranks

Use pd.cut() to divide data into 10 equal bins for decile ranking ?

import pandas as pd

# Create a DataFrame
data = {'scores': [85, 92, 78, 88, 95, 82, 90, 87, 93, 89]}
df = pd.DataFrame(data)

# Calculate decile ranks
df['decile_rank'] = pd.cut(df['scores'], bins=10, labels=range(1, 11)).astype(int)

print(df)
   scores  decile_rank
0      85            5
1      92            9
2      78            1
3      88            6
4      95           10
5      82            3
6      90            8
7      87            6
8      93           10
9      89            7

Working with Larger Datasets

Here's an example using a larger dataset with multiple columns ?

import pandas as pd
import numpy as np

# Create a larger DataFrame
np.random.seed(42)
data = {
    'math_scores': np.random.normal(75, 15, 100),
    'english_scores': np.random.normal(80, 12, 100)
}
df = pd.DataFrame(data)

# Calculate quantile ranks for math scores
df['math_quantile_rank'] = df['math_scores'].rank(pct=True)

# Calculate decile ranks for english scores
df['english_decile_rank'] = pd.cut(df['english_scores'], bins=10, labels=range(1, 11)).astype(int)

# Display first 5 rows
print(df.head())
print(f"\nDataset shape: {df.shape}")
   math_scores  english_scores  math_quantile_rank  english_decile_rank
0    82.456140       89.383252                0.69                    8
1    73.207039       82.188204                0.32                    5
2    84.715335       81.431127                0.75                    5
3    97.845451       84.444752                0.98                    7
4    71.690718       82.758936                0.27                    5

Dataset shape: (100, 4)

Comparison of Methods

Method Output Range Best For Use Case
Quantile Rank 0.0 to 1.0 Precise percentile positions Statistical analysis
Decile Rank 1 to 10 Grouping data into categories Performance rankings

Common Applications

  • Performance evaluation Ranking students or employees

  • Outlier detection Identifying extreme values (quantile rank < 0.05 or > 0.95)

  • Market analysis Categorizing products by sales performance

  • Risk assessment Grouping investments by risk levels

Conclusion

Quantile ranks provide precise percentile positions using rank(pct=True), while decile ranks group data into 10 categories using pd.cut(). Both methods are essential for data analysis, helping identify patterns, outliers, and relative positions within datasets.

Updated on: 2026-03-27T13:06:31+05:30

845 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements