How can 'implot' function be used to fit values to data if one of the variables is a discrete value in Python?

When building regression models, checking for multicollinearity is essential to understand correlations between continuous variables. If multicollinearity exists, it must be removed from the data to ensure model accuracy.

Seaborn provides two key functions for visualizing linear relationships: regplot and lmplot. The regplot function accepts x and y variables in various formats including NumPy arrays, Pandas Series, or DataFrame references. The lmplot function requires a specific data parameter with x and y values as strings, using long-form data format.

Using lmplot with Discrete Variables

The lmplot function can effectively handle cases where one variable is discrete. Here's how to visualize the relationship between party size (discrete) and tip amount ?

import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt

# Load the tips dataset
my_df = sb.load_dataset('tips')

# Display first few rows to understand the data
print("First 5 rows of tips dataset:")
print(my_df.head())
print(f"\nData types:")
print(my_df.dtypes)
First 5 rows of tips dataset:
   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4

Data types:
total_bill     float64
tip            float64
sex           category
smoker        category
day           category
time          category
size             int64

Creating Linear Model Plot

Now let's create a linear model plot with party size (discrete variable) on x-axis and tip amount on y-axis ?

import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt

# Load the tips dataset
my_df = sb.load_dataset('tips')

# Create lmplot with discrete x variable (size) and continuous y variable (tip)
sb.lmplot(x="size", y="tip", data=my_df, height=6, aspect=1.2)

plt.title("Linear Relationship: Party Size vs Tip Amount")
plt.xlabel("Party Size (Discrete Variable)")
plt.ylabel("Tip Amount ($)")
plt.show()

Enhanced Visualization with Grouping

You can further enhance the plot by adding another categorical variable for better insights ?

import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt

# Load the tips dataset
my_df = sb.load_dataset('tips')

# Create lmplot with hue parameter for additional grouping
sb.lmplot(x="size", y="tip", hue="time", data=my_df, height=6, aspect=1.2)

plt.title("Party Size vs Tip Amount by Meal Time")
plt.xlabel("Party Size")
plt.ylabel("Tip Amount ($)")
plt.show()

Key Features of lmplot with Discrete Variables

Feature Description Benefit
Regression Line Shows linear trend despite discrete x-values Reveals overall relationship pattern
Confidence Interval Gray shaded area around regression line Indicates uncertainty in the fit
Scatter Points Individual data points at discrete x-values Shows data distribution at each level

Parameters for lmplot

Important parameters when working with discrete variables ?

import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt

my_df = sb.load_dataset('tips')

# Advanced lmplot with multiple parameters
sb.lmplot(
    x="size", 
    y="tip", 
    data=my_df,
    height=6,           # Figure height
    aspect=1.3,         # Width/height ratio  
    ci=95,              # Confidence interval
    scatter_kws={"alpha": 0.6},  # Transparency for points
    line_kws={"color": "red"}    # Regression line color
)

plt.title("Customized lmplot: Size vs Tip")
plt.show()

Conclusion

The lmplot function effectively handles discrete variables by fitting a regression line through the scattered data points. This visualization helps identify linear trends even when one variable has limited distinct values, making it valuable for regression analysis with mixed variable types.

---
Updated on: 2026-03-25T13:25:29+05:30

307 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements