How is Seaborn used to filter and select specific rows or columns from my data?

Seaborn is primarily a data visualization library and does not provide direct methods for filtering or selecting specific rows or columns from your data. However, Seaborn works seamlessly with the pandas library, which is a powerful data manipulation library in Python. We can use pandas to filter and select specific rows or columns from your data, and then use Seaborn to visualize the filtered data.

By combining the data manipulation capabilities of pandas with the visualization capabilities of Seaborn, we can gain insights from our data and effectively communicate our findings through visualizations.

Import the Necessary Libraries

First, we need to import all the required libraries such as seaborn and pandas in our python environment ?

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

Load Data into a Pandas DataFrame

After importing the required libraries, we can create data using the DataFrame() function or load data using read_csv() ?

import seaborn as sns
import pandas as pd

# Load the Titanic dataset
df = sns.load_dataset('titanic')
print(df.head())
   survived  pclass     sex   age  sibsp  parch     fare embarked  class    who  adult_male deck  embark_town alive  alone
0         0       3    male  22.0      1      0   7.2500        S  Third    man        True  NaN  Southampton    no  False
1         1       1  female  38.0      1      0  71.2833        C  First  woman       False    C    Cherbourg   yes  False
2         1       3  female  26.0      0      0   7.9250        S  Third  woman       False  NaN  Southampton   yes   True
3         1       1  female  35.0      1      0  53.1000        S  First  woman       False    C  Southampton   yes  False
4         0       3    male  35.0      0      0   8.0500        S  Third    man        True  NaN  Southampton    no   True

Filter Rows Based on Conditions

Pandas provides various methods to filter rows based on specific conditions. We can use boolean indexing to filter rows ?

import seaborn as sns
import pandas as pd

df = sns.load_dataset('titanic')

# Filter rows where age is greater than 30
filtered_df = df[df['age'] > 30]
print(f"Original dataset shape: {df.shape}")
print(f"Filtered dataset shape: {filtered_df.shape}")
print(filtered_df.head())
Original dataset shape: (891, 15)
Filtered dataset shape: (365, 15)
   survived  pclass     sex   age  sibsp  parch     fare embarked  class    who  adult_male deck  embark_town alive  alone
1         1       1  female  38.0      1      0  71.2833        C  First  woman       False    C    Cherbourg   yes  False
3         1       1  female  35.0      1      0  53.1000        S  First  woman       False    C  Southampton   yes  False
4         0       3    male  35.0      0      0   8.0500        S  Third    man        True  NaN  Southampton    no   True
6         0       1    male  54.0      0      0  51.8625        S  First    man        True    E  Southampton    no   True
11        1       1  female  58.0      0      0  26.5500        S  First  woman       False    C  Southampton   yes   True

Select Specific Columns

We can use pandas to select specific columns from our DataFrame using column names or indexing ?

import seaborn as sns
import pandas as pd

df = sns.load_dataset('titanic')

# Select specific columns by name
selected_columns = df[['age', 'fare', 'survived']]
print("Selected columns:")
print(selected_columns.head())

# Alternative method using loc
selected_with_loc = df.loc[:, ['age', 'fare', 'survived']]
print("\nUsing loc method:")
print(selected_with_loc.head())
Selected columns:
    age     fare  survived
0  22.0   7.2500         0
1  38.0  71.2833         1
2  26.0   7.9250         1
3  35.0  53.1000         1
4  35.0   8.0500         0

Using loc method:
    age     fare  survived
0  22.0   7.2500         0
1  38.0  71.2833         1
2  26.0   7.9250         1
3  35.0  53.1000         1
4  35.0   8.0500         0

Combine Filtering and Column Selection

We can combine filtering and column selection for more targeted data analysis ?

import seaborn as sns
import pandas as pd

df = sns.load_dataset('titanic')

# Filter survivors and select specific columns
survivors = df[(df['survived'] == 1) & (df['age'] > 20)][['age', 'fare', 'pclass']]
print("Survivors over 20 years old:")
print(survivors.head())
print(f"Number of survivors over 20: {len(survivors)}")
Survivors over 20 years old:
    age     fare  pclass
1  38.0  71.2833       1
2  26.0   7.9250       3
3  35.0  53.1000       1
8  27.0  11.1333       3
9  14.0  30.0708       2

Number of survivors over 20: 220

Visualize Filtered Data with Seaborn

Once we have filtered or selected the desired data using pandas, we can use Seaborn to create visualizations ?

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

df = sns.load_dataset('titanic')

# Filter data for adults (age > 18)
adults = df[df['age'] > 18]

# Create a scatter plot
plt.figure(figsize=(10, 6))
sns.scatterplot(data=adults, x='age', y='fare', hue='survived', alpha=0.7)
plt.title('Age vs Fare for Adult Passengers')
plt.xlabel('Age')
plt.ylabel('Fare')
plt.show()

Multiple Visualization Examples

Here are different visualization approaches with filtered data ?

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

df = sns.load_dataset('titanic')

# Filter first class passengers
first_class = df[df['pclass'] == 1]

# Create subplots for different visualizations
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Box plot of fare by survival status
sns.boxplot(data=first_class, x='survived', y='fare', ax=axes[0,0])
axes[0,0].set_title('First Class: Fare by Survival Status')

# Count plot of gender
sns.countplot(data=first_class, x='sex', hue='survived', ax=axes[0,1])
axes[0,1].set_title('First Class: Gender Distribution by Survival')

# Histogram of age
sns.histplot(data=first_class, x='age', bins=15, ax=axes[1,0])
axes[1,0].set_title('First Class: Age Distribution')

# Bar plot of embarkation port
sns.countplot(data=first_class, x='embark_town', ax=axes[1,1])
axes[1,1].set_title('First Class: Embarkation Ports')
axes[1,1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

Comparison of Filtering Methods

Method Syntax Best For
Boolean Indexing df[condition] Simple conditions
loc with conditions df.loc[condition, columns] Combining row and column selection
query() method df.query('condition') Complex string-based conditions

Conclusion

While Seaborn focuses on visualization, combining it with pandas filtering provides powerful data analysis capabilities. Use pandas for data manipulation and Seaborn for creating insightful visualizations of your filtered datasets.

Updated on: 2026-03-27T10:54:00+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements