Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How is Seaborn used to filter and select specific rows or columns from my data?
Seaborn is primarily a data visualization library and does not provide direct methods for filtering or selecting specific rows or columns from your data. However, Seaborn works seamlessly with the pandas library, which is a powerful data manipulation library in Python. We can use pandas to filter and select specific rows or columns from your data, and then use Seaborn to visualize the filtered data.
By combining the data manipulation capabilities of pandas with the visualization capabilities of Seaborn, we can gain insights from our data and effectively communicate our findings through visualizations.
Import the Necessary Libraries
First, we need to import all the required libraries such as seaborn and pandas in our python environment ?
import seaborn as sns import pandas as pd import matplotlib.pyplot as plt
Load Data into a Pandas DataFrame
After importing the required libraries, we can create data using the DataFrame() function or load data using read_csv() ?
import seaborn as sns
import pandas as pd
# Load the Titanic dataset
df = sns.load_dataset('titanic')
print(df.head())
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone 0 0 3 male 22.0 1 0 7.2500 S Third man True NaN Southampton no False 1 1 1 female 38.0 1 0 71.2833 C First woman False C Cherbourg yes False 2 1 3 female 26.0 0 0 7.9250 S Third woman False NaN Southampton yes True 3 1 1 female 35.0 1 0 53.1000 S First woman False C Southampton yes False 4 0 3 male 35.0 0 0 8.0500 S Third man True NaN Southampton no True
Filter Rows Based on Conditions
Pandas provides various methods to filter rows based on specific conditions. We can use boolean indexing to filter rows ?
import seaborn as sns
import pandas as pd
df = sns.load_dataset('titanic')
# Filter rows where age is greater than 30
filtered_df = df[df['age'] > 30]
print(f"Original dataset shape: {df.shape}")
print(f"Filtered dataset shape: {filtered_df.shape}")
print(filtered_df.head())
Original dataset shape: (891, 15) Filtered dataset shape: (365, 15) survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone 1 1 1 female 38.0 1 0 71.2833 C First woman False C Cherbourg yes False 3 1 1 female 35.0 1 0 53.1000 S First woman False C Southampton yes False 4 0 3 male 35.0 0 0 8.0500 S Third man True NaN Southampton no True 6 0 1 male 54.0 0 0 51.8625 S First man True E Southampton no True 11 1 1 female 58.0 0 0 26.5500 S First woman False C Southampton yes True
Select Specific Columns
We can use pandas to select specific columns from our DataFrame using column names or indexing ?
import seaborn as sns
import pandas as pd
df = sns.load_dataset('titanic')
# Select specific columns by name
selected_columns = df[['age', 'fare', 'survived']]
print("Selected columns:")
print(selected_columns.head())
# Alternative method using loc
selected_with_loc = df.loc[:, ['age', 'fare', 'survived']]
print("\nUsing loc method:")
print(selected_with_loc.head())
Selected columns:
age fare survived
0 22.0 7.2500 0
1 38.0 71.2833 1
2 26.0 7.9250 1
3 35.0 53.1000 1
4 35.0 8.0500 0
Using loc method:
age fare survived
0 22.0 7.2500 0
1 38.0 71.2833 1
2 26.0 7.9250 1
3 35.0 53.1000 1
4 35.0 8.0500 0
Combine Filtering and Column Selection
We can combine filtering and column selection for more targeted data analysis ?
import seaborn as sns
import pandas as pd
df = sns.load_dataset('titanic')
# Filter survivors and select specific columns
survivors = df[(df['survived'] == 1) & (df['age'] > 20)][['age', 'fare', 'pclass']]
print("Survivors over 20 years old:")
print(survivors.head())
print(f"Number of survivors over 20: {len(survivors)}")
Survivors over 20 years old:
age fare pclass
1 38.0 71.2833 1
2 26.0 7.9250 3
3 35.0 53.1000 1
8 27.0 11.1333 3
9 14.0 30.0708 2
Number of survivors over 20: 220
Visualize Filtered Data with Seaborn
Once we have filtered or selected the desired data using pandas, we can use Seaborn to create visualizations ?
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
df = sns.load_dataset('titanic')
# Filter data for adults (age > 18)
adults = df[df['age'] > 18]
# Create a scatter plot
plt.figure(figsize=(10, 6))
sns.scatterplot(data=adults, x='age', y='fare', hue='survived', alpha=0.7)
plt.title('Age vs Fare for Adult Passengers')
plt.xlabel('Age')
plt.ylabel('Fare')
plt.show()
Multiple Visualization Examples
Here are different visualization approaches with filtered data ?
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
df = sns.load_dataset('titanic')
# Filter first class passengers
first_class = df[df['pclass'] == 1]
# Create subplots for different visualizations
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
# Box plot of fare by survival status
sns.boxplot(data=first_class, x='survived', y='fare', ax=axes[0,0])
axes[0,0].set_title('First Class: Fare by Survival Status')
# Count plot of gender
sns.countplot(data=first_class, x='sex', hue='survived', ax=axes[0,1])
axes[0,1].set_title('First Class: Gender Distribution by Survival')
# Histogram of age
sns.histplot(data=first_class, x='age', bins=15, ax=axes[1,0])
axes[1,0].set_title('First Class: Age Distribution')
# Bar plot of embarkation port
sns.countplot(data=first_class, x='embark_town', ax=axes[1,1])
axes[1,1].set_title('First Class: Embarkation Ports')
axes[1,1].tick_params(axis='x', rotation=45)
plt.tight_layout()
plt.show()
Comparison of Filtering Methods
| Method | Syntax | Best For |
|---|---|---|
| Boolean Indexing | df[condition] |
Simple conditions |
| loc with conditions | df.loc[condition, columns] |
Combining row and column selection |
| query() method | df.query('condition') |
Complex string-based conditions |
Conclusion
While Seaborn focuses on visualization, combining it with pandas filtering provides powerful data analysis capabilities. Use pandas for data manipulation and Seaborn for creating insightful visualizations of your filtered datasets.
