Seaborn.boxplot() method



The seaborn.boxplot() method is used to draw distributions with respect to categories. This method shows the distribution of data in such a way that enables comparison between variables or across different levels of a categorical variable.

Inputs passed to this method can be of various types such as array, wide-form data, long-from data or a list of vectors.

Syntax

Following is the syntax of seaborn.boxplot() method −

seaborn.boxplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, orient=None, color=None, palette=None, saturation=0.75, width=0.8, dodge=True, fliersize=5, linewidth=None, whis=1.5, ax=None, **kwargs)

Parameters

Some of the parameters in this method are discussed below.

S.No Parameter and Description
1 x,y

These parameters take names of variables as input that plot the long form data.

2 data

This is the dataframe that is used to plot graphs.

3 hue

Names of variables in the dataframe that are needed for plotting the graph.

4 linewidth

This parameter takes floating values and determines the width of the gray lines that frame the elements in the plot.

5 dodge

This parameter takes a boolean value. if we use hue nesting, passing true to this parameter will separate the strips for different hue levels. If False is passed, the points for each level will be plotted on top of each other.

6 orient

It takes values “h” or “v” and the orientation of the graph is determined based on this.

7 color

matplotlib color is taken as input and this determines the color of all the elements.

8 size

this parameter determines the radius of the markers in the plot drawn.

Loading the seaborn library

Let us load the seaborn library and the dataset before moving on to the developing the plots. To load or import the seaborn library the following line of code can be used.

Import seaborn as sns

Loading the dataset

In this article, we will make use of the Titanic dataset inbuilt in the seaborn library. the following command is used to load the dataset.

titanic=sns.load_dataset("titanic")

The below mentioned command is used to view the first 5 rows in the dataset. This enables us to understand what variables can be used to plot a graph.

titanic.head()

The below is the output for the above piece of code.

index,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,true,NaN,Southampton,no,false
1,1,1,female,38.0,1,0,71.2833,C,First,woman,false,C,Cherbourg,yes,false
2,1,3,female,26.0,0,0,7.925,S,Third,woman,false,NaN,Southampton,yes,true

Now that we have loaded the dataset, we will explore a few examples.

Example 1

We will understand a simple shortcut that can be used to plot a boxplot(). Instead of passing the data and x as separate parameters, they can be combined and be passed as a single argument in the x parameter.

In this example, we will be using the titanic dataset and the age column of the dataset is passed to x. The following line of code can be used to do so.

import seaborn as sns
import matplotlib.pyplot as plt
titanic=sns.load_dataset("titanic")
titanic.head()
sns.boxplot(x=titanic["age"])
plt.show()

Output

The output for the above line of code can be seen below.

boxplot method

Example 2

Now, we will see how to plot a simple box plot by passing, x,y and hue parameters to the seaborn.boxplot() method. Since, titanic dataset is being used, who, fare and alive are the columns being considered.

Here, boxplot() is a categorical plot and one of the columns passed for plotting need to be categorical, so in this example, x is passed a categorical parameter and hue should always be passed a categorical parameter.

import seaborn as sns
import matplotlib.pyplot as plt
titanic=sns.load_dataset("titanic")
titanic.head()
sns.boxplot(x="who", y="fare", hue="alive",data=titanic)
plt.show()

Output

The output of the above line of code is as follows,

seaborn boxplot

Example 3

There are many parameters in the seaborn.boxplot() method and we will see the working of another parameter called linewidth. This parameter takes floating values and determines the width of the gray lines that frame the elements in the plot.

The below line of code can be used to understand the working of the linewidth parameter.

import seaborn as sns
import matplotlib.pyplot as plt
titanic=sns.load_dataset("titanic")
titanic.head()
sns.boxplot(x="who", y="fare", hue="embark_town",data=titanic,linewidth=3)
plt.show()

Output

The output obtained is as follows −

seaborn boxplot method

Example 4

Orient is a parameter in the seaborn.boxplot() method that can be used to change the orientation of the graph and it takes two values either, “v” or “h”. In this example we will understand the working of this parameter.

import seaborn as sns
import matplotlib.pyplot as plt
titanic=sns.load_dataset("titanic")
titanic.head()
sns.boxplot(x="fare", y="who", hue="embark_town",data=titanic,orient="h")
plt.show()

Output

In the above line of code, the orientation is set to “h” that is, it is horizontal. The output thus obtained can be seen below.

seaborn.boxplot() method

To orient your plot horizontally, a numerical x variable is required so to the x variable, fare is passed in the above example and to set orient to “v”, the variable at y axis must be numerical. This is how the orient parameter of the seaborn.boxplot() method can be used.

seaborn_categorial_plots_Introduction.htm
Advertisements