Joint Plot
Joint plot is a rather unique plot, since it combines multiple plots. It is a chart that shows the relationship between two variables along with their individual distributions.
Basically, it has three elements by default:
Histogram on the top which represents the distribution of a certain variable;
Histogram on the right which represents the distribution of another variable;
Scatter plot in the middle which shows the relationship between these two variables.
Here is an example of a joint plot:
Data for the Joint Plot
seaborn
has a jointplot()
function which, similarly to countplot()
and kdeplot()
, has three most important parameters:
data
;x
;y
.
The x
and y
parameters specify the variables to plot, which correspond to the histograms on the right and top. These parameters can be array-like objects or column names when the data
parameter is a DataFrame.
import seaborn as sns import matplotlib.pyplot as plt # Loading the dataset with data about three different iris flowers species iris_df = sns.load_dataset("iris") sns.jointplot(data=iris_df, x="sepal_length", y="sepal_width") plt.show()
The initial example has been recreated by assigning a DataFrame to the data
parameter and specifying column names for x
and y
.
Plot in the Middle
Another quite useful parameter is kind
which specifies the plot you have in the middle. 'scatter'
is its default value. Here are other possible plots: 'kde'
, 'hist'
, 'hex'
, 'reg'
, 'resid'
. Feel free to experiment with different plots:
import seaborn as sns import matplotlib.pyplot as plt # Loading the dataset with data about three different iris flowers species iris_df = sns.load_dataset("iris") sns.jointplot(data=iris_df, x="sepal_length", y="sepal_width", kind='reg') plt.show()
Plot Kinds
Although the scatter plot is the most common choice for the central plot, there are several other options available:
reg: Adds a linear regression fit to the scatter plot, useful for checking correlation between variables;
resid: Displays the residuals from a linear regression;
hist: Shows a bivariate histogram for two variables;
kde: Creates a KDE plot;
hex: Produces a hexbin plot, where hexagonal bins replace individual points, and bin color indicates data density.
As usual, you can explore more options and parameters in jointplot()
documentation.
Also, it is worth exploring the mentioned topics:
residplot()
documentation;
Bivariate histogram example;
Hexbin plot example.
Swipe to start coding
- Use the correct function to create a joint plot.
- Use
weather_df
as the data for the plot (the first argument). - Set the
'Boston'
column for the x-axis variable (the second argument). - Set the
'Seattle'
column for the y-axis variable (the third argument). - Set the plot in the middle to have a regression line (the rightmost argument).
Solution
Thanks for your feedback!