Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende KDE Plot | Plotting with Seaborn
Ultimate Visualization with Python
course content

Contenido del Curso

Ultimate Visualization with Python

Ultimate Visualization with Python

1. Matplotlib Introduction
2. Creating Commonly Used Plots
3. Plots Customization
4. More Statistical Plots
5. Plotting with Seaborn

book
KDE Plot

Kernel density estimation (KDE) plot is a plot used to visualize the probability density function estimation. It is in a way similar to a histogram which we discussed in the previous section, however, the KDE plot is a continuous curve, not a set of bars, and is based on all of the data points rather than the intervals. Let’s have a look at an example of a KDE plot:

As you can see, here we have a histogram combined with a KDE plot (orange curve). This combination gives us a much clearer probability density function approximation than a single histogram.

With seaborn creating a KDE plot is as simple as it gets, since there is a special kdeplot() function. Its most important parameters data, x and y work the same way as in the countplot() function.

First Option

We can simply set only one of these parameters via passing a sequence of values. Here is an example to clarify everything:

123456789
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns url = 'https://staging-content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' # Loading the dataset with the average yearly temperatures in Boston and Seattle weather_df = pd.read_csv(url, index_col=0) # Creating a KDE plot setting only the data parameter sns.kdeplot(data=weather_df['Seattle'], fill=True) plt.show()
copy

We only set the value for the data parameter passing a Series object and use the fill parameter to fill in the area under the curve (it is not filled in by default).

Second Option

It is also possible to set a 2D object like a DataFrame for data and a column name (or a key if the data is a dictionary) for x (vertical orientation) or y (horizontal orientation):

12345678
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns url = 'https://staging-content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' weather_df = pd.read_csv(url, index_col=0) # Creating a KDE plot setting both the data and x parameters sns.kdeplot(data=weather_df, x='Seattle', fill=True) plt.show()
copy

We achieved the same results passing the whole DataFrame as the data parameter and the column name for the x parameter.

By the way, the KDE plot we created has a characteristic bell curve and closely resembles the normal distribution with the mean of approximately 52°F.

In case you want to explore more about the kdeplot() function, feel free to refer to its documentation.

Tarea
test

Swipe to begin your solution

  1. Use the correct function to create a KDE plot.
  2. Use countries_df as the data for the plot (the first argument).
  3. Set 'GDP per capita' as the column to use and the orientation to horizontal via the second argument.
  4. Fill in the area under the curve via the third (rightmost) argument.

Solución

Switch to desktopCambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones
¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 5. Capítulo 4
toggle bottom row

book
KDE Plot

Kernel density estimation (KDE) plot is a plot used to visualize the probability density function estimation. It is in a way similar to a histogram which we discussed in the previous section, however, the KDE plot is a continuous curve, not a set of bars, and is based on all of the data points rather than the intervals. Let’s have a look at an example of a KDE plot:

As you can see, here we have a histogram combined with a KDE plot (orange curve). This combination gives us a much clearer probability density function approximation than a single histogram.

With seaborn creating a KDE plot is as simple as it gets, since there is a special kdeplot() function. Its most important parameters data, x and y work the same way as in the countplot() function.

First Option

We can simply set only one of these parameters via passing a sequence of values. Here is an example to clarify everything:

123456789
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns url = 'https://staging-content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' # Loading the dataset with the average yearly temperatures in Boston and Seattle weather_df = pd.read_csv(url, index_col=0) # Creating a KDE plot setting only the data parameter sns.kdeplot(data=weather_df['Seattle'], fill=True) plt.show()
copy

We only set the value for the data parameter passing a Series object and use the fill parameter to fill in the area under the curve (it is not filled in by default).

Second Option

It is also possible to set a 2D object like a DataFrame for data and a column name (or a key if the data is a dictionary) for x (vertical orientation) or y (horizontal orientation):

12345678
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns url = 'https://staging-content-media-cdn.codefinity.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/weather_data.csv' weather_df = pd.read_csv(url, index_col=0) # Creating a KDE plot setting both the data and x parameters sns.kdeplot(data=weather_df, x='Seattle', fill=True) plt.show()
copy

We achieved the same results passing the whole DataFrame as the data parameter and the column name for the x parameter.

By the way, the KDE plot we created has a characteristic bell curve and closely resembles the normal distribution with the mean of approximately 52°F.

In case you want to explore more about the kdeplot() function, feel free to refer to its documentation.

Tarea
test

Swipe to begin your solution

  1. Use the correct function to create a KDE plot.
  2. Use countries_df as the data for the plot (the first argument).
  3. Set 'GDP per capita' as the column to use and the orientation to horizontal via the second argument.
  4. Fill in the area under the curve via the third (rightmost) argument.

Solución

Switch to desktopCambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones
¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 5. Capítulo 4
Switch to desktopCambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones
We're sorry to hear that something went wrong. What happened?
some-alt