Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Simple Solution for Scraping | Tables
Web Scraping with Python (res)
course content

Conteúdo do Curso

Web Scraping with Python (res)

Web Scraping with Python (res)

1. HTML Files and DevTools
2. Beautiful Soup
3. CSS Selectors/XPaths
4. Tables

Simple Solution for Scraping

The library pandas provide a quick and convenient solution for converting HTML tables to the DataFrame. The function read_html() can be useful for scraping tables from various websites without figuring out how to get the website’s HTML. You can use read_html() to work with tables whose structure is not complicated, for example, tables on Wikipedia pages.

12
import pandas as pd tables = pd.read_html('https://en.wikipedia.org/wiki/Florida')
copy

In the code above, the function read_html() got all tables from Wiki about Florida. table is a list of all the tables on the page already converted to DataFrames.

With a large number of tables on the page, it can be challenging to find the one you need. To make the table selection easier, use the match parameter to select the table you want. For example:

12
import pandas as pd tables = pd.read_html('https://en.wikipedia.org/wiki/Florida', match='State University System of Florida')
copy

Tarefa

Get the table from the Wikipedia page about Florida and convert it to the DataFrame.

  1. Import pandas library with the pd alias.
  2. Get the table 'Largest cities or towns in Florida' from the page.
  3. Print the DataFrame df.

Tarefa

Get the table from the Wikipedia page about Florida and convert it to the DataFrame.

  1. Import pandas library with the pd alias.
  2. Get the table 'Largest cities or towns in Florida' from the page.
  3. Print the DataFrame df.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Seção 4. Capítulo 3
toggle bottom row

Simple Solution for Scraping

The library pandas provide a quick and convenient solution for converting HTML tables to the DataFrame. The function read_html() can be useful for scraping tables from various websites without figuring out how to get the website’s HTML. You can use read_html() to work with tables whose structure is not complicated, for example, tables on Wikipedia pages.

12
import pandas as pd tables = pd.read_html('https://en.wikipedia.org/wiki/Florida')
copy

In the code above, the function read_html() got all tables from Wiki about Florida. table is a list of all the tables on the page already converted to DataFrames.

With a large number of tables on the page, it can be challenging to find the one you need. To make the table selection easier, use the match parameter to select the table you want. For example:

12
import pandas as pd tables = pd.read_html('https://en.wikipedia.org/wiki/Florida', match='State University System of Florida')
copy

Tarefa

Get the table from the Wikipedia page about Florida and convert it to the DataFrame.

  1. Import pandas library with the pd alias.
  2. Get the table 'Largest cities or towns in Florida' from the page.
  3. Print the DataFrame df.

Tarefa

Get the table from the Wikipedia page about Florida and convert it to the DataFrame.

  1. Import pandas library with the pd alias.
  2. Get the table 'Largest cities or towns in Florida' from the page.
  3. Print the DataFrame df.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Seção 4. Capítulo 3
toggle bottom row

Simple Solution for Scraping

The library pandas provide a quick and convenient solution for converting HTML tables to the DataFrame. The function read_html() can be useful for scraping tables from various websites without figuring out how to get the website’s HTML. You can use read_html() to work with tables whose structure is not complicated, for example, tables on Wikipedia pages.

12
import pandas as pd tables = pd.read_html('https://en.wikipedia.org/wiki/Florida')
copy

In the code above, the function read_html() got all tables from Wiki about Florida. table is a list of all the tables on the page already converted to DataFrames.

With a large number of tables on the page, it can be challenging to find the one you need. To make the table selection easier, use the match parameter to select the table you want. For example:

12
import pandas as pd tables = pd.read_html('https://en.wikipedia.org/wiki/Florida', match='State University System of Florida')
copy

Tarefa

Get the table from the Wikipedia page about Florida and convert it to the DataFrame.

  1. Import pandas library with the pd alias.
  2. Get the table 'Largest cities or towns in Florida' from the page.
  3. Print the DataFrame df.

Tarefa

Get the table from the Wikipedia page about Florida and convert it to the DataFrame.

  1. Import pandas library with the pd alias.
  2. Get the table 'Largest cities or towns in Florida' from the page.
  3. Print the DataFrame df.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

The library pandas provide a quick and convenient solution for converting HTML tables to the DataFrame. The function read_html() can be useful for scraping tables from various websites without figuring out how to get the website’s HTML. You can use read_html() to work with tables whose structure is not complicated, for example, tables on Wikipedia pages.

12
import pandas as pd tables = pd.read_html('https://en.wikipedia.org/wiki/Florida')
copy

In the code above, the function read_html() got all tables from Wiki about Florida. table is a list of all the tables on the page already converted to DataFrames.

With a large number of tables on the page, it can be challenging to find the one you need. To make the table selection easier, use the match parameter to select the table you want. For example:

12
import pandas as pd tables = pd.read_html('https://en.wikipedia.org/wiki/Florida', match='State University System of Florida')
copy

Tarefa

Get the table from the Wikipedia page about Florida and convert it to the DataFrame.

  1. Import pandas library with the pd alias.
  2. Get the table 'Largest cities or towns in Florida' from the page.
  3. Print the DataFrame df.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
Seção 4. Capítulo 3
Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
We're sorry to hear that something went wrong. What happened?
some-alt