Conteúdo do Curso
Automating Data Collection from Web Sources
 Parse the HTML Content Using BeautifulSoup
Parse the HTML Content Using BeautifulSoup
BeautifulSoup is a Python library that is used to parse HTML and XML documents. It creates parse trees that are helpful in extracting the data easily. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.
Here is an example of how to use BeautifulSoup to parse an HTML document and extract some data:
from bs4 import BeautifulSoup
# Open the HTML file and create a Beautiful Soup object
with open("document.html") as f:
  soup = BeautifulSoup(f, "html.parser")
Tarefa
Swipe to start coding
- Import the BeautifulSouplibrary.
- Use the BeautifulSouplibrary to parse the content of the website (html).
- Print the variable.
Solução
Mark tasks as Completed
Tudo estava claro?
Obrigado pelo seu feedback!
Seção 1. Capítulo 3
AVAILABLE TO ULTIMATE ONLY