Navigating HTML Document

After reading the HTML document, you have the flexibility to navigate it in several ways. To delve deeper, you can specify a tag just like an attribute. For example, let's examine the <head> element and represent it in a 'structured' form (by employing the .prettify() method).


              123456789101112
            
# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
print(soup.head.prettify())

Feel free to experiment by substituting the .head attribute with .body, for example. As shown above, the <head> element encompasses several children. You can iterate through all the children of elements using a for loop and the .children attribute.


              1234567891011121314
            
# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
# Iterating over all element children
for child in soup.head.children:
  print(child)

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 2

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Course Content

Web Scraping with Python

1. Getting Acquainted with HTML

2. Decoding HTML with Beautiful Soup

What is Beautiful Soup?Navigating HTML Document Challenge: The BeautifulSoup Object Challenge: Iterating Over Lists Working with Specific Elements Working with Paragraph Elements

3. Working with Element Attributes in Beautiful Soup

Attributes and Contents of Element Challenge: Attributes Attributes and Contents of Multiple Elements Challenge: Text from HTML Elements Advanced Search Challenge: Find All

Navigating HTML Document


              123456789101112
            
# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
print(soup.head.prettify())


              1234567891011121314
            
# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
# Iterating over all element children
for child in soup.head.children:
  print(child)

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 2