Course Content
Web Scraping with Python
Web Scraping with Python
Attributes & Contents of Multiple Elements
All the methods discussed in the previous chapter can be applied to all elements with a specific tag (i.e., to the result of the .find_all()
method). However, it's essential to keep in mind that the outcome of applying the .find_all()
method is a list, so you must use attributes and methods for each element individually. Just as we did previously, you should employ a for
loop in this context as well. For example, let's retrieve all the attributes of all <div>
elements.
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, 'html.parser') for div in soup.find_all('div'): print(div.attrs)
The same approach applies to extracting text. For instance, let's obtain all the text from all the <p>
elements.
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, 'html.parser') for p in soup.find_all('p'): print(p.get_text())
Everything was clear?
Thanks for your feedback!
Section 3. Chapter 3