single
Work with Soup
Sveip for å vise menyen
Continue exploring BeautifulSoup let’s learn some important functions! We can extract not only tag but also their parts (for example, names or attributes):
12print(soup.div.name) print(soup.div.attrs)
In the code, we used the method .name to get the tag’s name and the function .attrs, which returns all tag attributes as a dictionary.
Another useful function is .get_text(), which extracts all the raw text from the website without HTML tags.
print(soup.get_text())
The output of the page will contain a lot of extra blank lines. It happened because of newline characters in the initial HTML file.
In a similar way you can also get only text in the extracted HTML tags using the function .get_text() or .string:
12print(soup.h1.string) print(soup.h1.get_text())
If a tag contains more than one thing (or nothing), it is unclear what .string should refer to, so the function returns None.
Sveip for å begynne å kode
Here you will work on the same page about Christ the Redeemer as in the previous task.
- Import the
BeautifulSouplibrary. - Print the attributes of the
ptag. - Print only the text of the
ultags.
Løsning
Takk for tilbakemeldingene dine!
single
Spør AI
Spør AI
Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår