Beautiful Soup - descendants Property



Method Description

With the descendants property of a PageElement object in Beautiful Soup API you can traverse the list of all children under it. This property returns a generator object, with which the children elements can be retrieved in a breadth-first sequence.

While searching a tree structure, the Breadth-first traversal starts at the tree root and explores all nodes at the present depth prior to moving on to the nodes at the next depth level.

Syntax

tag.descendants

Return value

The descendants property returns a generator object.

Example 1

In the code below, we have a HTML document with nested unordered list tags. We scrape through the children elements parsed in breadth-first manner.

html = '''
   <ul id='outer'>
   <li class="mainmenu">Accounts</li>
      <ul>
      <li class="submenu">Anand</li>
      <li class="submenu">Mahesh</li>
      </ul>
   <li class="mainmenu">HR</li>
      <ul>
      <li class="submenu">Anil</li>
      <li class="submenu">Milind</li>
      </ul>
   </ul>
''' 
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'html.parser')
tag = soup.find('ul', {'id': 'outer'})
tags = soup.descendants
for desc in tags:
   print (desc)

Output

<ul id="outer">
<li class="mainmenu">Accounts</li>
<ul>
<li class="submenu">Anand</li>
<li class="submenu">Mahesh</li>
</ul>
<li class="mainmenu">HR</li>
<ul>
<li class="submenu">Anil</li>
<li class="submenu">Milind</li>
</ul>
</ul>

<li class="mainmenu">Accounts</li>
Accounts
<ul>
<li class="submenu">Anand</li>
<li class="submenu">Mahesh</li>
</ul>

<li class="submenu">Anand</li>
Anand
<li class="submenu">Mahesh</li>
Mahesh

<li class="mainmenu">HR</li>
HR
<ul>
<li class="submenu">Anil</li>
<li class="submenu">Milind</li>
</ul>

<li class="submenu">Anil</li>
Anil
<li class="submenu">Milind</li>
Milind

Example 2

In the following example, we list out the descendants of <head> tag

html = """
<html><head><title>TutorialsPoint</title></head>
<body>
<p>Hello World</p>
"""
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'html.parser')
tag = soup.head
for element in tag.descendants:
   print (element)

Output

<title>TutorialsPoint</title>
TutorialsPoint
Advertisements