Using BeautifulSoup to search HTML for string

The following line is looking for the exact NavigableString ‘Python’:

>>> soup.body.findAll(text='Python')
[]

Note that the following NavigableString is found:

>>> soup.body.findAll(text='Python Jobs') 
[u'Python Jobs']

Note this behaviour:

>>> import re
>>> soup.body.findAll(text=re.compile('^Python$'))
[]

So your regexp is looking for an occurrence of ‘Python’ not the exact match to the NavigableString ‘Python’.

Leave a Comment