by shigemk2

当面は技術的なことしか書かない

beautifulsoup findAll

要素を配列に突っ込みながら、Lambdaでフィルタリングもできる

soup.findAll(text="one")
# [u'one']
soup.findAll(text=u'one')
# [u'one']

soup.findAll(text=["one", "two"])
# [u'one', u'two']

soup.findAll(text=re.compile("paragraph"))
# [u'This is paragraph ', u'This is paragraph ']

soup.findAll(text=True)
# [u'Page title', u'This is paragraph ', u'one', u'.', u'This is paragraph ',
#  u'two', u'.']

soup.findAll(text=lambda(x): len(x) < 12)
# [u'Page title', u'one', u'.', u'two', u'.']

https://tdoc.info/beautifulsoup/