Html5lib is a library for parsing and serializing HTML documents and fragments in Python, with ports to Dart, PHP, and Ruby.
Lxml is a full-featured, high performance Python library for processing XML and HTML.
Better job html.parser
"Html5lib parser does a better job than lxml or html.parser handling the debate element in this case"
from question "How do I remove a spurious tag in BeautifulSoup"
|Parser generally faster||
"Lxml parser is generally faster html5lib is the most lenient one - this kind of difference would be relevant if you have a broken or non-well-formed html to parse"
from question "Python beautifulsoup : lxml html.parser"
"The standard html.parser option handles broken html less well than other options while the html5lib option is closest to how a modern browser would handle broken html albeit at a slower rate than lxml would handle html parsing"
from question "Beautiful Soup Remove Tag Error"
"Lxml is the faster parser and can handle broken html quite well html5lib comes closest to how your browser would parse broken html but is a lot slower"
from question "BeautifulSoup: how to ignore spurious end tags"