Home Page
About Us


Software library for parsing XML documents.


Lxml is a full-featured, high performance Python library for processing XML and HTML.



Libxml2 libxslt also come with their own much lower-level python bindings but lxml is much more straightforward and pythonic and it seems to have great performance as well

from question  

Does Python 2.5 include a package to natively transform an XML document?

It s installed with lxml in fact you probably want to use lxml instead of libxml2 because lxml is based on libxml2 is more pythonic sudo pip install lxml libxml2 page says note that some of the python purist dislike the default set of python bindings rather than complaining i suggest some of the python purist have a look at lxml the more pythonic bindings for libxml2 and libxslt and check the mailing-list

from question  

Cannot Install libxml2 in virtualenv

You re not using lxml wrong and it s not that lxml doesn t support preserving whitespace in this scenario as so many other so entries might have you think;it s just that you were unwittingly using a version of libxml2 that has a bug that s since been fixed

from question  

Why does lxml.html sometimes swallow/remove whitespace instead of preserving it?

The chapter starts with short course to xml general talk but with the atom syndication feed example then it continues with the standard xml.etree.elementtree and continues with third party lxml that implements more with the same interface full xpath 1.0 based on libxml2

from question  

Reading text from XML nodes using Python's libxml2

Between lxml and beautifulsoup lxml is more equivalent to nokogiri;because it is based on libxml2 and it has xpath css support

from question  

Going from Ruby to Python : Crawlers

Lxml is much easier to use than the xml libraries included in the standard python library;it s a binding for the c libxml2 library so i m assuming it s also faster

from question  

Improving text extraction routine from XML

The parsers page in the lxml docs goes into more detail about setting up a parser and iterating over the contents

from question  

Python libxml2 reader and XML_PARSE_RECOVER

Back to Home
Data comes from Stack Exchange with CC-BY-SA-4.0