O'Reilly European Open Source Convention - October 17-20, 2005 - Amsterdam, The Netherlands
 Convention Coverage


lxml, a Pythonic Binding for libxml2
Martijn Faassen, Infrae

Track: Python
Date: Wednesday, 19 October 2005
Time: 10:45 - 11:30
Location: St. John's Room II

The C libraries libxml2 and libxslt have huge benefits. They have standards-compliant XML support, are full-featured and actively maintained by XML experts, and very fast.

These libraries already ship with Python bindings, but these Python bindings have problems. In particular, they are very low level and C-ish (not Pythonic), underdocumented, huge, UTF-8 in API instead of Python unicode strings, they can cause segfaults from Python, and a programmer has to do manual memory management.

lxml is a new Python binding for libxml2 and libxslt, completely independent from these existing Python bindings. Its aims are a Pythonic API, documentation, the use of Python unicode strings in API, safety (no segfaults), and no manual memory management.

lxml aims to provide a Pythonic API by following as much as possible the ElementTree API. We're trying to avoid having to invent too many new APIs, or you having to learn new things--XML is complicated enough.

