EPUB to XML conversion library in Python

Closed - This job posting has been filled and work has been completed.

Job Description

I'm looking for an experienced Python developer, with an attention to detail. Ideally you know a bit the lxml library, XSLT transforms, and the EPUB format.

The job is to write a small Python library that:
(Part 1) converts an EPUB file to an XML document with a specific syntax (see below).
(Part 2) the opposite, ie. convert an XML document to EPUB.

XML document syntax
======================
I will give you the complete reference for the syntax but essentially it is simply:
<document>
<title>My doc</title>
<chapter>
<page>
<title>Introduction</title>
<section>
<subsection>
<text>Lorem ipsum dolor sit amet...</text>
</subsection>

[...]

</section>
</page>
</chapter>
</document>

For (Part 1), I will provide a small open-source library to read an EPUB file. You will then use lxml to parse the EPUB pages and convert them to XML. The difficulty is to retrieve the document structure (section & subsection) from the EPUB html titles h1..h6.

(Part 2) will probably need an XSLT transformation and an EPUB creation library.