You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Angelo zerr <an...@gmail.com> on 2012/10/12 11:09:30 UTC

POI XWPF converters : docx->xhtml and docx->pdf converters

Hi POI Team,

I contact you because we have developped 2 docx converters based on POI (on
other words XWPFDocument converter)  in our
XDocReport<http://code.google.com/p/xdocreport/>project :


   1. *docx->xhtml* converter : this converter loads a docx in the
   POI XWPFDocument and loop for each structures of the docuiment
   (XWPFParagraph, XWPFTable etc ) and generates html content with SAX (and
   not with DOM like you have done with your doc->html converter). Using SAX
   gives you the capability to merge several docx converted in html in the
   same page by using some SAX pipelines.
   2. *docx->pdf* converter : I'm not sure that you will be interest with
   this converter because it is based on iText and not FOP. Why iText? Because
   it's more fast to create PDF structures directly although to generate FO
   content and parse it to generate PDF structures with FOP. Our goal was to
   provide a very fast docx->pdf converter.

Those converters are not finished (we are improving it) but I think they
can be used. we have managed complex styles too (ex: indentation paragraph
linked to StyleA definied is retrieved from StyleB where StyleA extends
StyleB, manage tblLook to for set style for firstRow, lastRow etc).
Today it exists the 0.9.8 release but the result is very bad. The 1.0.0
will improve a lot the converters.

If you want you can test our converter 1.0.0 in our live demo at
http://xdocreport-converter.opensagres.cloudbees.net/

If you want see sources :


   - XWPF core converter :
   http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.core
   - XWPF xhtml converter :
   http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.xhtml
   - XWPF pdf converter:
   http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.pdf

Hope you will like it our docx converters.

Regards Angelo

Re: POI XWPF converters : docx->xhtml and docx->pdf converters

Posted by Yegor Kozlov <ye...@dinom.ru>.
Yes, sure.

On Fri, Oct 12, 2012 at 3:35 PM, Angelo zerr <an...@gmail.com> wrote:
> Hi Yegor,
>
> Many thank's for your reply.
>
> 2012/10/12 Yegor Kozlov <ye...@dinom.ru>
>
>> Thanks for this.
>>
>> Would you care to write a short abstract to put on the POI Case
>> Studies page: http://poi.apache.org/casestudies.html ?
>>
>
> Yes I can, is it posisble that I write some description about our converter
> in this mailling list and you will add it in the
> http://poi.apache.org/casestudies.htm<http://poi.apache.org/casestudies.html>
>
> Regards Angelo
>
>
>> We can also add a link in the POI FAQ. Cross-format conversion is a
>> common topic on the mailing lists and your code is a good addition to
>> existing code provided by POI.
>>
>> Yegor
>>
>> On Fri, Oct 12, 2012 at 1:09 PM, Angelo zerr <an...@gmail.com>
>> wrote:
>> > Hi POI Team,
>> >
>> > I contact you because we have developped 2 docx converters based on POI
>> (on
>> > other words XWPFDocument converter)  in our
>> > XDocReport<http://code.google.com/p/xdocreport/>project :
>> >
>> >
>> >    1. *docx->xhtml* converter : this converter loads a docx in the
>> >    POI XWPFDocument and loop for each structures of the docuiment
>> >    (XWPFParagraph, XWPFTable etc ) and generates html content with SAX
>> (and
>> >    not with DOM like you have done with your doc->html converter). Using
>> SAX
>> >    gives you the capability to merge several docx converted in html in
>> the
>> >    same page by using some SAX pipelines.
>> >    2. *docx->pdf* converter : I'm not sure that you will be interest with
>> >    this converter because it is based on iText and not FOP. Why iText?
>> Because
>> >    it's more fast to create PDF structures directly although to generate
>> FO
>> >    content and parse it to generate PDF structures with FOP. Our goal
>> was to
>> >    provide a very fast docx->pdf converter.
>> >
>> > Those converters are not finished (we are improving it) but I think they
>> > can be used. we have managed complex styles too (ex: indentation
>> paragraph
>> > linked to StyleA definied is retrieved from StyleB where StyleA extends
>> > StyleB, manage tblLook to for set style for firstRow, lastRow etc).
>> > Today it exists the 0.9.8 release but the result is very bad. The 1.0.0
>> > will improve a lot the converters.
>> >
>> > If you want you can test our converter 1.0.0 in our live demo at
>> > http://xdocreport-converter.opensagres.cloudbees.net/
>> >
>> > If you want see sources :
>> >
>> >
>> >    - XWPF core converter :
>> >
>> http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.core
>> >    - XWPF xhtml converter :
>> >
>> http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.xhtml
>> >    - XWPF pdf converter:
>> >
>> http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.pdf
>> >
>> > Hope you will like it our docx converters.
>> >
>> > Regards Angelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: POI XWPF converters : docx->xhtml and docx->pdf converters

Posted by Angelo zerr <an...@gmail.com>.
Hi Yegor,

Many thank's for your reply.

2012/10/12 Yegor Kozlov <ye...@dinom.ru>

> Thanks for this.
>
> Would you care to write a short abstract to put on the POI Case
> Studies page: http://poi.apache.org/casestudies.html ?
>

Yes I can, is it posisble that I write some description about our converter
in this mailling list and you will add it in the
http://poi.apache.org/casestudies.htm<http://poi.apache.org/casestudies.html>

Regards Angelo


> We can also add a link in the POI FAQ. Cross-format conversion is a
> common topic on the mailing lists and your code is a good addition to
> existing code provided by POI.
>
> Yegor
>
> On Fri, Oct 12, 2012 at 1:09 PM, Angelo zerr <an...@gmail.com>
> wrote:
> > Hi POI Team,
> >
> > I contact you because we have developped 2 docx converters based on POI
> (on
> > other words XWPFDocument converter)  in our
> > XDocReport<http://code.google.com/p/xdocreport/>project :
> >
> >
> >    1. *docx->xhtml* converter : this converter loads a docx in the
> >    POI XWPFDocument and loop for each structures of the docuiment
> >    (XWPFParagraph, XWPFTable etc ) and generates html content with SAX
> (and
> >    not with DOM like you have done with your doc->html converter). Using
> SAX
> >    gives you the capability to merge several docx converted in html in
> the
> >    same page by using some SAX pipelines.
> >    2. *docx->pdf* converter : I'm not sure that you will be interest with
> >    this converter because it is based on iText and not FOP. Why iText?
> Because
> >    it's more fast to create PDF structures directly although to generate
> FO
> >    content and parse it to generate PDF structures with FOP. Our goal
> was to
> >    provide a very fast docx->pdf converter.
> >
> > Those converters are not finished (we are improving it) but I think they
> > can be used. we have managed complex styles too (ex: indentation
> paragraph
> > linked to StyleA definied is retrieved from StyleB where StyleA extends
> > StyleB, manage tblLook to for set style for firstRow, lastRow etc).
> > Today it exists the 0.9.8 release but the result is very bad. The 1.0.0
> > will improve a lot the converters.
> >
> > If you want you can test our converter 1.0.0 in our live demo at
> > http://xdocreport-converter.opensagres.cloudbees.net/
> >
> > If you want see sources :
> >
> >
> >    - XWPF core converter :
> >
> http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.core
> >    - XWPF xhtml converter :
> >
> http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.xhtml
> >    - XWPF pdf converter:
> >
> http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.pdf
> >
> > Hope you will like it our docx converters.
> >
> > Regards Angelo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

Re: POI XWPF converters : docx->xhtml and docx->pdf converters

Posted by Yegor Kozlov <ye...@dinom.ru>.
Thanks for this.

Would you care to write a short abstract to put on the POI Case
Studies page: http://poi.apache.org/casestudies.html ?
We can also add a link in the POI FAQ. Cross-format conversion is a
common topic on the mailing lists and your code is a good addition to
existing code provided by POI.

Yegor

On Fri, Oct 12, 2012 at 1:09 PM, Angelo zerr <an...@gmail.com> wrote:
> Hi POI Team,
>
> I contact you because we have developped 2 docx converters based on POI (on
> other words XWPFDocument converter)  in our
> XDocReport<http://code.google.com/p/xdocreport/>project :
>
>
>    1. *docx->xhtml* converter : this converter loads a docx in the
>    POI XWPFDocument and loop for each structures of the docuiment
>    (XWPFParagraph, XWPFTable etc ) and generates html content with SAX (and
>    not with DOM like you have done with your doc->html converter). Using SAX
>    gives you the capability to merge several docx converted in html in the
>    same page by using some SAX pipelines.
>    2. *docx->pdf* converter : I'm not sure that you will be interest with
>    this converter because it is based on iText and not FOP. Why iText? Because
>    it's more fast to create PDF structures directly although to generate FO
>    content and parse it to generate PDF structures with FOP. Our goal was to
>    provide a very fast docx->pdf converter.
>
> Those converters are not finished (we are improving it) but I think they
> can be used. we have managed complex styles too (ex: indentation paragraph
> linked to StyleA definied is retrieved from StyleB where StyleA extends
> StyleB, manage tblLook to for set style for firstRow, lastRow etc).
> Today it exists the 0.9.8 release but the result is very bad. The 1.0.0
> will improve a lot the converters.
>
> If you want you can test our converter 1.0.0 in our live demo at
> http://xdocreport-converter.opensagres.cloudbees.net/
>
> If you want see sources :
>
>
>    - XWPF core converter :
>    http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.core
>    - XWPF xhtml converter :
>    http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.xhtml
>    - XWPF pdf converter:
>    http://code.google.com/p/xdocreport/source/browse/#git%2Fthirdparties-extension%2Forg.apache.poi.xwpf.converter.pdf
>
> Hope you will like it our docx converters.
>
> Regards Angelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org