You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Алексей Ушаровский <us...@mail.ru> on 2016/05/18 10:55:08 UTC

Re[8]: ltChunk using in POI

 I guess, i found something by reading sources of Apache POI library!

QName ALTCHUNK$58 =  new QName ( "http://schemas.openxmlformats.org/wordprocessingml/2006/main" ,  "altChunk" ) ;
QName ID$2 =  new QName ( "http://schemas.openxmlformats.org/officeDocument/2006/relationships" ,  "id" ) ;

CTDocument1 ctDoc = doc. getDocument ( ) ;
CTBodyImpl b =  ( CTBodyImpl ) ctDoc. getBody ( ) ;
XmlComplexContentImpl ac =  ( XmlComplexContentImpl ) b. get_store ( ) . add_element_user ( ALTCHUNK$58 ) ;

org. apache . xmlbeans . SimpleValue target = ac. get_store ( ) . add_attribute_user ( ID$2 ) ;
target. setStringValue ( id ) ;

This code is creating correct altChank element in the body of docx document. Now I'm thinking, how to put another docx inside this one with correct content type an relation object.
How I understand, this way is usful for many different low level modifications of OOXML.

>Вторник, 17 мая 2016, 22:43 +03:00 от Алексей Ушаровский <us...@mail.ru>:
>
>
>Great thanks for explaining your experience. I'm only beginning to work with office document in Java and now searching for good library for my task. It's very interesting to read what you've written.
>Have you ever work with docx4j library? 
>--
>Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 21:20 +03:00 от "Javen O'Neal" <  javenoneal@gmail.com > :
>
>>I haven't worked with POI's XWPF package yet, so your guess is better than
>>mine.
>>
>>However, I would assume that POI doesn't support document concatenating and
>>that people have written their own code to define how they want to combine
>>the documents. For example, how do you combine two documents with different
>>headers and footers, different style themes, different XML namespaces,
>>different VBA macros, etc? How do you resolve collisions of named fields?
>>Do you start the second document on the last page of the first document or
>>on a new page? How do you handle different page layouts, margins, and
>>printer settings? Combining bibliographies?
>>
>>It might be possible to write a function that indisputably combines two
>>very simple documents, but it'd be tricky to implement something that
>>satisfies everyone's needs that handles complex documents.
>>
>>If all your after is merging the paragraphs, then you could for-loop over
>>the paragraphs and copy them into the first document, creating a new page
>>before you start copying if that's the behavior you want. If you control
>>the format of the files that will be merged (say documents are rich text
>>plus pictures, text boxes, and tables), you might be able to get away with
>>this. If you don't have control, this would take a lot of effort with POI.
>>
>>If you're just after the text content, look at XWPFWordExtractor
>> http://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/xwpf/extractor/XWPFWordExtractor.java?view=markup
>>
>>If you're willing to embed a document rather than joining a document, you
>>could use UpdateEmbeddedDoc
>> http://svn.apache.org/viewvc/poi/trunk/src/examples/src/org/apache/poi/xwpf/usermodel/UpdateEmbeddedDoc.java?view=markup
>>
>>For non-POI solutions:
>>Look at what LibreOffice/OpenOffice do, both in application behavior and
>>source code. I think they have a headless API if you're still evaluating
>>software libraries that meet your needs.
>>
>>If you have MS Office installed on your system, you could use VBA scripts
>>to automate this. You could also write code that remote controls Word over
>>a COM port.
>>
>>Searching on Google for "altChunk POI", someone said Aspose (a commercial
>>$$$ library) has support for altChunk. I recently migrated 3 software
>>products at my company from Aspose Cells to POI Spreadsheet due to
>>increased licensing costs and poorly documented API, no source code access
>>(to make up for the API documentation), inability to add missing features
>>with a forked version, and lack of transparency of memory/speed performance
>>due to closed source.
>>
>>Best of luck solving your problem!
>>On May 17, 2016 09:29, "Алексей Ушаровский" <  usharik@mail.ru > wrote:
>>
>>
>>And another question. Is there any standard way to join two docx files into
>>one document by POI. Unfortunately i found nothing about it in the internet.
>>--
>>Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 18:55 +03:00 от
>>Алексей Ушаровский <  usharik@mail.ru > :
>>
>>>
>>>Thank you, Javen!
>>>How I understand problem is not only on hi level interface. POI has it own
>>classes which implement many but not all OOXML items.
>>>--
>>>Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 18:49 +03:00 от
>>"Javen O'Neal" <  javenoneal@gmail.com > :
>>>
>>>>Yes, if you're willing to write using CT* classes.
>>>>
>> http://www.atetric.com/atetric/javadoc/org.apache.poi/ooxml-schemas/1.1/org/openxmlformats/schemas/wordprocessingml/x2006/main/CTBody.html
>>>>
>>>>I couldn't find a higher-level abstraction on top of this in POI though.
>>>>If you get something working, please submit it back to POI so that your
>>>>work can benefit others with a similar problem.
>>>>On May 17, 2016 8:34 AM, "Алексей Ушаровский" <  usharik@mail.ru > wrote:
>>>>
>>>> Hello!
>>>>Is it possible to use altChunk OOXML items in docx by POI library?
>>>>
>>>>--
>>>>Regards,
>>>>Alex


Re[10]: ltChunk using in POI

Posted by Алексей Ушаровский <us...@mail.ru.INVALID>.
I think it's good aim, to develop new class for AltChunk. But I'm only at the beginnig on this way))
Do you know is there a place where MS Word or Excel saves logs with detailed errors which happened when it try to open corrupted documents?

>Среда, 18 мая 2016, 18:30 +03:00 от "Javen O'Neal" <ja...@gmail.com>:
>
>Thanks for looking into this. If you can figure out how to wrap this into
>an easy-to-use class, certainly submit a patch on bugzilla to get this
>added to the baseline. I'd be happy to help you get this integrated.
>On May 18, 2016 3:55 AM, "Алексей Ушаровский" < usharik@mail.ru > wrote:
>
>>  I guess, i found something by reading sources of Apache POI library!
>>
>> QName ALTCHUNK$58 =  new QName ( "
>>  http://schemas.openxmlformats.org/wordprocessingml/2006/main " ,
>> "altChunk" ) ;
>> QName ID$2 =  new QName ( "
>>  http://schemas.openxmlformats.org/officeDocument/2006/relationships " ,
>> "id" ) ;
>>
>> CTDocument1 ctDoc = doc. getDocument ( ) ;
>> CTBodyImpl b =  ( CTBodyImpl ) ctDoc. getBody ( ) ;
>> XmlComplexContentImpl ac =  ( XmlComplexContentImpl ) b. get_store ( ) .
>> add_element_user ( ALTCHUNK$58 ) ;
>>
>> org. apache . xmlbeans . SimpleValue target = ac. get_store ( ) .
>> add_attribute_user ( ID$2 ) ;
>> target. setStringValue ( id ) ;
>>
>> This code is creating correct altChank element in the body of docx
>> document. Now I'm thinking, how to put another docx inside this one with
>> correct content type an relation object.
>> How I understand, this way is usful for many different low level
>> modifications of OOXML.
>>
>> >Вторник, 17 мая 2016, 22:43 +03:00 от Алексей Ушаровский < usharik@mail.ru
>> >:
>> >
>> >
>> >Great thanks for explaining your experience. I'm only beginning to work
>> with office document in Java and now searching for good library for my
>> task. It's very interesting to read what you've written.
>> >Have you ever work with docx4j library?
>> >--
>> >Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 21:20 +03:00 от
>> "Javen O'Neal" <  javenoneal@gmail.com > :
>> >
>> >>I haven't worked with POI's XWPF package yet, so your guess is better
>> than
>> >>mine.
>> >>
>> >>However, I would assume that POI doesn't support document concatenating
>> and
>> >>that people have written their own code to define how they want to
>> combine
>> >>the documents. For example, how do you combine two documents with
>> different
>> >>headers and footers, different style themes, different XML namespaces,
>> >>different VBA macros, etc? How do you resolve collisions of named fields?
>> >>Do you start the second document on the last page of the first document
>> or
>> >>on a new page? How do you handle different page layouts, margins, and
>> >>printer settings? Combining bibliographies?
>> >>
>> >>It might be possible to write a function that indisputably combines two
>> >>very simple documents, but it'd be tricky to implement something that
>> >>satisfies everyone's needs that handles complex documents.
>> >>
>> >>If all your after is merging the paragraphs, then you could for-loop over
>> >>the paragraphs and copy them into the first document, creating a new page
>> >>before you start copying if that's the behavior you want. If you control
>> >>the format of the files that will be merged (say documents are rich text
>> >>plus pictures, text boxes, and tables), you might be able to get away
>> with
>> >>this. If you don't have control, this would take a lot of effort with
>> POI.
>> >>
>> >>If you're just after the text content, look at XWPFWordExtractor
>> >>
>>  http://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/xwpf/extractor/XWPFWordExtractor.java?view=markup
>> >>
>> >>If you're willing to embed a document rather than joining a document, you
>> >>could use UpdateEmbeddedDoc
>> >>
>>  http://svn.apache.org/viewvc/poi/trunk/src/examples/src/org/apache/poi/xwpf/usermodel/UpdateEmbeddedDoc.java?view=markup
>> >>
>> >>For non-POI solutions:
>> >>Look at what LibreOffice/OpenOffice do, both in application behavior and
>> >>source code. I think they have a headless API if you're still evaluating
>> >>software libraries that meet your needs.
>> >>
>> >>If you have MS Office installed on your system, you could use VBA scripts
>> >>to automate this. You could also write code that remote controls Word
>> over
>> >>a COM port.
>> >>
>> >>Searching on Google for "altChunk POI", someone said Aspose (a commercial
>> >>$$$ library) has support for altChunk. I recently migrated 3 software
>> >>products at my company from Aspose Cells to POI Spreadsheet due to
>> >>increased licensing costs and poorly documented API, no source code
>> access
>> >>(to make up for the API documentation), inability to add missing features
>> >>with a forked version, and lack of transparency of memory/speed
>> performance
>> >>due to closed source.
>> >>
>> >>Best of luck solving your problem!
>> >>On May 17, 2016 09:29, "Алексей Ушаровский" <  usharik@mail.ru > wrote:
>> >>
>> >>
>> >>And another question. Is there any standard way to join two docx files
>> into
>> >>one document by POI. Unfortunately i found nothing about it in the
>> internet.
>> >>--
>> >>Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 18:55 +03:00 от
>> >>Алексей Ушаровский <  usharik@mail.ru > :
>> >>
>> >>>
>> >>>Thank you, Javen!
>> >>>How I understand problem is not only on hi level interface. POI has it
>> own
>> >>classes which implement many but not all OOXML items.
>> >>>--
>> >>>Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 18:49 +03:00
>> от
>> >>"Javen O'Neal" <  javenoneal@gmail.com > :
>> >>>
>> >>>>Yes, if you're willing to write using CT* classes.
>> >>>>
>> >>
>>  http://www.atetric.com/atetric/javadoc/org.apache.poi/ooxml-schemas/1.1/org/openxmlformats/schemas/wordprocessingml/x2006/main/CTBody.html
>> >>>>
>> >>>>I couldn't find a higher-level abstraction on top of this in POI
>> though.
>> >>>>If you get something working, please submit it back to POI so that your
>> >>>>work can benefit others with a similar problem.
>> >>>>On May 17, 2016 8:34 AM, "Алексей Ушаровский" <  usharik@mail.ru >
>> wrote:
>> >>>>
>> >>>> Hello!
>> >>>>Is it possible to use altChunk OOXML items in docx by POI library?
>> >>>>
>> >>>>--
>> >>>>Regards,
>> >>>>Alex
>>
>>


Re: Re[8]: ltChunk using in POI

Posted by Javen O'Neal <ja...@gmail.com>.
Thanks for looking into this. If you can figure out how to wrap this into
an easy-to-use class, certainly submit a patch on bugzilla to get this
added to the baseline. I'd be happy to help you get this integrated.
On May 18, 2016 3:55 AM, "Алексей Ушаровский" <us...@mail.ru> wrote:

>  I guess, i found something by reading sources of Apache POI library!
>
> QName ALTCHUNK$58 =  new QName ( "
> http://schemas.openxmlformats.org/wordprocessingml/2006/main" ,
> "altChunk" ) ;
> QName ID$2 =  new QName ( "
> http://schemas.openxmlformats.org/officeDocument/2006/relationships" ,
> "id" ) ;
>
> CTDocument1 ctDoc = doc. getDocument ( ) ;
> CTBodyImpl b =  ( CTBodyImpl ) ctDoc. getBody ( ) ;
> XmlComplexContentImpl ac =  ( XmlComplexContentImpl ) b. get_store ( ) .
> add_element_user ( ALTCHUNK$58 ) ;
>
> org. apache . xmlbeans . SimpleValue target = ac. get_store ( ) .
> add_attribute_user ( ID$2 ) ;
> target. setStringValue ( id ) ;
>
> This code is creating correct altChank element in the body of docx
> document. Now I'm thinking, how to put another docx inside this one with
> correct content type an relation object.
> How I understand, this way is usful for many different low level
> modifications of OOXML.
>
> >Вторник, 17 мая 2016, 22:43 +03:00 от Алексей Ушаровский <usharik@mail.ru
> >:
> >
> >
> >Great thanks for explaining your experience. I'm only beginning to work
> with office document in Java and now searching for good library for my
> task. It's very interesting to read what you've written.
> >Have you ever work with docx4j library?
> >--
> >Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 21:20 +03:00 от
> "Javen O'Neal" <  javenoneal@gmail.com > :
> >
> >>I haven't worked with POI's XWPF package yet, so your guess is better
> than
> >>mine.
> >>
> >>However, I would assume that POI doesn't support document concatenating
> and
> >>that people have written their own code to define how they want to
> combine
> >>the documents. For example, how do you combine two documents with
> different
> >>headers and footers, different style themes, different XML namespaces,
> >>different VBA macros, etc? How do you resolve collisions of named fields?
> >>Do you start the second document on the last page of the first document
> or
> >>on a new page? How do you handle different page layouts, margins, and
> >>printer settings? Combining bibliographies?
> >>
> >>It might be possible to write a function that indisputably combines two
> >>very simple documents, but it'd be tricky to implement something that
> >>satisfies everyone's needs that handles complex documents.
> >>
> >>If all your after is merging the paragraphs, then you could for-loop over
> >>the paragraphs and copy them into the first document, creating a new page
> >>before you start copying if that's the behavior you want. If you control
> >>the format of the files that will be merged (say documents are rich text
> >>plus pictures, text boxes, and tables), you might be able to get away
> with
> >>this. If you don't have control, this would take a lot of effort with
> POI.
> >>
> >>If you're just after the text content, look at XWPFWordExtractor
> >>
> http://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/xwpf/extractor/XWPFWordExtractor.java?view=markup
> >>
> >>If you're willing to embed a document rather than joining a document, you
> >>could use UpdateEmbeddedDoc
> >>
> http://svn.apache.org/viewvc/poi/trunk/src/examples/src/org/apache/poi/xwpf/usermodel/UpdateEmbeddedDoc.java?view=markup
> >>
> >>For non-POI solutions:
> >>Look at what LibreOffice/OpenOffice do, both in application behavior and
> >>source code. I think they have a headless API if you're still evaluating
> >>software libraries that meet your needs.
> >>
> >>If you have MS Office installed on your system, you could use VBA scripts
> >>to automate this. You could also write code that remote controls Word
> over
> >>a COM port.
> >>
> >>Searching on Google for "altChunk POI", someone said Aspose (a commercial
> >>$$$ library) has support for altChunk. I recently migrated 3 software
> >>products at my company from Aspose Cells to POI Spreadsheet due to
> >>increased licensing costs and poorly documented API, no source code
> access
> >>(to make up for the API documentation), inability to add missing features
> >>with a forked version, and lack of transparency of memory/speed
> performance
> >>due to closed source.
> >>
> >>Best of luck solving your problem!
> >>On May 17, 2016 09:29, "Алексей Ушаровский" <  usharik@mail.ru > wrote:
> >>
> >>
> >>And another question. Is there any standard way to join two docx files
> into
> >>one document by POI. Unfortunately i found nothing about it in the
> internet.
> >>--
> >>Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 18:55 +03:00 от
> >>Алексей Ушаровский <  usharik@mail.ru > :
> >>
> >>>
> >>>Thank you, Javen!
> >>>How I understand problem is not only on hi level interface. POI has it
> own
> >>classes which implement many but not all OOXML items.
> >>>--
> >>>Отправлено из Mail.Ru для Android вторник, 17 мая 2016г., 18:49 +03:00
> от
> >>"Javen O'Neal" <  javenoneal@gmail.com > :
> >>>
> >>>>Yes, if you're willing to write using CT* classes.
> >>>>
> >>
> http://www.atetric.com/atetric/javadoc/org.apache.poi/ooxml-schemas/1.1/org/openxmlformats/schemas/wordprocessingml/x2006/main/CTBody.html
> >>>>
> >>>>I couldn't find a higher-level abstraction on top of this in POI
> though.
> >>>>If you get something working, please submit it back to POI so that your
> >>>>work can benefit others with a similar problem.
> >>>>On May 17, 2016 8:34 AM, "Алексей Ушаровский" <  usharik@mail.ru >
> wrote:
> >>>>
> >>>> Hello!
> >>>>Is it possible to use altChunk OOXML items in docx by POI library?
> >>>>
> >>>>--
> >>>>Regards,
> >>>>Alex
>
>