You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Dhaval Parekh <dh...@gmail.com> on 2009/10/15 00:52:46 UTC

XSSFExportToXML help trying to export xlsx to xml

Hey All,

I am trying to convert an Excel 2007 (xlsx) file to XML
I am trying to use the XSSFExportToXML However I am running into issues.
Please any help is greatly appreciated.


File inputfile = new File("test.xlsx");
FileInputStream inp = new FileInputStream(inputfile);

XSSFWorkbook wb1=new XSSFWorkbook(inp);

Collection temp = new ArrayList();
temp = wb1.getCustomXMLMappings();

System.out.println(temp.size());

// I GET THE SIZE OF THIS AS 0, I have tried using MapInfo trying to get
wb1.getMapInfo() but i get null,


// rest is just trial
XSSFMap = temp[0];

XSSFExportToXml ex = new XSSFExportToXml((xssfmap);
			
OutputStream writer = new FileOutputStream("outxml.txt");
ex.exportToXML(writer ,false);


Thanks in advance.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by Dhaval Parekh <dh...@gmail.com>.
Hey David,

really appreciate your help. Thanks for the info below. 
I have tried all of those options Reading cells, using extractor, xssf sax. 
They all give me the data that are in the regular sheets but not the data that
is in the pivot table cache which is what I am hunting for.


One thing I tried that partially worked was to get the package parts and dump
then in a zip
XSSFWorkbook wb = new XSSFWorkbook(inp);
String outFilename = "outfile.zip";
ZipOutputStream out = new ZipOutputStream(new FileOutputStream(outFilename));

for (POIXMLDocumentPart p : wb.getRelations()) {
  PackagePart pt = p.getPackagePart();
  pt.save(out);

}
out.close();

Upon unzipping I can see the xml files however they do not have the schema and
hence the data is pretty much useless.
For Example one of the XML files was 
pivotCacheDefinition1.xml

<pivotCacheDefinition r:id="rId1" refreshedBy=""
refreshedDate="40097.989743171296" createdVersion="1" recordCount="6907">
<cacheSource type="external" connectionId="2"/>
<cacheFields count="86">
<cacheField name="Number" numFmtId="0" sqlType="4">
<sharedItems containsSemiMixedTypes="0" containsString="0" containsNumber="1"
containsInteger="1" minValue="1" maxValue="9750" count="6904">
<n v="9739"/>
<n v="9738"/>
<n v="9736"/>

... 

It does not have any style associated with it or no schema. 
However when you use the Save As option in Excel to manually save the xlsx to
xml the schema is well formed and XML file is usable.
 
I will keep trying. Hopefully I wont have to get an intern to press the SAVE AS
button. :)
Again I really appreciate your help. Thanks to all working on this project.
If there is something I am missing or if I am thinking in the wrong direction
please feel free to let me know.

THanks.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by Dhaval Parekh <dh...@gmail.com>.
Hey Roberto,
THanks for the help.

This still does not give me the information in the pivot tables I am looking
for. It seems like its cached. I have tried a few other things I mentioned in my
reponse to David.I am not sure where to try next.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by Roberto Manicardi <ro...@gmail.com>.
Hi,

Il giorno 16/ott/09, alle ore 20:48, David Fisher ha scritto:

>> The excel file is generated automatically and come from a source. I  
>> do not want
>> to modify it or manually open it.
>
> Of course not. We all must automate, that's why we use POI.
>
>> The data I need exists in the Pivot Tables and that can change.
>>
>> I want to do the equivalent of opening an test.xlsx file in Excel  
>> 2007 and
>> Saving AS a test.xml file but programatically so I can use the data  
>> on the
>> server for further use.
>
> Understood.
>
>> Let me know if that can be done using the Apache POI API.
>
> To do so would require modifying the code that Roberto donated to  
> perform Excel's default behavior.


my code requires a user defined mapping in which the user can select  
which part of the XSLX will be extracted and set an custom output XML  
Schema.

The 'save as xml' feature, i think, is much more simple (fixed mapping  
and fixed schema) and requires few lines of code to implement it.

It can be implemented like this:

- Create a DOM node
-for each Sheet
   - create a SHEET element in the DOM
      - for each row in the sheet
        - create a ROW element in the SHEET element
          - for each column in the row
           - create a COLUMN element (or attribute) in the ROW element  
and set the cell identified by the row and column value in it
- serialize the DOM

for both on DOM management and XFFSWorbook tree navigation you can  
find some examples in my class:

http://svn.apache.org/repos/asf/poi/trunk/src/ooxml/java/org/apache/poi/xssf/extractor/XSSFExportToXml.java



Cheers,
Roberto



>
> If you would like to try that would be great, otherwise you can file  
> a bugzilla report and include an example xlsx and xml files and code  
> that you have tried. That way this issue won't get lost, and has a  
> chance of being eventually resolved.
>
> Now if you know where in the xlsx the cells are then I'm sure you  
> can extract those and then output in xml from your program. Not a  
> general solution, and not nearly as elegant, but you could get that  
> done with POI as it is now.
>
> These links to examples might be a starting point for you:
>
> http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api
>
> These examples are HSSF, but SS combined versions can be done.
> http://poi.apache.org/spreadsheet/converting.html
>
> http://poi.apache.org/spreadsheet/quick-guide.html#Iterator
>
> http://poi.apache.org/spreadsheet/quick-guide.html#CellContents
>
> http://poi.apache.org/spreadsheet/quick-guide.html#TextExtraction
>
>
>> Thanks for all the help.
>
> You're welcome
>
> Regards,
> Dave
>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>

-- 
Roberto Manicardi
= 
= 
= 
========================================================================
http://www.jugroma.it
http://roma.javaday.it
JUG Roma Coordinator


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by David Fisher <df...@jmlafferty.com>.
> The excel file is generated automatically and come from a source. I  
> do not want
> to modify it or manually open it.

Of course not. We all must automate, that's why we use POI.

> The data I need exists in the Pivot Tables and that can change.
>
> I want to do the equivalent of opening an test.xlsx file in Excel  
> 2007 and
> Saving AS a test.xml file but programatically so I can use the data  
> on the
> server for further use.

Understood.

> Let me know if that can be done using the Apache POI API.

To do so would require modifying the code that Roberto donated to  
perform Excel's default behavior.

If you would like to try that would be great, otherwise you can file a  
bugzilla report and include an example xlsx and xml files and code  
that you have tried. That way this issue won't get lost, and has a  
chance of being eventually resolved.

Now if you know where in the xlsx the cells are then I'm sure you can  
extract those and then output in xml from your program. Not a general  
solution, and not nearly as elegant, but you could get that done with  
POI as it is now.

These links to examples might be a starting point for you:

http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api

These examples are HSSF, but SS combined versions can be done.
http://poi.apache.org/spreadsheet/converting.html

http://poi.apache.org/spreadsheet/quick-guide.html#Iterator

http://poi.apache.org/spreadsheet/quick-guide.html#CellContents

http://poi.apache.org/spreadsheet/quick-guide.html#TextExtraction


> Thanks for all the help.

You're welcome

Regards,
Dave

>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by Dhaval Parekh <dh...@gmail.com>.
Thanks David for the reply.

The excel file is generated automatically and come from a source. I do not want
to modify it or manually open it. 
The data I need exists in the Pivot Tables and that can change. 

I want to do the equivalent of opening an test.xlsx file in Excel 2007 and
Saving AS a test.xml file but programatically so I can use the data on the
server for further use.

Let me know if that can be done using the Apache POI API.

Thanks for all the help.






---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by David Fisher <df...@jmlafferty.com>.
Roberto contributed this functionality to POI - Thanks!

Here is the bugzilla titled - [PATCH] Implementation of an XLSX to XML  
Exporter using Custom XML Mapping
https://issues.apache.org/bugzilla/show_bug.cgi?id=47520

When he says that it requires a Custom Mapping to work then that must  
be true.

You could try patching in a default mapping.

Good luck,
Dave

On Oct 16, 2009, at 9:59 AM, Dhaval Parekh wrote:

> Thanks for the reply.
>
> I just have to convert a xlsx to xml.
> I do not have any custom mappings defined.
>
> This is what I am doing at the moment and its throwing an exception
> java.io.IOException: error: Illegal XML character: 0x3
> org.apache.xmlbeans.impl.piccolo.io.IllegalCharException:
> Illegal XML character: 0x3 at
> (UTF8XMLDecoder.java:196)
>
> My code is below
>
> File inputfile = new File("test.xlsx");
> FileInputStream inp = new FileInputStream(inputfile);
> MapInfo info = new MapInfo();
>
>
> info.readFrom(inp);
> //this is where its throwing an error above
>
>
> XSSFMap xmap = info.getXSSFMapById(0);
> CTMap ctMap = null;
> XSSFExportToXml ex = new XSSFExportToXml(xmap);
> OutputStream writer = new FileOutputStream("outxml.txt");			
> ex.exportToXML(writer ,false);
>
> Thanks a lot for your help
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by Dhaval Parekh <dh...@gmail.com>.
Here is the latest.

I found out that the Pivot tables that I need are in cache stored in sheet rId11
So I use the following code to convert to XML. However it does not have any
style or schema. So its pretty much useless. 
I used the example that Roberto and Dave pointed to.

OPCPackage pkg = OPCPackage.open("test.xlsx");
XSSFReader r = new XSSFReader( pkg );
SharedStringsTable sst = r.getSharedStringsTable();
			
DOMParser parser = new DOMParser();
InputStream inp = r.getSheet("rId11");
InputSource inpSource = new InputSource(inp);
parser.parse(inpSource);
			
Document doc = parser.getDocument();
inp.close();

OutputStream writer = new FileOutputStream("outtrId11.xml");
TransformerFactory transfac = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();
			trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
trans.setOutputProperty(OutputKeys.INDENT, "yes");
trans.setOutputProperty(OutputKeys.ENCODING, "UTF-8");

//create string from xml tree
			
StreamResult result = new StreamResult(writer);
DOMSource source = new DOMSource(doc);
trans.transform(source, result);


The XML file generated was as below
 
This XML file does not appear to have any style information associated with it.
The document tree is shown below.      
−
<pivotCacheDefinition createdVersion="1" r:id="rId1" recordCount="6907"
refreshedBy="" refreshedDate="40097.989743171296">
<cacheSource connectionId="2" type="external"/>
−
<cacheFields count="86">
−
<cacheField name="Number" numFmtId="0" sqlType="4">
−
<sharedItems containsInteger="1" containsNumber="1" containsSemiMixedTypes="0"
containsString="0" count="6904" maxValue="9750" minValue="1">
<n v="9739"/>
<n v="9738"/>
<n v="9736"/>
<n v="9740"/>
<n v="9733"/>
<n v="9732"/>

... 

Not sure where to go from here.

Thanks.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by Roberto Manicardi <ro...@gmail.com>.
Hi,

Il giorno 16/ott/09, alle ore 18:59, Dhaval Parekh ha scritto:

> Thanks for the reply.
>
> I just have to convert a xlsx to xml.
> I do not have any custom mappings defined.

the error is reported because you have to load first the XLSX with the  
XXSFWorkbook object and then retrive the MapInfo object loaded in it.  
You can find an example on how to use the code here:

http://svn.apache.org/repos/asf/poi/trunk/src/ooxml/testcases/org/apache/poi/xssf/extractor/TestXSSFExportToXML.java


Cheers,
Roberto

>
> This is what I am doing at the moment and its throwing an exception
> java.io.IOException: error: Illegal XML character: 0x3
> org.apache.xmlbeans.impl.piccolo.io.IllegalCharException:
> Illegal XML character: 0x3 at
> (UTF8XMLDecoder.java:196)
>
> My code is below
>
> File inputfile = new File("test.xlsx");
> FileInputStream inp = new FileInputStream(inputfile);
> MapInfo info = new MapInfo();
>
>
> info.readFrom(inp);
> //this is where its throwing an error above
>
>
> XSSFMap xmap = info.getXSSFMapById(0);
> CTMap ctMap = null;
> XSSFExportToXml ex = new XSSFExportToXml(xmap);
> OutputStream writer = new FileOutputStream("outxml.txt");			
> ex.exportToXML(writer ,false);
>
> Thanks a lot for your help
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>

-- 
Roberto Manicardi
= 
= 
= 
========================================================================
http://www.jugroma.it
http://roma.javaday.it
JUG Roma Coordinator


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by Dhaval Parekh <dh...@gmail.com>.
Thanks for the reply. 

I just have to convert a xlsx to xml. 
I do not have any custom mappings defined.

This is what I am doing at the moment and its throwing an exception
java.io.IOException: error: Illegal XML character: 0x3
org.apache.xmlbeans.impl.piccolo.io.IllegalCharException: 
Illegal XML character: 0x3 at
(UTF8XMLDecoder.java:196)

My code is below

File inputfile = new File("test.xlsx");
FileInputStream inp = new FileInputStream(inputfile);
MapInfo info = new MapInfo();


info.readFrom(inp); 
//this is where its throwing an error above


XSSFMap xmap = info.getXSSFMapById(0);
CTMap ctMap = null;
XSSFExportToXml ex = new XSSFExportToXml(xmap);
OutputStream writer = new FileOutputStream("outxml.txt");			
ex.exportToXML(writer ,false);

Thanks a lot for your help




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: XSSFExportToXML help trying to export xlsx to xml

Posted by Roberto Manicardi <ro...@gmail.com>.
Hi,

have you configured any Custom XML Mapping in the test.xml file ?

Here's how to do it http://openxmldeveloper.org/articles/2268.aspx.  
(sorry MS Excel for Windows only).

Thanks,
Roberto




Il giorno 15/ott/09, alle ore 00:52, Dhaval Parekh ha scritto:

> Hey All,
>
> I am trying to convert an Excel 2007 (xlsx) file to XML
> I am trying to use the XSSFExportToXML However I am running into  
> issues.
> Please any help is greatly appreciated.
>
>
> File inputfile = new File("test.xlsx");
> FileInputStream inp = new FileInputStream(inputfile);
>
> XSSFWorkbook wb1=new XSSFWorkbook(inp);
>
> Collection temp = new ArrayList();
> temp = wb1.getCustomXMLMappings();
>
> System.out.println(temp.size());
>
> // I GET THE SIZE OF THIS AS 0, I have tried using MapInfo trying to  
> get
> wb1.getMapInfo() but i get null,
>
>
> // rest is just trial
> XSSFMap = temp[0];
>
> XSSFExportToXml ex = new XSSFExportToXml((xssfmap);
> 			
> OutputStream writer = new FileOutputStream("outxml.txt");
> ex.exportToXML(writer ,false);
>
>
> Thanks in advance.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>

-- 
Roberto Manicardi
= 
= 
= 
========================================================================
http://www.jugroma.it
http://roma.javaday.it
JUG Roma Coordinator