You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Nguyen, Vincent (CDC/OD/OADS) (CTR)" <vn...@cdc.gov> on 2012/08/22 21:57:35 UTC
Full Text Indexing for DOCX files
Has anyone been able to index DOCX files? I get this error message when using office 2007 documents
(Location of error unknown)org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. POI only supports OLE2 Office documents
We're currently using SOLR1.3
Vincent Vu Nguyen
RE: Full Text Indexing for DOCX files
Posted by "Nguyen, Vincent (CDC/OD/OADS) (CTR)" <vn...@cdc.gov>.
Thanks Jack, I'll give that version of SOLR a try.
Vincent Vu Nguyen
Web Applications Developer
Division of Science Quality and Translation
Office of the Associate Director for Science
Centers for Disease Control and Prevention (CDC)
404-498-0384 vng0@cdc.gov
Century Bldg 2400
Atlanta, GA 30329
-----Original Message-----
From: Jack Krupansky [mailto:jack@basetechnology.com]
Sent: Wednesday, August 22, 2012 4:07 PM
To: solr-user@lucene.apache.org
Subject: Re: Full Text Indexing for DOCX files
I've indexed Office 2007 .docx using Solr 3.6.
It sounds as if Solr 1.3 had an old release of Tika/POI. No big surprise there.
-- Jack Krupansky
-----Original Message-----
From: Nguyen, Vincent (CDC/OD/OADS) (CTR)
Sent: Wednesday, August 22, 2012 3:57 PM
To: solr-user@lucene.apache.org
Subject: Full Text Indexing for DOCX files
Has anyone been able to index DOCX files? I get this error message when using office 2007 documents
(Location of error
unknown)org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. POI only supports OLE2 Office documents
We're currently using SOLR1.3
Vincent Vu Nguyen
Re: Full Text Indexing for DOCX files
Posted by Jack Krupansky <ja...@basetechnology.com>.
I've indexed Office 2007 .docx using Solr 3.6.
It sounds as if Solr 1.3 had an old release of Tika/POI. No big surprise
there.
-- Jack Krupansky
-----Original Message-----
From: Nguyen, Vincent (CDC/OD/OADS) (CTR)
Sent: Wednesday, August 22, 2012 3:57 PM
To: solr-user@lucene.apache.org
Subject: Full Text Indexing for DOCX files
Has anyone been able to index DOCX files? I get this error message when
using office 2007 documents
(Location of error
unknown)org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied
data appears to be in the Office 2007+ XML. POI only supports OLE2 Office
documents
We're currently using SOLR1.3
Vincent Vu Nguyen