You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by "Allison, Timothy B." <ta...@mitre.org> on 2016/05/12 14:18:57 UTC

extracting hyperlinks from xlsx with XSSFEventBasedExcelExtractor?

All,
  On TIKA-1454, one of our users asked to add extraction of hyperlinks from xlsx.  It looks like hyperlink info appears at the end of the sheet.xml (after the <sheetData> in the <hyperlinks> section).  Are there any recommendations for merging hyperlink info with the actual cells via XSSFEventBasedExcelExtractor?  Double-pass on the sheet.xml or just give up and dump the hyperlinks at the end of the sheet... or other options?

          Best,

                        Tim


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: extracting hyperlinks from xlsx with XSSFEventBasedExcelExtractor?

Posted by Nick Burch <ap...@gagravarr.org>.
On Thu, 12 May 2016, Allison, Timothy B. wrote:
>  On TIKA-1454, one of our users asked to add extraction of hyperlinks 
> from xlsx.  It looks like hyperlink info appears at the end of the 
> sheet.xml (after the <sheetData> in the <hyperlinks> section).  Are 
> there any recommendations for merging hyperlink info with the actual 
> cells via XSSFEventBasedExcelExtractor?  Double-pass on the sheet.xml or 
> just give up and dump the hyperlinks at the end of the sheet... or other 
> options?

The only option I can think of that'd work would be an optional flag which 
would trigger a second SAX processing with a different handler. That would 
capture the (hopefully!) small set of hyperlink data, which the "normal" 
second process could then use

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org