You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2015/03/15 22:18:38 UTC
[jira] [Closed] (TIKA-1181) RTFParser not keeping HTML font colors
and underscore tags.
[ https://issues.apache.org/jira/browse/TIKA-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tyler Palsulich closed TIKA-1181.
---------------------------------
Resolution: Won't Fix
Closing as Won't Fix, since we aren't interested in adding color support. But, feel free to reopen if you believe we should add underline support!
> RTFParser not keeping HTML font colors and underscore tags.
> -----------------------------------------------------------
>
> Key: TIKA-1181
> URL: https://issues.apache.org/jira/browse/TIKA-1181
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.4
> Environment: Windows server 2008
> Reporter: Leo
> Labels: RTFParser
>
> Hi,
> I'm having problems with this code. It does not put the font colors and underscores "<u></u>" tags in the HTML from the RTF string. Is there anything I can do to put them there?
> Code:
> InputStream in = new ByteArrayInputStream(rtfString.getBytes("UTF-8"));
>
> org.apache.tika.parser.rtf.RTFParser parser = new org.apache.tika.parser.rtf.RTFParser();
>
> Metadata metadata = new Metadata();
> StringWriter sw = new StringWriter();
> SAXTransformerFactory factory = (SAXTransformerFactory)
> SAXTransformerFactory.newInstance();
> TransformerHandler handler = factory.newTransformerHandler();
> handler.getTransformer().setOutputProperty(OutputKeys.METHOD, "xml");
> handler.getTransformer().setOutputProperty(OutputKeys.INDENT, "no");
> handler.setResult(new StreamResult(sw));
> parser.parse(in, handler, metadata, new ParseContext());
> String xhtml = sw.toString();
>
> xhtml = xhtml.replaceAll("\r\n", "<br>\r\n");
> Thanks for looking at it.
> Leo
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)