You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2015/03/15 21:00:39 UTC
[jira] [Updated] (TIKA-1165) Autodetect and parse Asciidoc
[ https://issues.apache.org/jira/browse/TIKA-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tyler Palsulich updated TIKA-1165:
----------------------------------
Labels: new-parser (was: )
> Autodetect and parse Asciidoc
> -----------------------------
>
> Key: TIKA-1165
> URL: https://issues.apache.org/jira/browse/TIKA-1165
> Project: Tika
> Issue Type: Wish
> Components: languageidentifier, parser
> Affects Versions: 1.4
> Reporter: David Pilato
> Priority: Trivial
> Labels: new-parser
>
> When parsing asciidoc metadata, we currently get the following:
> {noformat}
> Content-Encoding: ISO-8859-1
> Content-Length: 66363
> Content-Type: text/plain; charset=ISO-8859-1
> resourceName: asciidoc.adoc
> {noformat}
> Steps to reproduce:
> {code:title=asciidoc.sh|borderStyle=solid}
> curl https://raw.github.com/asciidoctor/asciidoctor.org/master/docs/asciidoc-syntax-quick-reference.adoc -O -s
> java -jar tika-app-1.4.jar -m asciidoc-syntax-quick-reference.adoc
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)