You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2015/03/15 21:00:39 UTC

[jira] [Updated] (TIKA-1165) Autodetect and parse Asciidoc

     [ https://issues.apache.org/jira/browse/TIKA-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tyler Palsulich updated TIKA-1165:
----------------------------------
    Labels: new-parser  (was: )

> Autodetect and parse Asciidoc
> -----------------------------
>
>                 Key: TIKA-1165
>                 URL: https://issues.apache.org/jira/browse/TIKA-1165
>             Project: Tika
>          Issue Type: Wish
>          Components: languageidentifier, parser
>    Affects Versions: 1.4
>            Reporter: David Pilato
>            Priority: Trivial
>              Labels: new-parser
>
> When parsing asciidoc metadata, we currently get the following:
> {noformat}
> Content-Encoding: ISO-8859-1
> Content-Length: 66363
> Content-Type: text/plain; charset=ISO-8859-1
> resourceName: asciidoc.adoc
> {noformat}
> Steps to reproduce:
> {code:title=asciidoc.sh|borderStyle=solid}
> curl https://raw.github.com/asciidoctor/asciidoctor.org/master/docs/asciidoc-syntax-quick-reference.adoc -O -s
> java -jar tika-app-1.4.jar -m asciidoc-syntax-quick-reference.adoc
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)