You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2017/12/12 16:07:00 UTC

[jira] [Created] (ANY23-314) Service fails to return extraction in case of extraction error

Lewis John McGibbney created ANY23-314:
------------------------------------------

             Summary: Service fails to return extraction in case of extraction error
                 Key: ANY23-314
                 URL: https://issues.apache.org/jira/browse/ANY23-314
             Project: Apache Any23
          Issue Type: Bug
          Components: service
    Affects Versions: 2.1
         Environment: Any23 2.2-SNAPSHOT
            Reporter: Lewis John McGibbney
            Assignee: Lewis John McGibbney
             Fix For: 2.2
         Attachments: extraction.json, output.log

See the following command line extraction
{code}
lmcgibbn@LMC-056430 /usr/local/any23(master) $ ./cli/target/appassembler/bin/any23 rover -l output.log -o extraction.json https://www.jobcluster.de

------------------------------------------------------------------------
Apache Any23 :: rover
------------------------------------------------------------------------

0    [main] WARN  org.apache.tika.parser.image.ImageParser  - JBIG2ImageReader not loaded. jbig2 files will be ignored
128  [main] INFO  org.apache.any23.rdf.PopularPrefixes  - Loading prefixes from /org/apache/any23/prefixes/prefixes.properties
1388 [main] WARN  org.apache.commons.httpclient.HttpMethodBase  - Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
4790 [main] INFO  org.apache.any23.extractor.SingleDocumentExtraction  - Processing https://www.jobcluster.de/
[Fatal Error] :12:46: The entity name must immediately follow the '&' in the entity reference.

------------------------------------------------------------------------
Apache Any23 FAILURE

Execution terminated with errors: Error while parsing RDF document.

Total time: 5s
Finished at: Tue Dec 12 08:01:14 PST 2017
Final Memory: 31M/184M
------------------------------------------------------------------------
{code}
This results in the attached extraction result (extraction.json) and associated log (output.log)
If I attempt to run the same extraction using the service at [any23.org|http://any23.org/any23/?format=json&uri=https%3A%2F%2Fwww.jobcluster.de%2F&validation-mode=none] the (partial) extraction result should be returned regardless of whether the entire extraction was successful or not.

The service servlet seems to be returning the extraction Exception as oppose to the preferred extraction result. This issue will fix that.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)