You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by "David Cockbill (JIRA)" <ji...@apache.org> on 2019/01/31 09:39:00 UTC
[jira] [Commented] (ANY23-422) Error message when any23 cli tool
used
[ https://issues.apache.org/jira/browse/ANY23-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757059#comment-16757059 ]
David Cockbill commented on ANY23-422:
--------------------------------------
The 2 commits of relevance are:
———————————
commit 242b130b4670507e240bf9fec1fb8f9aad647870
Author: Peter Ansell <[p_ansell@yahoo.com|mailto:p_ansell@yahoo.com]>
Date: Thu Jan 12 10:35:17 2017 +1100
ANY23-80 : Split out CLI into its own module
Signed-off-by: Peter Ansell <[p_ansell@yahoo.com|mailto:p_ansell@yahoo.com]>
———————————
commit 692c583f848c5b7ae5a7940c857bfb0a9542c0d5
Author: Hans <[firedrake93@gmail.com|mailto:firedrake93@gmail.com]>
Date: Fri Sep 14 10:29:33 2018 -0500
ANY23-396 Overhaul WriterFactory API
———————————
The first commit introduced the file: cli/pom.xml where the app assembler plugin has the line:
<configurationSourceDirectory>${basedir}/src/test/resources</configurationSourceDirectory>
This copies resources from the src/test/resources directory when assembling the app.
The cli test directory has the ServiceLoader configuration file org.apache.any23.cli.flows.PeopleExtractorFactory.
The second commit introduced the WriterFactoryRegistry class. It’s constructor has a ServiceLoader which loads all the items from it’s resources/META-INF/services directory, which now has the above service configuration copied from the test/resources directory. This tries to load the PeopleExtractorFactor class when you run the any23 cli command. The PeopleExtractorFactor doesn’t exist as I believe is a test class only.
My feeling is that the <configurationSourceDirectory>${basedir}/src/test/resources</configurationSourceDirectory> block shouldn’t be in the cli/pom.xml
> Error message when any23 cli tool used
> --------------------------------------
>
> Key: ANY23-422
> URL: https://issues.apache.org/jira/browse/ANY23-422
> Project: Apache Any23
> Issue Type: Test
> Components: CLI
> Affects Versions: 2.3
> Environment: Linux data 4.4.0-140-generic #166~14.04.1-Ubuntu SMP Sat Nov 17 01:52:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> Reporter: dhirajforyou
> Priority: Critical
> Fix For: 2.3
>
>
>
> observation with cli tool:
> *any23 2.2*
> *{color:#ff0000}./bin/any23 rover "[https://www.bbc.com/sport/football/46377603]" -o /tmp/any23_2.2{color}*
> ------------------------------------------------------------------------
> Apache Any23 :: rover
> ------------------------------------------------------------------------
>
> Nov 30, 2018 7:45:32 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
> WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored
> See [https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io]
> for optional dependencies.
> TIFFImageWriter not loaded. tiff files will not be processed
> See [https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io]
> for optional dependencies.
> J2KImageReader not loaded. JPEG2000 files will not be processed.
> See [https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io]
> for optional dependencies.
>
> Nov 30, 2018 7:45:32 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
> WARNING: org.xerial's sqlite-jdbc is not loaded.
> Please provide the jar on your classpath to parse sqlite files.
> See tika-parsers/pom.xml for the correct version.
> 0 [main] INFO org.apache.any23.rdf.PopularPrefixes - Loading prefixes from /org/apache/any23/prefixes/prefixes.properties
> 1113 [main] INFO org.apache.any23.extractor.SingleDocumentExtraction - Processing [https://www.bbc.com/sport/football/46377603]
> 3127 [main] INFO org.apache.any23.cli.Rover - Extractors used: [html-head-meta, html-head-title, html-rdfa11]
> 3127 [main] INFO org.apache.any23.cli.Rover - 55 triples, 3083ms
>
> ------------------------------------------------------------------------
> Apache Any23 SUCCESS
> Total time: 4s
> Finished at: Fri Nov 30 19:45:35 IST 2018
> Final Memory: 40M/143M
> ------------------------------------------------------------------------
>
>
>
> *with any23 2.3 snapshot cli released locally:*
> {color:#ff0000}*/bin/any23 rover "[https://www.bbc.com/sport/football/46377603]" -o /tmp/any23_2.3*{color}
>
> 1 [main] ERROR org.apache.any23.writer.WriterFactoryRegistry - Found error loading a WriterFactory
> java.util.ServiceConfigurationError: org.apache.any23.writer.WriterFactory: Provider org.apache.any23.cli.flows.PeopleExtractorFactory not found
> at java.util.ServiceLoader.fail(ServiceLoader.java:239)
> at java.util.ServiceLoader.access$300(ServiceLoader.java:185)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:372)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at org.apache.any23.writer.WriterFactoryRegistry.<init>(WriterFactoryRegistry.java:90)
> at org.apache.any23.writer.WriterFactoryRegistry$InstanceHolder.<clinit>(WriterFactoryRegistry.java:54)
> at org.apache.any23.writer.WriterFactoryRegistry.getInstance(WriterFactoryRegistry.java:129)
> at org.apache.any23.cli.Rover.<clinit>(Rover.java:76)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at java.lang.Class.newInstance(Class.java:442)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at org.apache.any23.cli.ToolRunner.execute(ToolRunner.java:95)
> at org.apache.any23.cli.ToolRunner.execute(ToolRunner.java:72)
> at org.apache.any23.cli.ToolRunner.main(ToolRunner.java:68)
>
> ------------------------------------------------------------------------
> Apache Any23 :: rover
> ------------------------------------------------------------------------
>
> 2244 [main] WARN org.apache.http.client.protocol.ResponseProcessCookies - Invalid cookie header: "Set-Cookie: BBC-UID=ca727e6c3a3b33f842e8878f6fafd0e83567ff24f7978b58d536e1eec83ce2590Any23-CLI; expires=Tue, 29 Nov 2022 14:15:41 GMT; path=/; domain=.[bbc.com|http://bbc.com/]". Invalid 'expires' attribute: Tue, 29 Nov 2022 14:15:41 GMT
> 4384 [main] INFO org.apache.any23.cli.Rover - Extractors used: [html-head-meta, html-scraper, html-head-title, html-rdfa11]
> 4384 [main] INFO org.apache.any23.cli.Rover - 59 triples, 2568ms
>
> ------------------------------------------------------------------------
> Apache Any23 SUCCESS
> Total time: 4s
> Finished at: Fri Nov 30 19:45:43 IST 2018
> Final Memory: 75M/187M
> ------------------------------------------------------------------------
>
>
> with snapshot released locally, it starts with
> *[main] ERROR org.apache.any23.writer.WriterFactoryRegistry - Found error loading a WriterFactory*
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)