You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Tilman Hausherr <TH...@t-online.de> on 2022/04/23 13:25:24 UTC

tika-main windows build fails in TikaResourceFetcherTest

I have unsuccessfully tried to build tika-main on windows 10 on jdk8 for 
several weeks. Here's the failures I get

[ERROR] Failures:
[ERROR] 
TikaResourceFetcherTest.testHeader:100->CXFTestBase.assertContains:65 
hello world not found in:
  ==> expected: <true> but was: <false>
[ERROR] 
TikaResourceFetcherTest.testQueryPart:108->CXFTestBase.assertContains:65 
hello world not found in:
  ==> expected: <true> but was: <false>
[ERROR] Errors:
[ERROR] TikaServerIntegrationTest.test1WayTLS:341->configure1WayTLS:456 
» InvalidPath Illegal char <"> at index 0: 
"XXX\tika-main\tika-server\tika-server-core\target\test-classes\ssl-keys\tika-client-truststore.p12"
[ERROR] TikaServerIntegrationTest.test2WayTLS:377->configure2WayTLS:428 
» InvalidPath Illegal char <"> at index 0: 
"XXX\tika-main\tika-server\tika-server-core\target\test-classes\ssl-keys\tika-client-keystore.p12"
[INFO]
[ERROR] Tests run: 70, Failures: 2, Errors: 2, Skipped: 4

The two TLS fails are new, but the TikaResourceFetcherTest have been for 
weeks. The reason is that response.getEntity() returns an empty string. 
This is because response.getEntity() is a ByteArrayInputStream that is 
empty.


One output is this:


INFO  [main] 14:23:17,531 
org.apache.tika.pipes.fetcher.fs.FileSystemFetcher A FileSystemFetcher 
(fsf) has been initialized. Clients will be able to read all files under 
'XXX\tika-main\tika-server\tika-server-core\XXXtika-maintika-servertika-server-coretargettest-classestest-documents' 
if this process has permission to read them.

Note that the two XXX here are the same. It's the Window path where I 
keep my java projects.

I investigated a bit... FetcherManager.load loads a file from the temp 
directory. Its content is like this:

<?xml version="1.0" encoding="UTF-8"?>

... license...

<properties>
   <fetchers>
     <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher">
       <params>
         <name>fsf</name>
<basePath>XXXtika-maintika-servertika-server-coretargettest-classestest-documents</basePath>
       </params>
     </fetcher>
   </fetchers>

...

Something goes wrong in

         configXML = configXML.replaceAll("\\$\\{FETCHER_BASE_PATH\\}",
                 inputDir.toAbsolutePath().toString());

in TikaResourceFetcherTest.java that the backslash from the path is lost.

The javadoc warns about this

     Note that backslashes (|\|) and dollar signs (|$|) in the 
replacement string may cause the results to be different than if it were 
being treated as a literal replacement string

using replace("${FETCHER_BASE_PATH}") fixes this.

Related: shouldn't FileSystemFetcher.checkInitialization() check whether 
the path exists?

Tilman

Re: tika-main windows build fails in TikaResourceFetcherTest

Posted by Tim Allison <ta...@apache.org>.
Thank you for catching this, Tilman.  I do get a test failure on my
windows laptop after I installed exiftool. :(  Will fix.

On Mon, Apr 25, 2022 at 2:45 PM Tilman Hausherr <TH...@t-online.de> wrote:
>
> .replaceAll() is also used in ExternalParser.java with a filename
> parameter. But no tests fail because of it.

Re: tika-main windows build fails in TikaResourceFetcherTest

Posted by Tilman Hausherr <TH...@t-online.de>.
.replaceAll() is also used in ExternalParser.java with a filename 
parameter. But no tests fail because of it.

Re: tika-main windows build fails in TikaResourceFetcherTest

Posted by Tilman Hausherr <TH...@t-online.de>.
Same problem also in TikaServerIntegrationTest with replaceAll(); plus 
the problem that I mentioned in TIKA-3719.