You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Chris Mattmann <ma...@apache.org> on 2020/03/18 14:35:07 UTC

Re: [EXTERNAL] Re: JDK 12 build issues

So I was able to get past my issues with Tesseract by reinstalling the latest version with Brew.

 

I have a new issue!

I’ve tried in JDK12 and JDK13 to build tika-dl, but it keeps failing:

 

[INFO] 

[INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @ tika-dl ---

[INFO] Changes detected - recompiling the module!

[INFO] Compiling 2 source files to /Users/mattmann/src/tika/tika-dl/target/test-classes

[INFO] 

[INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-dl ---

[INFO] 

[INFO] -------------------------------------------------------

[INFO]  T E S T S

[INFO] -------------------------------------------------------

[INFO] Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest

log4j:WARN No appenders could be found for logger (org.nd4j.linalg.factory.Nd4jBackend).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.38 s <<< FAILURE! - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest

[ERROR] org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise  Time elapsed: 3.29 s  <<< ERROR!

org.apache.tika.exception.TikaConfigException: java.io.UTFDataFormatException: malformed input around byte 11

       at org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)

Caused by: java.lang.RuntimeException: java.io.UTFDataFormatException: malformed input around byte 11

       at org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)

Caused by: java.io.UTFDataFormatException: malformed input around byte 11

       at org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)

 

[INFO] Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest

[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.392 s - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest

[INFO] 

[INFO] Results:

[INFO] 

[ERROR] Errors: 

[ERROR]   DL4JVGG16NetTest.recognise:36 » TikaConfig java.io.UTFDataFormatException: mal...

[INFO] 

[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0

[INFO] 

[INFO] ------------------------------------------------------------------------

[INFO] BUILD FAILURE

[INFO] ------------------------------------------------------------------------

[INFO] Total time:  25.628 s

[INFO] Finished at: 2020-03-18T07:34:08-07:00

[INFO] ------------------------------------------------------------------------

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test) on project tika-dl: There are test failures.

[ERROR] 

[ERROR] Please refer to /Users/mattmann/src/tika/tika-dl/target/surefire-reports for the individual test results.

[ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.

[ERROR] -> [Help 1]

[ERROR] 

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR] 

[ERROR] For more information about the errors and possible solutions, please read the following articles:

[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

pomodoro:tika-dl mattmann$ 

 

Thamme, do you have any ideas what is going on here?


Cheers,

Chris

 

 

 

 

From: Tim Allison <ta...@apache.org>
Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>, "Allison, Timothy B (US 1760-Affiliate)" <ti...@jpl.nasa.gov>
Date: Wednesday, March 18, 2020 at 2:35 AM
To: "dev@tika.apache.org" <de...@tika.apache.org>
Subject: [EXTERNAL] Re: JDK 12 build issues

 

Haven’t tried...we should add java 12-14 to Jenkins.

 

Wait, are we up to 18 yet...

 

Will look into it...

 

On Tue, Mar 17, 2020 at 10:07 PM Chris Mattmann <ma...@apache.org> wrote:

 

Hey Tim et al.,

 

 

 

Do the tests fail for you with Java 12?

 

 

 

[INFO] Running org.apache.tika.parser.pkg.GzipParserTest

 

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

0.397 s - in org.apache.tika.parser.pkg.GzipParserTest

 

[INFO] Running org.apache.tika.TestXMLEntityExpansion

 

[WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:

0.085 s - in org.apache.tika.TestXMLEntityExpansion

 

[INFO] Running org.apache.tika.mime.MimeTypeTest

 

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

0.001 s - in org.apache.tika.mime.MimeTypeTest

 

[INFO] Running org.apache.tika.mime.MimeTypesTest

 

[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

0.001 s - in org.apache.tika.mime.MimeTypesTest

 

[INFO] Running org.apache.tika.mime.TestMimeTypes

 

[INFO] Tests run: 80, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

8.997 s - in org.apache.tika.mime.TestMimeTypes

 

[INFO] Running org.apache.tika.TestCorruptedFiles

 

[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:

0.001 s - in org.apache.tika.TestCorruptedFiles

 

[INFO]

 

[INFO] Results:

 

[INFO]

 

[ERROR] Failures:

 

[ERROR]

  TesseractOCRParserTest.confirmMultiPageTiffHandling:290->TikaTest.assertContains:110

Page 2 not found in:

 

<html xmlns="http://www.w3.org/1999/xhtml">

 

<head>

 

<meta name="Exif Image:Page Number" content="1 2" />

 

<meta name="Exif IFD0:Strip Offsets" content="8 28680 46835 73454" />

 

<meta name="Exif IFD0:JPEG Tables" content="[289 values]" />

 

<meta name="Exif Image:Samples Per Pixel" content="3 samples/pixel" />

 

<meta name="Exif Image:Image Height" content="600 pixels" />

 

<meta name="tiff:ImageLength" content="600" />

 

<meta name="Exif Image:Compression" content="JPEG" />

 

<meta name="Exif Image:Y Resolution" content="96 dots per inch" />

 

<meta name="Exif IFD0:X Resolution" content="96 dots per inch" />

 

<meta name="tiff:ResolutionUnit" content="Inch" />

 

<meta name="Exif IFD0:Image Height" content="600 pixels" />

 

<meta name="Exif IFD0:Strip Byte Counts" content="28672 18155 26619 4002

bytes" />

 

<meta name="File Size" content="156867 bytes" />

 

<meta name="Exif IFD0:Image Width" content="800 pixels" />

 

<meta name="Exif Image:Photometric Interpretation" content="RGB" />

 

<meta name="Exif IFD0:Samples Per Pixel" content="3 samples/pixel" />

 

<meta name="Exif IFD0:Planar Configuration" content="Chunky (contiguous

for each subsampling pixel)" />

 

<meta name="Exif IFD0:Rows Per Strip" content="160 rows/strip" />

 

<meta name="Exif Image:Image Width" content="800 pixels" />

 

<meta name="File Name" content="apache-tika-17704590698477286878.tmp" />

 

<meta name="Exif IFD0:Bits Per Sample" content="8 8 8

bits/component/pixel" />

 

<meta name="tiff:BitsPerSample" content="8" />

 

<meta name="Exif IFD0:Resolution Unit" content="Inch" />

 

<meta name="Content-Type" content="image/tiff" />

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />

 

<meta name="X-Parsed-By"

content="org.apache.tika.parser.ocr.TesseractOCRParser" />

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.image.TiffParser"

/>

 

<meta name="Exif Image:Planar Configuration" content="Chunky (contiguous

for each subsampling pixel)" />

 

<meta name="File Modified Date" content="Wed Mar 18 01:27:47 +00:00 2020"

/>

 

<meta name="tiff:XResolution" content="96.0" />

 

<meta name="tiff:SamplesPerPixel" content="3" />

 

<meta name="exif:PageCount" content="2" />

 

<meta name="Exif Image:Strip Byte Counts" content="28672 18155 27500 4002

bytes" />

 

<meta name="Exif IFD0:Orientation" content="Top, left side (Horizontal /

normal)" />

 

<meta name="tiff:Orientation" content="1" />

 

<meta name="Exif IFD0:Compression" content="JPEG" />

 

<meta name="Exif IFD0:Page Number" content="0 1" />

 

<meta name="Exif Image:Rows Per Strip" content="160 rows/strip" />

 

<meta name="Exif Image:X Resolution" content="96 dots per inch" />

 

<meta name="Exif Image:Orientation" content="Top, left side (Horizontal /

normal)" />

 

<meta name="Exif IFD0:Photometric Interpretation" content="RGB" />

 

<meta name="Exif Image:JPEG Tables" content="[289 values]" />

 

<meta name="tiff:ImageWidth" content="800" />

 

<meta name="tiff:YResolution" content="96.0" />

 

<meta name="Exif Image:Bits Per Sample" content="8 8 8

bits/component/pixel" />

 

<meta name="Exif Image:Strip Offsets" content="77997 106669 124824 152324"

/>

 

<meta name="Exif IFD0:Y Resolution" content="96 dots per inch" />

 

<meta name="Exif Image:Resolution Unit" content="Inch" />

 

<title></title>

 

</head>

 

<body><div class="ocr">Multipage

 

TIFF

 

Example

 

Page 1

 

</div>

 

</body></html>

 

[ERROR]

  TesseractOCRParserTest.testOCROutputsHOCR:146->TikaTest.assertContains:110

Happy</span> not found in:

 

<html xmlns="http://www.w3.org/1999/xhtml">

 

<head>

 

<meta name="pdf:docinfo:custom:AAPL:Keywords" content="" />

 

<meta name="pdf:PDFVersion" content="1.3" />

 

<meta name="pdf:docinfo:title" content="Presentation1" />

 

<meta name="xmp:CreatorTool" content="PowerPoint" />

 

<meta name="pdf:hasXFA" content="false" />

 

<meta name="access_permission:modify_annotations" content="true" />

 

<meta name="access_permission:can_print_degraded" content="true" />

 

<meta name="AAPL:Keywords" content="" />

 

<meta name="dc:creator" content="grantingersoll" />

 

<meta name="dcterms:created" content="2014-02-08T19:57:12Z" />

 

<meta name="dcterms:modified" content="2014-02-08T19:57:12Z" />

 

<meta name="Last-Modified" content="2014-02-08T19:57:12Z" />

 

<meta name="dc:format" content="application/pdf; version=1.3" />

 

<meta name="pdf:docinfo:creator_tool" content="PowerPoint" />

 

<meta name="access_permission:fill_in_form" content="true" />

 

<meta name="pdf:docinfo:keywords" content="" />

 

<meta name="pdf:docinfo:modified" content="2014-02-08T19:57:12Z" />

 

<meta name="meta:save-date" content="2014-02-08T19:57:12Z" />

 

<meta name="pdf:encrypted" content="false" />

 

<meta name="dc:title" content="Presentation1" />

 

<meta name="cp:subject" content="" />

 

<meta name="pdf:docinfo:subject" content="" />

 

<meta name="pdf:hasMarkedContent" content="false" />

 

<meta name="Content-Type" content="application/pdf" />

 

<meta name="pdf:docinfo:creator" content="grantingersoll" />

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.pdf.PDFParser" />

 

<meta name="meta:author" content="grantingersoll" />

 

<meta name="dc:subject" content="" />

 

<meta name="dc:subject" content="" />

 

<meta name="dc:subject" content="" />

 

<meta name="dc:subject" content="" />

 

<meta name="meta:creation-date" content="2014-02-08T19:57:12Z" />

 

<meta name="access_permission:extract_for_accessibility" content="true" />

 

<meta name="access_permission:assemble_document" content="true" />

 

<meta name="xmpTPg:NPages" content="1" />

 

<meta name="pdf:hasXMP" content="false" />

 

<meta name="access_permission:extract_content" content="true" />

 

<meta name="access_permission:can_print" content="true" />

 

<meta name="meta:keyword" content="" />

 

<meta name="access_permission:can_modify" content="true" />

 

<meta name="pdf:docinfo:producer" content="Mac OS X 10.9.1 Quartz

PDFContext" />

 

<meta name="pdf:docinfo:created" content="2014-02-08T19:57:12Z" />

 

<title>Presentation1</title>

 

</head>

 

<body><div class="page"><p />

 

<img src="embedded:image0.png" alt="image0.png" /></div>

 

</body></html><html xmlns="http://www.w3.org/1999/xhtml">

 

<head>

 

<meta name="Transparency Alpha" content="none" />

 

<meta name="tiff:ImageLength" content="261" />

 

<meta name="Compression CompressionTypeName" content="deflate" />

 

<meta name="Data BitsPerSample" content="8 8 8" />

 

<meta name="Data PlanarConfiguration" content="PixelInterleaved" />

 

<meta name="Dimension VerticalPixelSize" content="0.35273367" />

 

<meta name="IHDR" content="width=934, height=261, bitDepth=8,

colorType=RGB, compressionMethod=deflate, filterMethod=adaptive,

interlaceMethod=none" />

 

<meta name="embeddedResourceType" content="INLINE" />

 

<meta name="Chroma ColorSpaceType" content="RGB" />

 

<meta name="tiff:BitsPerSample" content="8 8 8" />

 

<meta name="Content-Type" content="image/png" />

 

<meta name="height" content="261" />

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />

 

<meta name="X-Parsed-By"

content="org.apache.tika.parser.ocr.TesseractOCRParser" />

 

<meta name="X-Parsed-By"

content="org.apache.tika.parser.image.ImageParser" />

 

<meta name="pHYs" content="pixelsPerUnitXAxis=2835,

pixelsPerUnitYAxis=2835, unitSpecifier=meter" />

 

<meta name="Dimension PixelAspectRatio" content="1.0" />

 

<meta name="resourceName" content="image0.png" />

 

<meta name="pdf:hasXMP" content="false" />

 

<meta name="Compression NumProgressiveScans" content="1" />

 

<meta name="Dimension HorizontalPixelSize" content="0.35273367" />

 

<meta name="Chroma BlackIsZero" content="true" />

 

<meta name="Compression Lossless" content="true" />

 

<meta name="X-TIKA:embedded_depth" content="1" />

 

<meta name="width" content="934" />

 

<meta name="Dimension ImageOrientation" content="Normal" />

 

<meta name="X-TIKA:embedded_resource_path" content="/image0.png" />

 

<meta name="tiff:ImageWidth" content="934" />

 

<meta name="Chroma NumChannels" content="3" />

 

<meta name="Data SampleFormat" content="UnsignedIntegral" />

 

<title></title>

 

</head>

 

<body><div class="ocr">

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

   <div class="ocr_page" id="page_1" title="image

&quot;/var/folders/n5/1d_k3z4s2293q8ntx_n8sw54mm5n_8/T/apache-tika-1878472717677617651.tmp&quot;;

bbox 0 0 934 261; ppageno 0">

 

    <div class="ocr_carea" id="block_1_1" title="bbox 21 34 465 66">

 

     <p class="ocr_par" id="par_1_1" lang="eng" title="bbox 21 34 465 66">

 

      <span class="ocr_line" id="line_1_1" title="bbox 21 34 465 66;

baseline 0.005 -7; x_size 33; x_descenders 7; x_ascenders 8">

 

       <span class="ocrx_word" id="word_1_1" title="bbox 21 36 135 66;

x_wconf 96"><strong>Happy</strong></span>

 

       <span class="ocrx_word" id="word_1_2" title="bbox 161 37 232 60;

x_wconf 96"><strong>New</strong></span>

 

       <span class="ocrx_word" id="word_1_3" title="bbox 259 38 345 60;

x_wconf 96"><strong>Year</strong></span>

 

       <span class="ocrx_word" id="word_1_4" title="bbox 375 34 465 61;

x_wconf 96"><strong>2003!</strong></span>

 

      </span>

 

     </p>

 

 

 

    </div>

 

 

 

   </div>

 

 

 

 

 

</div>

 

</body></html>

 

[INFO]

 

[ERROR] Tests run: 1188, Failures: 2, Errors: 0, Skipped: 48

 

[INFO]

 

[INFO]

------------------------------------------------------------------------

 

[INFO] Reactor Summary for Apache Tika 2.0.0-SNAPSHOT:

 

[INFO]

 

[INFO] Apache Tika parent ................................. SUCCESS [

8.822 s]

 

[INFO] Apache Tika core ................................... SUCCESS [

39.589 s]

 

[INFO] Apache Tika parsers ................................ FAILURE [09:04

min]

 

[INFO] Apache Tika OSGi bundle ............................ SKIPPED

 

[INFO] Apache Tika XMP .................................... SKIPPED

 

[INFO] Apache Tika serialization .......................... SKIPPED

 

[INFO] Apache Tika batch .................................. SKIPPED

 

[INFO] Apache Tika language detection ..................... SKIPPED

 

[INFO] Apache Tika application ............................ SKIPPED

 

[INFO] Apache Tika translate .............................. SKIPPED

 

[INFO] Apache Tika server ................................. SKIPPED

 

[INFO] Apache Tika eval ................................... SKIPPED

 

[INFO] Apache Tika examples ............................... SKIPPED

 

[INFO] Apache Tika Java-7 Components ...................... SKIPPED

 

[INFO] Apache Tika Deep Learning (powered by DL4J) ........ SKIPPED

 

[INFO] Apache Tika Natural Language Processing ............ SKIPPED

 

[INFO] Apache Tika ........................................ SKIPPED

 

[INFO]

------------------------------------------------------------------------

 

[INFO] BUILD FAILURE

 

[INFO]

------------------------------------------------------------------------

 

[INFO] Total time:  09:57 min

 

[INFO] Finished at: 2020-03-17T18:31:10-07:00

 

[INFO]

------------------------------------------------------------------------

 

[ERROR] Failed to execute goal

org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test)

on project tika-parsers: There are test failures.

 

[ERROR]

 

[ERROR] Please refer to

/Users/mattmann/src/tika/tika-parsers/target/surefire-reports for the

individual test results.

 

[ERROR] Please refer to dump files (if any exist) [date].dump,

[date]-jvmRun[N].dump and [date].dumpstream.

 

[ERROR] -> [Help 1]

 

[ERROR]

 

[ERROR] To see the full stack trace of the errors, re-run Maven with the

-e switch.

 

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

 

[ERROR]

 

[ERROR] For more information about the errors and possible solutions,

please read the following articles:

 

[ERROR] [Help 1]

http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

 

[ERROR]

 

[ERROR] After correcting the problems, you can resume the build with the

command

 

[ERROR]   mvn <goals> -rf :tika-parsers

 

pomodoro:tika mattmann$ java -version

 

openjdk version "12.0.1" 2019-04-16

 

OpenJDK Runtime Environment (build 12.0.1+12)

 

OpenJDK 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)

 

pomodoro:tika mattmann$

 

 

 

Any ideas?

 

 

 

Cheers,

 

Chris

 

 

 

 

 


Re: [EXTERNAL] Re: JDK 12 build issues

Posted by Tim Allison <ta...@apache.org>.
Oh, and welcome back, Chris!!!

On Wed, Mar 18, 2020 at 11:21 AM Tim Allison <ta...@apache.org> wrote:

> FWIW, I got a clean build on 11 and 13 just now.  We've been getting the
> malformed stream error in Jenkins quite a bit in the dl4j model.  I suspect
> this is just a problem w downloading the file, but I'm not sure...
>
> On Wed, Mar 18, 2020 at 10:57 AM Chris Mattmann <ma...@apache.org>
> wrote:
>
>> Thanks Oleg I was using OpenJDK 12 and 13, but I fixed it!
>>
>>
>>
>> I needed to delete the $HOME/.tika-dl folder. All good now!
>>
>>
>>
>> NING] Invalid POM for commons-net:commons-net:jar:3.1, transitive
>> dependencies (if any) will not be available, enable debug logging for more
>> details
>>
>> [WARNING] Invalid POM for net.ericaro:neoitertools:jar:1.0.0, transitive
>> dependencies (if any) will not be available, enable debug logging for more
>> details
>>
>> [INFO]
>>
>> [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ tika-dl
>> ---
>>
>> [WARNING] Invalid project model for artifact
>> [commons-net:commons-net:3.1]. It will be ignored by the remote resources
>> Mojo.
>>
>> [WARNING] Invalid project model for artifact
>> [neoitertools:net.ericaro:1.0.0]. It will be ignored by the remote
>> resources Mojo.
>>
>> [INFO]
>>
>> [INFO] --- maven-resources-plugin:2.7:resources (default-resources) @
>> tika-dl ---
>>
>> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>>
>> [INFO] skip non existing resourceDirectory
>> /Users/mattmann/src/tika/tika-dl/src/main/resources
>>
>> [INFO] Copying 3 resources
>>
>> [INFO]
>>
>> [INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @
>> tika-dl ---
>>
>> [INFO] Changes detected - recompiling the module!
>>
>> [INFO] Compiling 2 source files to
>> /Users/mattmann/src/tika/tika-dl/target/classes
>>
>> [INFO]
>>
>> [INFO] --- maven-resources-plugin:2.7:testResources
>> (default-testResources) @ tika-dl ---
>>
>> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>>
>> [INFO] Copying 4 resources
>>
>> [INFO] Copying 3 resources
>>
>> [INFO]
>>
>> [INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile)
>> @ tika-dl ---
>>
>> [INFO] Changes detected - recompiling the module!
>>
>> [INFO] Compiling 2 source files to
>> /Users/mattmann/src/tika/tika-dl/target/test-classes
>>
>> [INFO]
>>
>> [INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-dl
>> ---
>>
>> [INFO]
>>
>> [INFO] -------------------------------------------------------
>>
>> [INFO]  T E S T S
>>
>> [INFO] -------------------------------------------------------
>>
>> [INFO] Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>>
>> log4j:WARN No appenders could be found for logger
>> (org.nd4j.linalg.factory.Nd4jBackend).
>>
>> log4j:WARN Please initialize the log4j system properly.
>>
>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
>> more info.
>>
>> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> 272.202 s - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>>
>> [INFO] Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>>
>> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> 44.616 s - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>>
>> [INFO]
>>
>> [INFO] Results:
>>
>> [INFO]
>>
>> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
>>
>> [INFO]
>>
>> [INFO]
>> ------------------------------------------------------------------------
>>
>> [INFO] BUILD SUCCESS
>>
>> [INFO]
>> ------------------------------------------------------------------------
>>
>> [INFO] Total time:  05:27 min
>>
>> [INFO] Finished at: 2020-03-18T07:51:56-07:00
>>
>> [INFO]
>> ------------------------------------------------------------------------
>>
>> pomodoro:tika-dl mattmann$
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> From: Oleg Tikhonov <ol...@apache.org>
>> Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>
>> Date: Wednesday, March 18, 2020 at 7:53 AM
>> To: "dev@tika.apache.org" <de...@tika.apache.org>
>> Subject: Re: [EXTERNAL] Re: JDK 12 build issues
>>
>>
>>
>> Hi Chris,
>>
>> I'm currently trying to build an env with java 12/13 ... in order to try
>>
>> your setup.
>>
>> What java version are you using? open jdk or oracle?
>>
>> One upon a time was a bug in openjdk
>>
>> https://bugs.openjdk.java.net/browse/JDK-8131146
>>
>> But it seems to be ok in recent releases.
>>
>>
>>
>> Keep you updated.
>>
>> Cheers,
>>
>> Oleg
>>
>>
>>
>>
>>
>> On Wed, Mar 18, 2020 at 4:35 PM Chris Mattmann <ma...@apache.org>
>> wrote:
>>
>>
>>
>> So I was able to get past my issues with Tesseract by reinstalling the
>>
>> latest version with Brew.
>>
>>
>>
>>
>>
>>
>>
>> I have a new issue!
>>
>>
>>
>> I’ve tried in JDK12 and JDK13 to build tika-dl, but it keeps failing:
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>> [INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @
>>
>> tika-dl ---
>>
>>
>>
>> [INFO] Changes detected - recompiling the module!
>>
>>
>>
>> [INFO] Compiling 2 source files to
>>
>> /Users/mattmann/src/tika/tika-dl/target/test-classes
>>
>>
>>
>> [INFO]
>>
>>
>>
>> [INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-dl
>> ---
>>
>>
>>
>> [INFO]
>>
>>
>>
>> [INFO] -------------------------------------------------------
>>
>>
>>
>> [INFO]  T E S T S
>>
>>
>>
>> [INFO] -------------------------------------------------------
>>
>>
>>
>> [INFO] Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>>
>>
>>
>> log4j:WARN No appenders could be found for logger
>>
>> (org.nd4j.linalg.factory.Nd4jBackend).
>>
>>
>>
>> log4j:WARN Please initialize the log4j system properly.
>>
>>
>>
>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
>>
>> more info.
>>
>>
>>
>> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
>>
>> 3.38 s <<< FAILURE! - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>>
>>
>>
>> [ERROR] org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise  Time
>>
>> elapsed: 3.29 s  <<< ERROR!
>>
>>
>>
>> org.apache.tika.exception.TikaConfigException: java.io
>> .UTFDataFormatException:
>>
>> malformed input around byte 11
>>
>>
>>
>>         at
>>
>>
>> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>>
>>
>>
>> Caused by: java.lang.RuntimeException: java.io.UTFDataFormatException:
>>
>> malformed input around byte 11
>>
>>
>>
>>         at
>>
>>
>> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>>
>>
>>
>> Caused by: java.io.UTFDataFormatException: malformed input around byte 11
>>
>>
>>
>>         at
>>
>>
>> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>>
>>
>>
>> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>>
>> 5.392 s - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>>
>>
>>
>> [INFO]
>>
>>
>>
>> [INFO] Results:
>>
>>
>>
>> [INFO]
>>
>>
>>
>> [ERROR] Errors:
>>
>>
>>
>> [ERROR]   DL4JVGG16NetTest.recognise:36 » TikaConfig java.io
>> .UTFDataFormatException:
>>
>> mal...
>>
>>
>>
>> [INFO]
>>
>>
>>
>> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0
>>
>>
>>
>> [INFO]
>>
>>
>>
>> [INFO]
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> [INFO] BUILD FAILURE
>>
>>
>>
>> [INFO]
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> [INFO] Total time:  25.628 s
>>
>>
>>
>> [INFO] Finished at: 2020-03-18T07:34:08-07:00
>>
>>
>>
>> [INFO]
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> [ERROR] Failed to execute goal
>>
>> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test
>> (default-test)
>>
>> on project tika-dl: There are test failures.
>>
>>
>>
>> [ERROR]
>>
>>
>>
>> [ERROR] Please refer to
>>
>> /Users/mattmann/src/tika/tika-dl/target/surefire-reports for the
>> individual
>>
>> test results.
>>
>>
>>
>> [ERROR] Please refer to dump files (if any exist) [date].dump,
>>
>> [date]-jvmRun[N].dump and [date].dumpstream.
>>
>>
>>
>> [ERROR] -> [Help 1]
>>
>>
>>
>> [ERROR]
>>
>>
>>
>> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>>
>> -e switch.
>>
>>
>>
>> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>>
>>
>>
>> [ERROR]
>>
>>
>>
>> [ERROR] For more information about the errors and possible solutions,
>>
>> please read the following articles:
>>
>>
>>
>> [ERROR] [Help 1]
>>
>> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
>>
>>
>>
>> pomodoro:tika-dl mattmann$
>>
>>
>>
>>
>>
>>
>>
>> Thamme, do you have any ideas what is going on here?
>>
>>
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Chris
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> From: Tim Allison <ta...@apache.org>
>>
>> Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>, "Allison, Timothy
>>
>> B (US 1760-Affiliate)" <ti...@jpl.nasa.gov>
>>
>> Date: Wednesday, March 18, 2020 at 2:35 AM
>>
>> To: "dev@tika.apache.org" <de...@tika.apache.org>
>>
>> Subject: [EXTERNAL] Re: JDK 12 build issues
>>
>>
>>
>>
>>
>>
>>
>> Haven’t tried...we should add java 12-14 to Jenkins.
>>
>>
>>
>>
>>
>>
>>
>> Wait, are we up to 18 yet...
>>
>>
>>
>>
>>
>>
>>
>> Will look into it...
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Mar 17, 2020 at 10:07 PM Chris Mattmann <ma...@apache.org>
>>
>> wrote:
>>
>>
>>
>>
>>
>>
>>
>> Hey Tim et al.,
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Do the tests fail for you with Java 12?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Running org.apache.tika.parser.pkg.GzipParserTest
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>>
>>
>>
>> 0.397 s - in org.apache.tika.parser.pkg.GzipParserTest
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Running org.apache.tika.TestXMLEntityExpansion
>>
>>
>>
>>
>>
>>
>>
>> [WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
>>
>>
>>
>> 0.085 s - in org.apache.tika.TestXMLEntityExpansion
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Running org.apache.tika.mime.MimeTypeTest
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>>
>>
>>
>> 0.001 s - in org.apache.tika.mime.MimeTypeTest
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Running org.apache.tika.mime.MimeTypesTest
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>>
>>
>>
>> 0.001 s - in org.apache.tika.mime.MimeTypesTest
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Running org.apache.tika.mime.TestMimeTypes
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Tests run: 80, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>>
>>
>>
>> 8.997 s - in org.apache.tika.mime.TestMimeTypes
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Running org.apache.tika.TestCorruptedFiles
>>
>>
>>
>>
>>
>>
>>
>> [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
>>
>>
>>
>> 0.001 s - in org.apache.tika.TestCorruptedFiles
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Results:
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] Failures:
>>
>>
>>
>>
>>
>>
>>
>> [ERROR]
>>
>>
>>
>>
>>
>>
>> TesseractOCRParserTest.confirmMultiPageTiffHandling:290->TikaTest.assertContains:110
>>
>>
>>
>> Page 2 not found in:
>>
>>
>>
>>
>>
>>
>>
>> <html xmlns="http://www.w3.org/1999/xhtml">
>>
>>
>>
>>
>>
>>
>>
>> <head>
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Page Number" content="1 2" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Strip Offsets" content="8 28680 46835 73454" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:JPEG Tables" content="[289 values]" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Samples Per Pixel" content="3 samples/pixel" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Image Height" content="600 pixels" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:ImageLength" content="600" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Compression" content="JPEG" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Y Resolution" content="96 dots per inch" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:X Resolution" content="96 dots per inch" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:ResolutionUnit" content="Inch" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Image Height" content="600 pixels" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Strip Byte Counts" content="28672 18155 26619 4002
>>
>>
>>
>> bytes" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="File Size" content="156867 bytes" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Image Width" content="800 pixels" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Photometric Interpretation" content="RGB" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Samples Per Pixel" content="3 samples/pixel" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Planar Configuration" content="Chunky (contiguous
>>
>>
>>
>> for each subsampling pixel)" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Rows Per Strip" content="160 rows/strip" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Image Width" content="800 pixels" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="File Name" content="apache-tika-17704590698477286878.tmp" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Bits Per Sample" content="8 8 8
>>
>>
>>
>> bits/component/pixel" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:BitsPerSample" content="8" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Resolution Unit" content="Inch" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Content-Type" content="image/tiff" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-Parsed-By"
>>
>>
>>
>> content="org.apache.tika.parser.ocr.TesseractOCRParser" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-Parsed-By" content="org.apache.tika.parser.image.TiffParser"
>>
>>
>>
>> />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Planar Configuration" content="Chunky (contiguous
>>
>>
>>
>> for each subsampling pixel)" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="File Modified Date" content="Wed Mar 18 01:27:47 +00:00 2020"
>>
>>
>>
>> />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:XResolution" content="96.0" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:SamplesPerPixel" content="3" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="exif:PageCount" content="2" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Strip Byte Counts" content="28672 18155 27500 4002
>>
>>
>>
>> bytes" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Orientation" content="Top, left side (Horizontal /
>>
>>
>>
>> normal)" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:Orientation" content="1" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Compression" content="JPEG" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Page Number" content="0 1" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Rows Per Strip" content="160 rows/strip" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:X Resolution" content="96 dots per inch" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Orientation" content="Top, left side (Horizontal /
>>
>>
>>
>> normal)" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Photometric Interpretation" content="RGB" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:JPEG Tables" content="[289 values]" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:ImageWidth" content="800" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:YResolution" content="96.0" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Bits Per Sample" content="8 8 8
>>
>>
>>
>> bits/component/pixel" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Strip Offsets" content="77997 106669 124824 152324"
>>
>>
>>
>> />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif IFD0:Y Resolution" content="96 dots per inch" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Exif Image:Resolution Unit" content="Inch" />
>>
>>
>>
>>
>>
>>
>>
>> <title></title>
>>
>>
>>
>>
>>
>>
>>
>> </head>
>>
>>
>>
>>
>>
>>
>>
>> <body><div class="ocr">Multipage
>>
>>
>>
>>
>>
>>
>>
>> TIFF
>>
>>
>>
>>
>>
>>
>>
>> Example
>>
>>
>>
>>
>>
>>
>>
>> Page 1
>>
>>
>>
>>
>>
>>
>>
>> </div>
>>
>>
>>
>>
>>
>>
>>
>> </body></html>
>>
>>
>>
>>
>>
>>
>>
>> [ERROR]
>>
>>
>>
>>
>>
>> TesseractOCRParserTest.testOCROutputsHOCR:146->TikaTest.assertContains:110
>>
>>
>>
>> Happy</span> not found in:
>>
>>
>>
>>
>>
>>
>>
>> <html xmlns="http://www.w3.org/1999/xhtml">
>>
>>
>>
>>
>>
>>
>>
>> <head>
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:custom:AAPL:Keywords" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:PDFVersion" content="1.3" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:title" content="Presentation1" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="xmp:CreatorTool" content="PowerPoint" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:hasXFA" content="false" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="access_permission:modify_annotations" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="access_permission:can_print_degraded" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="AAPL:Keywords" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dc:creator" content="grantingersoll" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dcterms:created" content="2014-02-08T19:57:12Z" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dcterms:modified" content="2014-02-08T19:57:12Z" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Last-Modified" content="2014-02-08T19:57:12Z" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dc:format" content="application/pdf; version=1.3" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:creator_tool" content="PowerPoint" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="access_permission:fill_in_form" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:keywords" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:modified" content="2014-02-08T19:57:12Z" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="meta:save-date" content="2014-02-08T19:57:12Z" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:encrypted" content="false" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dc:title" content="Presentation1" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="cp:subject" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:subject" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:hasMarkedContent" content="false" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Content-Type" content="application/pdf" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:creator" content="grantingersoll" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-Parsed-By" content="org.apache.tika.parser.pdf.PDFParser" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="meta:author" content="grantingersoll" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dc:subject" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dc:subject" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dc:subject" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="dc:subject" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="meta:creation-date" content="2014-02-08T19:57:12Z" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="access_permission:extract_for_accessibility" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="access_permission:assemble_document" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="xmpTPg:NPages" content="1" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:hasXMP" content="false" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="access_permission:extract_content" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="access_permission:can_print" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="meta:keyword" content="" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="access_permission:can_modify" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:producer" content="Mac OS X 10.9.1 Quartz
>>
>>
>>
>> PDFContext" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:docinfo:created" content="2014-02-08T19:57:12Z" />
>>
>>
>>
>>
>>
>>
>>
>> <title>Presentation1</title>
>>
>>
>>
>>
>>
>>
>>
>> </head>
>>
>>
>>
>>
>>
>>
>>
>> <body><div class="page"><p />
>>
>>
>>
>>
>>
>>
>>
>> <img src="embedded:image0.png" alt="image0.png" /></div>
>>
>>
>>
>>
>>
>>
>>
>> </body></html><html xmlns="http://www.w3.org/1999/xhtml">
>>
>>
>>
>>
>>
>>
>>
>> <head>
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Transparency Alpha" content="none" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:ImageLength" content="261" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Compression CompressionTypeName" content="deflate" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Data BitsPerSample" content="8 8 8" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Data PlanarConfiguration" content="PixelInterleaved" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Dimension VerticalPixelSize" content="0.35273367" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="IHDR" content="width=934, height=261, bitDepth=8,
>>
>>
>>
>> colorType=RGB, compressionMethod=deflate, filterMethod=adaptive,
>>
>>
>>
>> interlaceMethod=none" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="embeddedResourceType" content="INLINE" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Chroma ColorSpaceType" content="RGB" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:BitsPerSample" content="8 8 8" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Content-Type" content="image/png" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="height" content="261" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-Parsed-By"
>>
>>
>>
>> content="org.apache.tika.parser.ocr.TesseractOCRParser" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-Parsed-By"
>>
>>
>>
>> content="org.apache.tika.parser.image.ImageParser" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pHYs" content="pixelsPerUnitXAxis=2835,
>>
>>
>>
>> pixelsPerUnitYAxis=2835, unitSpecifier=meter" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Dimension PixelAspectRatio" content="1.0" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="resourceName" content="image0.png" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="pdf:hasXMP" content="false" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Compression NumProgressiveScans" content="1" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Dimension HorizontalPixelSize" content="0.35273367" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Chroma BlackIsZero" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Compression Lossless" content="true" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-TIKA:embedded_depth" content="1" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="width" content="934" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Dimension ImageOrientation" content="Normal" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="X-TIKA:embedded_resource_path" content="/image0.png" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="tiff:ImageWidth" content="934" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Chroma NumChannels" content="3" />
>>
>>
>>
>>
>>
>>
>>
>> <meta name="Data SampleFormat" content="UnsignedIntegral" />
>>
>>
>>
>>
>>
>>
>>
>> <title></title>
>>
>>
>>
>>
>>
>>
>>
>> </head>
>>
>>
>>
>>
>>
>>
>>
>> <body><div class="ocr">
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>     <div class="ocr_page" id="page_1" title="image
>>
>>
>>
>>
>>
>>
>> &quot;/var/folders/n5/1d_k3z4s2293q8ntx_n8sw54mm5n_8/T/apache-tika-1878472717677617651.tmp&quot;;
>>
>>
>>
>> bbox 0 0 934 261; ppageno 0">
>>
>>
>>
>>
>>
>>
>>
>>      <div class="ocr_carea" id="block_1_1" title="bbox 21 34 465 66">
>>
>>
>>
>>
>>
>>
>>
>>       <p class="ocr_par" id="par_1_1" lang="eng" title="bbox 21 34 465
>> 66">
>>
>>
>>
>>
>>
>>
>>
>>        <span class="ocr_line" id="line_1_1" title="bbox 21 34 465 66;
>>
>>
>>
>> baseline 0.005 -7; x_size 33; x_descenders 7; x_ascenders 8">
>>
>>
>>
>>
>>
>>
>>
>>         <span class="ocrx_word" id="word_1_1" title="bbox 21 36 135 66;
>>
>>
>>
>> x_wconf 96"><strong>Happy</strong></span>
>>
>>
>>
>>
>>
>>
>>
>>         <span class="ocrx_word" id="word_1_2" title="bbox 161 37 232 60;
>>
>>
>>
>> x_wconf 96"><strong>New</strong></span>
>>
>>
>>
>>
>>
>>
>>
>>         <span class="ocrx_word" id="word_1_3" title="bbox 259 38 345 60;
>>
>>
>>
>> x_wconf 96"><strong>Year</strong></span>
>>
>>
>>
>>
>>
>>
>>
>>         <span class="ocrx_word" id="word_1_4" title="bbox 375 34 465 61;
>>
>>
>>
>> x_wconf 96"><strong>2003!</strong></span>
>>
>>
>>
>>
>>
>>
>>
>>        </span>
>>
>>
>>
>>
>>
>>
>>
>>       </p>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>      </div>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>     </div>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> </div>
>>
>>
>>
>>
>>
>>
>>
>> </body></html>
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] Tests run: 1188, Failures: 2, Errors: 0, Skipped: 48
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Reactor Summary for Apache Tika 2.0.0-SNAPSHOT:
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika parent ................................. SUCCESS [
>>
>>
>>
>> 8.822 s]
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika core ................................... SUCCESS [
>>
>>
>>
>> 39.589 s]
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika parsers ................................ FAILURE [09:04
>>
>>
>>
>> min]
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika OSGi bundle ............................ SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika XMP .................................... SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika serialization .......................... SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika batch .................................. SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika language detection ..................... SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika application ............................ SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika translate .............................. SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika server ................................. SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika eval ................................... SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika examples ............................... SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika Java-7 Components ...................... SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika Deep Learning (powered by DL4J) ........ SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika Natural Language Processing ............ SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Apache Tika ........................................ SKIPPED
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>>
>>
>>
>> [INFO] BUILD FAILURE
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Total time:  09:57 min
>>
>>
>>
>>
>>
>>
>>
>> [INFO] Finished at: 2020-03-17T18:31:10-07:00
>>
>>
>>
>>
>>
>>
>>
>> [INFO]
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] Failed to execute goal
>>
>>
>>
>> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test
>> (default-test)
>>
>>
>>
>> on project tika-parsers: There are test failures.
>>
>>
>>
>>
>>
>>
>>
>> [ERROR]
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] Please refer to
>>
>>
>>
>> /Users/mattmann/src/tika/tika-parsers/target/surefire-reports for the
>>
>>
>>
>> individual test results.
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] Please refer to dump files (if any exist) [date].dump,
>>
>>
>>
>> [date]-jvmRun[N].dump and [date].dumpstream.
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] -> [Help 1]
>>
>>
>>
>>
>>
>>
>>
>> [ERROR]
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>>
>>
>>
>> -e switch.
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>>
>>
>>
>>
>>
>>
>>
>> [ERROR]
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] For more information about the errors and possible solutions,
>>
>>
>>
>> please read the following articles:
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] [Help 1]
>>
>>
>>
>> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
>>
>>
>>
>>
>>
>>
>>
>> [ERROR]
>>
>>
>>
>>
>>
>>
>>
>> [ERROR] After correcting the problems, you can resume the build with the
>>
>>
>>
>> command
>>
>>
>>
>>
>>
>>
>>
>> [ERROR]   mvn <goals> -rf :tika-parsers
>>
>>
>>
>>
>>
>>
>>
>> pomodoro:tika mattmann$ java -version
>>
>>
>>
>>
>>
>>
>>
>> openjdk version "12.0.1" 2019-04-16
>>
>>
>>
>>
>>
>>
>>
>> OpenJDK Runtime Environment (build 12.0.1+12)
>>
>>
>>
>>
>>
>>
>>
>> OpenJDK 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)
>>
>>
>>
>>
>>
>>
>>
>> pomodoro:tika mattmann$
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Any ideas?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Cheers,
>>
>>
>>
>>
>>
>>
>>
>> Chris
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>

Re: [EXTERNAL] Re: JDK 12 build issues

Posted by Tim Allison <ta...@apache.org>.
FWIW, I got a clean build on 11 and 13 just now.  We've been getting the
malformed stream error in Jenkins quite a bit in the dl4j model.  I suspect
this is just a problem w downloading the file, but I'm not sure...

On Wed, Mar 18, 2020 at 10:57 AM Chris Mattmann <ma...@apache.org> wrote:

> Thanks Oleg I was using OpenJDK 12 and 13, but I fixed it!
>
>
>
> I needed to delete the $HOME/.tika-dl folder. All good now!
>
>
>
> NING] Invalid POM for commons-net:commons-net:jar:3.1, transitive
> dependencies (if any) will not be available, enable debug logging for more
> details
>
> [WARNING] Invalid POM for net.ericaro:neoitertools:jar:1.0.0, transitive
> dependencies (if any) will not be available, enable debug logging for more
> details
>
> [INFO]
>
> [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ tika-dl
> ---
>
> [WARNING] Invalid project model for artifact
> [commons-net:commons-net:3.1]. It will be ignored by the remote resources
> Mojo.
>
> [WARNING] Invalid project model for artifact
> [neoitertools:net.ericaro:1.0.0]. It will be ignored by the remote
> resources Mojo.
>
> [INFO]
>
> [INFO] --- maven-resources-plugin:2.7:resources (default-resources) @
> tika-dl ---
>
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>
> [INFO] skip non existing resourceDirectory
> /Users/mattmann/src/tika/tika-dl/src/main/resources
>
> [INFO] Copying 3 resources
>
> [INFO]
>
> [INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @ tika-dl
> ---
>
> [INFO] Changes detected - recompiling the module!
>
> [INFO] Compiling 2 source files to
> /Users/mattmann/src/tika/tika-dl/target/classes
>
> [INFO]
>
> [INFO] --- maven-resources-plugin:2.7:testResources
> (default-testResources) @ tika-dl ---
>
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>
> [INFO] Copying 4 resources
>
> [INFO] Copying 3 resources
>
> [INFO]
>
> [INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @
> tika-dl ---
>
> [INFO] Changes detected - recompiling the module!
>
> [INFO] Compiling 2 source files to
> /Users/mattmann/src/tika/tika-dl/target/test-classes
>
> [INFO]
>
> [INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-dl ---
>
> [INFO]
>
> [INFO] -------------------------------------------------------
>
> [INFO]  T E S T S
>
> [INFO] -------------------------------------------------------
>
> [INFO] Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>
> log4j:WARN No appenders could be found for logger
> (org.nd4j.linalg.factory.Nd4jBackend).
>
> log4j:WARN Please initialize the log4j system properly.
>
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
>
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 272.202 s - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>
> [INFO] Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 44.616 s - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>
> [INFO]
>
> [INFO] Results:
>
> [INFO]
>
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
>
> [INFO]
>
> [INFO]
> ------------------------------------------------------------------------
>
> [INFO] BUILD SUCCESS
>
> [INFO]
> ------------------------------------------------------------------------
>
> [INFO] Total time:  05:27 min
>
> [INFO] Finished at: 2020-03-18T07:51:56-07:00
>
> [INFO]
> ------------------------------------------------------------------------
>
> pomodoro:tika-dl mattmann$
>
>
>
>
>
>
>
>
>
> From: Oleg Tikhonov <ol...@apache.org>
> Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>
> Date: Wednesday, March 18, 2020 at 7:53 AM
> To: "dev@tika.apache.org" <de...@tika.apache.org>
> Subject: Re: [EXTERNAL] Re: JDK 12 build issues
>
>
>
> Hi Chris,
>
> I'm currently trying to build an env with java 12/13 ... in order to try
>
> your setup.
>
> What java version are you using? open jdk or oracle?
>
> One upon a time was a bug in openjdk
>
> https://bugs.openjdk.java.net/browse/JDK-8131146
>
> But it seems to be ok in recent releases.
>
>
>
> Keep you updated.
>
> Cheers,
>
> Oleg
>
>
>
>
>
> On Wed, Mar 18, 2020 at 4:35 PM Chris Mattmann <ma...@apache.org>
> wrote:
>
>
>
> So I was able to get past my issues with Tesseract by reinstalling the
>
> latest version with Brew.
>
>
>
>
>
>
>
> I have a new issue!
>
>
>
> I’ve tried in JDK12 and JDK13 to build tika-dl, but it keeps failing:
>
>
>
>
>
>
>
> [INFO]
>
>
>
> [INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @
>
> tika-dl ---
>
>
>
> [INFO] Changes detected - recompiling the module!
>
>
>
> [INFO] Compiling 2 source files to
>
> /Users/mattmann/src/tika/tika-dl/target/test-classes
>
>
>
> [INFO]
>
>
>
> [INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-dl ---
>
>
>
> [INFO]
>
>
>
> [INFO] -------------------------------------------------------
>
>
>
> [INFO]  T E S T S
>
>
>
> [INFO] -------------------------------------------------------
>
>
>
> [INFO] Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>
>
>
> log4j:WARN No appenders could be found for logger
>
> (org.nd4j.linalg.factory.Nd4jBackend).
>
>
>
> log4j:WARN Please initialize the log4j system properly.
>
>
>
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
>
> more info.
>
>
>
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
>
> 3.38 s <<< FAILURE! - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>
>
>
> [ERROR] org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise  Time
>
> elapsed: 3.29 s  <<< ERROR!
>
>
>
> org.apache.tika.exception.TikaConfigException: java.io
> .UTFDataFormatException:
>
> malformed input around byte 11
>
>
>
>         at
>
>
> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>
>
>
> Caused by: java.lang.RuntimeException: java.io.UTFDataFormatException:
>
> malformed input around byte 11
>
>
>
>         at
>
>
> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>
>
>
> Caused by: java.io.UTFDataFormatException: malformed input around byte 11
>
>
>
>         at
>
>
> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>
>
>
>
>
>
>
> [INFO] Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>
>
>
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
> 5.392 s - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>
>
>
> [INFO]
>
>
>
> [INFO] Results:
>
>
>
> [INFO]
>
>
>
> [ERROR] Errors:
>
>
>
> [ERROR]   DL4JVGG16NetTest.recognise:36 » TikaConfig java.io
> .UTFDataFormatException:
>
> mal...
>
>
>
> [INFO]
>
>
>
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0
>
>
>
> [INFO]
>
>
>
> [INFO]
>
> ------------------------------------------------------------------------
>
>
>
> [INFO] BUILD FAILURE
>
>
>
> [INFO]
>
> ------------------------------------------------------------------------
>
>
>
> [INFO] Total time:  25.628 s
>
>
>
> [INFO] Finished at: 2020-03-18T07:34:08-07:00
>
>
>
> [INFO]
>
> ------------------------------------------------------------------------
>
>
>
> [ERROR] Failed to execute goal
>
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test)
>
> on project tika-dl: There are test failures.
>
>
>
> [ERROR]
>
>
>
> [ERROR] Please refer to
>
> /Users/mattmann/src/tika/tika-dl/target/surefire-reports for the individual
>
> test results.
>
>
>
> [ERROR] Please refer to dump files (if any exist) [date].dump,
>
> [date]-jvmRun[N].dump and [date].dumpstream.
>
>
>
> [ERROR] -> [Help 1]
>
>
>
> [ERROR]
>
>
>
> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>
> -e switch.
>
>
>
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>
>
>
> [ERROR]
>
>
>
> [ERROR] For more information about the errors and possible solutions,
>
> please read the following articles:
>
>
>
> [ERROR] [Help 1]
>
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
>
>
>
> pomodoro:tika-dl mattmann$
>
>
>
>
>
>
>
> Thamme, do you have any ideas what is going on here?
>
>
>
>
>
> Cheers,
>
>
>
> Chris
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> From: Tim Allison <ta...@apache.org>
>
> Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>, "Allison, Timothy
>
> B (US 1760-Affiliate)" <ti...@jpl.nasa.gov>
>
> Date: Wednesday, March 18, 2020 at 2:35 AM
>
> To: "dev@tika.apache.org" <de...@tika.apache.org>
>
> Subject: [EXTERNAL] Re: JDK 12 build issues
>
>
>
>
>
>
>
> Haven’t tried...we should add java 12-14 to Jenkins.
>
>
>
>
>
>
>
> Wait, are we up to 18 yet...
>
>
>
>
>
>
>
> Will look into it...
>
>
>
>
>
>
>
> On Tue, Mar 17, 2020 at 10:07 PM Chris Mattmann <ma...@apache.org>
>
> wrote:
>
>
>
>
>
>
>
> Hey Tim et al.,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Do the tests fail for you with Java 12?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> [INFO] Running org.apache.tika.parser.pkg.GzipParserTest
>
>
>
>
>
>
>
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
>
>
> 0.397 s - in org.apache.tika.parser.pkg.GzipParserTest
>
>
>
>
>
>
>
> [INFO] Running org.apache.tika.TestXMLEntityExpansion
>
>
>
>
>
>
>
> [WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
>
>
>
> 0.085 s - in org.apache.tika.TestXMLEntityExpansion
>
>
>
>
>
>
>
> [INFO] Running org.apache.tika.mime.MimeTypeTest
>
>
>
>
>
>
>
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
>
>
> 0.001 s - in org.apache.tika.mime.MimeTypeTest
>
>
>
>
>
>
>
> [INFO] Running org.apache.tika.mime.MimeTypesTest
>
>
>
>
>
>
>
> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
>
>
> 0.001 s - in org.apache.tika.mime.MimeTypesTest
>
>
>
>
>
>
>
> [INFO] Running org.apache.tika.mime.TestMimeTypes
>
>
>
>
>
>
>
> [INFO] Tests run: 80, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
>
>
> 8.997 s - in org.apache.tika.mime.TestMimeTypes
>
>
>
>
>
>
>
> [INFO] Running org.apache.tika.TestCorruptedFiles
>
>
>
>
>
>
>
> [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
>
>
>
> 0.001 s - in org.apache.tika.TestCorruptedFiles
>
>
>
>
>
>
>
> [INFO]
>
>
>
>
>
>
>
> [INFO] Results:
>
>
>
>
>
>
>
> [INFO]
>
>
>
>
>
>
>
> [ERROR] Failures:
>
>
>
>
>
>
>
> [ERROR]
>
>
>
>
>
>
> TesseractOCRParserTest.confirmMultiPageTiffHandling:290->TikaTest.assertContains:110
>
>
>
> Page 2 not found in:
>
>
>
>
>
>
>
> <html xmlns="http://www.w3.org/1999/xhtml">
>
>
>
>
>
>
>
> <head>
>
>
>
>
>
>
>
> <meta name="Exif Image:Page Number" content="1 2" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Strip Offsets" content="8 28680 46835 73454" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:JPEG Tables" content="[289 values]" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Samples Per Pixel" content="3 samples/pixel" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Image Height" content="600 pixels" />
>
>
>
>
>
>
>
> <meta name="tiff:ImageLength" content="600" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Compression" content="JPEG" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Y Resolution" content="96 dots per inch" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:X Resolution" content="96 dots per inch" />
>
>
>
>
>
>
>
> <meta name="tiff:ResolutionUnit" content="Inch" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Image Height" content="600 pixels" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Strip Byte Counts" content="28672 18155 26619 4002
>
>
>
> bytes" />
>
>
>
>
>
>
>
> <meta name="File Size" content="156867 bytes" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Image Width" content="800 pixels" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Photometric Interpretation" content="RGB" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Samples Per Pixel" content="3 samples/pixel" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Planar Configuration" content="Chunky (contiguous
>
>
>
> for each subsampling pixel)" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Rows Per Strip" content="160 rows/strip" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Image Width" content="800 pixels" />
>
>
>
>
>
>
>
> <meta name="File Name" content="apache-tika-17704590698477286878.tmp" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Bits Per Sample" content="8 8 8
>
>
>
> bits/component/pixel" />
>
>
>
>
>
>
>
> <meta name="tiff:BitsPerSample" content="8" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Resolution Unit" content="Inch" />
>
>
>
>
>
>
>
> <meta name="Content-Type" content="image/tiff" />
>
>
>
>
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>
>
>
>
>
>
>
> <meta name="X-Parsed-By"
>
>
>
> content="org.apache.tika.parser.ocr.TesseractOCRParser" />
>
>
>
>
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.image.TiffParser"
>
>
>
> />
>
>
>
>
>
>
>
> <meta name="Exif Image:Planar Configuration" content="Chunky (contiguous
>
>
>
> for each subsampling pixel)" />
>
>
>
>
>
>
>
> <meta name="File Modified Date" content="Wed Mar 18 01:27:47 +00:00 2020"
>
>
>
> />
>
>
>
>
>
>
>
> <meta name="tiff:XResolution" content="96.0" />
>
>
>
>
>
>
>
> <meta name="tiff:SamplesPerPixel" content="3" />
>
>
>
>
>
>
>
> <meta name="exif:PageCount" content="2" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Strip Byte Counts" content="28672 18155 27500 4002
>
>
>
> bytes" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Orientation" content="Top, left side (Horizontal /
>
>
>
> normal)" />
>
>
>
>
>
>
>
> <meta name="tiff:Orientation" content="1" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Compression" content="JPEG" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Page Number" content="0 1" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Rows Per Strip" content="160 rows/strip" />
>
>
>
>
>
>
>
> <meta name="Exif Image:X Resolution" content="96 dots per inch" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Orientation" content="Top, left side (Horizontal /
>
>
>
> normal)" />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Photometric Interpretation" content="RGB" />
>
>
>
>
>
>
>
> <meta name="Exif Image:JPEG Tables" content="[289 values]" />
>
>
>
>
>
>
>
> <meta name="tiff:ImageWidth" content="800" />
>
>
>
>
>
>
>
> <meta name="tiff:YResolution" content="96.0" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Bits Per Sample" content="8 8 8
>
>
>
> bits/component/pixel" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Strip Offsets" content="77997 106669 124824 152324"
>
>
>
> />
>
>
>
>
>
>
>
> <meta name="Exif IFD0:Y Resolution" content="96 dots per inch" />
>
>
>
>
>
>
>
> <meta name="Exif Image:Resolution Unit" content="Inch" />
>
>
>
>
>
>
>
> <title></title>
>
>
>
>
>
>
>
> </head>
>
>
>
>
>
>
>
> <body><div class="ocr">Multipage
>
>
>
>
>
>
>
> TIFF
>
>
>
>
>
>
>
> Example
>
>
>
>
>
>
>
> Page 1
>
>
>
>
>
>
>
> </div>
>
>
>
>
>
>
>
> </body></html>
>
>
>
>
>
>
>
> [ERROR]
>
>
>
>
>
> TesseractOCRParserTest.testOCROutputsHOCR:146->TikaTest.assertContains:110
>
>
>
> Happy</span> not found in:
>
>
>
>
>
>
>
> <html xmlns="http://www.w3.org/1999/xhtml">
>
>
>
>
>
>
>
> <head>
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:custom:AAPL:Keywords" content="" />
>
>
>
>
>
>
>
> <meta name="pdf:PDFVersion" content="1.3" />
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:title" content="Presentation1" />
>
>
>
>
>
>
>
> <meta name="xmp:CreatorTool" content="PowerPoint" />
>
>
>
>
>
>
>
> <meta name="pdf:hasXFA" content="false" />
>
>
>
>
>
>
>
> <meta name="access_permission:modify_annotations" content="true" />
>
>
>
>
>
>
>
> <meta name="access_permission:can_print_degraded" content="true" />
>
>
>
>
>
>
>
> <meta name="AAPL:Keywords" content="" />
>
>
>
>
>
>
>
> <meta name="dc:creator" content="grantingersoll" />
>
>
>
>
>
>
>
> <meta name="dcterms:created" content="2014-02-08T19:57:12Z" />
>
>
>
>
>
>
>
> <meta name="dcterms:modified" content="2014-02-08T19:57:12Z" />
>
>
>
>
>
>
>
> <meta name="Last-Modified" content="2014-02-08T19:57:12Z" />
>
>
>
>
>
>
>
> <meta name="dc:format" content="application/pdf; version=1.3" />
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:creator_tool" content="PowerPoint" />
>
>
>
>
>
>
>
> <meta name="access_permission:fill_in_form" content="true" />
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:keywords" content="" />
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:modified" content="2014-02-08T19:57:12Z" />
>
>
>
>
>
>
>
> <meta name="meta:save-date" content="2014-02-08T19:57:12Z" />
>
>
>
>
>
>
>
> <meta name="pdf:encrypted" content="false" />
>
>
>
>
>
>
>
> <meta name="dc:title" content="Presentation1" />
>
>
>
>
>
>
>
> <meta name="cp:subject" content="" />
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:subject" content="" />
>
>
>
>
>
>
>
> <meta name="pdf:hasMarkedContent" content="false" />
>
>
>
>
>
>
>
> <meta name="Content-Type" content="application/pdf" />
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:creator" content="grantingersoll" />
>
>
>
>
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>
>
>
>
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.pdf.PDFParser" />
>
>
>
>
>
>
>
> <meta name="meta:author" content="grantingersoll" />
>
>
>
>
>
>
>
> <meta name="dc:subject" content="" />
>
>
>
>
>
>
>
> <meta name="dc:subject" content="" />
>
>
>
>
>
>
>
> <meta name="dc:subject" content="" />
>
>
>
>
>
>
>
> <meta name="dc:subject" content="" />
>
>
>
>
>
>
>
> <meta name="meta:creation-date" content="2014-02-08T19:57:12Z" />
>
>
>
>
>
>
>
> <meta name="access_permission:extract_for_accessibility" content="true" />
>
>
>
>
>
>
>
> <meta name="access_permission:assemble_document" content="true" />
>
>
>
>
>
>
>
> <meta name="xmpTPg:NPages" content="1" />
>
>
>
>
>
>
>
> <meta name="pdf:hasXMP" content="false" />
>
>
>
>
>
>
>
> <meta name="access_permission:extract_content" content="true" />
>
>
>
>
>
>
>
> <meta name="access_permission:can_print" content="true" />
>
>
>
>
>
>
>
> <meta name="meta:keyword" content="" />
>
>
>
>
>
>
>
> <meta name="access_permission:can_modify" content="true" />
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:producer" content="Mac OS X 10.9.1 Quartz
>
>
>
> PDFContext" />
>
>
>
>
>
>
>
> <meta name="pdf:docinfo:created" content="2014-02-08T19:57:12Z" />
>
>
>
>
>
>
>
> <title>Presentation1</title>
>
>
>
>
>
>
>
> </head>
>
>
>
>
>
>
>
> <body><div class="page"><p />
>
>
>
>
>
>
>
> <img src="embedded:image0.png" alt="image0.png" /></div>
>
>
>
>
>
>
>
> </body></html><html xmlns="http://www.w3.org/1999/xhtml">
>
>
>
>
>
>
>
> <head>
>
>
>
>
>
>
>
> <meta name="Transparency Alpha" content="none" />
>
>
>
>
>
>
>
> <meta name="tiff:ImageLength" content="261" />
>
>
>
>
>
>
>
> <meta name="Compression CompressionTypeName" content="deflate" />
>
>
>
>
>
>
>
> <meta name="Data BitsPerSample" content="8 8 8" />
>
>
>
>
>
>
>
> <meta name="Data PlanarConfiguration" content="PixelInterleaved" />
>
>
>
>
>
>
>
> <meta name="Dimension VerticalPixelSize" content="0.35273367" />
>
>
>
>
>
>
>
> <meta name="IHDR" content="width=934, height=261, bitDepth=8,
>
>
>
> colorType=RGB, compressionMethod=deflate, filterMethod=adaptive,
>
>
>
> interlaceMethod=none" />
>
>
>
>
>
>
>
> <meta name="embeddedResourceType" content="INLINE" />
>
>
>
>
>
>
>
> <meta name="Chroma ColorSpaceType" content="RGB" />
>
>
>
>
>
>
>
> <meta name="tiff:BitsPerSample" content="8 8 8" />
>
>
>
>
>
>
>
> <meta name="Content-Type" content="image/png" />
>
>
>
>
>
>
>
> <meta name="height" content="261" />
>
>
>
>
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>
>
>
>
>
>
>
> <meta name="X-Parsed-By"
>
>
>
> content="org.apache.tika.parser.ocr.TesseractOCRParser" />
>
>
>
>
>
>
>
> <meta name="X-Parsed-By"
>
>
>
> content="org.apache.tika.parser.image.ImageParser" />
>
>
>
>
>
>
>
> <meta name="pHYs" content="pixelsPerUnitXAxis=2835,
>
>
>
> pixelsPerUnitYAxis=2835, unitSpecifier=meter" />
>
>
>
>
>
>
>
> <meta name="Dimension PixelAspectRatio" content="1.0" />
>
>
>
>
>
>
>
> <meta name="resourceName" content="image0.png" />
>
>
>
>
>
>
>
> <meta name="pdf:hasXMP" content="false" />
>
>
>
>
>
>
>
> <meta name="Compression NumProgressiveScans" content="1" />
>
>
>
>
>
>
>
> <meta name="Dimension HorizontalPixelSize" content="0.35273367" />
>
>
>
>
>
>
>
> <meta name="Chroma BlackIsZero" content="true" />
>
>
>
>
>
>
>
> <meta name="Compression Lossless" content="true" />
>
>
>
>
>
>
>
> <meta name="X-TIKA:embedded_depth" content="1" />
>
>
>
>
>
>
>
> <meta name="width" content="934" />
>
>
>
>
>
>
>
> <meta name="Dimension ImageOrientation" content="Normal" />
>
>
>
>
>
>
>
> <meta name="X-TIKA:embedded_resource_path" content="/image0.png" />
>
>
>
>
>
>
>
> <meta name="tiff:ImageWidth" content="934" />
>
>
>
>
>
>
>
> <meta name="Chroma NumChannels" content="3" />
>
>
>
>
>
>
>
> <meta name="Data SampleFormat" content="UnsignedIntegral" />
>
>
>
>
>
>
>
> <title></title>
>
>
>
>
>
>
>
> </head>
>
>
>
>
>
>
>
> <body><div class="ocr">
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     <div class="ocr_page" id="page_1" title="image
>
>
>
>
>
>
> &quot;/var/folders/n5/1d_k3z4s2293q8ntx_n8sw54mm5n_8/T/apache-tika-1878472717677617651.tmp&quot;;
>
>
>
> bbox 0 0 934 261; ppageno 0">
>
>
>
>
>
>
>
>      <div class="ocr_carea" id="block_1_1" title="bbox 21 34 465 66">
>
>
>
>
>
>
>
>       <p class="ocr_par" id="par_1_1" lang="eng" title="bbox 21 34 465 66">
>
>
>
>
>
>
>
>        <span class="ocr_line" id="line_1_1" title="bbox 21 34 465 66;
>
>
>
> baseline 0.005 -7; x_size 33; x_descenders 7; x_ascenders 8">
>
>
>
>
>
>
>
>         <span class="ocrx_word" id="word_1_1" title="bbox 21 36 135 66;
>
>
>
> x_wconf 96"><strong>Happy</strong></span>
>
>
>
>
>
>
>
>         <span class="ocrx_word" id="word_1_2" title="bbox 161 37 232 60;
>
>
>
> x_wconf 96"><strong>New</strong></span>
>
>
>
>
>
>
>
>         <span class="ocrx_word" id="word_1_3" title="bbox 259 38 345 60;
>
>
>
> x_wconf 96"><strong>Year</strong></span>
>
>
>
>
>
>
>
>         <span class="ocrx_word" id="word_1_4" title="bbox 375 34 465 61;
>
>
>
> x_wconf 96"><strong>2003!</strong></span>
>
>
>
>
>
>
>
>        </span>
>
>
>
>
>
>
>
>       </p>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>      </div>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     </div>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> </div>
>
>
>
>
>
>
>
> </body></html>
>
>
>
>
>
>
>
> [INFO]
>
>
>
>
>
>
>
> [ERROR] Tests run: 1188, Failures: 2, Errors: 0, Skipped: 48
>
>
>
>
>
>
>
> [INFO]
>
>
>
>
>
>
>
> [INFO]
>
>
>
> ------------------------------------------------------------------------
>
>
>
>
>
>
>
> [INFO] Reactor Summary for Apache Tika 2.0.0-SNAPSHOT:
>
>
>
>
>
>
>
> [INFO]
>
>
>
>
>
>
>
> [INFO] Apache Tika parent ................................. SUCCESS [
>
>
>
> 8.822 s]
>
>
>
>
>
>
>
> [INFO] Apache Tika core ................................... SUCCESS [
>
>
>
> 39.589 s]
>
>
>
>
>
>
>
> [INFO] Apache Tika parsers ................................ FAILURE [09:04
>
>
>
> min]
>
>
>
>
>
>
>
> [INFO] Apache Tika OSGi bundle ............................ SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika XMP .................................... SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika serialization .......................... SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika batch .................................. SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika language detection ..................... SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika application ............................ SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika translate .............................. SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika server ................................. SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika eval ................................... SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika examples ............................... SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika Java-7 Components ...................... SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika Deep Learning (powered by DL4J) ........ SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika Natural Language Processing ............ SKIPPED
>
>
>
>
>
>
>
> [INFO] Apache Tika ........................................ SKIPPED
>
>
>
>
>
>
>
> [INFO]
>
>
>
> ------------------------------------------------------------------------
>
>
>
>
>
>
>
> [INFO] BUILD FAILURE
>
>
>
>
>
>
>
> [INFO]
>
>
>
> ------------------------------------------------------------------------
>
>
>
>
>
>
>
> [INFO] Total time:  09:57 min
>
>
>
>
>
>
>
> [INFO] Finished at: 2020-03-17T18:31:10-07:00
>
>
>
>
>
>
>
> [INFO]
>
>
>
> ------------------------------------------------------------------------
>
>
>
>
>
>
>
> [ERROR] Failed to execute goal
>
>
>
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test)
>
>
>
> on project tika-parsers: There are test failures.
>
>
>
>
>
>
>
> [ERROR]
>
>
>
>
>
>
>
> [ERROR] Please refer to
>
>
>
> /Users/mattmann/src/tika/tika-parsers/target/surefire-reports for the
>
>
>
> individual test results.
>
>
>
>
>
>
>
> [ERROR] Please refer to dump files (if any exist) [date].dump,
>
>
>
> [date]-jvmRun[N].dump and [date].dumpstream.
>
>
>
>
>
>
>
> [ERROR] -> [Help 1]
>
>
>
>
>
>
>
> [ERROR]
>
>
>
>
>
>
>
> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>
>
>
> -e switch.
>
>
>
>
>
>
>
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>
>
>
>
>
>
>
> [ERROR]
>
>
>
>
>
>
>
> [ERROR] For more information about the errors and possible solutions,
>
>
>
> please read the following articles:
>
>
>
>
>
>
>
> [ERROR] [Help 1]
>
>
>
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
>
>
>
>
>
>
>
> [ERROR]
>
>
>
>
>
>
>
> [ERROR] After correcting the problems, you can resume the build with the
>
>
>
> command
>
>
>
>
>
>
>
> [ERROR]   mvn <goals> -rf :tika-parsers
>
>
>
>
>
>
>
> pomodoro:tika mattmann$ java -version
>
>
>
>
>
>
>
> openjdk version "12.0.1" 2019-04-16
>
>
>
>
>
>
>
> OpenJDK Runtime Environment (build 12.0.1+12)
>
>
>
>
>
>
>
> OpenJDK 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)
>
>
>
>
>
>
>
> pomodoro:tika mattmann$
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Any ideas?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Cheers,
>
>
>
>
>
>
>
> Chris
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: [EXTERNAL] Re: JDK 12 build issues

Posted by Chris Mattmann <ma...@apache.org>.
Thanks Oleg I was using OpenJDK 12 and 13, but I fixed it!

 

I needed to delete the $HOME/.tika-dl folder. All good now!

 

NING] Invalid POM for commons-net:commons-net:jar:3.1, transitive dependencies (if any) will not be available, enable debug logging for more details

[WARNING] Invalid POM for net.ericaro:neoitertools:jar:1.0.0, transitive dependencies (if any) will not be available, enable debug logging for more details

[INFO] 

[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ tika-dl ---

[WARNING] Invalid project model for artifact [commons-net:commons-net:3.1]. It will be ignored by the remote resources Mojo.

[WARNING] Invalid project model for artifact [neoitertools:net.ericaro:1.0.0]. It will be ignored by the remote resources Mojo.

[INFO] 

[INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ tika-dl ---

[INFO] Using 'UTF-8' encoding to copy filtered resources.

[INFO] skip non existing resourceDirectory /Users/mattmann/src/tika/tika-dl/src/main/resources

[INFO] Copying 3 resources

[INFO] 

[INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @ tika-dl ---

[INFO] Changes detected - recompiling the module!

[INFO] Compiling 2 source files to /Users/mattmann/src/tika/tika-dl/target/classes

[INFO] 

[INFO] --- maven-resources-plugin:2.7:testResources (default-testResources) @ tika-dl ---

[INFO] Using 'UTF-8' encoding to copy filtered resources.

[INFO] Copying 4 resources

[INFO] Copying 3 resources

[INFO] 

[INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @ tika-dl ---

[INFO] Changes detected - recompiling the module!

[INFO] Compiling 2 source files to /Users/mattmann/src/tika/tika-dl/target/test-classes

[INFO] 

[INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-dl ---

[INFO] 

[INFO] -------------------------------------------------------

[INFO]  T E S T S

[INFO] -------------------------------------------------------

[INFO] Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest

log4j:WARN No appenders could be found for logger (org.nd4j.linalg.factory.Nd4jBackend).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 272.202 s - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest

[INFO] Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest

[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.616 s - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest

[INFO] 

[INFO] Results:

[INFO] 

[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

[INFO] 

[INFO] ------------------------------------------------------------------------

[INFO] BUILD SUCCESS

[INFO] ------------------------------------------------------------------------

[INFO] Total time:  05:27 min

[INFO] Finished at: 2020-03-18T07:51:56-07:00

[INFO] ------------------------------------------------------------------------

pomodoro:tika-dl mattmann$ 

 

 

 

 

From: Oleg Tikhonov <ol...@apache.org>
Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>
Date: Wednesday, March 18, 2020 at 7:53 AM
To: "dev@tika.apache.org" <de...@tika.apache.org>
Subject: Re: [EXTERNAL] Re: JDK 12 build issues

 

Hi Chris,

I'm currently trying to build an env with java 12/13 ... in order to try

your setup.

What java version are you using? open jdk or oracle?

One upon a time was a bug in openjdk

https://bugs.openjdk.java.net/browse/JDK-8131146

But it seems to be ok in recent releases.

 

Keep you updated.

Cheers,

Oleg

 

 

On Wed, Mar 18, 2020 at 4:35 PM Chris Mattmann <ma...@apache.org> wrote:

 

So I was able to get past my issues with Tesseract by reinstalling the

latest version with Brew.

 

 

 

I have a new issue!

 

I’ve tried in JDK12 and JDK13 to build tika-dl, but it keeps failing:

 

 

 

[INFO]

 

[INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @

tika-dl ---

 

[INFO] Changes detected - recompiling the module!

 

[INFO] Compiling 2 source files to

/Users/mattmann/src/tika/tika-dl/target/test-classes

 

[INFO]

 

[INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-dl ---

 

[INFO]

 

[INFO] -------------------------------------------------------

 

[INFO]  T E S T S

 

[INFO] -------------------------------------------------------

 

[INFO] Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest

 

log4j:WARN No appenders could be found for logger

(org.nd4j.linalg.factory.Nd4jBackend).

 

log4j:WARN Please initialize the log4j system properly.

 

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for

more info.

 

[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:

3.38 s <<< FAILURE! - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest

 

[ERROR] org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise  Time

elapsed: 3.29 s  <<< ERROR!

 

org.apache.tika.exception.TikaConfigException: java.io.UTFDataFormatException:

malformed input around byte 11

 

        at

org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)

 

Caused by: java.lang.RuntimeException: java.io.UTFDataFormatException:

malformed input around byte 11

 

        at

org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)

 

Caused by: java.io.UTFDataFormatException: malformed input around byte 11

 

        at

org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)

 

 

 

[INFO] Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest

 

[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

5.392 s - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest

 

[INFO]

 

[INFO] Results:

 

[INFO]

 

[ERROR] Errors:

 

[ERROR]   DL4JVGG16NetTest.recognise:36 » TikaConfig java.io.UTFDataFormatException:

mal...

 

[INFO]

 

[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0

 

[INFO]

 

[INFO]

------------------------------------------------------------------------

 

[INFO] BUILD FAILURE

 

[INFO]

------------------------------------------------------------------------

 

[INFO] Total time:  25.628 s

 

[INFO] Finished at: 2020-03-18T07:34:08-07:00

 

[INFO]

------------------------------------------------------------------------

 

[ERROR] Failed to execute goal

org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test)

on project tika-dl: There are test failures.

 

[ERROR]

 

[ERROR] Please refer to

/Users/mattmann/src/tika/tika-dl/target/surefire-reports for the individual

test results.

 

[ERROR] Please refer to dump files (if any exist) [date].dump,

[date]-jvmRun[N].dump and [date].dumpstream.

 

[ERROR] -> [Help 1]

 

[ERROR]

 

[ERROR] To see the full stack trace of the errors, re-run Maven with the

-e switch.

 

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

 

[ERROR]

 

[ERROR] For more information about the errors and possible solutions,

please read the following articles:

 

[ERROR] [Help 1]

http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

 

pomodoro:tika-dl mattmann$

 

 

 

Thamme, do you have any ideas what is going on here?

 

 

Cheers,

 

Chris

 

 

 

 

 

 

 

 

 

From: Tim Allison <ta...@apache.org>

Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>, "Allison, Timothy

B (US 1760-Affiliate)" <ti...@jpl.nasa.gov>

Date: Wednesday, March 18, 2020 at 2:35 AM

To: "dev@tika.apache.org" <de...@tika.apache.org>

Subject: [EXTERNAL] Re: JDK 12 build issues

 

 

 

Haven’t tried...we should add java 12-14 to Jenkins.

 

 

 

Wait, are we up to 18 yet...

 

 

 

Will look into it...

 

 

 

On Tue, Mar 17, 2020 at 10:07 PM Chris Mattmann <ma...@apache.org>

wrote:

 

 

 

Hey Tim et al.,

 

 

 

 

 

 

 

Do the tests fail for you with Java 12?

 

 

 

 

 

 

 

[INFO] Running org.apache.tika.parser.pkg.GzipParserTest

 

 

 

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

 

0.397 s - in org.apache.tika.parser.pkg.GzipParserTest

 

 

 

[INFO] Running org.apache.tika.TestXMLEntityExpansion

 

 

 

[WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:

 

0.085 s - in org.apache.tika.TestXMLEntityExpansion

 

 

 

[INFO] Running org.apache.tika.mime.MimeTypeTest

 

 

 

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

 

0.001 s - in org.apache.tika.mime.MimeTypeTest

 

 

 

[INFO] Running org.apache.tika.mime.MimeTypesTest

 

 

 

[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

 

0.001 s - in org.apache.tika.mime.MimeTypesTest

 

 

 

[INFO] Running org.apache.tika.mime.TestMimeTypes

 

 

 

[INFO] Tests run: 80, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:

 

8.997 s - in org.apache.tika.mime.TestMimeTypes

 

 

 

[INFO] Running org.apache.tika.TestCorruptedFiles

 

 

 

[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:

 

0.001 s - in org.apache.tika.TestCorruptedFiles

 

 

 

[INFO]

 

 

 

[INFO] Results:

 

 

 

[INFO]

 

 

 

[ERROR] Failures:

 

 

 

[ERROR]

 

 

TesseractOCRParserTest.confirmMultiPageTiffHandling:290->TikaTest.assertContains:110

 

Page 2 not found in:

 

 

 

<html xmlns="http://www.w3.org/1999/xhtml">

 

 

 

<head>

 

 

 

<meta name="Exif Image:Page Number" content="1 2" />

 

 

 

<meta name="Exif IFD0:Strip Offsets" content="8 28680 46835 73454" />

 

 

 

<meta name="Exif IFD0:JPEG Tables" content="[289 values]" />

 

 

 

<meta name="Exif Image:Samples Per Pixel" content="3 samples/pixel" />

 

 

 

<meta name="Exif Image:Image Height" content="600 pixels" />

 

 

 

<meta name="tiff:ImageLength" content="600" />

 

 

 

<meta name="Exif Image:Compression" content="JPEG" />

 

 

 

<meta name="Exif Image:Y Resolution" content="96 dots per inch" />

 

 

 

<meta name="Exif IFD0:X Resolution" content="96 dots per inch" />

 

 

 

<meta name="tiff:ResolutionUnit" content="Inch" />

 

 

 

<meta name="Exif IFD0:Image Height" content="600 pixels" />

 

 

 

<meta name="Exif IFD0:Strip Byte Counts" content="28672 18155 26619 4002

 

bytes" />

 

 

 

<meta name="File Size" content="156867 bytes" />

 

 

 

<meta name="Exif IFD0:Image Width" content="800 pixels" />

 

 

 

<meta name="Exif Image:Photometric Interpretation" content="RGB" />

 

 

 

<meta name="Exif IFD0:Samples Per Pixel" content="3 samples/pixel" />

 

 

 

<meta name="Exif IFD0:Planar Configuration" content="Chunky (contiguous

 

for each subsampling pixel)" />

 

 

 

<meta name="Exif IFD0:Rows Per Strip" content="160 rows/strip" />

 

 

 

<meta name="Exif Image:Image Width" content="800 pixels" />

 

 

 

<meta name="File Name" content="apache-tika-17704590698477286878.tmp" />

 

 

 

<meta name="Exif IFD0:Bits Per Sample" content="8 8 8

 

bits/component/pixel" />

 

 

 

<meta name="tiff:BitsPerSample" content="8" />

 

 

 

<meta name="Exif IFD0:Resolution Unit" content="Inch" />

 

 

 

<meta name="Content-Type" content="image/tiff" />

 

 

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />

 

 

 

<meta name="X-Parsed-By"

 

content="org.apache.tika.parser.ocr.TesseractOCRParser" />

 

 

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.image.TiffParser"

 

/>

 

 

 

<meta name="Exif Image:Planar Configuration" content="Chunky (contiguous

 

for each subsampling pixel)" />

 

 

 

<meta name="File Modified Date" content="Wed Mar 18 01:27:47 +00:00 2020"

 

/>

 

 

 

<meta name="tiff:XResolution" content="96.0" />

 

 

 

<meta name="tiff:SamplesPerPixel" content="3" />

 

 

 

<meta name="exif:PageCount" content="2" />

 

 

 

<meta name="Exif Image:Strip Byte Counts" content="28672 18155 27500 4002

 

bytes" />

 

 

 

<meta name="Exif IFD0:Orientation" content="Top, left side (Horizontal /

 

normal)" />

 

 

 

<meta name="tiff:Orientation" content="1" />

 

 

 

<meta name="Exif IFD0:Compression" content="JPEG" />

 

 

 

<meta name="Exif IFD0:Page Number" content="0 1" />

 

 

 

<meta name="Exif Image:Rows Per Strip" content="160 rows/strip" />

 

 

 

<meta name="Exif Image:X Resolution" content="96 dots per inch" />

 

 

 

<meta name="Exif Image:Orientation" content="Top, left side (Horizontal /

 

normal)" />

 

 

 

<meta name="Exif IFD0:Photometric Interpretation" content="RGB" />

 

 

 

<meta name="Exif Image:JPEG Tables" content="[289 values]" />

 

 

 

<meta name="tiff:ImageWidth" content="800" />

 

 

 

<meta name="tiff:YResolution" content="96.0" />

 

 

 

<meta name="Exif Image:Bits Per Sample" content="8 8 8

 

bits/component/pixel" />

 

 

 

<meta name="Exif Image:Strip Offsets" content="77997 106669 124824 152324"

 

/>

 

 

 

<meta name="Exif IFD0:Y Resolution" content="96 dots per inch" />

 

 

 

<meta name="Exif Image:Resolution Unit" content="Inch" />

 

 

 

<title></title>

 

 

 

</head>

 

 

 

<body><div class="ocr">Multipage

 

 

 

TIFF

 

 

 

Example

 

 

 

Page 1

 

 

 

</div>

 

 

 

</body></html>

 

 

 

[ERROR]

 

 

TesseractOCRParserTest.testOCROutputsHOCR:146->TikaTest.assertContains:110

 

Happy</span> not found in:

 

 

 

<html xmlns="http://www.w3.org/1999/xhtml">

 

 

 

<head>

 

 

 

<meta name="pdf:docinfo:custom:AAPL:Keywords" content="" />

 

 

 

<meta name="pdf:PDFVersion" content="1.3" />

 

 

 

<meta name="pdf:docinfo:title" content="Presentation1" />

 

 

 

<meta name="xmp:CreatorTool" content="PowerPoint" />

 

 

 

<meta name="pdf:hasXFA" content="false" />

 

 

 

<meta name="access_permission:modify_annotations" content="true" />

 

 

 

<meta name="access_permission:can_print_degraded" content="true" />

 

 

 

<meta name="AAPL:Keywords" content="" />

 

 

 

<meta name="dc:creator" content="grantingersoll" />

 

 

 

<meta name="dcterms:created" content="2014-02-08T19:57:12Z" />

 

 

 

<meta name="dcterms:modified" content="2014-02-08T19:57:12Z" />

 

 

 

<meta name="Last-Modified" content="2014-02-08T19:57:12Z" />

 

 

 

<meta name="dc:format" content="application/pdf; version=1.3" />

 

 

 

<meta name="pdf:docinfo:creator_tool" content="PowerPoint" />

 

 

 

<meta name="access_permission:fill_in_form" content="true" />

 

 

 

<meta name="pdf:docinfo:keywords" content="" />

 

 

 

<meta name="pdf:docinfo:modified" content="2014-02-08T19:57:12Z" />

 

 

 

<meta name="meta:save-date" content="2014-02-08T19:57:12Z" />

 

 

 

<meta name="pdf:encrypted" content="false" />

 

 

 

<meta name="dc:title" content="Presentation1" />

 

 

 

<meta name="cp:subject" content="" />

 

 

 

<meta name="pdf:docinfo:subject" content="" />

 

 

 

<meta name="pdf:hasMarkedContent" content="false" />

 

 

 

<meta name="Content-Type" content="application/pdf" />

 

 

 

<meta name="pdf:docinfo:creator" content="grantingersoll" />

 

 

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />

 

 

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.pdf.PDFParser" />

 

 

 

<meta name="meta:author" content="grantingersoll" />

 

 

 

<meta name="dc:subject" content="" />

 

 

 

<meta name="dc:subject" content="" />

 

 

 

<meta name="dc:subject" content="" />

 

 

 

<meta name="dc:subject" content="" />

 

 

 

<meta name="meta:creation-date" content="2014-02-08T19:57:12Z" />

 

 

 

<meta name="access_permission:extract_for_accessibility" content="true" />

 

 

 

<meta name="access_permission:assemble_document" content="true" />

 

 

 

<meta name="xmpTPg:NPages" content="1" />

 

 

 

<meta name="pdf:hasXMP" content="false" />

 

 

 

<meta name="access_permission:extract_content" content="true" />

 

 

 

<meta name="access_permission:can_print" content="true" />

 

 

 

<meta name="meta:keyword" content="" />

 

 

 

<meta name="access_permission:can_modify" content="true" />

 

 

 

<meta name="pdf:docinfo:producer" content="Mac OS X 10.9.1 Quartz

 

PDFContext" />

 

 

 

<meta name="pdf:docinfo:created" content="2014-02-08T19:57:12Z" />

 

 

 

<title>Presentation1</title>

 

 

 

</head>

 

 

 

<body><div class="page"><p />

 

 

 

<img src="embedded:image0.png" alt="image0.png" /></div>

 

 

 

</body></html><html xmlns="http://www.w3.org/1999/xhtml">

 

 

 

<head>

 

 

 

<meta name="Transparency Alpha" content="none" />

 

 

 

<meta name="tiff:ImageLength" content="261" />

 

 

 

<meta name="Compression CompressionTypeName" content="deflate" />

 

 

 

<meta name="Data BitsPerSample" content="8 8 8" />

 

 

 

<meta name="Data PlanarConfiguration" content="PixelInterleaved" />

 

 

 

<meta name="Dimension VerticalPixelSize" content="0.35273367" />

 

 

 

<meta name="IHDR" content="width=934, height=261, bitDepth=8,

 

colorType=RGB, compressionMethod=deflate, filterMethod=adaptive,

 

interlaceMethod=none" />

 

 

 

<meta name="embeddedResourceType" content="INLINE" />

 

 

 

<meta name="Chroma ColorSpaceType" content="RGB" />

 

 

 

<meta name="tiff:BitsPerSample" content="8 8 8" />

 

 

 

<meta name="Content-Type" content="image/png" />

 

 

 

<meta name="height" content="261" />

 

 

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />

 

 

 

<meta name="X-Parsed-By"

 

content="org.apache.tika.parser.ocr.TesseractOCRParser" />

 

 

 

<meta name="X-Parsed-By"

 

content="org.apache.tika.parser.image.ImageParser" />

 

 

 

<meta name="pHYs" content="pixelsPerUnitXAxis=2835,

 

pixelsPerUnitYAxis=2835, unitSpecifier=meter" />

 

 

 

<meta name="Dimension PixelAspectRatio" content="1.0" />

 

 

 

<meta name="resourceName" content="image0.png" />

 

 

 

<meta name="pdf:hasXMP" content="false" />

 

 

 

<meta name="Compression NumProgressiveScans" content="1" />

 

 

 

<meta name="Dimension HorizontalPixelSize" content="0.35273367" />

 

 

 

<meta name="Chroma BlackIsZero" content="true" />

 

 

 

<meta name="Compression Lossless" content="true" />

 

 

 

<meta name="X-TIKA:embedded_depth" content="1" />

 

 

 

<meta name="width" content="934" />

 

 

 

<meta name="Dimension ImageOrientation" content="Normal" />

 

 

 

<meta name="X-TIKA:embedded_resource_path" content="/image0.png" />

 

 

 

<meta name="tiff:ImageWidth" content="934" />

 

 

 

<meta name="Chroma NumChannels" content="3" />

 

 

 

<meta name="Data SampleFormat" content="UnsignedIntegral" />

 

 

 

<title></title>

 

 

 

</head>

 

 

 

<body><div class="ocr">

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

    <div class="ocr_page" id="page_1" title="image

 

 

&quot;/var/folders/n5/1d_k3z4s2293q8ntx_n8sw54mm5n_8/T/apache-tika-1878472717677617651.tmp&quot;;

 

bbox 0 0 934 261; ppageno 0">

 

 

 

     <div class="ocr_carea" id="block_1_1" title="bbox 21 34 465 66">

 

 

 

      <p class="ocr_par" id="par_1_1" lang="eng" title="bbox 21 34 465 66">

 

 

 

       <span class="ocr_line" id="line_1_1" title="bbox 21 34 465 66;

 

baseline 0.005 -7; x_size 33; x_descenders 7; x_ascenders 8">

 

 

 

        <span class="ocrx_word" id="word_1_1" title="bbox 21 36 135 66;

 

x_wconf 96"><strong>Happy</strong></span>

 

 

 

        <span class="ocrx_word" id="word_1_2" title="bbox 161 37 232 60;

 

x_wconf 96"><strong>New</strong></span>

 

 

 

        <span class="ocrx_word" id="word_1_3" title="bbox 259 38 345 60;

 

x_wconf 96"><strong>Year</strong></span>

 

 

 

        <span class="ocrx_word" id="word_1_4" title="bbox 375 34 465 61;

 

x_wconf 96"><strong>2003!</strong></span>

 

 

 

       </span>

 

 

 

      </p>

 

 

 

 

 

 

 

     </div>

 

 

 

 

 

 

 

    </div>

 

 

 

 

 

 

 

 

 

 

 

</div>

 

 

 

</body></html>

 

 

 

[INFO]

 

 

 

[ERROR] Tests run: 1188, Failures: 2, Errors: 0, Skipped: 48

 

 

 

[INFO]

 

 

 

[INFO]

 

------------------------------------------------------------------------

 

 

 

[INFO] Reactor Summary for Apache Tika 2.0.0-SNAPSHOT:

 

 

 

[INFO]

 

 

 

[INFO] Apache Tika parent ................................. SUCCESS [

 

8.822 s]

 

 

 

[INFO] Apache Tika core ................................... SUCCESS [

 

39.589 s]

 

 

 

[INFO] Apache Tika parsers ................................ FAILURE [09:04

 

min]

 

 

 

[INFO] Apache Tika OSGi bundle ............................ SKIPPED

 

 

 

[INFO] Apache Tika XMP .................................... SKIPPED

 

 

 

[INFO] Apache Tika serialization .......................... SKIPPED

 

 

 

[INFO] Apache Tika batch .................................. SKIPPED

 

 

 

[INFO] Apache Tika language detection ..................... SKIPPED

 

 

 

[INFO] Apache Tika application ............................ SKIPPED

 

 

 

[INFO] Apache Tika translate .............................. SKIPPED

 

 

 

[INFO] Apache Tika server ................................. SKIPPED

 

 

 

[INFO] Apache Tika eval ................................... SKIPPED

 

 

 

[INFO] Apache Tika examples ............................... SKIPPED

 

 

 

[INFO] Apache Tika Java-7 Components ...................... SKIPPED

 

 

 

[INFO] Apache Tika Deep Learning (powered by DL4J) ........ SKIPPED

 

 

 

[INFO] Apache Tika Natural Language Processing ............ SKIPPED

 

 

 

[INFO] Apache Tika ........................................ SKIPPED

 

 

 

[INFO]

 

------------------------------------------------------------------------

 

 

 

[INFO] BUILD FAILURE

 

 

 

[INFO]

 

------------------------------------------------------------------------

 

 

 

[INFO] Total time:  09:57 min

 

 

 

[INFO] Finished at: 2020-03-17T18:31:10-07:00

 

 

 

[INFO]

 

------------------------------------------------------------------------

 

 

 

[ERROR] Failed to execute goal

 

org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test)

 

on project tika-parsers: There are test failures.

 

 

 

[ERROR]

 

 

 

[ERROR] Please refer to

 

/Users/mattmann/src/tika/tika-parsers/target/surefire-reports for the

 

individual test results.

 

 

 

[ERROR] Please refer to dump files (if any exist) [date].dump,

 

[date]-jvmRun[N].dump and [date].dumpstream.

 

 

 

[ERROR] -> [Help 1]

 

 

 

[ERROR]

 

 

 

[ERROR] To see the full stack trace of the errors, re-run Maven with the

 

-e switch.

 

 

 

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

 

 

 

[ERROR]

 

 

 

[ERROR] For more information about the errors and possible solutions,

 

please read the following articles:

 

 

 

[ERROR] [Help 1]

 

http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

 

 

 

[ERROR]

 

 

 

[ERROR] After correcting the problems, you can resume the build with the

 

command

 

 

 

[ERROR]   mvn <goals> -rf :tika-parsers

 

 

 

pomodoro:tika mattmann$ java -version

 

 

 

openjdk version "12.0.1" 2019-04-16

 

 

 

OpenJDK Runtime Environment (build 12.0.1+12)

 

 

 

OpenJDK 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)

 

 

 

pomodoro:tika mattmann$

 

 

 

 

 

 

 

Any ideas?

 

 

 

 

 

 

 

Cheers,

 

 

 

Chris

 

 

 

 

 

 

 

 

 

 

 

 

 


Re: [EXTERNAL] Re: JDK 12 build issues

Posted by Oleg Tikhonov <ol...@apache.org>.
Hi Chris,
I'm currently trying to build an env with java 12/13 ... in order to try
your setup.
What java version are you using? open jdk or oracle?
One upon a time was a bug in openjdk
https://bugs.openjdk.java.net/browse/JDK-8131146
But it seems to be ok in recent releases.

Keep you updated.
Cheers,
Oleg


On Wed, Mar 18, 2020 at 4:35 PM Chris Mattmann <ma...@apache.org> wrote:

> So I was able to get past my issues with Tesseract by reinstalling the
> latest version with Brew.
>
>
>
> I have a new issue!
>
> I’ve tried in JDK12 and JDK13 to build tika-dl, but it keeps failing:
>
>
>
> [INFO]
>
> [INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @
> tika-dl ---
>
> [INFO] Changes detected - recompiling the module!
>
> [INFO] Compiling 2 source files to
> /Users/mattmann/src/tika/tika-dl/target/test-classes
>
> [INFO]
>
> [INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-dl ---
>
> [INFO]
>
> [INFO] -------------------------------------------------------
>
> [INFO]  T E S T S
>
> [INFO] -------------------------------------------------------
>
> [INFO] Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>
> log4j:WARN No appenders could be found for logger
> (org.nd4j.linalg.factory.Nd4jBackend).
>
> log4j:WARN Please initialize the log4j system properly.
>
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
>
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 3.38 s <<< FAILURE! - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest
>
> [ERROR] org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise  Time
> elapsed: 3.29 s  <<< ERROR!
>
> org.apache.tika.exception.TikaConfigException: java.io.UTFDataFormatException:
> malformed input around byte 11
>
>        at
> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>
> Caused by: java.lang.RuntimeException: java.io.UTFDataFormatException:
> malformed input around byte 11
>
>        at
> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>
> Caused by: java.io.UTFDataFormatException: malformed input around byte 11
>
>        at
> org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:36)
>
>
>
> [INFO] Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 5.392 s - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest
>
> [INFO]
>
> [INFO] Results:
>
> [INFO]
>
> [ERROR] Errors:
>
> [ERROR]   DL4JVGG16NetTest.recognise:36 » TikaConfig java.io.UTFDataFormatException:
> mal...
>
> [INFO]
>
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0
>
> [INFO]
>
> [INFO]
> ------------------------------------------------------------------------
>
> [INFO] BUILD FAILURE
>
> [INFO]
> ------------------------------------------------------------------------
>
> [INFO] Total time:  25.628 s
>
> [INFO] Finished at: 2020-03-18T07:34:08-07:00
>
> [INFO]
> ------------------------------------------------------------------------
>
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test)
> on project tika-dl: There are test failures.
>
> [ERROR]
>
> [ERROR] Please refer to
> /Users/mattmann/src/tika/tika-dl/target/surefire-reports for the individual
> test results.
>
> [ERROR] Please refer to dump files (if any exist) [date].dump,
> [date]-jvmRun[N].dump and [date].dumpstream.
>
> [ERROR] -> [Help 1]
>
> [ERROR]
>
> [ERROR] To see the full stack trace of the errors, re-run Maven with the
> -e switch.
>
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>
> [ERROR]
>
> [ERROR] For more information about the errors and possible solutions,
> please read the following articles:
>
> [ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
>
> pomodoro:tika-dl mattmann$
>
>
>
> Thamme, do you have any ideas what is going on here?
>
>
> Cheers,
>
> Chris
>
>
>
>
>
>
>
>
>
> From: Tim Allison <ta...@apache.org>
> Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>, "Allison, Timothy
> B (US 1760-Affiliate)" <ti...@jpl.nasa.gov>
> Date: Wednesday, March 18, 2020 at 2:35 AM
> To: "dev@tika.apache.org" <de...@tika.apache.org>
> Subject: [EXTERNAL] Re: JDK 12 build issues
>
>
>
> Haven’t tried...we should add java 12-14 to Jenkins.
>
>
>
> Wait, are we up to 18 yet...
>
>
>
> Will look into it...
>
>
>
> On Tue, Mar 17, 2020 at 10:07 PM Chris Mattmann <ma...@apache.org>
> wrote:
>
>
>
> Hey Tim et al.,
>
>
>
>
>
>
>
> Do the tests fail for you with Java 12?
>
>
>
>
>
>
>
> [INFO] Running org.apache.tika.parser.pkg.GzipParserTest
>
>
>
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
> 0.397 s - in org.apache.tika.parser.pkg.GzipParserTest
>
>
>
> [INFO] Running org.apache.tika.TestXMLEntityExpansion
>
>
>
> [WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
>
> 0.085 s - in org.apache.tika.TestXMLEntityExpansion
>
>
>
> [INFO] Running org.apache.tika.mime.MimeTypeTest
>
>
>
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
> 0.001 s - in org.apache.tika.mime.MimeTypeTest
>
>
>
> [INFO] Running org.apache.tika.mime.MimeTypesTest
>
>
>
> [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
> 0.001 s - in org.apache.tika.mime.MimeTypesTest
>
>
>
> [INFO] Running org.apache.tika.mime.TestMimeTypes
>
>
>
> [INFO] Tests run: 80, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>
> 8.997 s - in org.apache.tika.mime.TestMimeTypes
>
>
>
> [INFO] Running org.apache.tika.TestCorruptedFiles
>
>
>
> [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
>
> 0.001 s - in org.apache.tika.TestCorruptedFiles
>
>
>
> [INFO]
>
>
>
> [INFO] Results:
>
>
>
> [INFO]
>
>
>
> [ERROR] Failures:
>
>
>
> [ERROR]
>
>
> TesseractOCRParserTest.confirmMultiPageTiffHandling:290->TikaTest.assertContains:110
>
> Page 2 not found in:
>
>
>
> <html xmlns="http://www.w3.org/1999/xhtml">
>
>
>
> <head>
>
>
>
> <meta name="Exif Image:Page Number" content="1 2" />
>
>
>
> <meta name="Exif IFD0:Strip Offsets" content="8 28680 46835 73454" />
>
>
>
> <meta name="Exif IFD0:JPEG Tables" content="[289 values]" />
>
>
>
> <meta name="Exif Image:Samples Per Pixel" content="3 samples/pixel" />
>
>
>
> <meta name="Exif Image:Image Height" content="600 pixels" />
>
>
>
> <meta name="tiff:ImageLength" content="600" />
>
>
>
> <meta name="Exif Image:Compression" content="JPEG" />
>
>
>
> <meta name="Exif Image:Y Resolution" content="96 dots per inch" />
>
>
>
> <meta name="Exif IFD0:X Resolution" content="96 dots per inch" />
>
>
>
> <meta name="tiff:ResolutionUnit" content="Inch" />
>
>
>
> <meta name="Exif IFD0:Image Height" content="600 pixels" />
>
>
>
> <meta name="Exif IFD0:Strip Byte Counts" content="28672 18155 26619 4002
>
> bytes" />
>
>
>
> <meta name="File Size" content="156867 bytes" />
>
>
>
> <meta name="Exif IFD0:Image Width" content="800 pixels" />
>
>
>
> <meta name="Exif Image:Photometric Interpretation" content="RGB" />
>
>
>
> <meta name="Exif IFD0:Samples Per Pixel" content="3 samples/pixel" />
>
>
>
> <meta name="Exif IFD0:Planar Configuration" content="Chunky (contiguous
>
> for each subsampling pixel)" />
>
>
>
> <meta name="Exif IFD0:Rows Per Strip" content="160 rows/strip" />
>
>
>
> <meta name="Exif Image:Image Width" content="800 pixels" />
>
>
>
> <meta name="File Name" content="apache-tika-17704590698477286878.tmp" />
>
>
>
> <meta name="Exif IFD0:Bits Per Sample" content="8 8 8
>
> bits/component/pixel" />
>
>
>
> <meta name="tiff:BitsPerSample" content="8" />
>
>
>
> <meta name="Exif IFD0:Resolution Unit" content="Inch" />
>
>
>
> <meta name="Content-Type" content="image/tiff" />
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>
>
>
> <meta name="X-Parsed-By"
>
> content="org.apache.tika.parser.ocr.TesseractOCRParser" />
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.image.TiffParser"
>
> />
>
>
>
> <meta name="Exif Image:Planar Configuration" content="Chunky (contiguous
>
> for each subsampling pixel)" />
>
>
>
> <meta name="File Modified Date" content="Wed Mar 18 01:27:47 +00:00 2020"
>
> />
>
>
>
> <meta name="tiff:XResolution" content="96.0" />
>
>
>
> <meta name="tiff:SamplesPerPixel" content="3" />
>
>
>
> <meta name="exif:PageCount" content="2" />
>
>
>
> <meta name="Exif Image:Strip Byte Counts" content="28672 18155 27500 4002
>
> bytes" />
>
>
>
> <meta name="Exif IFD0:Orientation" content="Top, left side (Horizontal /
>
> normal)" />
>
>
>
> <meta name="tiff:Orientation" content="1" />
>
>
>
> <meta name="Exif IFD0:Compression" content="JPEG" />
>
>
>
> <meta name="Exif IFD0:Page Number" content="0 1" />
>
>
>
> <meta name="Exif Image:Rows Per Strip" content="160 rows/strip" />
>
>
>
> <meta name="Exif Image:X Resolution" content="96 dots per inch" />
>
>
>
> <meta name="Exif Image:Orientation" content="Top, left side (Horizontal /
>
> normal)" />
>
>
>
> <meta name="Exif IFD0:Photometric Interpretation" content="RGB" />
>
>
>
> <meta name="Exif Image:JPEG Tables" content="[289 values]" />
>
>
>
> <meta name="tiff:ImageWidth" content="800" />
>
>
>
> <meta name="tiff:YResolution" content="96.0" />
>
>
>
> <meta name="Exif Image:Bits Per Sample" content="8 8 8
>
> bits/component/pixel" />
>
>
>
> <meta name="Exif Image:Strip Offsets" content="77997 106669 124824 152324"
>
> />
>
>
>
> <meta name="Exif IFD0:Y Resolution" content="96 dots per inch" />
>
>
>
> <meta name="Exif Image:Resolution Unit" content="Inch" />
>
>
>
> <title></title>
>
>
>
> </head>
>
>
>
> <body><div class="ocr">Multipage
>
>
>
> TIFF
>
>
>
> Example
>
>
>
> Page 1
>
>
>
> </div>
>
>
>
> </body></html>
>
>
>
> [ERROR]
>
>
> TesseractOCRParserTest.testOCROutputsHOCR:146->TikaTest.assertContains:110
>
> Happy</span> not found in:
>
>
>
> <html xmlns="http://www.w3.org/1999/xhtml">
>
>
>
> <head>
>
>
>
> <meta name="pdf:docinfo:custom:AAPL:Keywords" content="" />
>
>
>
> <meta name="pdf:PDFVersion" content="1.3" />
>
>
>
> <meta name="pdf:docinfo:title" content="Presentation1" />
>
>
>
> <meta name="xmp:CreatorTool" content="PowerPoint" />
>
>
>
> <meta name="pdf:hasXFA" content="false" />
>
>
>
> <meta name="access_permission:modify_annotations" content="true" />
>
>
>
> <meta name="access_permission:can_print_degraded" content="true" />
>
>
>
> <meta name="AAPL:Keywords" content="" />
>
>
>
> <meta name="dc:creator" content="grantingersoll" />
>
>
>
> <meta name="dcterms:created" content="2014-02-08T19:57:12Z" />
>
>
>
> <meta name="dcterms:modified" content="2014-02-08T19:57:12Z" />
>
>
>
> <meta name="Last-Modified" content="2014-02-08T19:57:12Z" />
>
>
>
> <meta name="dc:format" content="application/pdf; version=1.3" />
>
>
>
> <meta name="pdf:docinfo:creator_tool" content="PowerPoint" />
>
>
>
> <meta name="access_permission:fill_in_form" content="true" />
>
>
>
> <meta name="pdf:docinfo:keywords" content="" />
>
>
>
> <meta name="pdf:docinfo:modified" content="2014-02-08T19:57:12Z" />
>
>
>
> <meta name="meta:save-date" content="2014-02-08T19:57:12Z" />
>
>
>
> <meta name="pdf:encrypted" content="false" />
>
>
>
> <meta name="dc:title" content="Presentation1" />
>
>
>
> <meta name="cp:subject" content="" />
>
>
>
> <meta name="pdf:docinfo:subject" content="" />
>
>
>
> <meta name="pdf:hasMarkedContent" content="false" />
>
>
>
> <meta name="Content-Type" content="application/pdf" />
>
>
>
> <meta name="pdf:docinfo:creator" content="grantingersoll" />
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.pdf.PDFParser" />
>
>
>
> <meta name="meta:author" content="grantingersoll" />
>
>
>
> <meta name="dc:subject" content="" />
>
>
>
> <meta name="dc:subject" content="" />
>
>
>
> <meta name="dc:subject" content="" />
>
>
>
> <meta name="dc:subject" content="" />
>
>
>
> <meta name="meta:creation-date" content="2014-02-08T19:57:12Z" />
>
>
>
> <meta name="access_permission:extract_for_accessibility" content="true" />
>
>
>
> <meta name="access_permission:assemble_document" content="true" />
>
>
>
> <meta name="xmpTPg:NPages" content="1" />
>
>
>
> <meta name="pdf:hasXMP" content="false" />
>
>
>
> <meta name="access_permission:extract_content" content="true" />
>
>
>
> <meta name="access_permission:can_print" content="true" />
>
>
>
> <meta name="meta:keyword" content="" />
>
>
>
> <meta name="access_permission:can_modify" content="true" />
>
>
>
> <meta name="pdf:docinfo:producer" content="Mac OS X 10.9.1 Quartz
>
> PDFContext" />
>
>
>
> <meta name="pdf:docinfo:created" content="2014-02-08T19:57:12Z" />
>
>
>
> <title>Presentation1</title>
>
>
>
> </head>
>
>
>
> <body><div class="page"><p />
>
>
>
> <img src="embedded:image0.png" alt="image0.png" /></div>
>
>
>
> </body></html><html xmlns="http://www.w3.org/1999/xhtml">
>
>
>
> <head>
>
>
>
> <meta name="Transparency Alpha" content="none" />
>
>
>
> <meta name="tiff:ImageLength" content="261" />
>
>
>
> <meta name="Compression CompressionTypeName" content="deflate" />
>
>
>
> <meta name="Data BitsPerSample" content="8 8 8" />
>
>
>
> <meta name="Data PlanarConfiguration" content="PixelInterleaved" />
>
>
>
> <meta name="Dimension VerticalPixelSize" content="0.35273367" />
>
>
>
> <meta name="IHDR" content="width=934, height=261, bitDepth=8,
>
> colorType=RGB, compressionMethod=deflate, filterMethod=adaptive,
>
> interlaceMethod=none" />
>
>
>
> <meta name="embeddedResourceType" content="INLINE" />
>
>
>
> <meta name="Chroma ColorSpaceType" content="RGB" />
>
>
>
> <meta name="tiff:BitsPerSample" content="8 8 8" />
>
>
>
> <meta name="Content-Type" content="image/png" />
>
>
>
> <meta name="height" content="261" />
>
>
>
> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
>
>
>
> <meta name="X-Parsed-By"
>
> content="org.apache.tika.parser.ocr.TesseractOCRParser" />
>
>
>
> <meta name="X-Parsed-By"
>
> content="org.apache.tika.parser.image.ImageParser" />
>
>
>
> <meta name="pHYs" content="pixelsPerUnitXAxis=2835,
>
> pixelsPerUnitYAxis=2835, unitSpecifier=meter" />
>
>
>
> <meta name="Dimension PixelAspectRatio" content="1.0" />
>
>
>
> <meta name="resourceName" content="image0.png" />
>
>
>
> <meta name="pdf:hasXMP" content="false" />
>
>
>
> <meta name="Compression NumProgressiveScans" content="1" />
>
>
>
> <meta name="Dimension HorizontalPixelSize" content="0.35273367" />
>
>
>
> <meta name="Chroma BlackIsZero" content="true" />
>
>
>
> <meta name="Compression Lossless" content="true" />
>
>
>
> <meta name="X-TIKA:embedded_depth" content="1" />
>
>
>
> <meta name="width" content="934" />
>
>
>
> <meta name="Dimension ImageOrientation" content="Normal" />
>
>
>
> <meta name="X-TIKA:embedded_resource_path" content="/image0.png" />
>
>
>
> <meta name="tiff:ImageWidth" content="934" />
>
>
>
> <meta name="Chroma NumChannels" content="3" />
>
>
>
> <meta name="Data SampleFormat" content="UnsignedIntegral" />
>
>
>
> <title></title>
>
>
>
> </head>
>
>
>
> <body><div class="ocr">
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>    <div class="ocr_page" id="page_1" title="image
>
>
> &quot;/var/folders/n5/1d_k3z4s2293q8ntx_n8sw54mm5n_8/T/apache-tika-1878472717677617651.tmp&quot;;
>
> bbox 0 0 934 261; ppageno 0">
>
>
>
>     <div class="ocr_carea" id="block_1_1" title="bbox 21 34 465 66">
>
>
>
>      <p class="ocr_par" id="par_1_1" lang="eng" title="bbox 21 34 465 66">
>
>
>
>       <span class="ocr_line" id="line_1_1" title="bbox 21 34 465 66;
>
> baseline 0.005 -7; x_size 33; x_descenders 7; x_ascenders 8">
>
>
>
>        <span class="ocrx_word" id="word_1_1" title="bbox 21 36 135 66;
>
> x_wconf 96"><strong>Happy</strong></span>
>
>
>
>        <span class="ocrx_word" id="word_1_2" title="bbox 161 37 232 60;
>
> x_wconf 96"><strong>New</strong></span>
>
>
>
>        <span class="ocrx_word" id="word_1_3" title="bbox 259 38 345 60;
>
> x_wconf 96"><strong>Year</strong></span>
>
>
>
>        <span class="ocrx_word" id="word_1_4" title="bbox 375 34 465 61;
>
> x_wconf 96"><strong>2003!</strong></span>
>
>
>
>       </span>
>
>
>
>      </p>
>
>
>
>
>
>
>
>     </div>
>
>
>
>
>
>
>
>    </div>
>
>
>
>
>
>
>
>
>
>
>
> </div>
>
>
>
> </body></html>
>
>
>
> [INFO]
>
>
>
> [ERROR] Tests run: 1188, Failures: 2, Errors: 0, Skipped: 48
>
>
>
> [INFO]
>
>
>
> [INFO]
>
> ------------------------------------------------------------------------
>
>
>
> [INFO] Reactor Summary for Apache Tika 2.0.0-SNAPSHOT:
>
>
>
> [INFO]
>
>
>
> [INFO] Apache Tika parent ................................. SUCCESS [
>
> 8.822 s]
>
>
>
> [INFO] Apache Tika core ................................... SUCCESS [
>
> 39.589 s]
>
>
>
> [INFO] Apache Tika parsers ................................ FAILURE [09:04
>
> min]
>
>
>
> [INFO] Apache Tika OSGi bundle ............................ SKIPPED
>
>
>
> [INFO] Apache Tika XMP .................................... SKIPPED
>
>
>
> [INFO] Apache Tika serialization .......................... SKIPPED
>
>
>
> [INFO] Apache Tika batch .................................. SKIPPED
>
>
>
> [INFO] Apache Tika language detection ..................... SKIPPED
>
>
>
> [INFO] Apache Tika application ............................ SKIPPED
>
>
>
> [INFO] Apache Tika translate .............................. SKIPPED
>
>
>
> [INFO] Apache Tika server ................................. SKIPPED
>
>
>
> [INFO] Apache Tika eval ................................... SKIPPED
>
>
>
> [INFO] Apache Tika examples ............................... SKIPPED
>
>
>
> [INFO] Apache Tika Java-7 Components ...................... SKIPPED
>
>
>
> [INFO] Apache Tika Deep Learning (powered by DL4J) ........ SKIPPED
>
>
>
> [INFO] Apache Tika Natural Language Processing ............ SKIPPED
>
>
>
> [INFO] Apache Tika ........................................ SKIPPED
>
>
>
> [INFO]
>
> ------------------------------------------------------------------------
>
>
>
> [INFO] BUILD FAILURE
>
>
>
> [INFO]
>
> ------------------------------------------------------------------------
>
>
>
> [INFO] Total time:  09:57 min
>
>
>
> [INFO] Finished at: 2020-03-17T18:31:10-07:00
>
>
>
> [INFO]
>
> ------------------------------------------------------------------------
>
>
>
> [ERROR] Failed to execute goal
>
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test)
>
> on project tika-parsers: There are test failures.
>
>
>
> [ERROR]
>
>
>
> [ERROR] Please refer to
>
> /Users/mattmann/src/tika/tika-parsers/target/surefire-reports for the
>
> individual test results.
>
>
>
> [ERROR] Please refer to dump files (if any exist) [date].dump,
>
> [date]-jvmRun[N].dump and [date].dumpstream.
>
>
>
> [ERROR] -> [Help 1]
>
>
>
> [ERROR]
>
>
>
> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>
> -e switch.
>
>
>
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>
>
>
> [ERROR]
>
>
>
> [ERROR] For more information about the errors and possible solutions,
>
> please read the following articles:
>
>
>
> [ERROR] [Help 1]
>
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
>
>
>
> [ERROR]
>
>
>
> [ERROR] After correcting the problems, you can resume the build with the
>
> command
>
>
>
> [ERROR]   mvn <goals> -rf :tika-parsers
>
>
>
> pomodoro:tika mattmann$ java -version
>
>
>
> openjdk version "12.0.1" 2019-04-16
>
>
>
> OpenJDK Runtime Environment (build 12.0.1+12)
>
>
>
> OpenJDK 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)
>
>
>
> pomodoro:tika mattmann$
>
>
>
>
>
>
>
> Any ideas?
>
>
>
>
>
>
>
> Cheers,
>
>
>
> Chris
>
>
>
>
>
>
>
>
>
>
>
>