You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@camel.apache.org by GitBox <gi...@apache.org> on 2020/03/30 09:39:00 UTC

[GitHub] [camel-quarkus] JiriOndrusek opened a new pull request #998: 799 tika support wip2

JiriOndrusek opened a new pull request #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998
 
 
   WIP: Replaces  https://github.com/ppalaga/camel-quarkus/pull/3

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on a change in pull request #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on a change in pull request #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#discussion_r402914096
 
 

 ##########
 File path: integration-tests/tika/src/main/java/org/apache/camel/quarkus/component/tika/it/TikaResource.java
 ##########
 @@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.camel.quarkus.component.tika.it;
+
+import java.net.URI;
+
+import javax.enterprise.context.ApplicationScoped;
+import javax.inject.Inject;
+import javax.ws.rs.Consumes;
+import javax.ws.rs.POST;
+import javax.ws.rs.Path;
+import javax.ws.rs.Produces;
+import javax.ws.rs.core.MediaType;
+import javax.ws.rs.core.Response;
+
+// import org.apache.camel.ProducerTemplate;
 
 Review comment:
   Plz remove the commented code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on a change in pull request #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on a change in pull request #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#discussion_r402910272
 
 

 ##########
 File path: .github/workflows/pr-build.yaml
 ##########
 @@ -154,11 +154,12 @@ jobs:
               braintree
               compression
               graphql
-              mustache
+              mustache              
 
 Review comment:
   Plz. remove the tailing whitespace

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on a change in pull request #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on a change in pull request #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#discussion_r402910816
 
 

 ##########
 File path: extensions/tika/deployment/src/main/java/org/apache/camel/quarkus/component/tika/deployment/TikaProcessor.java
 ##########
 @@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.camel.quarkus.component.tika.deployment;
+
+import io.quarkus.arc.deployment.BeanContainerBuildItem;
+import io.quarkus.deployment.annotations.BuildProducer;
+import io.quarkus.deployment.annotations.BuildStep;
+import io.quarkus.deployment.annotations.ExecutionTime;
+import io.quarkus.deployment.annotations.Record;
+import io.quarkus.deployment.builditem.FeatureBuildItem;
+import io.quarkus.deployment.builditem.nativeimage.ReflectiveClassBuildItem;
+import org.apache.camel.quarkus.component.tika.TikaRecorder;
+import org.apache.camel.quarkus.core.deployment.CamelRuntimeBeanBuildItem;
+import org.apache.camel.quarkus.core.deployment.CamelServiceFilter;
+import org.apache.camel.quarkus.core.deployment.CamelServiceFilterBuildItem;
+import org.jboss.logging.Logger;
+
+class TikaProcessor {
+
+    private static final Logger LOG = Logger.getLogger(TikaProcessor.class);
+    private static final String FEATURE = "camel-tika";
+
+    @BuildStep
+    FeatureBuildItem feature() {
+        return new FeatureBuildItem(FEATURE);
+    }
+
+    /*
+     * The bean-validator component is programmatically configured by the extension thus
 
 Review comment:
   ```suggestion
        * The tika component is programmatically configured by the extension thus
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on a change in pull request #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on a change in pull request #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#discussion_r402916608
 
 

 ##########
 File path: poms/bom-deployment/pom.xml
 ##########
 @@ -34,6 +34,7 @@
 
     <properties>
         <camel-quarkus.version>1.1.0-SNAPSHOT</camel-quarkus.version><!-- kept in sync with project.version by the release plugin -->
+        <quarkus-version>1.3.0.Final</quarkus-version>
 
 Review comment:
   `quarkus.version` is defined in the top pom so I think this one is not needed?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on issue #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on issue #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#issuecomment-608368076
 
 
   You can add the missing license headers by running `mvn process-resources -Pformat`
   The order of imports can be fixed by re-compiling the failing modules.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] JiriOndrusek edited a comment on issue #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
JiriOndrusek edited a comment on issue #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#issuecomment-608494313
 
 
   @Ppalaga all suggestions are applied, doc created, rebased to integration branch, squashed.
   
   Althoug I wasn't able to test  or co,pile it, as this branch can not be compiled.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on a change in pull request #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on a change in pull request #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#discussion_r402915881
 
 

 ##########
 File path: integration-tests/tika/src/test/java/org/apache/camel/quarkus/component/tika/it/TikaTest.java
 ##########
 @@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.camel.quarkus.component.tika.it;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+
+import io.quarkus.test.junit.QuarkusTest;
+import io.restassured.RestAssured;
+import io.restassured.http.ContentType;
+import org.apache.tika.metadata.Metadata;
+import org.apache.tika.parser.txt.UniversalEncodingDetector;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.Matchers.containsStringIgnoringCase;
+import static org.hamcrest.Matchers.not;
+
+@QuarkusTest
+class TikaTest {
+
+    @Test
+    public void testPdf() throws Exception {
+        test("quarkus.pdf", "application/pdf", "Hello Quarkus");
+    }
+
+    @Test
+    public void testOdf() throws Exception {
+        String body = test("testOpenOffice2.odt", "application/vnd.oasis.opendocument.text",
+                "This is a sample Open Office document, written in NeoOffice 2.2.1 for the Mac");
+
+        Charset detectedCharset = null;
+        try {
+            InputStream bodyIs = new ByteArrayInputStream(body.getBytes(StandardCharsets.UTF_16));
+            UniversalEncodingDetector encodingDetector = new UniversalEncodingDetector();
+            detectedCharset = encodingDetector.detect(bodyIs, new Metadata());
+        } catch (IOException e1) {
+            Assertions.fail();
+        }
+
+        Assertions.assertTrue(detectedCharset.name().startsWith(StandardCharsets.UTF_16.name()));
+    }
+
+    @Test
+    public void testOffice() throws Exception {
+        String body = test("test.doc", "application/msword", "test");
+
+        Charset detectedCharset = null;
+        try {
+            InputStream bodyIs = new ByteArrayInputStream(body.getBytes());
+            UniversalEncodingDetector encodingDetector = new UniversalEncodingDetector();
+            detectedCharset = encodingDetector.detect(bodyIs, new Metadata());
+        } catch (IOException e1) {
+            Assertions.fail();
+        }
+
+        Assertions.assertTrue(detectedCharset.name().startsWith(Charset.defaultCharset().name()));
+    }
+
+    //    @Test
 
 Review comment:
   Maybe better this?
   ```suggestion
       @Test
       @Disabled("https://github.com/quarkusio/quarkus/issues/8375")
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on a change in pull request #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on a change in pull request #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#discussion_r402913066
 
 

 ##########
 File path: extensions/tika/deployment/src/main/java/org/apache/camel/quarkus/component/tika/deployment/TikaProcessor.java
 ##########
 @@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.camel.quarkus.component.tika.deployment;
+
+import io.quarkus.arc.deployment.BeanContainerBuildItem;
+import io.quarkus.deployment.annotations.BuildProducer;
+import io.quarkus.deployment.annotations.BuildStep;
+import io.quarkus.deployment.annotations.ExecutionTime;
+import io.quarkus.deployment.annotations.Record;
+import io.quarkus.deployment.builditem.FeatureBuildItem;
+import io.quarkus.deployment.builditem.nativeimage.ReflectiveClassBuildItem;
+import org.apache.camel.quarkus.component.tika.TikaRecorder;
+import org.apache.camel.quarkus.core.deployment.CamelRuntimeBeanBuildItem;
+import org.apache.camel.quarkus.core.deployment.CamelServiceFilter;
+import org.apache.camel.quarkus.core.deployment.CamelServiceFilterBuildItem;
+import org.jboss.logging.Logger;
+
+class TikaProcessor {
+
+    private static final Logger LOG = Logger.getLogger(TikaProcessor.class);
+    private static final String FEATURE = "camel-tika";
+
+    @BuildStep
+    FeatureBuildItem feature() {
+        return new FeatureBuildItem(FEATURE);
+    }
+
+    /*
+     * The bean-validator component is programmatically configured by the extension thus
+     * we can safely prevent camel to instantiate a default instance.
+     */
+    @BuildStep
+    CamelServiceFilterBuildItem serviceFilter() {
+        return new CamelServiceFilterBuildItem(CamelServiceFilter.forComponent("tika"));
+    }
+
+    @Record(ExecutionTime.STATIC_INIT)
+    @BuildStep
+    CamelRuntimeBeanBuildItem tikaComponent(BeanContainerBuildItem beanContainer, TikaRecorder recorder) {
+        return new CamelRuntimeBeanBuildItem(
+                "tika",
+                TikaRecorder.class.getName(),
 
 Review comment:
   `TikaRecorder` looks strange. `TikaComponent` maybe?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] JiriOndrusek commented on issue #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
JiriOndrusek commented on issue #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#issuecomment-608494313
 
 
   @Ppalaga all suggestions are applied, doc created, rebased to integration branch, squashed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on issue #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on issue #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#issuecomment-608366434
 
 
   And please rebase and squash your commits.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] JiriOndrusek commented on issue #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
JiriOndrusek commented on issue #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#issuecomment-608313750
 
 
   @ppalaga Current states is, that tika parser "works" in jvm and native, BUT
   
   - Tike parser obviously has some issues in quarkus. Not every parser is working. There is a list of not native ready parsers hardcoded in git - https://github.com/quarkusio/quarkus/blob/master/extensions/tika/deployment/src/main/java/io/quarkus/tika/deployment/TikaProcessor.java#L39
   - Even some of the parsers which should work, are not working. For example image parser - I've reported an issue about it https://github.com/quarkusio/quarkus/issues/8375 
   - following change has to be present in camel https://github.com/apache/camel/pull/3705 (already merged) I'll prepare PR for integration branch once the branch is working with current camel
   
   Do you see any direction what to do with this issue until integration branch is working?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-quarkus] ppalaga commented on a change in pull request #998: 799 tika support wip2

Posted by GitBox <gi...@apache.org>.
ppalaga commented on a change in pull request #998: 799 tika support wip2
URL: https://github.com/apache/camel-quarkus/pull/998#discussion_r402917426
 
 

 ##########
 File path: poms/bom-deployment/pom.xml
 ##########
 @@ -822,6 +823,16 @@
                 <artifactId>camel-quarkus-telegram-deployment</artifactId>
                 <version>${camel-quarkus.version}</version>
             </dependency>
+            <dependency>
+                <groupId>org.apache.camel.quarkus</groupId>
+                <artifactId>camel-quarkus-tika-deployment</artifactId>
+                <version>${camel-quarkus.version}</version>
+            </dependency>
+            <dependency>
+                <groupId>io.quarkus</groupId>
+                <artifactId>quarkus-tika-deployment</artifactId>
+                <version>${quarkus.version}</version>
+            </dependency>
 
 Review comment:
   Plz move this one to the top where the mysql driver is and add a link to the quarkus PR where it is fixed in their BOM.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services