You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@camel.apache.org by ac...@apache.org on 2019/01/08 09:24:20 UTC

[camel] branch master updated: Big XML file split example (#2699)

This is an automated email from the ASF dual-hosted git repository.

acosentino pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/camel.git


The following commit(s) were added to refs/heads/master by this push:
     new 53e563a  Big XML file split example (#2699)
53e563a is described below

commit 53e563aeb5e48d6b1011361f901a5c5de5c6c3bc
Author: fvaleri <fv...@users.noreply.github.com>
AuthorDate: Tue Jan 8 10:24:15 2019 +0100

    Big XML file split example (#2699)
    
    * bigxml-split example
    
    * update log pattern
    
    * small changes
    
    * fix table layout
    
    * minor changes
    
    * update final results
    
    * aligned ReactiveHelper to the master branch
---
 examples/README.adoc                               |   2 +
 examples/camel-example-bigxml-split/README.md      |  60 ++++++++++++
 examples/camel-example-bigxml-split/pom.xml        | 109 +++++++++++++++++++++
 .../org/apache/camel/example/bigxml/Record.java    |  55 +++++++++++
 .../camel/example/bigxml/StaxTokenizerTest.java    |  72 ++++++++++++++
 .../org/apache/camel/example/bigxml/TestUtils.java |  78 +++++++++++++++
 .../camel/example/bigxml/XmlTokenizerTest.java     |  70 +++++++++++++
 .../apache/camel/example/bigxml/package-info.java  |  23 +++++
 .../src/test/resources/log4j2.properties           |  27 +++++
 9 files changed, 496 insertions(+)

diff --git a/examples/README.adoc b/examples/README.adoc
index e25c6b8..e2f4a08 100644
--- a/examples/README.adoc
+++ b/examples/README.adoc
@@ -233,6 +233,8 @@ Number of Examples: 103 (2 deprecated)
 | link:camel-example-cxf-ws-security-signature/README.md[CXF using WS-Security Signature] (camel-example-cxf-ws-security-signature) | WebService | CXF example using WS-Security Signature Action
 
 | link:camel-example-spring-ws/README.md[Spring WebService] (camel-example-spring-ws) | WebService | An example showing how to work with Camel and Spring Web Services
+
+| link:camel-example-bigxml-split/README.md[Split Test] camel-example-bigxml-split) | Testing | An example showing how to deal with big XML files in Camel
 |===
 // examples: END
 
diff --git a/examples/camel-example-bigxml-split/README.md b/examples/camel-example-bigxml-split/README.md
new file mode 100644
index 0000000..f5b35dd
--- /dev/null
+++ b/examples/camel-example-bigxml-split/README.md
@@ -0,0 +1,60 @@
+# Splitting big XML payloads
+
+### Introduction
+This example shows how to deal with big XML files in Camel.  
+
+The XPath tokenizer will load the entire XML content into memory, so it's not well suited for very big XML payloads.  
+Instead you can use the StAX or XML tokenizers to efficiently iterate the XML payload in a streamed fashion.  
+For more information please read the [official documentation](http://camel.apache.org/splitter.html).
+
+There are 2 tests:
+
+1. `StaxTokenizerTest` : requires using JAXB and process messages using a SAX ContentHandler
+2. `XmlTokenizerTest` : easier to use but can't handle complex XML structures (i.e. nested naming clash)
+
+The test XML contains a simple collection of records.
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+<records xmlns="http://fvaleri.it/records">
+    <record>
+        <key>0</key>
+        <value>The quick brown fox jumps over the lazy dog</value>
+    </record>
+</records>
+```
+
+You can customize numOfRecords and maxWaitTime to do performance tests with different payloads.  
+Max JVM heap is restricted to 20 MB to show that it works with a very limited amount of memory (see `pom.xml`).
+
+There are also a number of optional runtime settings: 
+- no cache enabled
+- no parallel processing
+- no mock endpoints with in-memory exchange store
+- enabled Throughput Logging for DEBUG level
+- disabled JMX instrumentation
+
+### Build and run
+The test XML file is built once beforehand using `@BeforeClass`.
+```sh
+mvn clean test
+```
+
+### Test results
+Tested on MacBook Pro 2,8 GHz Intel Core i7; 16 GB 2133 MHz LPDDR3; Java 1.8.0_181.
+
+tokenizer | numOfRecords | maxWaitTime (ms) | XML size (kB) | time (ms) 
+--- | --- | --- | --- | --- 
+StAX | 40000 | 5000 | 3543 | 3052
+XML | 40000 | 5000 | 3543 | 2756
+StAX | 1000000 | 20000 | 89735 | 11740
+XML | 1000000 | 20000 | 89735 | 11137
+StAX | 15000000 | 200000 | 1366102 | 132176
+XML | 15000000 | 200000 | 1366102 | 132549
+
+### Forum, Help, etc
+If you hit an problems please let us know on the Camel Forums
+<http://camel.apache.org/discussion-forums.html>
+
+Please help us make Apache Camel better - we appreciate any feedback you may have. Enjoy!
+
+The Camel riders!
diff --git a/examples/camel-example-bigxml-split/pom.xml b/examples/camel-example-bigxml-split/pom.xml
new file mode 100644
index 0000000..55f696d
--- /dev/null
+++ b/examples/camel-example-bigxml-split/pom.xml
@@ -0,0 +1,109 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+         http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" 
+    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+
+    <modelVersion>4.0.0</modelVersion>
+
+    <parent>
+        <groupId>org.apache.camel.example</groupId>
+        <artifactId>examples</artifactId>
+        <version>3.0.0-SNAPSHOT</version>
+    </parent>
+
+    <artifactId>camel-example-bigxml-split</artifactId>
+    <packaging>jar</packaging>
+    <name>Camel :: Example :: Big XML Split</name>
+    <description>How to deal with big XML files in Camel</description>
+
+    <properties>
+        <maven.compiler.source>1.8</maven.compiler.source>
+        <maven.compiler.target>1.8</maven.compiler.target>
+        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
+    </properties>
+
+    <dependencies>
+
+        <dependency>
+            <groupId>org.apache.camel</groupId>
+            <artifactId>camel-core</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>org.apache.camel</groupId>
+            <artifactId>camel-stax</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>com.fasterxml.woodstox</groupId>
+            <artifactId>woodstox-core</artifactId>
+            <version>5.2.0</version>
+        </dependency>
+
+        <!-- logging -->
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-api</artifactId>
+            <scope>runtime</scope>
+        </dependency>
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-core</artifactId>
+            <scope>runtime</scope>
+        </dependency>
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-slf4j-impl</artifactId>
+            <scope>runtime</scope>
+        </dependency>
+
+        <!-- for testing -->
+        <dependency>
+            <groupId>org.apache.camel</groupId>
+            <artifactId>camel-test</artifactId>
+            <scope>test</scope>
+        </dependency>
+        <dependency>
+            <groupId>junit</groupId>
+            <artifactId>junit</artifactId>
+            <scope>test</scope>
+        </dependency>
+
+    </dependencies>
+
+    <build>
+        <plugins>
+            <plugin>
+                <groupId>org.apache.maven.plugins</groupId>
+                <artifactId>maven-surefire-plugin</artifactId>
+                <configuration>
+                    <argLine>-Xmx20m -Dfile.encoding=${project.build.sourceEncoding}</argLine>
+                    <useSystemClassLoader>true</useSystemClassLoader>
+                    <rerunFailingTestsCount>0</rerunFailingTestsCount>
+                    <forkCount>1</forkCount>
+                    <forkedProcessTimeoutInSeconds>0</forkedProcessTimeoutInSeconds>
+                    <useFile>true</useFile>
+                    <failIfNoTests>false</failIfNoTests>
+                    <runOrder>alphabetical</runOrder>
+                </configuration>
+            </plugin>
+        </plugins>
+    </build>
+
+</project>
diff --git a/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/Record.java b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/Record.java
new file mode 100644
index 0000000..eab0628
--- /dev/null
+++ b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/Record.java
@@ -0,0 +1,55 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.camel.example.bigxml;
+
+import javax.xml.bind.annotation.XmlAccessType;
+import javax.xml.bind.annotation.XmlAccessorType;
+import javax.xml.bind.annotation.XmlElement;
+import javax.xml.bind.annotation.XmlType;
+
+@XmlAccessorType(XmlAccessType.FIELD)
+@XmlType(name = "record", propOrder = { "key", "value" })
+public class Record {
+
+    @XmlElement(required = true)
+    protected String key;
+
+    @XmlElement(required = true)
+    protected String value;
+
+    public String getKey() {
+        return key;
+    }
+
+    public void setKey(String key) {
+        this.key = key;
+    }
+
+    public String getValue() {
+        return value;
+    }
+
+    public void setValue(String value) {
+        this.value = value;
+    }
+
+    @Override
+    public String toString() {
+        return "{key='" + getKey() + "', value='" + getValue() + "'}";
+    }
+
+}
diff --git a/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/StaxTokenizerTest.java b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/StaxTokenizerTest.java
new file mode 100644
index 0000000..aab4aa6
--- /dev/null
+++ b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/StaxTokenizerTest.java
@@ -0,0 +1,72 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.camel.example.bigxml;
+
+import org.apache.camel.CamelContext;
+import org.apache.camel.builder.NotifyBuilder;
+import org.apache.camel.builder.RouteBuilder;
+import org.apache.camel.test.junit4.CamelTestSupport;
+
+import static org.apache.camel.component.stax.StAXBuilder.stax;
+
+import java.util.concurrent.TimeUnit;
+
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class StaxTokenizerTest extends CamelTestSupport {
+
+    @BeforeClass
+    public static void beforeClass() throws Exception {
+        TestUtils.buildTestXml();
+    }
+    
+    @Override
+    protected CamelContext createCamelContext() throws Exception {
+        CamelContext ctx = super.createCamelContext();
+        ctx.disableJMX();
+        return ctx;
+    }
+
+    @Override
+    protected int getShutdownTimeout() {
+        return 300;
+    }
+
+    @Test
+    public void test() throws Exception {
+        NotifyBuilder notify = new NotifyBuilder(context).whenDone(TestUtils.getNumOfRecords()).create();
+        boolean matches = notify.matches(TestUtils.getMaxWaitTime(), TimeUnit.MILLISECONDS);
+        log.info("Processed XML file with {} records", TestUtils.getNumOfRecords());
+        assertTrue("Test completed", matches);
+    }
+
+    @Override
+    protected RouteBuilder createRouteBuilder() throws Exception {
+        return new RouteBuilder() {
+            @Override
+            public void configure() throws Exception {
+                from("file:" + TestUtils.getBasePath() + "?readLock=changed&noop=true")
+                    .split(stax(Record.class)).streaming().stopOnException()
+                        //.log(LoggingLevel.TRACE, "org.apache.camel.example.bigxml", "${body}")
+                        .to("log:org.apache.camel.example.bigxml?level=DEBUG&groupInterval=100&groupDelay=100&groupActiveOnly=false")
+                    .end();
+            }
+        };
+    }
+
+}
diff --git a/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/TestUtils.java b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/TestUtils.java
new file mode 100644
index 0000000..37e1d3c
--- /dev/null
+++ b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/TestUtils.java
@@ -0,0 +1,78 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.camel.example.bigxml;
+
+import java.io.File;
+import java.io.FileOutputStream;
+
+import javax.xml.stream.XMLOutputFactory;
+import javax.xml.stream.XMLStreamWriter;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class TestUtils {
+
+    private static final Logger log = LoggerFactory.getLogger(TestUtils.class);
+    private final static String basePath = System.getProperty("user.dir") + "/target/data";
+    private final static int numOfRecords = 40000;
+    private final static int maxWaitTime = 5000;
+
+    public static String getBasePath() {
+        return basePath;
+    }
+
+    public static int getNumOfRecords() {
+        return numOfRecords;
+    }
+
+    public static int getMaxWaitTime() {
+        return maxWaitTime;
+    }
+
+    public static void buildTestXml() throws Exception {
+        new File(basePath).mkdir();
+        File f = new File(basePath + "/test.xml");
+        if (!f.exists()) {
+            log.info("Building test XML file...");
+            XMLOutputFactory xof = XMLOutputFactory.newInstance();
+            XMLStreamWriter xsw = xof.createXMLStreamWriter(new FileOutputStream(f), "UTF-8");
+            try {
+                xsw.writeStartDocument("UTF-8", "1.0");
+                xsw.writeStartElement("records");
+                xsw.writeAttribute("xmlns", "http://fvaleri.it/records");
+                for (int i = 0; i < numOfRecords; i++) {
+                    xsw.writeStartElement("record");
+                    xsw.writeStartElement("key");
+                    xsw.writeCharacters("" + i);
+                    xsw.writeEndElement();
+                    xsw.writeStartElement("value");
+                    xsw.writeCharacters("The quick brown fox jumps over the lazy dog");
+                    xsw.writeEndElement();
+                    xsw.writeEndElement();
+                }
+                xsw.writeEndElement();
+                xsw.writeEndDocument();
+            } finally {
+                log.info("Test XML file ready (size: {} kB)", f.length() / 1024);
+                xsw.flush();
+                xsw.close();
+            }
+        }
+    }
+
+}
\ No newline at end of file
diff --git a/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/XmlTokenizerTest.java b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/XmlTokenizerTest.java
new file mode 100644
index 0000000..18e4f4f
--- /dev/null
+++ b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/XmlTokenizerTest.java
@@ -0,0 +1,70 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.camel.example.bigxml;
+
+import org.apache.camel.CamelContext;
+import org.apache.camel.builder.NotifyBuilder;
+import org.apache.camel.builder.RouteBuilder;
+import org.apache.camel.test.junit4.CamelTestSupport;
+
+import java.util.concurrent.TimeUnit;
+
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class XmlTokenizerTest extends CamelTestSupport {
+
+    @BeforeClass
+    public static void beforeClass() throws Exception {
+        TestUtils.buildTestXml();
+    }
+
+    @Override
+    protected CamelContext createCamelContext() throws Exception {
+        CamelContext ctx = super.createCamelContext();
+        ctx.disableJMX();
+        return ctx;
+    }
+
+    @Override
+    protected int getShutdownTimeout() {
+        return 300;
+    }
+
+    @Test
+    public void test() throws Exception {
+        NotifyBuilder notify = new NotifyBuilder(context).whenDone(TestUtils.getNumOfRecords()).create();
+        boolean matches = notify.matches(TestUtils.getMaxWaitTime(), TimeUnit.MILLISECONDS);
+        log.info("Processed XML file with {} records", TestUtils.getNumOfRecords());
+        assertTrue("Test completed", matches);
+    }
+
+    @Override
+    protected RouteBuilder createRouteBuilder() throws Exception {
+        return new RouteBuilder() {
+            @Override
+            public void configure() throws Exception {
+                from("file:" + TestUtils.getBasePath() + "?readLock=changed&noop=true")
+                    .split(body().tokenizeXML("record", "records")).streaming().stopOnException()
+                        //.log(LoggingLevel.TRACE, "org.apache.camel.example.bigxml", "${body}")
+                        .to("log:org.apache.camel.example.bigxml?level=DEBUG&groupInterval=100&groupDelay=100&groupActiveOnly=false")
+                    .end();
+            }
+        };
+    }
+
+}
diff --git a/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/package-info.java b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/package-info.java
new file mode 100644
index 0000000..712f654
--- /dev/null
+++ b/examples/camel-example-bigxml-split/src/test/java/org/apache/camel/example/bigxml/package-info.java
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+@XmlSchema(namespace = "http://fvaleri.it/records", xmlns = {
+        @XmlNs(prefix = "", namespaceURI = "http://fvaleri.it/records") }, elementFormDefault = XmlNsForm.QUALIFIED)
+package org.apache.camel.example.bigxml;
+
+import javax.xml.bind.annotation.XmlNs;
+import javax.xml.bind.annotation.XmlNsForm;
+import javax.xml.bind.annotation.XmlSchema;
diff --git a/examples/camel-example-bigxml-split/src/test/resources/log4j2.properties b/examples/camel-example-bigxml-split/src/test/resources/log4j2.properties
new file mode 100644
index 0000000..2c19b57
--- /dev/null
+++ b/examples/camel-example-bigxml-split/src/test/resources/log4j2.properties
@@ -0,0 +1,27 @@
+## ---------------------------------------------------------------------------
+## Licensed to the Apache Software Foundation (ASF) under one or more
+## contributor license agreements.  See the NOTICE file distributed with
+## this work for additional information regarding copyright ownership.
+## The ASF licenses this file to You under the Apache License, Version 2.0
+## (the "License"); you may not use this file except in compliance with
+## the License.  You may obtain a copy of the License at
+##
+##      http://www.apache.org/licenses/LICENSE-2.0
+##
+## Unless required by applicable law or agreed to in writing, software
+## distributed under the License is distributed on an "AS IS" BASIS,
+## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+## See the License for the specific language governing permissions and
+## limitations under the License.
+## ---------------------------------------------------------------------------
+
+appender.out.type = Console
+appender.out.name = out
+appender.out.layout.type = PatternLayout
+appender.out.layout.pattern = %d [%20.20t] %-5p %20.20c{1} - %m%n
+rootLogger.level = INFO
+rootLogger.appenderRef.out.ref = out
+
+loggers = mine
+logger.mine.name = org.apache.camel.example.bigxml
+logger.mine.level = INFO