You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by gr...@apache.org on 2019/07/31 01:40:05 UTC

[kudu] branch master updated (58e0149 -> e65b6d7)

This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git.


    from 58e0149  KUDU-2881 Support create/drop range partition by command line
     new 51a0d06  [master] Change the timeout threshold for unit test case
     new e65b6d7  [examples] Add a complete Nifi quickstart example

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 docs/quickstart.adoc                          |   12 +-
 examples/quickstart/nifi/README.adoc          |  165 ++++
 examples/quickstart/nifi/Random_User_Kudu.xml | 1002 +++++++++++++++++++++++++
 examples/quickstart/spark/README.adoc         |    2 +-
 src/kudu/master/sentry_authz_provider-test.cc |   10 +-
 5 files changed, 1180 insertions(+), 11 deletions(-)
 create mode 100644 examples/quickstart/nifi/README.adoc
 create mode 100644 examples/quickstart/nifi/Random_User_Kudu.xml


[kudu] 02/02: [examples] Add a complete Nifi quickstart example

Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit e65b6d77a1afec28e3fe8cf592d61a1e4ea8656c
Author: Grant Henke <gr...@apache.org>
AuthorDate: Sun Jul 14 18:20:24 2019 -0500

    [examples] Add a complete Nifi quickstart example
    
    This patchs adds a brief example using Apache Nifi
    to ingest data into Apache Kudu.
    
    Change-Id: I71f3bc5898c15d7bc19cffb3a91b9efac3f6928b
    Reviewed-on: http://gerrit.cloudera.org:8080/13878
    Tested-by: Grant Henke <gr...@apache.org>
    Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
 docs/quickstart.adoc                          |   12 +-
 examples/quickstart/nifi/README.adoc          |  165 ++++
 examples/quickstart/nifi/Random_User_Kudu.xml | 1002 +++++++++++++++++++++++++
 examples/quickstart/spark/README.adoc         |    2 +-
 4 files changed, 1174 insertions(+), 7 deletions(-)

diff --git a/docs/quickstart.adoc b/docs/quickstart.adoc
index e02506b..46e06a9 100644
--- a/docs/quickstart.adoc
+++ b/docs/quickstart.adoc
@@ -30,7 +30,7 @@
 Follow these instructions to set up and run a local Kudu Cluster using Docker,
 and get started using Apache Kudu in minutes.
 
-Note: This is intended for demonstration purposes only and shouldn't
+NOTE: This is intended for demonstration purposes only and shouldn't
 be used for production or performance/scale testing.
 
 [[quickstart_vm]]
@@ -48,8 +48,8 @@ Clone the Apache Kudu repository using Git and change to the `kudu` directory:
 
 [source,bash]
 ----
-$ git clone https://github.com/apache/kudu
-$ cd kudu
+git clone https://github.com/apache/kudu
+cd kudu
 ----
 
 == Start the Quickstart Cluster
@@ -60,7 +60,7 @@ Set the `KUDU_QUICKSTART_IP` environment variable to your ip address:
 
 [source,bash]
 ----
-$ export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
 ----
 
 === Bring up the Cluster
@@ -75,7 +75,7 @@ you can specify the master addresses with `localhost:7051,localhost:7151,localho
 docker-compose -f docker/quickstart.yml up
 ----
 
-Note: You can include the `-d` flag to run the cluster in the background.
+NOTE: You can include the `-d` flag to run the cluster in the background.
 
 === View the Web-UI
 
@@ -106,7 +106,7 @@ export KUDU_USER_NAME=kudu
 kudu cluster ksck localhost:7051,localhost:7151,localhost:7251
 ----
 
-Note: Setting `KUDU_USER_NAME=kudu` simplifies using Kudu from various user
+NOTE: Setting `KUDU_USER_NAME=kudu` simplifies using Kudu from various user
 accounts in a non-secure environment.
 
 == Running a Brief Example
diff --git a/examples/quickstart/nifi/README.adoc b/examples/quickstart/nifi/README.adoc
new file mode 100644
index 0000000..3d4e168
--- /dev/null
+++ b/examples/quickstart/nifi/README.adoc
@@ -0,0 +1,165 @@
+= Apache NiFi Quickstart
+
+Below is a brief example using Apache NiFi to ingest data in Apache Kudu.
+
+== Start the Kudu Quickstart Environment
+
+See the Apache Kudu
+link:https://kudu.apache.org/docs/quickstart.html[quickstart documentation]
+to setup and run the Kudu quickstart environment.
+
+== Run Apache NiFi
+
+Use the following command to run the latest Apache NiFi Docker image:
+
+[source,bash]
+----
+docker run --name kudu-nifi --network="docker_default" -p 8080:8080 apache/nifi:latest
+----
+
+You can view the running NiFi instance at link:http://localhost:8080/nifi[localhost:8080/nifi].
+
+NOTE: `--network="docker_default"` is specified to connect the container the
+same network as the quickstart cluster.
+
+NOTE: You can include the `-d` flag to run the cluster in the background.
+
+== Create the Kudu table
+
+Create the `random_user` Kudu table that matches the expected Schema.
+
+In order to do this without any dependencies on your host machine, we will
+use the `jshell` REPL in a Docker container to create the table using the
+Java API. First setup the Docker container, download the jar, and run the REPL:
+
+[source,bash]
+----
+docker run -it --rm --network="docker_default" maven:latest bin/bash
+# Download the kudu-client-tools jar which has the kudu-client and all the dependencies.
+mkdir jars
+mvn dependency:copy \
+    -Dartifact=org.apache.kudu:kudu-client-tools:1.10.0 \
+    -DoutputDirectory=jars
+# Run the jshell with the jar on the classpath.
+jshell --class-path jars/*
+----
+
+NOTE: `--network="docker_default"` is specified to connect the container the
+same network as the quickstart cluster.
+
+Then, once in the `jshell` REPL, create the table using the Java API:
+
+[source,java]
+----
+import org.apache.kudu.client.CreateTableOptions
+import org.apache.kudu.client.KuduClient
+import org.apache.kudu.client.KuduClient.KuduClientBuilder
+import org.apache.kudu.ColumnSchema.ColumnSchemaBuilder
+import org.apache.kudu.Schema
+import org.apache.kudu.Type
+
+KuduClient client =
+  new KuduClientBuilder("kudu-master-1:7051,kudu-master-2:7151,kudu-master-3:7251").build();
+
+if(client.tableExists("random_user")) {
+  client.deleteTable("random_user");
+}
+
+Schema schema = new Schema(Arrays.asList(
+  new ColumnSchemaBuilder("ssn", Type.STRING).key(true).build(),
+  new ColumnSchemaBuilder("firstName", Type.STRING).build(),
+  new ColumnSchemaBuilder("lastName", Type.STRING).build(),
+  new ColumnSchemaBuilder("email", Type.STRING).build())
+);
+CreateTableOptions tableOptions =
+  new CreateTableOptions().setNumReplicas(3).addHashPartitions(Arrays.asList("ssn"), 4);
+client.createTable("random_user", schema, tableOptions);
+----
+
+Once complete, you can use `CTRL + D` to exit the REPL and `exit` to exit the container.
+
+== Load the Dataflow Template
+
+The `Random_User_Kudu.xml` template downloads randomly generated user data from
+http://randomuser.me and then pushes the data into Kudu. The data is pulled in
+100 records at a time and then split into individual records. The incoming data
+is in JSON Format.
+
+Next, the user's social security number, first name, last name, and e-mail
+address are extract from the JSON into FlowFile Attributes and the content is
+modified to become a new JSON document consisting of only 4 fields:
+`ssn`, `firstName`, `lastName`, and `email`. Finally, this smaller JSON is then pushed to
+Kudu as a single row, each field being a separate column in that row.
+
+To load the template follow the NiFi
+link:https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Import_Template["Importing a Template" documentation]
+to load `Random_User_Kudu.xml`.
+
+Then follow the NiFi
+link:hhttps://nifi.apache.org/docs/nifi-docs/html/user-guide.html#instantiating-a-template["Instantiating a Template" documentation]
+to add the `Random User Kudu` template to the canvas.
+
+Once the template is added to the canvas you need to start the JsonTreeReader
+controller service. You can do this via the PutKudu processor configuration
+or via the Nifi Flow configuration in the Operate panel. See the Nifi
+link:https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Controller_Services_for_Dataflows["Controller Service" documentation]
+for more details.
+
+Now you can start individual processors by right-clicking each processor and selecting `Start`.
+You can also explore the configuration, queue contents, and more by right-clicking on each element.
+Alternatively you can use the Operate panel and start the entire flow at once.
+More about starting and stopping NiFi components can be read in the NiFi
+link:https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#starting-a-component["Starting a Component" documentation].
+
+== Shutdown NiFi
+
+Once you are done with the NiFi container you can shutdown in a couple of ways.
+If you ran NiFi without the `-d` flag, you can use `ctrl + c` to stop the  container.
+
+If you ran NiFi with the `-d` flag, you can use the following to
+gracefully shutdown the cluster:
+
+[source,bash]
+----
+docker stop kudu-nifi
+----
+
+To permanently remove the container run the following:
+
+[source,bash]
+----
+docker rm kudu-nifi
+----
+
+== Next steps
+
+The above example showed how to ingest data into Kudu using Apache NiFi.
+Next explore the other quickstart guides to learn how to query or process
+the data using other tools.
+
+For example, the link:https://github.com/apache/kudu/tree/master/examples/quickstart/spark[Spark quickstart guide]
+will walk you through how to setup and query Kudu tables with the `spark-kudu`
+integration.
+
+If you have already run through the Spark quickstart the following is a brief
+example of the code to allow you to query the `random_user` table:
+
+[source,bash]
+----
+spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.10.0
+----
+
+[source,scala]
+----
+:paste
+val random_user = spark.read
+	.option("kudu.master", "localhost:7051,localhost:7151,localhost:7251")
+	.option("kudu.table", "random_user")
+	// We need to use leader_only because Kudu on Docker currently doesn't
+	// support Snapshot scans due to `--use_hybrid_clock=false`.
+	.option("kudu.scanLocality", "leader_only")
+	.format("kudu").load
+random_user.createOrReplaceTempView("random_user")
+spark.sql("SELECT count(*) FROM random_user").show()
+spark.sql("SELECT * FROM random_user LIMIT 5").show()
+----
diff --git a/examples/quickstart/nifi/Random_User_Kudu.xml b/examples/quickstart/nifi/Random_User_Kudu.xml
new file mode 100644
index 0000000..158992a
--- /dev/null
+++ b/examples/quickstart/nifi/Random_User_Kudu.xml
@@ -0,0 +1,1002 @@
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<template encoding-version="1.2">
+    <description>This template downloads randomly generated user data from
+http://randomuser.me and then pushes the data into Kudu. The data is pulled in
+100 records at a time and then split into individual records. The incoming data
+is in JSON Format.
+
+Next, the user's social security number, first name, last name, and e-mail
+address are extract from the JSON into FlowFile Attributes and the content is
+modified to become a new JSON document consisting of only 4 fields:
+ssn, firstName, lastName, email. Finally, this smaller JSON is then pushed to
+Kudu as a single row, each value being a separate column in that row.</description>
+    <groupId>00304107-016c-1000-2e69-8f2347fbf5c3</groupId>
+    <name>Random User Kudu</name>
+    <snippet>
+        <connections>
+            <id>2ebb7ae0-bb19-386d-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>10000</backPressureObjectThreshold>
+            <bends>
+                <x>469.6021968790567</x>
+                <y>1017.9549013717346</y>
+            </bends>
+            <bends>
+                <x>469.6021968790567</x>
+                <y>1067.9549013717346</y>
+            </bends>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>d18d7c78-8767-35c5-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>1</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalancePartitionAttribute></loadBalancePartitionAttribute>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>failure</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>d18d7c78-8767-35c5-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <connections>
+            <id>7400e70c-689c-353f-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>0</backPressureObjectThreshold>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>61b913f5-e84d-33c4-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>1</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>split</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>1f4acd0d-2480-38ea-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <connections>
+            <id>786748c8-7a7c-3dd4-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>0</backPressureObjectThreshold>
+            <bends>
+                <x>173.46475219726562</x>
+                <y>179.42988967895508</y>
+            </bends>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>1f4acd0d-2480-38ea-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>0</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>Response</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>6ada961c-399a-30dd-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <connections>
+            <id>91e420fd-87d5-39e6-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>0</backPressureObjectThreshold>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>d18d7c78-8767-35c5-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>1</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>failure</selectedRelationships>
+            <selectedRelationships>success</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>5946c6a3-44fa-3784-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <connections>
+            <id>c518dc9b-e66c-3664-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <backPressureDataSizeThreshold>0 MB</backPressureDataSizeThreshold>
+            <backPressureObjectThreshold>0</backPressureObjectThreshold>
+            <destination>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>5946c6a3-44fa-3784-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </destination>
+            <flowFileExpiration>0 sec</flowFileExpiration>
+            <labelIndex>1</labelIndex>
+            <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
+            <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
+            <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
+            <name></name>
+            <selectedRelationships>matched</selectedRelationships>
+            <source>
+                <groupId>3d044f5c-470e-393b-0000-000000000000</groupId>
+                <id>61b913f5-e84d-33c4-0000-000000000000</id>
+                <type>PROCESSOR</type>
+            </source>
+            <zIndex>0</zIndex>
+        </connections>
+        <controllerServices>
+            <id>d8092989-d6ef-3313-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <bundle>
+                <artifact>nifi-record-serialization-services-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <descriptors>
+                <entry>
+                    <key>schema-access-strategy</key>
+                    <value>
+                        <name>schema-access-strategy</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-registry</key>
+                    <value>
+                        <identifiesControllerService>org.apache.nifi.schemaregistry.services.SchemaRegistry</identifiesControllerService>
+                        <name>schema-registry</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-name</key>
+                    <value>
+                        <name>schema-name</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-version</key>
+                    <value>
+                        <name>schema-version</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-branch</key>
+                    <value>
+                        <name>schema-branch</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-text</key>
+                    <value>
+                        <name>schema-text</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>schema-inference-cache</key>
+                    <value>
+                        <identifiesControllerService>org.apache.nifi.serialization.RecordSchemaCacheService</identifiesControllerService>
+                        <name>schema-inference-cache</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>Date Format</key>
+                    <value>
+                        <name>Date Format</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>Time Format</key>
+                    <value>
+                        <name>Time Format</name>
+                    </value>
+                </entry>
+                <entry>
+                    <key>Timestamp Format</key>
+                    <value>
+                        <name>Timestamp Format</name>
+                    </value>
+                </entry>
+            </descriptors>
+            <name>JsonTreeReader</name>
+            <persistsState>false</persistsState>
+            <properties>
+                <entry>
+                    <key>schema-access-strategy</key>
+                </entry>
+                <entry>
+                    <key>schema-registry</key>
+                </entry>
+                <entry>
+                    <key>schema-name</key>
+                </entry>
+                <entry>
+                    <key>schema-version</key>
+                </entry>
+                <entry>
+                    <key>schema-branch</key>
+                </entry>
+                <entry>
+                    <key>schema-text</key>
+                </entry>
+                <entry>
+                    <key>schema-inference-cache</key>
+                </entry>
+                <entry>
+                    <key>Date Format</key>
+                </entry>
+                <entry>
+                    <key>Time Format</key>
+                </entry>
+                <entry>
+                    <key>Timestamp Format</key>
+                </entry>
+            </properties>
+            <state>ENABLED</state>
+            <type>org.apache.nifi.json.JsonTreeReader</type>
+        </controllerServices>
+        <processors>
+            <id>1f4acd0d-2480-38ea-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>5.00505561901673</x>
+                <y>268.45753564705933</y>
+            </position>
+            <bundle>
+                <artifact>nifi-standard-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>JsonPath Expression</key>
+                        <value>
+                            <name>JsonPath Expression</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Null Value Representation</key>
+                        <value>
+                            <name>Null Value Representation</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>JsonPath Expression</key>
+                        <value>$.results[*]</value>
+                    </entry>
+                    <entry>
+                        <key>Null Value Representation</key>
+                        <value>empty string</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>0 sec</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>SplitJson</name>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>original</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>split</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.standard.SplitJson</type>
+        </processors>
+        <processors>
+            <id>5946c6a3-44fa-3784-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>5.00036773572856</x>
+                <y>744.0256629035371</y>
+            </position>
+            <bundle>
+                <artifact>nifi-standard-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>Attributes List</key>
+                        <value>
+                            <name>Attributes List</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>attributes-to-json-regex</key>
+                        <value>
+                            <name>attributes-to-json-regex</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Destination</key>
+                        <value>
+                            <name>Destination</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Include Core Attributes</key>
+                        <value>
+                            <name>Include Core Attributes</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Null Value</key>
+                        <value>
+                            <name>Null Value</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>Attributes List</key>
+                        <value>ssn, firstName, lastName, email</value>
+                    </entry>
+                    <entry>
+                        <key>attributes-to-json-regex</key>
+                    </entry>
+                    <entry>
+                        <key>Destination</key>
+                        <value>flowfile-content</value>
+                    </entry>
+                    <entry>
+                        <key>Include Core Attributes</key>
+                        <value>true</value>
+                    </entry>
+                    <entry>
+                        <key>Null Value</key>
+                        <value>false</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>0 sec</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>AttributesToJSON</name>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>success</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.standard.AttributesToJSON</type>
+        </processors>
+        <processors>
+            <id>61b913f5-e84d-33c4-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>6.4349386870426315</x>
+                <y>504.31885574631224</y>
+            </position>
+            <bundle>
+                <artifact>nifi-standard-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>Destination</key>
+                        <value>
+                            <name>Destination</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Return Type</key>
+                        <value>
+                            <name>Return Type</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Path Not Found Behavior</key>
+                        <value>
+                            <name>Path Not Found Behavior</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Null Value Representation</key>
+                        <value>
+                            <name>Null Value Representation</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>email</key>
+                        <value>
+                            <name>email</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>firstName</key>
+                        <value>
+                            <name>firstName</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>lastName</key>
+                        <value>
+                            <name>lastName</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>ssn</key>
+                        <value>
+                            <name>ssn</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>Destination</key>
+                        <value>flowfile-attribute</value>
+                    </entry>
+                    <entry>
+                        <key>Return Type</key>
+                        <value>auto-detect</value>
+                    </entry>
+                    <entry>
+                        <key>Path Not Found Behavior</key>
+                        <value>ignore</value>
+                    </entry>
+                    <entry>
+                        <key>Null Value Representation</key>
+                        <value>empty string</value>
+                    </entry>
+                    <entry>
+                        <key>email</key>
+                        <value>$.email</value>
+                    </entry>
+                    <entry>
+                        <key>firstName</key>
+                        <value>$.name.first</value>
+                    </entry>
+                    <entry>
+                        <key>lastName</key>
+                        <value>$.name.last</value>
+                    </entry>
+                    <entry>
+                        <key>ssn</key>
+                        <value>$.id.value</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>0 sec</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>EvaluateJsonPath</name>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>matched</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>unmatched</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.standard.EvaluateJsonPath</type>
+        </processors>
+        <processors>
+            <id>6ada961c-399a-30dd-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>0.0</x>
+                <y>0.0</y>
+            </position>
+            <bundle>
+                <artifact>nifi-standard-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>HTTP Method</key>
+                        <value>
+                            <name>HTTP Method</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Remote URL</key>
+                        <value>
+                            <name>Remote URL</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>SSL Context Service</key>
+                        <value>
+                            <identifiesControllerService>org.apache.nifi.ssl.SSLContextService</identifiesControllerService>
+                            <name>SSL Context Service</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Connection Timeout</key>
+                        <value>
+                            <name>Connection Timeout</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Read Timeout</key>
+                        <value>
+                            <name>Read Timeout</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Include Date Header</key>
+                        <value>
+                            <name>Include Date Header</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Follow Redirects</key>
+                        <value>
+                            <name>Follow Redirects</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Attributes to Send</key>
+                        <value>
+                            <name>Attributes to Send</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Basic Authentication Username</key>
+                        <value>
+                            <name>Basic Authentication Username</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Basic Authentication Password</key>
+                        <value>
+                            <name>Basic Authentication Password</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>proxy-configuration-service</key>
+                        <value>
+                            <identifiesControllerService>org.apache.nifi.proxy.ProxyConfigurationService</identifiesControllerService>
+                            <name>proxy-configuration-service</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Proxy Host</key>
+                        <value>
+                            <name>Proxy Host</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Proxy Port</key>
+                        <value>
+                            <name>Proxy Port</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Proxy Type</key>
+                        <value>
+                            <name>Proxy Type</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>invokehttp-proxy-user</key>
+                        <value>
+                            <name>invokehttp-proxy-user</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>invokehttp-proxy-password</key>
+                        <value>
+                            <name>invokehttp-proxy-password</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Put Response Body In Attribute</key>
+                        <value>
+                            <name>Put Response Body In Attribute</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Max Length To Put In Attribute</key>
+                        <value>
+                            <name>Max Length To Put In Attribute</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Digest Authentication</key>
+                        <value>
+                            <name>Digest Authentication</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Always Output Response</key>
+                        <value>
+                            <name>Always Output Response</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Trusted Hostname</key>
+                        <value>
+                            <name>Trusted Hostname</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Add Response Headers to Request</key>
+                        <value>
+                            <name>Add Response Headers to Request</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Content-Type</key>
+                        <value>
+                            <name>Content-Type</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>send-message-body</key>
+                        <value>
+                            <name>send-message-body</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Use Chunked Encoding</key>
+                        <value>
+                            <name>Use Chunked Encoding</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Penalize on "No Retry"</key>
+                        <value>
+                            <name>Penalize on "No Retry"</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>use-etag</key>
+                        <value>
+                            <name>use-etag</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>etag-max-cache-size</key>
+                        <value>
+                            <name>etag-max-cache-size</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>HTTP Method</key>
+                        <value>GET</value>
+                    </entry>
+                    <entry>
+                        <key>Remote URL</key>
+                        <value>http://api.randomuser.me?nat=us&amp;results=100</value>
+                    </entry>
+                    <entry>
+                        <key>SSL Context Service</key>
+                    </entry>
+                    <entry>
+                        <key>Connection Timeout</key>
+                        <value>5 secs</value>
+                    </entry>
+                    <entry>
+                        <key>Read Timeout</key>
+                        <value>15 secs</value>
+                    </entry>
+                    <entry>
+                        <key>Include Date Header</key>
+                        <value>True</value>
+                    </entry>
+                    <entry>
+                        <key>Follow Redirects</key>
+                        <value>True</value>
+                    </entry>
+                    <entry>
+                        <key>Attributes to Send</key>
+                    </entry>
+                    <entry>
+                        <key>Basic Authentication Username</key>
+                    </entry>
+                    <entry>
+                        <key>Basic Authentication Password</key>
+                    </entry>
+                    <entry>
+                        <key>proxy-configuration-service</key>
+                    </entry>
+                    <entry>
+                        <key>Proxy Host</key>
+                    </entry>
+                    <entry>
+                        <key>Proxy Port</key>
+                    </entry>
+                    <entry>
+                        <key>Proxy Type</key>
+                        <value>http</value>
+                    </entry>
+                    <entry>
+                        <key>invokehttp-proxy-user</key>
+                    </entry>
+                    <entry>
+                        <key>invokehttp-proxy-password</key>
+                    </entry>
+                    <entry>
+                        <key>Put Response Body In Attribute</key>
+                    </entry>
+                    <entry>
+                        <key>Max Length To Put In Attribute</key>
+                        <value>256</value>
+                    </entry>
+                    <entry>
+                        <key>Digest Authentication</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>Always Output Response</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>Trusted Hostname</key>
+                    </entry>
+                    <entry>
+                        <key>Add Response Headers to Request</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>Content-Type</key>
+                        <value>${mime.type}</value>
+                    </entry>
+                    <entry>
+                        <key>send-message-body</key>
+                        <value>true</value>
+                    </entry>
+                    <entry>
+                        <key>Use Chunked Encoding</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>Penalize on "No Retry"</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>use-etag</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>etag-max-cache-size</key>
+                        <value>10MB</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>10 seconds</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>Fetch User Data</name>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>Failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>No Retry</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>Original</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>Response</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>Retry</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.standard.InvokeHTTP</type>
+        </processors>
+        <processors>
+            <id>d18d7c78-8767-35c5-0000-000000000000</id>
+            <parentGroupId>3d044f5c-470e-393b-0000-000000000000</parentGroupId>
+            <position>
+                <x>6.6021968790567485</x>
+                <y>977.9549013717346</y>
+            </position>
+            <bundle>
+                <artifact>nifi-kudu-nar</artifact>
+                <group>org.apache.nifi</group>
+                <version>1.9.2</version>
+            </bundle>
+            <config>
+                <bulletinLevel>WARN</bulletinLevel>
+                <comments></comments>
+                <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
+                <descriptors>
+                    <entry>
+                        <key>Kudu Masters</key>
+                        <value>
+                            <name>Kudu Masters</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Table Name</key>
+                        <value>
+                            <name>Table Name</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>kerberos-credentials-service</key>
+                        <value>
+                            <identifiesControllerService>org.apache.nifi.kerberos.KerberosCredentialsService</identifiesControllerService>
+                            <name>kerberos-credentials-service</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Skip head line</key>
+                        <value>
+                            <name>Skip head line</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>record-reader</key>
+                        <value>
+                            <identifiesControllerService>org.apache.nifi.serialization.RecordReaderFactory</identifiesControllerService>
+                            <name>record-reader</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Insert Operation</key>
+                        <value>
+                            <name>Insert Operation</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Flush Mode</key>
+                        <value>
+                            <name>Flush Mode</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>FlowFiles per Batch</key>
+                        <value>
+                            <name>FlowFiles per Batch</name>
+                        </value>
+                    </entry>
+                    <entry>
+                        <key>Batch Size</key>
+                        <value>
+                            <name>Batch Size</name>
+                        </value>
+                    </entry>
+                </descriptors>
+                <executionNode>ALL</executionNode>
+                <lossTolerant>false</lossTolerant>
+                <penaltyDuration>30 sec</penaltyDuration>
+                <properties>
+                    <entry>
+                        <key>Kudu Masters</key>
+                        <value>kudu-master-1:7051,kudu-master-2:7151,kudu-master-3:7251</value>
+                    </entry>
+                    <entry>
+                        <key>Table Name</key>
+                        <value>random_user</value>
+                    </entry>
+                    <entry>
+                        <key>kerberos-credentials-service</key>
+                    </entry>
+                    <entry>
+                        <key>Skip head line</key>
+                        <value>false</value>
+                    </entry>
+                    <entry>
+                        <key>record-reader</key>
+                        <value>d8092989-d6ef-3313-0000-000000000000</value>
+                    </entry>
+                    <entry>
+                        <key>Insert Operation</key>
+                        <value>UPSERT</value>
+                    </entry>
+                    <entry>
+                        <key>Flush Mode</key>
+                        <value>AUTO_FLUSH_BACKGROUND</value>
+                    </entry>
+                    <entry>
+                        <key>FlowFiles per Batch</key>
+                        <value>1</value>
+                    </entry>
+                    <entry>
+                        <key>Batch Size</key>
+                        <value>100</value>
+                    </entry>
+                </properties>
+                <runDurationMillis>0</runDurationMillis>
+                <schedulingPeriod>0 sec</schedulingPeriod>
+                <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
+                <yieldDuration>1 sec</yieldDuration>
+            </config>
+            <executionNodeRestricted>false</executionNodeRestricted>
+            <name>PutKudu</name>
+            <relationships>
+                <autoTerminate>false</autoTerminate>
+                <name>failure</name>
+            </relationships>
+            <relationships>
+                <autoTerminate>true</autoTerminate>
+                <name>success</name>
+            </relationships>
+            <state>STOPPED</state>
+            <style/>
+            <type>org.apache.nifi.processors.kudu.PutKudu</type>
+        </processors>
+    </snippet>
+    <timestamp>07/18/2019 14:13:34 UTC</timestamp>
+</template>
diff --git a/examples/quickstart/spark/README.adoc b/examples/quickstart/spark/README.adoc
index 42953fe..b7ec637 100644
--- a/examples/quickstart/spark/README.adoc
+++ b/examples/quickstart/spark/README.adoc
@@ -3,7 +3,7 @@
 Below is a brief example using Apache Spark to load, query, and modify a real
 data set in Apache Kudu.
 
-== Start the Kudu Quickstart
+== Start the Kudu Quickstart Environment
 
 See the Apache Kudu
 link:https://kudu.apache.org/docs/quickstart.html[quickstart documentation]


[kudu] 01/02: [master] Change the timeout threshold for unit test case

Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 51a0d066900fa1e9a35ab0f0fc5eff92099c8a88
Author: helifu <hz...@corp.netease.com>
AuthorDate: Wed Jul 24 17:38:55 2019 +0800

    [master] Change the timeout threshold for unit test case
    
    In the previous patch, we use KUDU_ALLOW_SLOW_TEST to avoid failure
    in some environments. Now, another solution is proposed. It changes
    the timeout threshold to meet the assertion conditions.
    
    Change-Id: Ief5ec23c83022c2c5dbcc65a110ad8fe1f91f1a7
    Reviewed-on: http://gerrit.cloudera.org:8080/13909
    Tested-by: Kudu Jenkins
    Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
 src/kudu/master/sentry_authz_provider-test.cc | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/kudu/master/sentry_authz_provider-test.cc b/src/kudu/master/sentry_authz_provider-test.cc
index 2c5efed..a3a694c 100644
--- a/src/kudu/master/sentry_authz_provider-test.cc
+++ b/src/kudu/master/sentry_authz_provider-test.cc
@@ -1393,8 +1393,8 @@ TEST_F(TestSentryClientMetrics, Basic) {
   // Shorten the default timeout parameters: make timeout interval shorter.
   NO_FATALS(sentry_authz_provider_->Stop());
   FLAGS_sentry_service_rpc_addresses = sentry_->address().ToString();
-  FLAGS_sentry_service_send_timeout_seconds = AllowSlowTests() ? 5 : 2;
-  FLAGS_sentry_service_recv_timeout_seconds = AllowSlowTests() ? 5 : 2;
+  FLAGS_sentry_service_send_timeout_seconds = 2;
+  FLAGS_sentry_service_recv_timeout_seconds = 2;
   sentry_authz_provider_.reset(new SentryAuthzProvider(metric_entity_));
   ASSERT_OK(sentry_authz_provider_->Start());
 
@@ -1412,9 +1412,11 @@ TEST_F(TestSentryClientMetrics, Basic) {
   scoped_refptr<Histogram> hist(metric_entity_->FindOrCreateHistogram(
       &METRIC_sentry_client_task_execution_time_us));
   ASSERT_LT(0, hist->histogram()->MinValue());
-  ASSERT_LT(2000000, hist->histogram()->MaxValue());
+  // Change the threshold to 1900000 in case of very unstable system clock
+  // and other scheduler anomalies of the OS scheduler.
+  ASSERT_LT(1900000, hist->histogram()->MaxValue());
   ASSERT_LE(5, hist->histogram()->TotalCount());
-  ASSERT_LT(2000000, hist->histogram()->TotalSum());
+  ASSERT_LT(1900000, hist->histogram()->TotalSum());
 }
 
 enum class ThreadsNumPolicy {