You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@flink.apache.org by se...@apache.org on 2015/08/10 16:37:28 UTC

[1/4] flink git commit: [FLINK-2453] [docs] Update README and setup docs to reflect requirement for Java 7+

Repository: flink
Updated Branches:
  refs/heads/master fdebcb83d -> 0ce757fd7


[FLINK-2453] [docs] Update README and setup docs to reflect requirement for Java 7+


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/e503ebcf
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/e503ebcf
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/e503ebcf

Branch: refs/heads/master
Commit: e503ebcfc2330544c7bebcdaead6b1c5292409a9
Parents: 368d7ae
Author: Stephan Ewen <se...@apache.org>
Authored: Thu Aug 6 20:29:24 2015 +0200
Committer: Stephan Ewen <se...@apache.org>
Committed: Mon Aug 10 16:36:36 2015 +0200

----------------------------------------------------------------------
 README.md                   | 2 +-
 docs/setup/cluster_setup.md | 8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/e503ebcf/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index d9250b8..3cf08c7 100644
--- a/README.md
+++ b/README.md
@@ -35,7 +35,7 @@ Prerequisites for building Flink:
 * Unix-like environment (We use Linux, Mac OS X, Cygwin)
 * git
 * Maven (at least version 3.0.4)
-* Java 6, 7 or 8 (Note that Oracle's JDK 6 library will fail to build Flink, but is able to run a pre-compiled package without problem)
+* Java 7 or 8
 
 ```
 git clone https://github.com/apache/flink.git

http://git-wip-us.apache.org/repos/asf/flink/blob/e503ebcf/docs/setup/cluster_setup.md
----------------------------------------------------------------------
diff --git a/docs/setup/cluster_setup.md b/docs/setup/cluster_setup.md
index 232708b..474aace 100644
--- a/docs/setup/cluster_setup.md
+++ b/docs/setup/cluster_setup.md
@@ -40,7 +40,7 @@ and **Cygwin** (for Windows) and expects the cluster to consist of **one master
 node** and **one or more worker nodes**. Before you start to setup the system,
 make sure you have the following software installed **on each node**:
 
-- **Java 1.6.x** or higher,
+- **Java 1.7.x** or higher,
 - **ssh** (sshd must be running to use the Flink scripts that manage
   remote components)
 
@@ -65,9 +65,9 @@ The command should output something comparable to the following on every node of
 your cluster (depending on your Java version, there may be small differences):
 
 ~~~bash
-java version "1.6.0_22"
-Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
-Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode)
+java version "1.7.0_55"
+Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
+Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
 ~~~
 
 To make sure the ssh daemon is running properly, you can use the command

[3/4] flink git commit: [FLINK-1177] [docs] Mark HDFS setup instructions as option in cluster setup docs

Posted by se...@apache.org.

[FLINK-1177] [docs] Mark HDFS setup instructions as option in cluster setup docs


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/27044b41
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/27044b41
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/27044b41

Branch: refs/heads/master
Commit: 27044b4107eac80a1f24afaf4b1a4a79af423d76
Parents: e503ebc
Author: Stephan Ewen <se...@apache.org>
Authored: Thu Aug 6 20:39:34 2015 +0200
Committer: Stephan Ewen <se...@apache.org>
Committed: Mon Aug 10 16:36:37 2015 +0200

----------------------------------------------------------------------
 docs/setup/cluster_setup.md | 200 ++++++++++++++++++++-------------------
 1 file changed, 103 insertions(+), 97 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/27044b41/docs/setup/cluster_setup.md
----------------------------------------------------------------------
diff --git a/docs/setup/cluster_setup.md b/docs/setup/cluster_setup.md
index 474aace..29df3de 100644
--- a/docs/setup/cluster_setup.md
+++ b/docs/setup/cluster_setup.md
@@ -158,14 +158,111 @@ echo "PermitUserEnvironment yes" >> /etc/ssh/sshd_config
 /etc/init.d/sshd restart
 ~~~
 
-## Hadoop Distributed Filesystem (HDFS) Setup
 
-The Flink system currently uses the Hadoop Distributed Filesystem (HDFS)
-to read and write data in a distributed fashion. It is possible to use
-Flink without HDFS or other distributed file systems.
+## Flink Setup
+
+Go to the [downloads page]({{site.baseurl}}/downloads.html) and get the ready to run
+package. Make sure to pick the Flink package **matching your Hadoop
+version**.
+
+After downloading the latest release, copy the archive to your master node and
+extract it:
+
+~~~bash
+tar xzf flink-*.tgz
+cd flink-*
+~~~
+
+### Configuring the Cluster
+
+After having extracted the system files, you need to configure Flink for
+the cluster by editing *conf/flink-conf.yaml*.
+
+Set the `jobmanager.rpc.address` key to point to your master node. Furthermode
+define the maximum amount of main memory the JVM is allowed to allocate on each
+node by setting the `jobmanager.heap.mb` and `taskmanager.heap.mb` keys.
+
+The value is given in MB. If some worker nodes have more main memory which you
+want to allocate to the Flink system you can overwrite the default value
+by setting an environment variable `FLINK_TM_HEAP` on the respective
+node.
+
+Finally you must provide a list of all nodes in your cluster which shall be used
+as worker nodes. Therefore, similar to the HDFS configuration, edit the file
+*conf/slaves* and enter the IP/host name of each worker node. Each worker node
+will later run a TaskManager.
+
+Each entry must be separated by a new line, as in the following example:
+
+~~~
+192.168.0.100
+192.168.0.101
+.
+.
+.
+192.168.0.150
+~~~
+
+The Flink directory must be available on every worker under the same
+path. Similarly as for HDFS, you can use a shared NSF directory, or copy the
+entire Flink directory to every worker node.
+
+Please see the [configuration page](config.html) for details and additional
+configuration options.
+
+In particular, 
+
+ * the amount of available memory per TaskManager (`taskmanager.heap.mb`), 
+ * the number of available CPUs per machine (`taskmanager.numberOfTaskSlots`),
+ * the total number of CPUs in the cluster (`parallelism.default`) and
+ * the temporary directories (`taskmanager.tmp.dirs`)
+
+are very important configuration values.
+
+
+### Starting Flink
+
+The following script starts a JobManager on the local node and connects via
+SSH to all worker nodes listed in the *slaves* file to start the
+TaskManager on each node. Now your Flink system is up and
+running. The JobManager running on the local node will now accept jobs
+at the configured RPC port.
+
+Assuming that you are on the master node and inside the Flink directory:
+
+~~~bash
+bin/start-cluster.sh
+~~~
+
+To stop Flink, there is also a `stop-cluster.sh` script.
+
+
+### Starting Flink in the streaming mode
+
+~~~bash
+bin/start-cluster-streaming.sh
+~~~
+
+The streaming mode changes the startup behavior of Flink: The system is not 
+bringing up the managed memory services with preallocated memory at the beginning.
+Flink streaming is not using the managed memory employed by the batch operators.
+By not starting these services with preallocated memory, streaming jobs can benefit
+from more heap space being available.
+
+Note that you can still start batch jobs in the streaming mode. The memory manager
+will then allocate memory segments from the Java heap as needed.
+
+
+## Optional: Hadoop Distributed Filesystem (HDFS) Setup
 
-Make sure to have a running HDFS installation. The following instructions are
-just a general overview of some required settings. Please consult one of the
+**NOTE** Flink does not require HDFS to run; HDFS is simply a typical choice of a distributed data
+store to read data from (in parallel) and write results to.
+If HDFS is already available on the cluster, or Flink is used purely with different storage
+techniques (e.g., Apache Kafka, JDBC, Rabbit MQ, or other storage or message queues), this
+setup step is not needed.
+
+
+The following instructions are a general overview of usual required settings. Please consult one of the
 many installation guides available online for more detailed instructions.
 
 __Note that the following instructions are based on Hadoop 1.2 and might differ 
@@ -271,95 +368,4 @@ like to point you to the [Hadoop Quick
 Start](http://wiki.apache.org/hadoop/QuickStart)
 guide.
 
-## Flink Setup
-
-Go to the [downloads page]({{site.baseurl}}/downloads.html) and get the ready to run
-package. Make sure to pick the Flink package **matching your Hadoop
-version**.
-
-After downloading the latest release, copy the archive to your master node and
-extract it:
-
-~~~bash
-tar xzf flink-*.tgz
-cd flink-*
-~~~
-
-### Configuring the Cluster
-
-After having extracted the system files, you need to configure Flink for
-the cluster by editing *conf/flink-conf.yaml*.
-
-Set the `jobmanager.rpc.address` key to point to your master node. Furthermode
-define the maximum amount of main memory the JVM is allowed to allocate on each
-node by setting the `jobmanager.heap.mb` and `taskmanager.heap.mb` keys.
 
-The value is given in MB. If some worker nodes have more main memory which you
-want to allocate to the Flink system you can overwrite the default value
-by setting an environment variable `FLINK_TM_HEAP` on the respective
-node.
-
-Finally you must provide a list of all nodes in your cluster which shall be used
-as worker nodes. Therefore, similar to the HDFS configuration, edit the file
-*conf/slaves* and enter the IP/host name of each worker node. Each worker node
-will later run a TaskManager.
-
-Each entry must be separated by a new line, as in the following example:
-
-~~~
-192.168.0.100
-192.168.0.101
-.
-.
-.
-192.168.0.150
-~~~
-
-The Flink directory must be available on every worker under the same
-path. Similarly as for HDFS, you can use a shared NSF directory, or copy the
-entire Flink directory to every worker node.
-
-Please see the [configuration page](config.html) for details and additional
-configuration options.
-
-In particular, 
-
- * the amount of available memory per TaskManager (`taskmanager.heap.mb`), 
- * the number of available CPUs per machine (`taskmanager.numberOfTaskSlots`),
- * the total number of CPUs in the cluster (`parallelism.default`) and
- * the temporary directories (`taskmanager.tmp.dirs`)
-
-are very important configuration values.
-
-
-### Starting Flink
-
-The following script starts a JobManager on the local node and connects via
-SSH to all worker nodes listed in the *slaves* file to start the
-TaskManager on each node. Now your Flink system is up and
-running. The JobManager running on the local node will now accept jobs
-at the configured RPC port.
-
-Assuming that you are on the master node and inside the Flink directory:
-
-~~~bash
-bin/start-cluster.sh
-~~~
-
-To stop Flink, there is also a `stop-cluster.sh` script.
-
-
-### Starting Flink in the streaming mode
-
-~~~bash
-bin/start-cluster-streaming.sh
-~~~
-
-The streaming mode changes the startup behavior of Flink: The system is not 
-bringing up the managed memory services with preallocated memory at the beginning.
-Flink streaming is not using the managed memory employed by the batch operators.
-By not starting these services with preallocated memory, streaming jobs can benefit
-from more heap space being available.
-
-Note that you can still start batch jobs in the streaming mode. The memory manager
-will then allocate memory segments from the Java heap as needed.

[4/4] flink git commit: [hotfix] Fix repository language statistics by adjusting .gitattributes

Posted by se...@apache.org.

[hotfix] Fix repository language statistics by adjusting .gitattributes


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/0ce757fd
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/0ce757fd
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/0ce757fd

Branch: refs/heads/master
Commit: 0ce757fd72446a60d5b79ea970dfb545406e3544
Parents: 27044b4
Author: Stephan Ewen <se...@apache.org>
Authored: Mon Aug 10 15:37:33 2015 +0200
Committer: Stephan Ewen <se...@apache.org>
Committed: Mon Aug 10 16:36:37 2015 +0200

----------------------------------------------------------------------
 .gitattributes | 2 ++
 1 file changed, 2 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/0ce757fd/.gitattributes
----------------------------------------------------------------------
diff --git a/.gitattributes b/.gitattributes
index 69b47b5..b68afa6 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -1 +1,3 @@
 *.bat text eol=crlf
+flink-runtime-web/web-dashboard/web/* linguist-vendored
+

[2/4] flink git commit: [FLINK-705] [license] Add license reference for TPC-H sample data in tests.

Posted by se...@apache.org.

[FLINK-705] [license] Add license reference for TPC-H sample data in tests.


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/368d7aee
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/368d7aee
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/368d7aee

Branch: refs/heads/master
Commit: 368d7aee1ce3a4a2fd458628852a96db098ae635
Parents: fdebcb8
Author: Stephan Ewen <se...@apache.org>
Authored: Thu Aug 6 19:42:13 2015 +0200
Committer: Stephan Ewen <se...@apache.org>
Committed: Mon Aug 10 16:36:36 2015 +0200

----------------------------------------------------------------------
 .../flink/test/recordJobTests/TPCHQuery3ITCase.java   | 14 ++++++++++++++
 .../recordJobTests/TPCHQuery3WithUnionITCase.java     | 14 +++++++++++++-
 .../flink/test/recordJobTests/TPCHQuery9ITCase.java   | 14 +++++++++++++-
 .../test/recordJobTests/TPCHQueryAsterixITCase.java   | 14 +++++++++++++-
 4 files changed, 53 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/368d7aee/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3ITCase.java
----------------------------------------------------------------------
diff --git a/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3ITCase.java b/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3ITCase.java
index a0236c2..159c931 100644
--- a/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3ITCase.java
+++ b/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3ITCase.java
@@ -24,10 +24,24 @@ import org.apache.flink.api.common.Plan;
 import org.apache.flink.configuration.Configuration;
 import org.apache.flink.test.recordJobs.relational.TPCHQuery3;
 import org.apache.flink.test.util.RecordAPITestBase;
+
 import org.junit.runner.RunWith;
 import org.junit.runners.Parameterized;
 import org.junit.runners.Parameterized.Parameters;
 
+// -----------------------------------------------------------------------------
+//  --- NOTE ---
+//
+//  This class contains test data generated by tools from by the
+//  Transaction Processing Council (TPC), specifically the TPC-H benchmark's
+//  data generator.
+//
+//  Any form of use and redistribution must happen in accordance with the TPC-H
+//  Software License Agreement.
+// 
+//  For details, see http://www.tpc.org/tpch/dbgen/tpc-h%20license%20agreement.pdf
+// -----------------------------------------------------------------------------
+
 @RunWith(Parameterized.class)
 public class TPCHQuery3ITCase extends RecordAPITestBase {
 	

http://git-wip-us.apache.org/repos/asf/flink/blob/368d7aee/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3WithUnionITCase.java
----------------------------------------------------------------------
diff --git a/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3WithUnionITCase.java b/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3WithUnionITCase.java
index 3ade964..4c51ac7 100644
--- a/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3WithUnionITCase.java
+++ b/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery3WithUnionITCase.java
@@ -16,13 +16,25 @@
  * limitations under the License.
  */
 
-
 package org.apache.flink.test.recordJobTests;
 
 import org.apache.flink.api.common.Plan;
 import org.apache.flink.test.recordJobs.relational.TPCHQuery3Unioned;
 import org.apache.flink.test.util.RecordAPITestBase;
 
+// -----------------------------------------------------------------------------
+//  --- NOTE ---
+//
+//  This class contains test data generated by tools from by the
+//  Transaction Processing Council (TPC), specifically the TPC-H benchmark's
+//  data generator.
+//
+//  Any form of use and redistribution must happen in accordance with the TPC-H
+//  Software License Agreement.
+// 
+//  For details, see http://www.tpc.org/tpch/dbgen/tpc-h%20license%20agreement.pdf
+// -----------------------------------------------------------------------------
+
 public class TPCHQuery3WithUnionITCase extends RecordAPITestBase {
 	
 	private String orders1Path = null;

http://git-wip-us.apache.org/repos/asf/flink/blob/368d7aee/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery9ITCase.java
----------------------------------------------------------------------
diff --git a/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery9ITCase.java b/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery9ITCase.java
index f092400..ab99e22 100644
--- a/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery9ITCase.java
+++ b/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQuery9ITCase.java
@@ -16,13 +16,25 @@
  * limitations under the License.
  */
 
-
 package org.apache.flink.test.recordJobTests;
 
 import org.apache.flink.api.common.Plan;
 import org.apache.flink.test.recordJobs.relational.TPCHQuery9;
 import org.apache.flink.test.util.RecordAPITestBase;
 
+// -----------------------------------------------------------------------------
+//  --- NOTE ---
+//
+//  This class contains test data generated by tools from by the
+//  Transaction Processing Council (TPC), specifically the TPC-H benchmark's
+//  data generator.
+//
+//  Any form of use and redistribution must happen in accordance with the TPC-H
+//  Software License Agreement.
+// 
+//  For details, see http://www.tpc.org/tpch/dbgen/tpc-h%20license%20agreement.pdf
+// -----------------------------------------------------------------------------
+
 public class TPCHQuery9ITCase extends RecordAPITestBase {
 	
 	private String partInputPath;

http://git-wip-us.apache.org/repos/asf/flink/blob/368d7aee/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQueryAsterixITCase.java
----------------------------------------------------------------------
diff --git a/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQueryAsterixITCase.java b/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQueryAsterixITCase.java
index 2c53ee2..881bd2c 100644
--- a/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQueryAsterixITCase.java
+++ b/flink-tests/src/test/java/org/apache/flink/test/recordJobTests/TPCHQueryAsterixITCase.java
@@ -16,13 +16,25 @@
  * limitations under the License.
  */
 
-
 package org.apache.flink.test.recordJobTests;
 
 import org.apache.flink.api.common.Plan;
 import org.apache.flink.test.recordJobs.relational.TPCHQueryAsterix;
 import org.apache.flink.test.util.RecordAPITestBase;
 
+// -----------------------------------------------------------------------------
+//  --- NOTE ---
+//
+//  This class contains test data generated by tools from by the
+//  Transaction Processing Council (TPC), specifically the TPC-H benchmark's
+//  data generator.
+//
+//  Any form of use and redistribution must happen in accordance with the TPC-H
+//  Software License Agreement.
+// 
+//  For details, see http://www.tpc.org/tpch/dbgen/tpc-h%20license%20agreement.pdf
+// -----------------------------------------------------------------------------
+
 public class TPCHQueryAsterixITCase extends RecordAPITestBase {
 
 	private String ordersPath;