You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/04/21 09:14:26 UTC

[GitHub] [flink-training] NicoK commented on a change in pull request #1: [FLINK-17275] port core training exercises, descriptions, solutions and tests

NicoK commented on a change in pull request #1:
URL: https://github.com/apache/flink-training/pull/1#discussion_r412015506



##########
File path: README.md
##########
@@ -1,3 +1,257 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 # Flink Training Exercises
 
 Exercises that go along with the training content in the documentation.
+
+## Table of Contents
+
+[**Setup your Development Environment**](#setup-your-development-environment)
+
+1. [Software requirements](#software-requirements)
+1. [Clone and build the flink-training project](#clone-and-build-the-flink-training-project)
+1. [Import the flink-training project into your IDE](#import-the-flink-training-project-into-your-ide)
+1. [Download the data sets](#download-the-data-sets)
+
+[**Using the Taxi Data Streams**](#using-the-taxi-data-streams)
+
+1. [Schema of Taxi Ride Events](#schema-of-taxi-ride-events)
+1. [Generating Taxi Ride Data Streams in a Flink program](#generating-taxi-ride-data-streams-in-a-flink-program)
+
+[**How to do the Labs**](#how-to-do-the-labs)
+
+1. [Learn about the data](#learn-about-the-data)
+1. [Modify `ExerciseBase`](#modify-exercisebase)
+1. [Run and debug Flink programs in your IDE](#run-and-debug-flink-programs-in-your-ide)
+1. [Exercises, Tests, and Solutions](#exercises-tests-and-solutions)
+
+[**Labs**](LABS-OVERVIEW.md)
+
+## Setup your Development Environment
+
+The following instructions guide you through the process of setting up a development environment for the purpose of developing, debugging, and executing solutions to the Flink developer training exercises and examples.
+
+### Software requirements
+
+Flink supports Linux, OS X, and Windows as development environments for Flink programs and local execution. The following software is required for a Flink development setup and should be installed on your system:
+
+- a JDK for Java 8 or Java 11 (a JRE is not sufficient; other versions of Java are not supported)
+- Git
+- an IDE for Java (and/or Scala) development with Gradle support.
+  We recommend IntelliJ, but Eclipse or Visual Studio Code can also be used so long as you stick to Java. For Scala you will need to use IntelliJ (and its Scala plugin).
+
+> **:information_source: Note for Windows users:** Many of the examples of shell commands provided in the training instructions are for UNIX systems. To make things easier, you may find it worthwhile to setup cygwin or WSL. For developing Flink jobs, Windows works reasonably well: you can run a Flink cluster on a single machine, submit jobs, run the webUI, and execute jobs in the IDE.
+
+### Clone and build the flink-training project
+
+This `flink-training` project contains exercises, tests, and reference solutions for the programming exercises. Clone the `flink-training` project from Github and build it.
+
+> **:information_source: Repository Layout:** This repository has several branches set up pointing to different Apache Flink versions, similarly to the [apache/flink](https://github.com/apache/flink) repository with:
+> - a release branch for each minor version of Apache Flink, e.g. `release-1.10`, and
+> - a `master` branch that points to the current Flink release (not `flink:master`!)
+>
+> If you want to work on a version other than the current Flink release, make sure to check out the appropriate branch.
+
+```bash
+git clone https://github.com/apache/flink-training.git
+cd flink-training
+./gradlew test shadowJar
+```
+
+If you haven’t done this before, at this point you’ll end up downloading all of the dependencies for this Flink training project. This usually takes a few minutes, depending on the speed of your internet connection.
+
+If all of the tests pass and the build is successful, you are off to a good start.
+
+<details>
+<summary><strong>Users in China: click here for instructions about using a local maven mirror.</strong></summary>
+
+If you are in China, we recommend configuring the maven repository to use a mirror. You can do this by uncommenting the appropriate line in our [`build.gradle`](build.gradle) like this:
+
+```groovy
+    repositories {
+        // for access from China, you may need to uncomment this line
+        maven { url 'http://maven.aliyun.com/nexus/content/groups/public/' }
+        mavenCentral()
+    }
+```
+</details>
+
+
+### Import the flink-training project into your IDE
+
+The project needs to be imported as a gradle project into your IDE.
+
+Once that’s done you should be able to open [`RideCleansingTest`](ride-cleansing/src/test/java/org/apache/flink/training/exercises/ridecleansing/RideCleansingTest.java) and successfully run this test.
+
+> **:information_source: Note for Scala users:** You will need to use IntelliJ with the JetBrains Scala plugin, and you will need to add a Scala 2.12 SDK to the Global Libraries section of the Project Structure. IntelliJ will ask you for the latter when you open a Scala file.
+
+### Download the data sets
+
+You will also need to download the taxi data files used in this training by running the following commands
+
+```bash
+wget http://training.ververica.com/trainingData/nycTaxiRides.gz

Review comment:
       Actually, I was planning to rewrite them into a data generator to not have to worry about licensing and removing this additional download step. I just created https://issues.apache.org/jira/browse/FLINK-17293 for that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org