You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by GitBox <gi...@apache.org> on 2020/03/03 17:50:00 UTC

[GitHub] [beam] bipinupd opened a new pull request #11028: BEAM-2546 Beam IO for InfluxDB

bipinupd opened a new pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028
 
 
   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @iemejia emejia`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd opened a new pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd opened a new pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028
 
 
   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @iemejia emejia`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   To test InfluxdbIO. Run the influxDB container:
   ```bash
   docker run -p 8086:8086 -p 2003:2003  -p 8083:8083    -e INFLUXDB_GRAPHITE_ENABLED=true -e INFLUXDB_USER=supersadmin -e INFLUXDB_USER_PASSWORD=supersecretpassword  influxdb
   ./gradlew -Ppublishing -PdistMgmtSnapshotsUrl=~/.m2/repository/ -p sdks/java/io/influxdb publishToMavenLocal
   ./gradlew integrationTest -p sdks/java/io/influxdb -DintegrationTestPipelineOptions='[ "--influxDBURL=http://localhost:8086", "--influxDBUserName=superadmin",  "--influxDBPassword=supersecretpassword", "--databaseName=db1" ]' --tests org.apache.beam.sdk.io.influxdb.InfluxDBIOIT  -DintegrationTestRunner=direct
   ```
   
   R: @iemejia @mwalenia 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r407241372
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
 
 Review comment:
   I have refactored the split function.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404221870
 
 

 ##########
 File path: settings.gradle
 ##########
 @@ -118,6 +119,7 @@ include ":sdks:java:io:thrift"
 include ":sdks:java:io:tika"
 include ":sdks:java:io:xml"
 include ":sdks:java:io:synthetic"
+include ":sdks:java:io:influxdb"
 
 Review comment:
   remove this second ocurrence

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404380078
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
 
 Review comment:
   List<BoundedSource<String>> sources = new ArrayList<>();

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-612664265
 
 
   @iemejia and @mwalenia Thanks for reviewing and providing suggestion and helpful comments. It's my first contribution to Beam project. Looking forward for your guidance and support.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404356103
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
+        connection =
+            getConnection(
+                spec.dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled());
+        int flushDuration =
+            spec.flushDuration() != null ? spec.flushDuration() : defaultFlushDuration;
+        int noOfBatchPoints =
+            spec.noOfElementsToBatch() != null
+                ? spec.noOfElementsToBatch()
+                : defaultNumberOfDuration;
+        connection.enableBatch(
+            BatchOptions.DEFAULTS.actions(noOfBatchPoints).flushDuration(flushDuration));
+        connection.setDatabase(spec.database());
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        connection.write(c.element().toString());
+      }
+
+      @FinishBundle
+      public void finishBundle() throws Exception {
+        connection.flush();
+      }
+
+      @Teardown
+      public void tearDown() throws Exception {
+        if (connection != null) {
+          connection.flush();
+          connection.close();
+          connection = null;
+        }
+      }
+
+      @Override
+      public void populateDisplayData(DisplayData.Builder builder) {
+        builder.delegate(Write.this);
+      }
+
+      private final Integer defaultNumberOfDuration = 1000;
 
 Review comment:
   Please remove this two attributes and set them by default in the `write()` method

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404410842
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
 
 Review comment:
   remove log

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404226273
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
 
 Review comment:
   remove `datasource`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615489412
 
 
   I don't understand what happened but it looks really broken. One alternative would be to run  `git pull origin master --rebase` to move all your commits on top of the latest master, this will fix the issue but you have to deal with the rebase that does not look easy, the other is that you recreate the branch/PR from scratch based on the latest master.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404223761
 
 

 ##########
 File path: sdks/java/io/influxdb/build.gradle
 ##########
 @@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+plugins { id 'org.apache.beam.module' }
+applyJavaNature(automaticModuleName: 'org.apache.beam.sdk.io.influxdb')
+provideIntegrationTestingDependencies()
+enableJavaPerformanceTesting()
+
+description = "Apache Beam :: SDKs :: Java :: IO :: InfluxDB"
+ext.summary = "IO to read and write on JDBC datasource."
+
+dependencies {
+  compile library.java.vendored_guava_26_0_jre
+  compile project(path: ":sdks:java:core", configuration: "shadow")
+  compile library.java.slf4j_api
+  compile group: 'org.influxdb', name: 'influxdb-java', version: '2.15'
 
 Review comment:
   Prefer the inlined version `compile "org.influxdb:influxdb-java:$influxdb_version"`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-595176325
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404230813
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
 
 Review comment:
   missing space and . (see also below)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404396609
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
 
 Review comment:
   This method is returning the number of elements but it is intended to return the number of bytes used by the elements in the database. Is there any way to get such statistics? Otherwise i think that this implementation is not consistent so probably better not to implement it. This argument is used to calculate the desiredElementsInBundle value for the `split` method but we are already ignoring that one so it should be ok.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404232037
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
 
 Review comment:
   Uppercase the query

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404390537
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
 
 Review comment:
   Maybe also `connection.setRetentionPolicy(spec.retentionPolicy());`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404385677
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
 
 Review comment:
   boolean

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615292356
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404376801
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
 
 Review comment:
   remove throws

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404376936
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
 
 Review comment:
   remove throws

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404212913
 
 

 ##########
 File path: sdks/java/io/influxdb/Readme.md
 ##########
 @@ -0,0 +1,6 @@
+To test the influxdb io plugin. Run the influxDB container.
 
 Review comment:
   Remove this file and move its contents into the `InfluxDBIOIT.java` javadoc

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404388021
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
 
 Review comment:
   `s/metric/metrics`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404376980
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
+        connection =
+            getConnection(
+                spec.dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled());
+        int flushDuration =
+            spec.flushDuration() != null ? spec.flushDuration() : defaultFlushDuration;
+        int noOfBatchPoints =
+            spec.noOfElementsToBatch() != null
+                ? spec.noOfElementsToBatch()
+                : defaultNumberOfDuration;
+        connection.enableBatch(
+            BatchOptions.DEFAULTS.actions(noOfBatchPoints).flushDuration(flushDuration));
+        connection.setDatabase(spec.database());
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        connection.write(c.element().toString());
+      }
+
+      @FinishBundle
+      public void finishBundle() throws Exception {
+        connection.flush();
+      }
+
+      @Teardown
+      public void tearDown() throws Exception {
 
 Review comment:
   remove throws

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404385611
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
 
 Review comment:
   boolean

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r397962741
 
 

 ##########
 File path: .test-infra/jenkins/job_PerformanceTests_InfluxDBIO_IT.groovy
 ##########
 @@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+import CommonJobProperties as common
+import Kubernetes
+
+String jobName = "beam_PerformanceTests_InfluxDBIO_IT"
+
+job(jobName) {
+  common.setTopLevelMainJobProperties(delegate)
+  common.enablePhraseTriggeringFromPullRequest(
+          delegate,
+          'Java InfluxDBIO Performance Test',
+          'Run Java InfluxDBIO Performance Test')
+
+  String namespace = common.getKubernetesNamespace(jobName)
+  String kubeconfigPath = common.getKubeconfigLocationForNamespace(namespace)
+  Kubernetes k8s = Kubernetes.create(delegate, kubeconfigPath, namespace)
+
+  k8s.apply(common.makePathAbsolute("src/.test-infra/kubernetes/influxdb/influxdb.yml"))
+  String influxDBHostName = "LOAD_BALANCER_IP"
+  k8s.loadBalancerIP("influxdb-load-balancer-service", influxDBHostName)
+  Map pipelineOptions = [
+          influxDBURL     : "http://\$${influxDBHostName}:8086",
+          influxDBUserName : "superadmin",
+          influxDBPassword : "supersecretpassword",
+          databaseName : "db1"
+  ]
+
+  steps {
+    gradle {
+      rootBuildScriptDir(common.checkoutDir)
+      common.setGradleSwitches(delegate)
+      switches("--info")
+      switches("-DintegrationTestPipelineOptions=\'${common.joinPipelineOptions(pipelineOptions)}\'")
+      switches("-DintegrationTestRunner=direct")
 
 Review comment:
   After looking a bit into the code we have a weird mix of Dataflow/Direct in the runs, do you think there is a way we can do that 'choosable' when running an IT test? (note that I agree with the idea of running on Dataflow, but of course I understand contributors may not have access to it and it is quicker to develop/do it with DirectRunner).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404384086
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
 
 Review comment:
   s/Stringn/String

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404377013
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
+        connection =
+            getConnection(
+                spec.dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled());
+        int flushDuration =
+            spec.flushDuration() != null ? spec.flushDuration() : defaultFlushDuration;
+        int noOfBatchPoints =
+            spec.noOfElementsToBatch() != null
+                ? spec.noOfElementsToBatch()
+                : defaultNumberOfDuration;
+        connection.enableBatch(
+            BatchOptions.DEFAULTS.actions(noOfBatchPoints).flushDuration(flushDuration));
+        connection.setDatabase(spec.database());
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        connection.write(c.element().toString());
+      }
+
+      @FinishBundle
+      public void finishBundle() throws Exception {
 
 Review comment:
   remove throws

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404385733
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
 
 Review comment:
   boolean

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615523726
 
 
   Thanks @iemejia ... I will try if did not work ... I will follow you second advice and tag you and mwalenia.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404382238
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
 
 Review comment:
   Remove javadoc

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r407241190
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
 
 Review comment:
   @iemejia Thanks for the suggestion. I have refactored getEstimatedSizeBytes and split functions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404360427
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
 
 Review comment:
   Can you remove this deprecated call by maybe creating an internal private method that does the same via query + `SHOW DATABASES`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615293438
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404400145
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
 
 Review comment:
   Actually if you remove this method Beam should be able to infer the proper `Coder`, can you try and remove it to see if it works

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404381249
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
 
 Review comment:
   ```
               for (Object databaseName : databaseNames) {
                 List database = (List) databaseName;
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-595176325
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404406106
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
+        connection =
+            getConnection(
+                spec.dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled());
+        int flushDuration =
+            spec.flushDuration() != null ? spec.flushDuration() : defaultFlushDuration;
+        int noOfBatchPoints =
+            spec.noOfElementsToBatch() != null
+                ? spec.noOfElementsToBatch()
+                : defaultNumberOfDuration;
+        connection.enableBatch(
+            BatchOptions.DEFAULTS.actions(noOfBatchPoints).flushDuration(flushDuration));
+        connection.setDatabase(spec.database());
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        connection.write(c.element().toString());
+      }
+
+      @FinishBundle
+      public void finishBundle() throws Exception {
+        connection.flush();
+      }
+
+      @Teardown
+      public void tearDown() throws Exception {
+        if (connection != null) {
+          connection.flush();
+          connection.close();
+          connection = null;
+        }
+      }
+
+      @Override
+      public void populateDisplayData(DisplayData.Builder builder) {
+        builder.delegate(Write.this);
+      }
+
+      private final Integer defaultNumberOfDuration = 1000;
+      private final Integer defaultFlushDuration = 100;
+    }
+  }
+
+  public static OkHttpClient.Builder getUnsafeOkHttpClient() {
 
 Review comment:
   private

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404387662
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
 
 Review comment:
   `s/setMetric/setMetrics`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404218978
 
 

 ##########
 File path: sdks/java/io/influxdb/src/test/java/org/apache/beam/sdk/io/influxdb/InfluxDBIOIT.java
 ##########
 @@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import java.util.Arrays;
+import org.apache.beam.sdk.PipelineResult;
+import org.apache.beam.sdk.io.common.IOTestPipelineOptions;
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Count;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.values.PCollection;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * A test of {@link org.apache.beam.sdk.io.influxdb.InfluxDBIO} on an independent InfluxDB instance.
+ *
+ * <p>This test requires a running instance of InfluxDB. Pass in connection information using
+ * PipelineOptions:
+ *
+ * <pre>
+ *  ./gradlew integrationTest -p sdks/java/io/influxdb -DintegrationTestPipelineOptions='[
+ *  "--influxdburl=http://localhost:8086",
+ *  "--infuxDBDatabase=mypass",
+ *  "--username=username"
+ *  "--password=password"]'
+ *  --tests org.apache.beam.sdk.io.influxdb.InfluxDBIOIT
+ *  -DintegrationTestRunner=direct
+ * </pre>
+ */
+@RunWith(JUnit4.class)
+public class InfluxDBIOIT {
+
+  private static InfluxDBPipelineOptions options;
+
+  @Rule public final TestPipeline writePipeline = TestPipeline.create();
+  @Rule public final TestPipeline readPipeline = TestPipeline.create();
+
+  /** InfluxDBIO options. */
+  public interface InfluxDBPipelineOptions extends IOTestPipelineOptions {
+    @Description("InfluxDB host (host name/ip address)")
+    @Default.String("http://localhost:8086")
+    String getInfluxDBURL();
+
+    void setInfluxDBURL(String value);
+
+    @Description("Username for InfluxDB")
+    @Default.String("superadmin")
+    String getInfluxDBUserName();
+
+    void setInfluxDBUserName(String value);
+
+    @Description("Password for InfluxDB")
+    @Default.String("supersecretpassword")
+    String getInfluxDBPassword();
+
+    void setInfluxDBPassword(String value);
+
+    @Description("InfluxDB database name")
+    @Default.String("db0")
+    String getDatabaseName();
+
+    void setDatabaseName(String value);
+  }
+
+  @BeforeClass
+  public static void setUp() {
+    PipelineOptionsFactory.register(InfluxDBPipelineOptions.class);
+    options = TestPipeline.testingPipelineOptions().as(InfluxDBPipelineOptions.class);
+  }
+
+  @After
+  public void clear() {
+    try (InfluxDB connection =
+        InfluxDBFactory.connect(
+            options.getInfluxDBURL(),
+            options.getInfluxDBUserName(),
+            options.getInfluxDBPassword())) {
+      connection.query(new Query("DROP DATABASE \"" + options.getDatabaseName() + "\""));
+    }
+  }
+
+  @Before
+  public void initTest() {
+    try (InfluxDB connection =
+        InfluxDBFactory.connect(
+            options.getInfluxDBURL(),
+            options.getInfluxDBUserName(),
+            options.getInfluxDBPassword())) {
+      connection.query(new Query("CREATE DATABASE \"" + options.getDatabaseName() + "\""));
+    }
+  }
+
+  @Test
+  public void testWriteAndRead() {
+    final int noofElementsToReadAndWrite = 1000;
+    writePipeline
+        .apply("Generate data", Create.of(GenerateData.getMetric(noofElementsToReadAndWrite)))
+        .apply(
+            "Write data to InfluxDB",
+            InfluxDBIO.write()
+                .withConfiguration(
+                    InfluxDBIO.DataSourceConfiguration.create(
+                        options.getInfluxDBURL(),
+                        options.getInfluxDBUserName(),
+                        options.getInfluxDBPassword()))
+                .withDatabase(options.getDatabaseName())
+                .withSslInvalidHostNameAllowed(false)
+                .withSslEnabled(false));
+    writePipeline.run().waitUntilFinish();
+    PCollection<String> readVals =
 
 Review comment:
   s/readVals/values

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404232253
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
 
 Review comment:
   Remove ` datasource`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404410295
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
 
 Review comment:
   if (spec.retentionPolicy() != null) maybe too

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404390643
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
+        connection =
+            getConnection(
+                spec.dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled());
+        int flushDuration =
+            spec.flushDuration() != null ? spec.flushDuration() : defaultFlushDuration;
+        int noOfBatchPoints =
+            spec.noOfElementsToBatch() != null
+                ? spec.noOfElementsToBatch()
+                : defaultNumberOfDuration;
+        connection.enableBatch(
+            BatchOptions.DEFAULTS.actions(noOfBatchPoints).flushDuration(flushDuration));
+        connection.setDatabase(spec.database());
 
 Review comment:
   Maybe also `connection.setRetentionPolicy(spec.retentionPolicy());`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404231393
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
 
 Review comment:
   Read with query example

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615472538
 
 
   > retest this please
   
   HI @iemejia Thanks. I have retested. The tests are passing .... For some reason while rebasing is pulling all these changes ... When I check the diff from my branch (https://github.com/bipinupd/beam/tree/BEAM-2546) where apache:master, I still see only my changes. I will appreciate any suggestion fix this..

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404375365
 
 

 ##########
 File path: sdks/java/io/influxdb/src/test/java/org/apache/beam/sdk/io/influxdb/Model.java
 ##########
 @@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+import org.influxdb.dto.Point;
+
+class Model implements LineProtocolConvertable, Serializable {
+  private String measurement;
+  private Map<String, String> tags;
+  private Map<String, Object> fields;
+  private Long time;
+  private TimeUnit timeUnit;
+
+  Model() {
+    tags = new HashMap();
 
 Review comment:
   `new HashMap<>();` (same in the line below)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r407240851
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
 
 Review comment:
   close() is abstract function in BoundedReader

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615292249
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404378613
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
 
 Review comment:
   s/withDatabase()/database

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404380343
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
+        connection =
+            getConnection(
+                spec.dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled());
+        int flushDuration =
+            spec.flushDuration() != null ? spec.flushDuration() : defaultFlushDuration;
+        int noOfBatchPoints =
+            spec.noOfElementsToBatch() != null
+                ? spec.noOfElementsToBatch()
+                : defaultNumberOfDuration;
+        connection.enableBatch(
+            BatchOptions.DEFAULTS.actions(noOfBatchPoints).flushDuration(flushDuration));
+        connection.setDatabase(spec.database());
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        connection.write(c.element().toString());
+      }
+
+      @FinishBundle
+      public void finishBundle() throws Exception {
+        connection.flush();
+      }
+
+      @Teardown
+      public void tearDown() throws Exception {
+        if (connection != null) {
+          connection.flush();
+          connection.close();
+          connection = null;
+        }
+      }
+
+      @Override
+      public void populateDisplayData(DisplayData.Builder builder) {
+        builder.delegate(Write.this);
+      }
+
+      private final Integer defaultNumberOfDuration = 1000;
+      private final Integer defaultFlushDuration = 100;
+    }
+  }
+
+  public static OkHttpClient.Builder getUnsafeOkHttpClient() {
+    try {
+      // Create a trust manager that does not validate certificate chains
+      final TrustManager[] trustAllCerts =
+          new TrustManager[] {
+            new X509TrustManager() {
+              @Override
+              public void checkClientTrusted(
+                  java.security.cert.X509Certificate[] chain, String authType)
+                  throws CertificateException {}
+
+              @Override
+              public void checkServerTrusted(
+                  java.security.cert.X509Certificate[] chain, String authType)
+                  throws CertificateException {}
+
+              @Override
+              public java.security.cert.X509Certificate[] getAcceptedIssuers() {
+                return new java.security.cert.X509Certificate[] {};
+              }
+            }
+          };
+
+      // Install the all-trusting trust manager
+      final SSLContext sslContext = SSLContext.getInstance("SSL");
+      sslContext.init(null, trustAllCerts, new java.security.SecureRandom());
+      // Create an ssl socket factory with our all-trusting manager
+      final SSLSocketFactory sslSocketFactory = sslContext.getSocketFactory();
+
+      OkHttpClient.Builder builder = new OkHttpClient.Builder();
+      builder.sslSocketFactory(sslSocketFactory, (X509TrustManager) trustAllCerts[0]);
+      builder.hostnameVerifier(
 
 Review comment:
   `builder.hostnameVerifier((hostname, session) -> true);`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404376571
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
 
 Review comment:
   remove throws

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404385490
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
 
 Review comment:
   boolean

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r397962741
 
 

 ##########
 File path: .test-infra/jenkins/job_PerformanceTests_InfluxDBIO_IT.groovy
 ##########
 @@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+import CommonJobProperties as common
+import Kubernetes
+
+String jobName = "beam_PerformanceTests_InfluxDBIO_IT"
+
+job(jobName) {
+  common.setTopLevelMainJobProperties(delegate)
+  common.enablePhraseTriggeringFromPullRequest(
+          delegate,
+          'Java InfluxDBIO Performance Test',
+          'Run Java InfluxDBIO Performance Test')
+
+  String namespace = common.getKubernetesNamespace(jobName)
+  String kubeconfigPath = common.getKubeconfigLocationForNamespace(namespace)
+  Kubernetes k8s = Kubernetes.create(delegate, kubeconfigPath, namespace)
+
+  k8s.apply(common.makePathAbsolute("src/.test-infra/kubernetes/influxdb/influxdb.yml"))
+  String influxDBHostName = "LOAD_BALANCER_IP"
+  k8s.loadBalancerIP("influxdb-load-balancer-service", influxDBHostName)
+  Map pipelineOptions = [
+          influxDBURL     : "http://\$${influxDBHostName}:8086",
+          influxDBUserName : "superadmin",
+          influxDBPassword : "supersecretpassword",
+          databaseName : "db1"
+  ]
+
+  steps {
+    gradle {
+      rootBuildScriptDir(common.checkoutDir)
+      common.setGradleSwitches(delegate)
+      switches("--info")
+      switches("-DintegrationTestPipelineOptions=\'${common.joinPipelineOptions(pipelineOptions)}\'")
+      switches("-DintegrationTestRunner=direct")
 
 Review comment:
   After looking a bit into the code we have a weird mix of Dataflow/Direct in the runs, @mwalenia do you think there is a way we can do that 'choosable' when running an IT test? (note that I agree with the idea of running on Dataflow, but of course I understand contributors may not have access to it and it is quicker to develop/do it with DirectRunner).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404229145
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
 
 Review comment:
   Jdbc :) (check this in other parts too)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404399019
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
 
 Review comment:
   `StringUtf8Coder.of()`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404227410
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
 
 Review comment:
   s/source/{@link #read()}

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404382511
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
 
 Review comment:
   Remove

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-603750036
 
 
   Sorry for the delay and thanks for your patience @bipinupd , my review queue at least is finally advancing. I added @mwalenia so he can help me verfify that the CI / IT part is consistent with Beam common patterns and it works.
   I will give a round to the code and comment later on today.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404384582
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
+        connection =
+            getConnection(
+                spec.dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled());
+        int flushDuration =
+            spec.flushDuration() != null ? spec.flushDuration() : defaultFlushDuration;
+        int noOfBatchPoints =
+            spec.noOfElementsToBatch() != null
+                ? spec.noOfElementsToBatch()
+                : defaultNumberOfDuration;
+        connection.enableBatch(
+            BatchOptions.DEFAULTS.actions(noOfBatchPoints).flushDuration(flushDuration));
+        connection.setDatabase(spec.database());
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        connection.write(c.element().toString());
+      }
+
+      @FinishBundle
+      public void finishBundle() throws Exception {
+        connection.flush();
+      }
+
+      @Teardown
+      public void tearDown() throws Exception {
+        if (connection != null) {
+          connection.flush();
+          connection.close();
+          connection = null;
+        }
+      }
+
+      @Override
+      public void populateDisplayData(DisplayData.Builder builder) {
+        builder.delegate(Write.this);
+      }
+
+      private final Integer defaultNumberOfDuration = 1000;
+      private final Integer defaultFlushDuration = 100;
+    }
+  }
+
+  public static OkHttpClient.Builder getUnsafeOkHttpClient() {
+    try {
+      // Create a trust manager that does not validate certificate chains
+      final TrustManager[] trustAllCerts =
+          new TrustManager[] {
+            new X509TrustManager() {
+              @Override
+              public void checkClientTrusted(
+                  java.security.cert.X509Certificate[] chain, String authType)
+                  throws CertificateException {}
+
+              @Override
+              public void checkServerTrusted(
+                  java.security.cert.X509Certificate[] chain, String authType)
+                  throws CertificateException {}
+
+              @Override
+              public java.security.cert.X509Certificate[] getAcceptedIssuers() {
+                return new java.security.cert.X509Certificate[] {};
+              }
+            }
+          };
+
+      // Install the all-trusting trust manager
+      final SSLContext sslContext = SSLContext.getInstance("SSL");
+      sslContext.init(null, trustAllCerts, new java.security.SecureRandom());
+      // Create an ssl socket factory with our all-trusting manager
+      final SSLSocketFactory sslSocketFactory = sslContext.getSocketFactory();
+
+      OkHttpClient.Builder builder = new OkHttpClient.Builder();
+      builder.sslSocketFactory(sslSocketFactory, (X509TrustManager) trustAllCerts[0]);
+      builder.hostnameVerifier(
+          new HostnameVerifier() {
+            @Override
+            public boolean verify(String hostname, SSLSession session) {
+              return true;
+            }
+          });
+
+      return builder;
+    } catch (Exception e) {
+      throw new RuntimeException(e);
+    }
+  }
+  /**
+   * A POJO describing a {}, either providing directly a or all properties allowing to create a {}.
 
 Review comment:
   A class describing ...?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404220372
 
 

 ##########
 File path: sdks/java/io/influxdb/src/test/java/org/apache/beam/sdk/io/influxdb/InfluxDBIOIT.java
 ##########
 @@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import java.util.Arrays;
+import org.apache.beam.sdk.PipelineResult;
+import org.apache.beam.sdk.io.common.IOTestPipelineOptions;
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Count;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.values.PCollection;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * A test of {@link org.apache.beam.sdk.io.influxdb.InfluxDBIO} on an independent InfluxDB instance.
+ *
+ * <p>This test requires a running instance of InfluxDB. Pass in connection information using
+ * PipelineOptions:
+ *
+ * <pre>
+ *  ./gradlew integrationTest -p sdks/java/io/influxdb -DintegrationTestPipelineOptions='[
+ *  "--influxdburl=http://localhost:8086",
+ *  "--infuxDBDatabase=mypass",
+ *  "--username=username"
+ *  "--password=password"]'
+ *  --tests org.apache.beam.sdk.io.influxdb.InfluxDBIOIT
+ *  -DintegrationTestRunner=direct
+ * </pre>
+ */
+@RunWith(JUnit4.class)
+public class InfluxDBIOIT {
+
+  private static InfluxDBPipelineOptions options;
+
+  @Rule public final TestPipeline writePipeline = TestPipeline.create();
+  @Rule public final TestPipeline readPipeline = TestPipeline.create();
+
+  /** InfluxDBIO options. */
+  public interface InfluxDBPipelineOptions extends IOTestPipelineOptions {
+    @Description("InfluxDB host (host name/ip address)")
+    @Default.String("http://localhost:8086")
+    String getInfluxDBURL();
+
+    void setInfluxDBURL(String value);
+
+    @Description("Username for InfluxDB")
+    @Default.String("superadmin")
+    String getInfluxDBUserName();
+
+    void setInfluxDBUserName(String value);
+
+    @Description("Password for InfluxDB")
+    @Default.String("supersecretpassword")
+    String getInfluxDBPassword();
+
+    void setInfluxDBPassword(String value);
+
+    @Description("InfluxDB database name")
+    @Default.String("db0")
+    String getDatabaseName();
+
+    void setDatabaseName(String value);
+  }
+
+  @BeforeClass
+  public static void setUp() {
+    PipelineOptionsFactory.register(InfluxDBPipelineOptions.class);
+    options = TestPipeline.testingPipelineOptions().as(InfluxDBPipelineOptions.class);
+  }
+
+  @After
+  public void clear() {
+    try (InfluxDB connection =
+        InfluxDBFactory.connect(
+            options.getInfluxDBURL(),
+            options.getInfluxDBUserName(),
+            options.getInfluxDBPassword())) {
+      connection.query(new Query("DROP DATABASE \"" + options.getDatabaseName() + "\""));
+    }
+  }
+
+  @Before
+  public void initTest() {
+    try (InfluxDB connection =
+        InfluxDBFactory.connect(
+            options.getInfluxDBURL(),
+            options.getInfluxDBUserName(),
+            options.getInfluxDBPassword())) {
+      connection.query(new Query("CREATE DATABASE \"" + options.getDatabaseName() + "\""));
+    }
+  }
+
+  @Test
+  public void testWriteAndRead() {
+    final int noofElementsToReadAndWrite = 1000;
+    writePipeline
+        .apply("Generate data", Create.of(GenerateData.getMetric(noofElementsToReadAndWrite)))
+        .apply(
+            "Write data to InfluxDB",
+            InfluxDBIO.write()
+                .withConfiguration(
+                    InfluxDBIO.DataSourceConfiguration.create(
+                        options.getInfluxDBURL(),
+                        options.getInfluxDBUserName(),
+                        options.getInfluxDBPassword()))
+                .withDatabase(options.getDatabaseName())
+                .withSslInvalidHostNameAllowed(false)
+                .withSslEnabled(false));
+    writePipeline.run().waitUntilFinish();
+    PCollection<String> readVals =
+        readPipeline.apply(
+            "Read all points in Influxdb",
+            InfluxDBIO.read()
+                .withDataSourceConfiguration(
+                    InfluxDBIO.DataSourceConfiguration.create(
+                        options.getInfluxDBURL(),
+                        options.getInfluxDBUserName(),
+                        options.getInfluxDBPassword()))
+                .withDatabase(options.getDatabaseName())
+                .withQuery("SELECT * FROM \"test_m\"")
+                .withSslInvalidHostNameAllowed(false)
+                .withSslEnabled(false));
+
+    PAssert.thatSingleton(readVals.apply("Count All", Count.globally()))
+        .isEqualTo((long) noofElementsToReadAndWrite);
+    PipelineResult readResult = readPipeline.run();
+    readResult.waitUntilFinish();
 
 Review comment:
   inline with the previous line since we are not using the result

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615292249
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404220859
 
 

 ##########
 File path: sdks/java/io/influxdb/src/test/java/org/apache/beam/sdk/io/influxdb/InfluxDBIOIT.java
 ##########
 @@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import java.util.Arrays;
+import org.apache.beam.sdk.PipelineResult;
+import org.apache.beam.sdk.io.common.IOTestPipelineOptions;
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Count;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.values.PCollection;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * A test of {@link org.apache.beam.sdk.io.influxdb.InfluxDBIO} on an independent InfluxDB instance.
+ *
+ * <p>This test requires a running instance of InfluxDB. Pass in connection information using
+ * PipelineOptions:
+ *
+ * <pre>
+ *  ./gradlew integrationTest -p sdks/java/io/influxdb -DintegrationTestPipelineOptions='[
+ *  "--influxdburl=http://localhost:8086",
+ *  "--infuxDBDatabase=mypass",
+ *  "--username=username"
+ *  "--password=password"]'
+ *  --tests org.apache.beam.sdk.io.influxdb.InfluxDBIOIT
+ *  -DintegrationTestRunner=direct
+ * </pre>
+ */
+@RunWith(JUnit4.class)
+public class InfluxDBIOIT {
+
+  private static InfluxDBPipelineOptions options;
+
+  @Rule public final TestPipeline writePipeline = TestPipeline.create();
+  @Rule public final TestPipeline readPipeline = TestPipeline.create();
+
+  /** InfluxDBIO options. */
+  public interface InfluxDBPipelineOptions extends IOTestPipelineOptions {
+    @Description("InfluxDB host (host name/ip address)")
+    @Default.String("http://localhost:8086")
+    String getInfluxDBURL();
+
+    void setInfluxDBURL(String value);
+
+    @Description("Username for InfluxDB")
+    @Default.String("superadmin")
+    String getInfluxDBUserName();
+
+    void setInfluxDBUserName(String value);
+
+    @Description("Password for InfluxDB")
+    @Default.String("supersecretpassword")
+    String getInfluxDBPassword();
+
+    void setInfluxDBPassword(String value);
+
+    @Description("InfluxDB database name")
+    @Default.String("db0")
+    String getDatabaseName();
+
+    void setDatabaseName(String value);
+  }
+
+  @BeforeClass
+  public static void setUp() {
+    PipelineOptionsFactory.register(InfluxDBPipelineOptions.class);
+    options = TestPipeline.testingPipelineOptions().as(InfluxDBPipelineOptions.class);
+  }
+
+  @After
+  public void clear() {
+    try (InfluxDB connection =
+        InfluxDBFactory.connect(
+            options.getInfluxDBURL(),
+            options.getInfluxDBUserName(),
+            options.getInfluxDBPassword())) {
+      connection.query(new Query("DROP DATABASE \"" + options.getDatabaseName() + "\""));
+    }
+  }
+
+  @Before
+  public void initTest() {
+    try (InfluxDB connection =
+        InfluxDBFactory.connect(
+            options.getInfluxDBURL(),
+            options.getInfluxDBUserName(),
+            options.getInfluxDBPassword())) {
+      connection.query(new Query("CREATE DATABASE \"" + options.getDatabaseName() + "\""));
+    }
+  }
+
+  @Test
+  public void testWriteAndRead() {
+    final int noofElementsToReadAndWrite = 1000;
+    writePipeline
+        .apply("Generate data", Create.of(GenerateData.getMetric(noofElementsToReadAndWrite)))
+        .apply(
+            "Write data to InfluxDB",
+            InfluxDBIO.write()
+                .withConfiguration(
+                    InfluxDBIO.DataSourceConfiguration.create(
+                        options.getInfluxDBURL(),
+                        options.getInfluxDBUserName(),
+                        options.getInfluxDBPassword()))
+                .withDatabase(options.getDatabaseName())
+                .withSslInvalidHostNameAllowed(false)
+                .withSslEnabled(false));
+    writePipeline.run().waitUntilFinish();
+    PCollection<String> readVals =
+        readPipeline.apply(
+            "Read all points in Influxdb",
+            InfluxDBIO.read()
+                .withDataSourceConfiguration(
+                    InfluxDBIO.DataSourceConfiguration.create(
+                        options.getInfluxDBURL(),
+                        options.getInfluxDBUserName(),
+                        options.getInfluxDBPassword()))
+                .withDatabase(options.getDatabaseName())
+                .withQuery("SELECT * FROM \"test_m\"")
+                .withSslInvalidHostNameAllowed(false)
+                .withSslEnabled(false));
+
+    PAssert.thatSingleton(readVals.apply("Count All", Count.globally()))
+        .isEqualTo((long) noofElementsToReadAndWrite);
+    PipelineResult readResult = readPipeline.run();
+    readResult.waitUntilFinish();
+  }
+
+  @Test
+  public void testWriteAndReadWithSingleMetric() {
+    final int noofElementsToReadAndWrite = 1000;
+    writePipeline
+        .apply("Generate data", Create.of(GenerateData.getMetric(noofElementsToReadAndWrite)))
+        .apply(
+            "Write data to InfluxDB",
+            InfluxDBIO.write()
+                .withConfiguration(
+                    InfluxDBIO.DataSourceConfiguration.create(
+                        options.getInfluxDBURL(),
+                        options.getInfluxDBUserName(),
+                        options.getInfluxDBPassword()))
+                .withDatabase(options.getDatabaseName())
+                .withSslInvalidHostNameAllowed(false)
+                .withSslEnabled(false));
+    writePipeline.run().waitUntilFinish();
+    PCollection<String> readVals =
 
 Review comment:
   s/readVals/values

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404378445
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
 
 Review comment:
   s/withConfiguration()/configuration

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404388315
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
 
 Review comment:
   Can you please add an extra overrride of this method: `withMetrics(String... metrics)`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404388505
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
 
 Review comment:
   boolean

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615150628
 
 
   It seems some files are missing the Apache license headers, you can fix that by running `./gradlew :sdks:java:io:influxdb:check spotlessApply`. Also it seems the state of the PR is a bit crazy and needs some rebasing, you can do that with `git pull origin master --rebase`.
   Can you please do both and verify that tests pass locally and ping me just when ready. I will be glad to do the review then.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia closed pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia closed pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-595176377
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r407240538
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/package-info.java
 ##########
 @@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Transforms for reading and writing from/to InfluxDB.
+ *
+ * @see org.apache.beam.sdk.io.influxdb.InfluxDBIO
+ */
+package org.apache.beam.sdk.io.influxdb;
 
 Review comment:
   I have added in the tag in InfluxDBIO class. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-594197974
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd closed pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd closed pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-595176235
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404402426
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
 
 Review comment:
   This method should return a list of source where each source ideally corresponds to each partition/shard. The current implementation divdes per metric. Do shard correspond 1 to 1 to metrics in InfluxDB? If yes this is ok if not, is there a way to access the partitions corresponding to a metric? (this is what we should ideally return here)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404227752
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
 
 Review comment:
   remove `To configure the InfluxDB source, `

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404218067
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/package-info.java
 ##########
 @@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Transforms for reading and writing from/to InfluxDB.
+ *
+ * @see org.apache.beam.sdk.io.influxdb.InfluxDBIO
+ */
+package org.apache.beam.sdk.io.influxdb;
 
 Review comment:
   Please add `@Experimental(Kind.SOURCE_SINK)` annotation

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-613695575
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] bipinupd commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
bipinupd commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-594081070
 
 
   HI @iemejia, I decided to close the previous PR (https://github.com/apache/beam/pull/10604) and opened a new one. Sorry for the inconvenience. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615292356
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-615489475
 
 
   Oups I accidentally closed it :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404404249
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
+
+      private final Write spec;
+      private InfluxDB connection;
+
+      InfluxWriterFn(Write write) {
+        this.spec = write;
+      }
+
+      @Setup
+      public void setup() throws Exception {
+        connection =
+            getConnection(
+                spec.dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled());
+        int flushDuration =
+            spec.flushDuration() != null ? spec.flushDuration() : defaultFlushDuration;
+        int noOfBatchPoints =
+            spec.noOfElementsToBatch() != null
+                ? spec.noOfElementsToBatch()
+                : defaultNumberOfDuration;
+        connection.enableBatch(
+            BatchOptions.DEFAULTS.actions(noOfBatchPoints).flushDuration(flushDuration));
+        connection.setDatabase(spec.database());
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        connection.write(c.element().toString());
+      }
+
+      @FinishBundle
+      public void finishBundle() throws Exception {
+        connection.flush();
+      }
+
+      @Teardown
+      public void tearDown() throws Exception {
+        if (connection != null) {
+          connection.flush();
+          connection.close();
+          connection = null;
+        }
+      }
+
+      @Override
+      public void populateDisplayData(DisplayData.Builder builder) {
+        builder.delegate(Write.this);
+      }
+
+      private final Integer defaultNumberOfDuration = 1000;
+      private final Integer defaultFlushDuration = 100;
+    }
+  }
+
+  public static OkHttpClient.Builder getUnsafeOkHttpClient() {
+    try {
+      // Create a trust manager that does not validate certificate chains
+      final TrustManager[] trustAllCerts =
+          new TrustManager[] {
+            new X509TrustManager() {
+              @Override
+              public void checkClientTrusted(
+                  java.security.cert.X509Certificate[] chain, String authType)
+                  throws CertificateException {}
+
+              @Override
+              public void checkServerTrusted(
+                  java.security.cert.X509Certificate[] chain, String authType)
+                  throws CertificateException {}
+
+              @Override
+              public java.security.cert.X509Certificate[] getAcceptedIssuers() {
+                return new java.security.cert.X509Certificate[] {};
+              }
+            }
+          };
+
+      // Install the all-trusting trust manager
+      final SSLContext sslContext = SSLContext.getInstance("SSL");
+      sslContext.init(null, trustAllCerts, new java.security.SecureRandom());
+      // Create an ssl socket factory with our all-trusting manager
+      final SSLSocketFactory sslSocketFactory = sslContext.getSocketFactory();
+
+      OkHttpClient.Builder builder = new OkHttpClient.Builder();
+      builder.sslSocketFactory(sslSocketFactory, (X509TrustManager) trustAllCerts[0]);
+      builder.hostnameVerifier(
+          new HostnameVerifier() {
+            @Override
+            public boolean verify(String hostname, SSLSession session) {
+              return true;
+            }
+          });
+
+      return builder;
+    } catch (Exception e) {
+      throw new RuntimeException(e);
+    }
+  }
+  /**
+   * A POJO describing a {}, either providing directly a or all properties allowing to create a {}.
+   */
+  @AutoValue
+  public abstract static class DataSourceConfiguration implements Serializable {
+
+    @Nullable
+    abstract ValueProvider<String> getUrl();
 
 Review comment:
   Remove the get prefix on these attributes to be consistent with the style of the other uses of autovalue

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404374894
 
 

 ##########
 File path: sdks/java/io/influxdb/src/main/java/org/apache/beam/sdk/io/influxdb/InfluxDBIO.java
 ##########
 @@ -0,0 +1,709 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.influxdb;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.security.cert.CertificateException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
+import javax.net.ssl.HostnameVerifier;
+import javax.net.ssl.SSLContext;
+import javax.net.ssl.SSLSession;
+import javax.net.ssl.SSLSocketFactory;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.X509TrustManager;
+import okhttp3.OkHttpClient;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.influxdb.BatchOptions;
+import org.influxdb.InfluxDB;
+import org.influxdb.InfluxDBFactory;
+import org.influxdb.dto.Query;
+import org.influxdb.dto.QueryResult;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * IO to read and write to InfluxDB.
+ *
+ * <h3>Reading from InfluxDB datasource</h3>
+ *
+ * <p>InfluxDBIO source returns a bounded collection of {@code String} as a {@code
+ * PCollection<String>}.
+ *
+ * <p>To configure the InfluxDB source, you have to provide a {@link DataSourceConfiguration} using
+ * <br>
+ * {@link DataSourceConfiguration#create(String, String, String)}(durl, username and password).
+ * Optionally, {@link DataSourceConfiguration#withUsername(String)} and {@link
+ * DataSourceConfiguration#withPassword(String)} allows you to define username and password.
+ *
+ * <p>For example:
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <p>For example (Read from query):
+ *
+ * <pre>{@code
+ * PCollection<Stringn> collection = pipeline.apply(InfluxDBIO.read()
+ *   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *          "https://localhost:8086","username","password"))
+ *   .withDatabase("metrics")
+ *   .withQuery("Select * from cpu")
+ *   .withRetentionPolicy("autogen")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ * }</pre>
+ *
+ * <h3>Writing to Influx datasource</h3>
+ *
+ * <p>InfluxDB sink supports writing records into a database. It writes a {@link PCollection} to the
+ * database by converting each T. The T should implement getLineProtocol() from {@link
+ * LineProtocolConvertable}.
+ *
+ * <p>Like the source, to configure the sink, you have to provide a {@link DataSourceConfiguration}.
+ *
+ * <pre>{@code
+ * pipeline
+ *   .apply(...)
+ *   .apply(InfluxDb.write()
+ *      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
+ *            "https://localhost:8086","username","password"))
+ *   .withRetentionPolicy("autogen")
+ *   .withDatabase("metrics")
+ *   .withSslInvalidHostNameAllowed(true)
+ *   withSslEnabled(true));
+ *    );
+ * }</pre>
+ *
+ * *
+ */
+@Experimental(Experimental.Kind.SOURCE_SINK)
+public class InfluxDBIO {
+  private static final Logger LOG = LoggerFactory.getLogger(InfluxDBIO.class);
+
+  public static Write write() {
+    return new AutoValue_InfluxDBIO_Write.Builder().build();
+  }
+
+  public static Read read() {
+    return new AutoValue_InfluxDBIO_Read.Builder().build();
+  }
+
+  @AutoValue
+  public abstract static class Read extends PTransform<PBegin, PCollection<String>> {
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String query();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    @Nullable
+    abstract List<String> metric();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Builder setQuery(String query);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setMetric(List<String> metric);
+
+      abstract Read build();
+    }
+
+    /** Reads from the InfluxDB instance indicated by the given configuration. */
+    public Read withDataSourceConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    /** Reads from the specified database. */
+    public Read withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+    /** Reads from the specified query. */
+    public Read withQuery(String query) {
+      return builder().setQuery(query).build();
+    }
+
+    public Read withMetric(List<String> metric) {
+      return builder().setMetric(metric).build();
+    }
+
+    public Read withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Read withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Read withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    @Override
+    public PCollection<String> expand(PBegin input) {
+      checkArgument(dataSourceConfiguration() != null, "withDataSourceConfiguration() is required");
+      checkArgument(
+          query() != null || database() != null, "withDatabase() or withQuery() is required");
+      if (database() != null) {
+        try (InfluxDB connection =
+            getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+          checkArgument(
+              connection.databaseExists(database()), "Database %s does not exist", database());
+        }
+      }
+      return input.apply(org.apache.beam.sdk.io.Read.from(new InfluxDBSource(this)));
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(DisplayData.item("query", query()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+    }
+  }
+
+  static class InfluxDBSource extends BoundedSource<String> {
+    private final Read spec;
+
+    InfluxDBSource(Read read) {
+      this.spec = read;
+    }
+
+    @Override
+    public long getEstimatedSizeBytes(PipelineOptions pipelineOptions) throws Exception {
+      int size = 0;
+      try (InfluxDB connection =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        connection.setDatabase(spec.database());
+        QueryResult queryResult = connection.query(new Query(getQueryToRun(spec), spec.database()));
+        if (queryResult != null) {
+          List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+          if (databaseNames != null) {
+            Iterator var4 = databaseNames.iterator();
+            while (var4.hasNext()) {
+              List database = (List) var4.next();
+              size += database.size();
+            }
+          }
+        }
+      }
+      LOG.info("Estimated number of elements {} for database {}", size, spec.database());
+      return size;
+    }
+
+    /**
+     * @param desiredElementsInABundle
+     * @param options
+     * @return
+     * @throws Exception
+     */
+    @Override
+    public List<? extends BoundedSource<String>> split(
+        long desiredElementsInABundle, PipelineOptions options) throws Exception {
+      List<BoundedSource<String>> sources = new ArrayList<BoundedSource<String>>();
+      if (spec.metric() != null && spec.metric().size() > 1) {
+        for (String metric : spec.metric()) {
+          sources.add(new InfluxDBSource(spec.withMetric(Arrays.asList(metric))));
+        }
+      } else {
+        sources.add(this);
+      }
+      checkArgument(!sources.isEmpty(), "No primary shard found");
+      return sources;
+    }
+
+    @Override
+    public BoundedReader<String> createReader(PipelineOptions pipelineOptions) {
+      return new BoundedInfluxDbReader(this);
+    }
+
+    @Override
+    public void validate() {
+      spec.validate(null /* input */);
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      spec.populateDisplayData(builder);
+    }
+
+    @Override
+    public Coder<String> getOutputCoder() {
+      return SerializableCoder.of(String.class);
+    }
+  }
+
+  private static String getQueryToRun(Read spec) {
+    if (spec.query() == null) {
+      return "SELECT * FROM " + String.join(",", spec.metric());
+    }
+    return spec.query();
+  }
+
+  private static InfluxDB getConnection(
+      DataSourceConfiguration configuration,
+      boolean sslInvalidHostNameAllowed,
+      boolean sslEnabled) {
+    if (sslInvalidHostNameAllowed && sslEnabled) {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get(),
+          getUnsafeOkHttpClient());
+    } else {
+      return InfluxDBFactory.connect(
+          configuration.getUrl().get(),
+          configuration.getUsername().get(),
+          configuration.getPassword().get());
+    }
+  }
+
+  private static class BoundedInfluxDbReader extends BoundedSource.BoundedReader<String> {
+    private final InfluxDBIO.InfluxDBSource source;
+    private Iterator cursor;
+    private List current;
+
+    public BoundedInfluxDbReader(InfluxDBIO.InfluxDBSource source) {
+      this.source = source;
+    }
+
+    @Override
+    public boolean start() {
+      InfluxDBIO.Read spec = source.spec;
+      try (InfluxDB influxDB =
+          getConnection(
+              spec.dataSourceConfiguration(),
+              spec.sslInvalidHostNameAllowed(),
+              spec.sslEnabled())) {
+        if (spec.database() != null) {
+          influxDB.setDatabase(spec.database());
+        }
+        String query = getQueryToRun(spec);
+        LOG.debug("BoundedInfluxDbReader.start() ==> " + query);
+
+        QueryResult queryResult = influxDB.query(new Query(query, spec.database()));
+
+        List databaseNames = queryResult.getResults().get(0).getSeries().get(0).getValues();
+
+        if (databaseNames != null) {
+          cursor = databaseNames.iterator();
+        }
+      }
+      return advance();
+    }
+
+    @Override
+    public boolean advance() {
+      if (cursor.hasNext()) {
+        current = (List) cursor.next();
+        return true;
+      } else {
+        return false;
+      }
+    }
+
+    @Override
+    public BoundedSource<String> getCurrentSource() {
+      return source;
+    }
+
+    @Override
+    public String getCurrent() throws NoSuchElementException {
+      return current.toString();
+    }
+
+    @Override
+    public void close() {
+      return;
+    }
+  }
+
+  @AutoValue
+  public abstract static class Write extends PTransform<PCollection<String>, PDone> {
+
+    @Override
+    public PDone expand(PCollection<String> input) {
+      checkArgument(dataSourceConfiguration() != null, "withConfiguration() is required");
+      checkArgument(database() != null && !database().isEmpty(), "withDatabase() is required");
+      try (InfluxDB connection =
+          getConnection(dataSourceConfiguration(), sslInvalidHostNameAllowed(), sslEnabled())) {
+        checkArgument(
+            connection.databaseExists(database()), "Database %s does not exist", database());
+      }
+      input.apply(ParDo.of(new InfluxWriterFn(this)));
+      return PDone.in(input.getPipeline());
+    }
+
+    @Override
+    public void populateDisplayData(DisplayData.Builder builder) {
+      super.populateDisplayData(builder);
+      builder.addIfNotNull(
+          DisplayData.item("dataSourceConfiguration", dataSourceConfiguration().toString()));
+      builder.addIfNotNull(DisplayData.item("database", database()));
+      builder.addIfNotNull(DisplayData.item("retentionPolicy", retentionPolicy()));
+      builder.addIfNotNull(DisplayData.item("sslEnabled", sslEnabled()));
+      builder.addIfNotNull(
+          DisplayData.item("sslInvalidHostNameAllowed", sslInvalidHostNameAllowed()));
+      builder.addIfNotNull(DisplayData.item("noOfElementsToBatch", noOfElementsToBatch()));
+      builder.addIfNotNull(DisplayData.item("flushDuration", flushDuration()));
+    }
+
+    @Nullable
+    abstract String database();
+
+    @Nullable
+    abstract String retentionPolicy();
+
+    @Nullable
+    abstract Boolean sslInvalidHostNameAllowed();
+
+    @Nullable
+    abstract Boolean sslEnabled();
+
+    @Nullable
+    abstract Integer noOfElementsToBatch();
+
+    @Nullable
+    abstract Integer flushDuration();
+
+    @Nullable
+    abstract DataSourceConfiguration dataSourceConfiguration();
+
+    abstract Builder builder();
+
+    @AutoValue.Builder
+    abstract static class Builder {
+      abstract Builder setDataSourceConfiguration(DataSourceConfiguration configuration);
+
+      abstract Builder setDatabase(String database);
+
+      abstract Builder setSslInvalidHostNameAllowed(Boolean value);
+
+      abstract Builder setNoOfElementsToBatch(Integer noOfElementsToBatch);
+
+      abstract Builder setFlushDuration(Integer flushDuration);
+
+      abstract Builder setSslEnabled(Boolean sslEnabled);
+
+      abstract Builder setRetentionPolicy(String retentionPolicy);
+
+      abstract Write build();
+    }
+
+    public Write withConfiguration(DataSourceConfiguration configuration) {
+      checkArgument(configuration != null, "configuration can not be null");
+      return builder().setDataSourceConfiguration(configuration).build();
+    }
+
+    public Write withDatabase(String database) {
+      return builder().setDatabase(database).build();
+    }
+
+    public Write withSslEnabled(boolean sslEnabled) {
+      return builder().setSslEnabled(sslEnabled).build();
+    }
+
+    public Write withSslInvalidHostNameAllowed(Boolean value) {
+      return builder().setSslInvalidHostNameAllowed(value).build();
+    }
+
+    public Write withNoOfElementsToBatch(Integer noOfElementsToBatch) {
+      return builder().setNoOfElementsToBatch(noOfElementsToBatch).build();
+    }
+
+    public Write withFlushDuration(Integer flushDuration) {
+      return builder().setFlushDuration(flushDuration).build();
+    }
+
+    public Write withRetentionPolicy(String rp) {
+      return builder().setRetentionPolicy(rp).build();
+    }
+
+    private class InfluxWriterFn<T> extends DoFn<T, Void> {
 
 Review comment:
   Maybe it is simpler to just make it extend `DoFn<String, Void>` for the moment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-594197974
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r404225264
 
 

 ##########
 File path: sdks/java/io/influxdb/build.gradle
 ##########
 @@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+plugins { id 'org.apache.beam.module' }
+applyJavaNature(automaticModuleName: 'org.apache.beam.sdk.io.influxdb')
+provideIntegrationTestingDependencies()
+enableJavaPerformanceTesting()
+
+description = "Apache Beam :: SDKs :: Java :: IO :: InfluxDB"
+ext.summary = "IO to read and write on JDBC datasource."
+
 
 Review comment:
   Add def influxdb_version="2.15" (maybe 2.17 better now).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] mwalenia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
mwalenia commented on a change in pull request #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#discussion_r397759342
 
 

 ##########
 File path: .test-infra/jenkins/job_PerformanceTests_InfluxDBIO_IT.groovy
 ##########
 @@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+import CommonJobProperties as common
+import Kubernetes
+
+String jobName = "beam_PerformanceTests_InfluxDBIO_IT"
+
+job(jobName) {
+  common.setTopLevelMainJobProperties(delegate)
+  common.enablePhraseTriggeringFromPullRequest(
+          delegate,
+          'Java InfluxDBIO Performance Test',
+          'Run Java InfluxDBIO Performance Test')
+
+  String namespace = common.getKubernetesNamespace(jobName)
+  String kubeconfigPath = common.getKubeconfigLocationForNamespace(namespace)
+  Kubernetes k8s = Kubernetes.create(delegate, kubeconfigPath, namespace)
+
+  k8s.apply(common.makePathAbsolute("src/.test-infra/kubernetes/influxdb/influxdb.yml"))
+  String influxDBHostName = "LOAD_BALANCER_IP"
+  k8s.loadBalancerIP("influxdb-load-balancer-service", influxDBHostName)
+  Map pipelineOptions = [
+          influxDBURL     : "http://\$${influxDBHostName}:8086",
+          influxDBUserName : "superadmin",
+          influxDBPassword : "supersecretpassword",
+          databaseName : "db1"
+  ]
+
+  steps {
+    gradle {
+      rootBuildScriptDir(common.checkoutDir)
+      common.setGradleSwitches(delegate)
+      switches("--info")
+      switches("-DintegrationTestPipelineOptions=\'${common.joinPipelineOptions(pipelineOptions)}\'")
+      switches("-DintegrationTestRunner=direct")
 
 Review comment:
   I think this should be executed on Dataflow so that we can test integration of a real runner with a real service.
   Remember about the appropriate options relevant to the Dataflow runner

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia commented on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-595176377
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB

Posted by GitBox <gi...@apache.org>.
iemejia removed a comment on issue #11028: BEAM-2546 Beam IO for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-595176235
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services