You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/05/18 07:32:01 UTC

[GitHub] [flink] JingsongLi opened a new pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

JingsongLi opened a new pull request #12212:
URL: https://github.com/apache/flink/pull/12212


   
   ## What is the purpose of the change
   
   Fs connector should use FLIP-122 format options style. Like:
   ```
   create table t (...) with (
     'connector'='filesystem',
     'path'='...',
     'format'='csv',
     'csv.field-delimiter'=';'
   )
   ```
   
   ## Brief change log
   
   - FileSystemFormatFactory implements FLIP-95 Factory
   - Update formats
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes)
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ec02542ed4721376e60ea71090cbb335885e6b0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ec02542ed4721376e60ea71090cbb335885e6b0 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] leonardBang commented on a change in pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
leonardBang commented on a change in pull request #12212:
URL: https://github.com/apache/flink/pull/12212#discussion_r426454025



##########
File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemOptions.java
##########
@@ -29,6 +29,35 @@
  */
 public class FileSystemOptions {
 
+	public static final ConfigOption<String> PATH = key("path")
+			.stringType()
+			.noDefaultValue()
+			.withDescription("The path of a directory");
+
+	public static final ConfigOption<String> PARTITION_DEFAULT_NAME = key("partition.default-name")
+			.stringType()
+			.defaultValue("__DEFAULT_PARTITION__")
+			.withDescription("The default partition name in case the dynamic partition" +
+					" column value is null/empty string");
+
+	public static final ConfigOption<Long> SINK_ROLLING_POLICY_FILE_SIZE = key("sink.rolling-policy.file-size")
+			.longType()
+			.defaultValue(1024L * 1024L * 128L)
+			.withDescription("The maximum part file size before rolling (by default 128MB).");
+
+	public static final ConfigOption<Long> SINK_ROLLING_POLICY_TIME_INTERVAL = key("sink.rolling-policy.time.interval")

Review comment:
       ```suggestion
   	public static final ConfigOption<Long> SINK_ROLLING_POLICY_TIME_INTERVAL = key("sink.rolling-policy.time-interval")
   ```
   how about rename to `sink.rolling-policy.time-interval` which is closer to  FLIP-122's style ?

##########
File path: flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/ParquetFileSystemFormatFactory.java
##########
@@ -23,137 +23,104 @@
 import org.apache.flink.api.common.serialization.BulkWriter;
 import org.apache.flink.api.common.serialization.Encoder;
 import org.apache.flink.configuration.ConfigOption;
+import org.apache.flink.configuration.ReadableConfig;
 import org.apache.flink.core.fs.FileInputSplit;
 import org.apache.flink.core.fs.Path;
 import org.apache.flink.formats.parquet.row.ParquetRowDataBuilder;
 import org.apache.flink.formats.parquet.utils.SerializableConfiguration;
 import org.apache.flink.formats.parquet.vector.ParquetColumnarRowSplitReader;
 import org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil;
 import org.apache.flink.table.data.RowData;
-import org.apache.flink.table.descriptors.DescriptorProperties;
 import org.apache.flink.table.factories.FileSystemFormatFactory;
 import org.apache.flink.table.types.DataType;
 import org.apache.flink.table.types.logical.LogicalType;
 import org.apache.flink.table.types.logical.RowType;
 import org.apache.flink.table.utils.PartitionPathUtils;
 
 import org.apache.hadoop.conf.Configuration;
-import org.apache.parquet.hadoop.ParquetOutputFormat;
 
 import java.io.IOException;
 import java.util.Arrays;
-import java.util.HashMap;
+import java.util.HashSet;
 import java.util.LinkedHashMap;
 import java.util.List;
-import java.util.Map;
 import java.util.Optional;
+import java.util.Properties;
+import java.util.Set;
 
 import static org.apache.flink.configuration.ConfigOptions.key;
 import static org.apache.flink.table.data.vector.VectorizedColumnBatch.DEFAULT_SIZE;
-import static org.apache.flink.table.descriptors.FormatDescriptorValidator.FORMAT;
 import static org.apache.flink.table.filesystem.RowPartitionComputer.restorePartValueFromType;
 
 /**
  * Parquet {@link FileSystemFormatFactory} for file system.
  */
 public class ParquetFileSystemFormatFactory implements FileSystemFormatFactory {
 
-	public static final ConfigOption<Boolean> UTC_TIMEZONE = key("format.utc-timezone")
+	public static final String IDENTIFIER = "parquet";
+
+	public static final ConfigOption<Boolean> UTC_TIMEZONE = key("utc-timezone")
 			.booleanType()
 			.defaultValue(false)
 			.withDescription("Use UTC timezone or local timezone to the conversion between epoch" +
 					" time and LocalDateTime. Hive 0.x/1.x/2.x use local timezone. But Hive 3.x" +
 					" use UTC timezone");
 
-	/**
-	 * Prefix for parquet-related properties, besides format, start with "parquet".
-	 * See more in {@link ParquetOutputFormat}.
-	 * - parquet.compression
-	 * - parquet.block.size
-	 * - parquet.page.size
-	 * - parquet.dictionary.page.size
-	 * - parquet.writer.max-padding
-	 * - parquet.enable.dictionary
-	 * - parquet.validation
-	 * - parquet.writer.version
-	 * ...
-	 */
-	public static final String PARQUET_PROPERTIES = "format.parquet";
-
 	@Override
-	public Map<String, String> requiredContext() {
-		Map<String, String> context = new HashMap<>();
-		context.put(FORMAT, "parquet");
-		return context;
+	public String factoryIdentifier() {
+		return IDENTIFIER;
 	}
 
 	@Override
-	public List<String> supportedProperties() {
-		return Arrays.asList(
-				UTC_TIMEZONE.key(),
-				PARQUET_PROPERTIES + ".*"
-		);
+	public Set<ConfigOption<?>> requiredOptions() {
+		return new HashSet<>();
 	}
 
-	private static boolean isUtcTimestamp(DescriptorProperties properties) {
-		return properties.getOptionalBoolean(UTC_TIMEZONE.key())
-				.orElse(UTC_TIMEZONE.defaultValue());
+	@Override
+	public Set<ConfigOption<?>> optionalOptions() {
+		Set<ConfigOption<?>> options = new HashSet<>();
+		options.add(UTC_TIMEZONE);
+		// support "parquet.*"

Review comment:
       could we list all supported Options here? if yes please add for orc too.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801",
       "triggerID" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806",
       "triggerID" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8db3c2de5558ea95e7083b97f60fdcac60ef8179",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1817",
       "triggerID" : "8db3c2de5558ea95e7083b97f60fdcac60ef8179",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801) 
   * 7a972db935a0bcc39d7979358962538fd8790a9c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806) 
   * 8db3c2de5558ea95e7083b97f60fdcac60ef8179 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1817) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ec02542ed4721376e60ea71090cbb335885e6b0 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698) 
   * 14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630001771


   Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress of the review.
   
   
   ## Automated Checks
   Last check on commit 8db3c2de5558ea95e7083b97f60fdcac60ef8179 (Fri Oct 16 10:50:16 UTC 2020)
   
   **Warnings:**
    * No documentation files were touched! Remember to keep the Flink docs up to date!
   
   
   <sub>Mention the bot in a comment to re-run the automated checks.</sub>
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process.<details>
    The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`)
    - `@flinkbot approve all` to approve all aspects
    - `@flinkbot approve-until architecture` to approve everything until `architecture`
    - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention
    - `@flinkbot disapprove architecture` to remove an approval you gave earlier
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] JingsongLi merged pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
JingsongLi merged pull request #12212:
URL: https://github.com/apache/flink/pull/12212


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] JingsongLi commented on a change in pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on a change in pull request #12212:
URL: https://github.com/apache/flink/pull/12212#discussion_r426499466



##########
File path: flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/ParquetFileSystemFormatFactory.java
##########
@@ -23,137 +23,104 @@
 import org.apache.flink.api.common.serialization.BulkWriter;
 import org.apache.flink.api.common.serialization.Encoder;
 import org.apache.flink.configuration.ConfigOption;
+import org.apache.flink.configuration.ReadableConfig;
 import org.apache.flink.core.fs.FileInputSplit;
 import org.apache.flink.core.fs.Path;
 import org.apache.flink.formats.parquet.row.ParquetRowDataBuilder;
 import org.apache.flink.formats.parquet.utils.SerializableConfiguration;
 import org.apache.flink.formats.parquet.vector.ParquetColumnarRowSplitReader;
 import org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil;
 import org.apache.flink.table.data.RowData;
-import org.apache.flink.table.descriptors.DescriptorProperties;
 import org.apache.flink.table.factories.FileSystemFormatFactory;
 import org.apache.flink.table.types.DataType;
 import org.apache.flink.table.types.logical.LogicalType;
 import org.apache.flink.table.types.logical.RowType;
 import org.apache.flink.table.utils.PartitionPathUtils;
 
 import org.apache.hadoop.conf.Configuration;
-import org.apache.parquet.hadoop.ParquetOutputFormat;
 
 import java.io.IOException;
 import java.util.Arrays;
-import java.util.HashMap;
+import java.util.HashSet;
 import java.util.LinkedHashMap;
 import java.util.List;
-import java.util.Map;
 import java.util.Optional;
+import java.util.Properties;
+import java.util.Set;
 
 import static org.apache.flink.configuration.ConfigOptions.key;
 import static org.apache.flink.table.data.vector.VectorizedColumnBatch.DEFAULT_SIZE;
-import static org.apache.flink.table.descriptors.FormatDescriptorValidator.FORMAT;
 import static org.apache.flink.table.filesystem.RowPartitionComputer.restorePartValueFromType;
 
 /**
  * Parquet {@link FileSystemFormatFactory} for file system.
  */
 public class ParquetFileSystemFormatFactory implements FileSystemFormatFactory {
 
-	public static final ConfigOption<Boolean> UTC_TIMEZONE = key("format.utc-timezone")
+	public static final String IDENTIFIER = "parquet";
+
+	public static final ConfigOption<Boolean> UTC_TIMEZONE = key("utc-timezone")
 			.booleanType()
 			.defaultValue(false)
 			.withDescription("Use UTC timezone or local timezone to the conversion between epoch" +
 					" time and LocalDateTime. Hive 0.x/1.x/2.x use local timezone. But Hive 3.x" +
 					" use UTC timezone");
 
-	/**
-	 * Prefix for parquet-related properties, besides format, start with "parquet".
-	 * See more in {@link ParquetOutputFormat}.
-	 * - parquet.compression
-	 * - parquet.block.size
-	 * - parquet.page.size
-	 * - parquet.dictionary.page.size
-	 * - parquet.writer.max-padding
-	 * - parquet.enable.dictionary
-	 * - parquet.validation
-	 * - parquet.writer.version
-	 * ...
-	 */
-	public static final String PARQUET_PROPERTIES = "format.parquet";
-
 	@Override
-	public Map<String, String> requiredContext() {
-		Map<String, String> context = new HashMap<>();
-		context.put(FORMAT, "parquet");
-		return context;
+	public String factoryIdentifier() {
+		return IDENTIFIER;
 	}
 
 	@Override
-	public List<String> supportedProperties() {
-		return Arrays.asList(
-				UTC_TIMEZONE.key(),
-				PARQUET_PROPERTIES + ".*"
-		);
+	public Set<ConfigOption<?>> requiredOptions() {
+		return new HashSet<>();
 	}
 
-	private static boolean isUtcTimestamp(DescriptorProperties properties) {
-		return properties.getOptionalBoolean(UTC_TIMEZONE.key())
-				.orElse(UTC_TIMEZONE.defaultValue());
+	@Override
+	public Set<ConfigOption<?>> optionalOptions() {
+		Set<ConfigOption<?>> options = new HashSet<>();
+		options.add(UTC_TIMEZONE);
+		// support "parquet.*"

Review comment:
       As far as I know, it is hard... Because we don't know how many options are supported by parquet and orc.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801",
       "triggerID" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806",
       "triggerID" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ec02542ed4721376e60ea71090cbb335885e6b0 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698) 
   * 14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801) 
   * 7a972db935a0bcc39d7979358962538fd8790a9c Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] leonardBang commented on a change in pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
leonardBang commented on a change in pull request #12212:
URL: https://github.com/apache/flink/pull/12212#discussion_r426514957



##########
File path: flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/ParquetFileSystemFormatFactory.java
##########
@@ -23,137 +23,104 @@
 import org.apache.flink.api.common.serialization.BulkWriter;
 import org.apache.flink.api.common.serialization.Encoder;
 import org.apache.flink.configuration.ConfigOption;
+import org.apache.flink.configuration.ReadableConfig;
 import org.apache.flink.core.fs.FileInputSplit;
 import org.apache.flink.core.fs.Path;
 import org.apache.flink.formats.parquet.row.ParquetRowDataBuilder;
 import org.apache.flink.formats.parquet.utils.SerializableConfiguration;
 import org.apache.flink.formats.parquet.vector.ParquetColumnarRowSplitReader;
 import org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil;
 import org.apache.flink.table.data.RowData;
-import org.apache.flink.table.descriptors.DescriptorProperties;
 import org.apache.flink.table.factories.FileSystemFormatFactory;
 import org.apache.flink.table.types.DataType;
 import org.apache.flink.table.types.logical.LogicalType;
 import org.apache.flink.table.types.logical.RowType;
 import org.apache.flink.table.utils.PartitionPathUtils;
 
 import org.apache.hadoop.conf.Configuration;
-import org.apache.parquet.hadoop.ParquetOutputFormat;
 
 import java.io.IOException;
 import java.util.Arrays;
-import java.util.HashMap;
+import java.util.HashSet;
 import java.util.LinkedHashMap;
 import java.util.List;
-import java.util.Map;
 import java.util.Optional;
+import java.util.Properties;
+import java.util.Set;
 
 import static org.apache.flink.configuration.ConfigOptions.key;
 import static org.apache.flink.table.data.vector.VectorizedColumnBatch.DEFAULT_SIZE;
-import static org.apache.flink.table.descriptors.FormatDescriptorValidator.FORMAT;
 import static org.apache.flink.table.filesystem.RowPartitionComputer.restorePartValueFromType;
 
 /**
  * Parquet {@link FileSystemFormatFactory} for file system.
  */
 public class ParquetFileSystemFormatFactory implements FileSystemFormatFactory {
 
-	public static final ConfigOption<Boolean> UTC_TIMEZONE = key("format.utc-timezone")
+	public static final String IDENTIFIER = "parquet";
+
+	public static final ConfigOption<Boolean> UTC_TIMEZONE = key("utc-timezone")
 			.booleanType()
 			.defaultValue(false)
 			.withDescription("Use UTC timezone or local timezone to the conversion between epoch" +
 					" time and LocalDateTime. Hive 0.x/1.x/2.x use local timezone. But Hive 3.x" +
 					" use UTC timezone");
 
-	/**
-	 * Prefix for parquet-related properties, besides format, start with "parquet".
-	 * See more in {@link ParquetOutputFormat}.
-	 * - parquet.compression
-	 * - parquet.block.size
-	 * - parquet.page.size
-	 * - parquet.dictionary.page.size
-	 * - parquet.writer.max-padding
-	 * - parquet.enable.dictionary
-	 * - parquet.validation
-	 * - parquet.writer.version
-	 * ...
-	 */
-	public static final String PARQUET_PROPERTIES = "format.parquet";
-
 	@Override
-	public Map<String, String> requiredContext() {
-		Map<String, String> context = new HashMap<>();
-		context.put(FORMAT, "parquet");
-		return context;
+	public String factoryIdentifier() {
+		return IDENTIFIER;
 	}
 
 	@Override
-	public List<String> supportedProperties() {
-		return Arrays.asList(
-				UTC_TIMEZONE.key(),
-				PARQUET_PROPERTIES + ".*"
-		);
+	public Set<ConfigOption<?>> requiredOptions() {
+		return new HashSet<>();
 	}
 
-	private static boolean isUtcTimestamp(DescriptorProperties properties) {
-		return properties.getOptionalBoolean(UTC_TIMEZONE.key())
-				.orElse(UTC_TIMEZONE.defaultValue());
+	@Override
+	public Set<ConfigOption<?>> optionalOptions() {
+		Set<ConfigOption<?>> options = new HashSet<>();
+		options.add(UTC_TIMEZONE);
+		// support "parquet.*"

Review comment:
       okay, it's fine to me




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630001771


   Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress of the review.
   
   
   ## Automated Checks
   Last check on commit 0ec02542ed4721376e60ea71090cbb335885e6b0 (Mon May 18 07:36:56 UTC 2020)
   
   **Warnings:**
    * No documentation files were touched! Remember to keep the Flink docs up to date!
   
   
   <sub>Mention the bot in a comment to re-run the automated checks.</sub>
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process.<details>
    The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`)
    - `@flinkbot approve all` to approve all aspects
    - `@flinkbot approve-until architecture` to approve everything until `architecture`
    - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention
    - `@flinkbot disapprove architecture` to remove an approval you gave earlier
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801",
       "triggerID" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806",
       "triggerID" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8db3c2de5558ea95e7083b97f60fdcac60ef8179",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8db3c2de5558ea95e7083b97f60fdcac60ef8179",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801) 
   * 7a972db935a0bcc39d7979358962538fd8790a9c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806) 
   * 8db3c2de5558ea95e7083b97f60fdcac60ef8179 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801",
       "triggerID" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ec02542ed4721376e60ea71090cbb335885e6b0 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698) 
   * 14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801",
       "triggerID" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ec02542ed4721376e60ea71090cbb335885e6b0 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698) 
   * 14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801) 
   * 7a972db935a0bcc39d7979358962538fd8790a9c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801",
       "triggerID" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806",
       "triggerID" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8db3c2de5558ea95e7083b97f60fdcac60ef8179",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1817",
       "triggerID" : "8db3c2de5558ea95e7083b97f60fdcac60ef8179",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7a972db935a0bcc39d7979358962538fd8790a9c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806) 
   * 8db3c2de5558ea95e7083b97f60fdcac60ef8179 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1817) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801",
       "triggerID" : "14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806",
       "triggerID" : "7a972db935a0bcc39d7979358962538fd8790a9c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 14ab5f81d7f9d7324e3a0f7ab51d47bf9242abee Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1801) 
   * 7a972db935a0bcc39d7979358962538fd8790a9c Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1806) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #12212:
URL: https://github.com/apache/flink/pull/12212#issuecomment-630019032


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698",
       "triggerID" : "0ec02542ed4721376e60ea71090cbb335885e6b0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ec02542ed4721376e60ea71090cbb335885e6b0 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=1698) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] JingsongLi commented on a change in pull request #12212: [FLINK-17626][fs-connector] Fs connector should use FLIP-122 format options style

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on a change in pull request #12212:
URL: https://github.com/apache/flink/pull/12212#discussion_r426494943



##########
File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/FileSystemOptions.java
##########
@@ -29,6 +29,35 @@
  */
 public class FileSystemOptions {
 
+	public static final ConfigOption<String> PATH = key("path")
+			.stringType()
+			.noDefaultValue()
+			.withDescription("The path of a directory");
+
+	public static final ConfigOption<String> PARTITION_DEFAULT_NAME = key("partition.default-name")
+			.stringType()
+			.defaultValue("__DEFAULT_PARTITION__")
+			.withDescription("The default partition name in case the dynamic partition" +
+					" column value is null/empty string");
+
+	public static final ConfigOption<Long> SINK_ROLLING_POLICY_FILE_SIZE = key("sink.rolling-policy.file-size")
+			.longType()
+			.defaultValue(1024L * 1024L * 128L)
+			.withDescription("The maximum part file size before rolling (by default 128MB).");
+
+	public static final ConfigOption<Long> SINK_ROLLING_POLICY_TIME_INTERVAL = key("sink.rolling-policy.time.interval")

Review comment:
       Good catch, I think yes.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org