You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/12/22 00:52:15 UTC

[GitHub] [flink] liuml07 opened a new pull request, #21549: [FLINK-30479][doc] Document flink-connector-files for local execution

liuml07 opened a new pull request, #21549:
URL: https://github.com/apache/flink/pull/21549

   ## What is the purpose of the change
   
   The file system SQL connector itself is included in Flink and does not require an additional dependency. However, if a user uses the filesystem connector for [local execution](< ref "docs/dev/dataset/local_execution" >),
   for e.g. running Flink job in the IDE, she will need to add dependency. Otherwise, the user will get validation exception: Cannot discover a connector using option: 'connector'='filesystem'. This is confusing and can be documented.
   
   ## Brief change log
   
   The scope of the files connector dependency should be provided, because they should not be packaged into the JAR file.
   So we did not use the `sql_download_table` shortcodes like `{{< sql_download_table "files" >}}`. Also that shortcodes has texts saying the dependencies are required for SQL Client with SQL JAR bundles. That is not applicable to files connector as it's already shipped int he `/lib` directlory.
   
   
   ## Verifying this change
   
   This is a doc change, and I have tested it rendered locally.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #21549: [FLINK-30479][doc] Document flink-connector-files for local execution

Posted by GitBox <gi...@apache.org>.

flinkbot commented on PR #21549:
URL: https://github.com/apache/flink/pull/21549#issuecomment-1362266024

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3d0a250bc8391007493b1031b98a5acd4b87eae4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3d0a250bc8391007493b1031b98a5acd4b87eae4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3d0a250bc8391007493b1031b98a5acd4b87eae4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink] liuml07 commented on pull request #21549: [FLINK-30479][doc] Document flink-connector-files for local execution

Posted by GitBox <gi...@apache.org>.

liuml07 commented on PR #21549:
URL: https://github.com/apache/flink/pull/21549#issuecomment-1362394086

   CC: @twalthr @fapaul 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink] liuml07 commented on pull request #21549: [FLINK-30479][doc] Document flink-connector-files for local execution

Posted by GitBox <gi...@apache.org>.

liuml07 commented on PR #21549:
URL: https://github.com/apache/flink/pull/21549#issuecomment-1362264819

   ![Uploading Screenshot 2022-12-21 at 4.52.34 PM.png…]()
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink] liuml07 commented on pull request #21549: [FLINK-30479][doc] Document flink-connector-files for local execution

Posted by GitBox <gi...@apache.org>.

liuml07 commented on PR #21549:
URL: https://github.com/apache/flink/pull/21549#issuecomment-1362629142

Yeah, documenting centrally sounds good.

Maybe I'm limited by how I build the jobs with connectors - I shade all connectors (and dependencies) into the uber job jar. For other connectors (e.g. Kafka), I add the dependency to the Flink job following the Maven snippet in each connector's doc page. That works for both local execution (IDE) and remote deployment. Filesystem connector is a bit special because it's in the Flink deploy (so no need to shade) but not ready for local execution. Adding "provided" scope dependency for this connector solves my problem. I don't find other connectors dependency needs to change for local execution.

I'm thinking where it would be a good central place. There is a [short guide](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/configuration/advanced/#hadoop-dependencies) for setting Hadoop dependencies for local execution. Do you think it's a good idea to write a new section in the [Connectors and Formats](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/configuration/connector/) page or [Advanced Configuration Topics](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/configuration/advanced/) page?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink] liuml07 commented on a diff in pull request #21549: [FLINK-30479][doc] Document flink-connector-files for local execution

Posted by GitBox <gi...@apache.org>.

liuml07 commented on code in PR #21549:
URL: https://github.com/apache/flink/pull/21549#discussion_r1054985251


##########
docs/content/docs/connectors/table/filesystem.md:
##########
@@ -33,6 +33,18 @@ The file system connector itself is included in Flink and does not require an ad
 The corresponding jar can be found in the Flink distribution inside the `/lib` directory.
 A corresponding format needs to be specified for reading and writing rows from and to a file system.
 
+NOTE: If you use the filesystem connector for [local execution]({{< ref "docs/dev/dataset/local_execution" >}}),
+for e.g. running Flink job in your IDE, you will need to add dependency.
+
+```xml

Review Comment:
   In the PR description, we mentioned why this does not use the `sql_download_table` shortcodes like `{{< sql_download_table "files" >}}`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org