You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 21:45:45 UTC

[GitHub] [beam] damccorm opened a new issue, #21165: Support to submit Jobs using HBaseIO to DataflowRunner without local access to HBase Cluster

damccorm opened a new issue, #21165:
URL: https://github.com/apache/beam/issues/21165

   *****Context*****
   
   As of today HBase IO interacts with Hbase cluster while building execution graph for validating the existence of table, etc 
   
   https://github.com/apache/beam/blob/master/sdks/java/io/hbase/src/main/java/org/apache/beam/sdk/io/hbase/HBaseIO.java#L237
   
   In certain scenarios dataflow jobs are launched from systems that does not have network access to Hbase cluster during graph construction stage. but can access only during execution time on google cloud. However due to current implementation of local access to HbaseIO, the job can be launched only from systems that has network access to Hbase Cluster.
   
   *****Requirement*****
   
    Modify HbaseIO to accept a flag (say hasLocalAccess) and if flag is set to false defer validations , split calculation logic etc to job execution time rather than job construction time.
   
    
   
   Imported from Jira [BEAM-13141](https://issues.apache.org/jira/browse/BEAM-13141). Original Jira may contain additional context.
   Reported by: prathapreddy22.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org