You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/05 05:51:38 UTC

[GitHub] [hudi] kartik18 commented on issue #5211: [SUPPORT] Glob pattern to pick specific subfolders not working while reading in Spark

kartik18 commented on issue #5211:
URL: https://github.com/apache/hudi/issues/5211#issuecomment-1088294929

   @nsivabalan How would I read HUDI files from multiple locations? Considering the folder structure mentioned, I tried to use regex but that didn't work out.
   
   Just like CSV, where we have spark.read.csv(*paths). Is there something similar in HUDI as well?
   
   Folder
          |_ cluster=abc_$folder_ //It is a file not directory
          |_ cluster=abc
             |_dt = 2022-01-01
                  |_ A1.parquet
                  |_ A2.parquet
         |_ cluster=efg_$folder_
         |_ cluster=efg
             |_dt = 2022-01-01
                 |_ B1.parquet
                 |_ B2.parquet
                 
   Given this folder structure, how would I read the HUDI files?              


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org