You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2020/07/29 17:54:33 UTC

[GitHub] [hive] bmaidics opened a new pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

bmaidics opened a new pull request #1330:
URL: https://github.com/apache/hive/pull/1330


   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r475177713



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5706,67 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);
+        Path path = new Path(p.getSd().getLocation());
+
+        FileSystem fs = path.getFileSystem(conf);
+        RemoteIterator<LocatedFileStatus> itr = fs.listFiles(path, true);
+        while (itr.hasNext()) {
+          FileStatus fStatus = itr.next();
+          Reader reader = OrcFile.createReader(fStatus.getPath(), OrcFile.readerOptions(fs.getConf()));
+          boolean isRawFormat  = !CollectionUtils.isEqualCollection(reader.getSchema().getFieldNames(), ALL_ACID_ROW_NAMES);
+          int fileFormat = isRawFormat ? 0 : 2;

Review comment:
       Please use enum on the Java side, but on thrift side stick to strings/ints whatever. Again, when upgrading HMS API enum in the thrift can cause issues




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r467012433



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5706,67 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);
+        Path path = new Path(p.getSd().getLocation());
+
+        FileSystem fs = path.getFileSystem(conf);
+        RemoteIterator<LocatedFileStatus> itr = fs.listFiles(path, true);
+        while (itr.hasNext()) {
+          FileStatus fStatus = itr.next();
+          Reader reader = OrcFile.createReader(fStatus.getPath(), OrcFile.readerOptions(fs.getConf()));
+          boolean isRawFormat  = !CollectionUtils.isEqualCollection(reader.getSchema().getFieldNames(), ALL_ACID_ROW_NAMES);
+          int fileFormat = isRawFormat ? 0 : 2;
+          response.addToFileListData(createFileStatus(fStatus, p.getSd().getLocation(), fileFormat));
+        }
+        success = true;
+      } catch (Exception e) {
+        ex = e;
+        LOG.error("Failed to list files with error: " + e.getMessage());
+        throw new MetaException(e.getMessage());
+      } finally {
+        endFunction("get_file_list", success, ex);
+      }
+      return response;
+    }
+
+    private ByteBuffer createFileStatus(FileStatus fileStatus, String basePath, int fileFormat) {

Review comment:
       Noted, I'll add javadoc here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r467058365



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,
+  6: optional string validWriteIdList
+}
+
+struct GetFileListResponse {

Review comment:
       This make sense as well. I'll add an enum: FileMetadataType, which this GetFileListResponse (I'll probably rename) will contain as a field.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] closed pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #1330:
URL: https://github.com/apache/hive/pull/1330


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r475177465



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,
+  6: optional string validWriteIdList
+}
+
+struct GetFileListResponse {
+  1: optional list<binary> fileListData,
+  2: optional i32 fbVersionNumber

Review comment:
       AFAIK we try to keep every field optional in the HMS APIs so we will be able to change them if needed. I specifically asked Barna to do this. Please Vihang correct me if I was mistaken. Thanks, Peter 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vihangk1 commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
vihangk1 commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r468832097



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,

Review comment:
       I think there is a trade-off here. On larger tables with lots of partitions, doing multiple RPCs to the metastore for fetching the file-metadata one at a time not only is less efficient, it is likely that the ValidWriteIdList is updated for the table during the time and the cache hit ratio could go down. You are right about large data sent over network. In my experience the file-metadata which we are sending here is few hundred bytes per partition and the its not very large even for few thousands of the partition. If use a partitionNames list here in the request, clients can always do batching like requesting 1000 partitions at a time which would be more efficient.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r467005926



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {

Review comment:
       I agree with the naming. The existing getFileMetadata (AFAIK) returns file metadata based on fileId. It would not be good for our use-case (replacing recursive listing during getAcidState). 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r467009533



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -240,6 +247,20 @@
   private static ZooKeeperHiveHelper zooKeeperHelper = null;
   private static String msHost = null;
 
+  static final String OPERATION_FIELD_NAME = "operation";

Review comment:
       Noted, I'll add some comments




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] vihangk1 commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
vihangk1 commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r466575514



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {

Review comment:
       This naming of this struct could be more generic (may be a better name could be GetFileMetadataRequest). Also did you consider reusing existing/extending getFileMetadata HMS API?

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5706,67 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);
+        Path path = new Path(p.getSd().getLocation());
+
+        FileSystem fs = path.getFileSystem(conf);
+        RemoteIterator<LocatedFileStatus> itr = fs.listFiles(path, true);
+        while (itr.hasNext()) {
+          FileStatus fStatus = itr.next();
+          Reader reader = OrcFile.createReader(fStatus.getPath(), OrcFile.readerOptions(fs.getConf()));

Review comment:
       Does this assume that the request is always for a ORC table?

##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,
+  6: optional string validWriteIdList
+}
+
+struct GetFileListResponse {

Review comment:
       I think it will be useful to have a separate struct defined for the FileMetadata which also includes a type field. For instance, I can see this being useful for various engines and the FileMetadata format could be different for each engine. For instance, engine1 may require FileStatus, BlockInformation while engine2 only is interested in FileStatus, FilemodificationTime, while there is another file-metadata type for ACID state which includes some ACID specific information like fileformat that you used below.

##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -2454,10 +2467,14 @@ PartitionsResponse get_partitions_req(1:PartitionsRequest req)
   // partition keys in new_part should be the same as those in old partition.
   void rename_partition(1:string db_name, 2:string tbl_name, 3:list<string> part_vals, 4:Partition new_part)
                        throws (1:InvalidOperationException o1, 2:MetaException o2)
-  
+
   RenamePartitionResponse rename_partition_req(1:RenamePartitionRequest req)
                        throws (1:InvalidOperationException o1, 2:MetaException o2)
 
+  // Returns a file list using FlatBuffers as serialization
+  GetFileListResponse get_file_list(1:GetFileListRequest req)
+    throws(1:MetaException o1)

Review comment:
       Seems like if the partition doesn't exist you might need to throw a NoSuchObjectFoundException as well.

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5706,67 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);
+        Path path = new Path(p.getSd().getLocation());
+
+        FileSystem fs = path.getFileSystem(conf);
+        RemoteIterator<LocatedFileStatus> itr = fs.listFiles(path, true);
+        while (itr.hasNext()) {
+          FileStatus fStatus = itr.next();
+          Reader reader = OrcFile.createReader(fStatus.getPath(), OrcFile.readerOptions(fs.getConf()));
+          boolean isRawFormat  = !CollectionUtils.isEqualCollection(reader.getSchema().getFieldNames(), ALL_ACID_ROW_NAMES);
+          int fileFormat = isRawFormat ? 0 : 2;

Review comment:
       I think fileFormat having 0 or 2 looks very arbitary. We should atleast define the valid values of this field and what each value means.

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5706,67 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);
+        Path path = new Path(p.getSd().getLocation());
+
+        FileSystem fs = path.getFileSystem(conf);
+        RemoteIterator<LocatedFileStatus> itr = fs.listFiles(path, true);
+        while (itr.hasNext()) {
+          FileStatus fStatus = itr.next();
+          Reader reader = OrcFile.createReader(fStatus.getPath(), OrcFile.readerOptions(fs.getConf()));
+          boolean isRawFormat  = !CollectionUtils.isEqualCollection(reader.getSchema().getFieldNames(), ALL_ACID_ROW_NAMES);
+          int fileFormat = isRawFormat ? 0 : 2;
+          response.addToFileListData(createFileStatus(fStatus, p.getSd().getLocation(), fileFormat));
+        }
+        success = true;
+      } catch (Exception e) {
+        ex = e;
+        LOG.error("Failed to list files with error: " + e.getMessage());
+        throw new MetaException(e.getMessage());
+      } finally {
+        endFunction("get_file_list", success, ex);
+      }
+      return response;
+    }
+
+    private ByteBuffer createFileStatus(FileStatus fileStatus, String basePath, int fileFormat) {

Review comment:
       Please add a java doc here. Also, would be good to attribute the source where this code is from with a link to the file.

##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,

Review comment:
       Does this mean the file list is retrieved on 1 partition at a time? It seems like we can have bulk retrieval for a given list of partition names?

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -240,6 +247,20 @@
   private static ZooKeeperHiveHelper zooKeeperHelper = null;
   private static String msHost = null;
 
+  static final String OPERATION_FIELD_NAME = "operation";

Review comment:
       Some documentation of these constants will help improve the readability.

##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,
+  6: optional string validWriteIdList
+}
+
+struct GetFileListResponse {
+  1: optional list<binary> fileListData,
+  2: optional i32 fbVersionNumber

Review comment:
       It would be good to have a comment here saying what the version number represents. Also, fbVersionNumber sounds like its a flatbuffer specific version number? Is that correct? Why does this field need to be optional?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r462178379



##########
File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
##########
@@ -191,6 +191,20 @@ void dropCatalog(String catName)
   List<String> getTables(String dbName, String tablePattern)
       throws MetaException, TException, UnknownDBException;
 
+
+  /**
+   * Get the files from the filesystem that the specified table/partition contains

Review comment:
       nit: javadoc complains if there is no '.' at the end of the first sentence :)

##########
File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/fb/FbFileStatus.java
##########
@@ -0,0 +1,51 @@
+// automatically generated by the FlatBuffers compiler, do not modify

Review comment:
       Rat check did not complain?

##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,18 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,
+  6: optional string validWriteIdList
+}
+
+struct GetFileListResponse {
+  1: required list<binary> fileListData

Review comment:
       Should be optional, so if we decide to change something then we can change easily.
   
   Maybe based on @vihangk1 's suggestion, we should add a version number here for the FB data. So if we change content of the binary later it could be done easily

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -75,11 +75,10 @@
 import com.google.common.base.Suppliers;
 import com.google.common.collect.Lists;
 
+import com.google.flatbuffers.FlatBufferBuilder;
 import org.apache.commons.cli.OptionBuilder;
 import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.*;

Review comment:
       nit: Please avoid '*' imports

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5685,63 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);

Review comment:
       Uhhh... this will be costly - any way to avoid it?

##########
File path: standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetFileList.java
##########
@@ -0,0 +1,237 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.metastore.client;
+
+import org.apache.hadoop.fs.*;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.annotation.MetastoreCheckinTest;
+import org.apache.hadoop.hive.metastore.api.*;
+import org.apache.hadoop.hive.metastore.client.builder.DatabaseBuilder;
+import org.apache.hadoop.hive.metastore.client.builder.PartitionBuilder;
+import org.apache.hadoop.hive.metastore.client.builder.TableBuilder;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf.ConfVars;
+import org.apache.hadoop.hive.metastore.fb.FbFileStatus;
+import org.apache.hadoop.hive.metastore.minihms.AbstractMetaStoreService;
+import org.junit.*;

Review comment:
       nit: avoid wildcard imports please

##########
File path: standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetFileList.java
##########
@@ -0,0 +1,237 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.metastore.client;
+
+import org.apache.hadoop.fs.*;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.annotation.MetastoreCheckinTest;
+import org.apache.hadoop.hive.metastore.api.*;
+import org.apache.hadoop.hive.metastore.client.builder.DatabaseBuilder;
+import org.apache.hadoop.hive.metastore.client.builder.PartitionBuilder;
+import org.apache.hadoop.hive.metastore.client.builder.TableBuilder;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf.ConfVars;
+import org.apache.hadoop.hive.metastore.fb.FbFileStatus;
+import org.apache.hadoop.hive.metastore.minihms.AbstractMetaStoreService;
+import org.junit.*;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import java.nio.ByteBuffer;
+import java.nio.file.Files;
+import java.util.*;

Review comment:
       nit: avoid wildcard imports please

##########
File path: standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetFileList.java
##########
@@ -0,0 +1,237 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.metastore.client;
+
+import org.apache.hadoop.fs.*;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.annotation.MetastoreCheckinTest;
+import org.apache.hadoop.hive.metastore.api.*;
+import org.apache.hadoop.hive.metastore.client.builder.DatabaseBuilder;
+import org.apache.hadoop.hive.metastore.client.builder.PartitionBuilder;
+import org.apache.hadoop.hive.metastore.client.builder.TableBuilder;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf.ConfVars;
+import org.apache.hadoop.hive.metastore.fb.FbFileStatus;
+import org.apache.hadoop.hive.metastore.minihms.AbstractMetaStoreService;
+import org.junit.*;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import java.nio.ByteBuffer;
+import java.nio.file.Files;
+import java.util.*;
+
+/**
+ * Tests for listing files.
+ */
+@RunWith(Parameterized.class)
+@Category(MetastoreCheckinTest.class)
+public class TestGetFileList extends MetaStoreClientTest {
+  private AbstractMetaStoreService metaStore;
+  private IMetaStoreClient client;
+  private FileSystem fs;
+
+  private static String hiveWarehouse;
+
+  protected static final String DB_NAME = "test_getfile_db";
+  protected static final String TABLE_NAME = "test_getfile_table";
+  protected static final String FILE_NAME = "sample";
+  protected static final String PARTITION1 = "partcol1";
+  protected static final String PARTITION2 = "partcol2";
+  protected static final int NUM_PARTITIONS = 2;
+  protected static final int NUM_FILES_IN_PARTITION  = 5;
+
+
+  public TestGetFileList(String name, AbstractMetaStoreService metaStore) {
+    this.metaStore = metaStore;
+  }
+
+  @BeforeClass
+  public static void startMetaStores() {
+    Map<ConfVars, String> msConf = new HashMap<>();
+    Map<String, String> extraConf = new HashMap<>();
+    extraConf.put(ConfVars.HIVE_IN_TEST.getVarname(), "true");
+    startMetaStores(msConf, extraConf);
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Get new client
+    client = metaStore.getClient();
+
+    // Clean up the database
+    client.dropDatabase(DB_NAME, true, true, true);
+    metaStore.cleanWarehouseDirs();
+
+    //Create files in local filesystem
+    fs = new LocalFileSystem();
+    java.nio.file.Path tempDirWithPrefix = Files.createTempDirectory("GetFileTest_");
+    hiveWarehouse = tempDirWithPrefix.toString();
+    fs.initialize(tempDirWithPrefix.toUri(), metaStore.getConf());
+  }
+
+  @After
+  public void tearDown() throws Exception {
+    try {
+        client.close();
+    } finally {
+      client = null;
+    }
+  }
+
+  protected AbstractMetaStoreService getMetaStore() {
+    return metaStore;
+  }
+
+  protected IMetaStoreClient getClient() {
+    return client;
+  }
+
+  protected void setClient(IMetaStoreClient client) {
+    this.client = client;
+  }
+
+  @Test
+  public void testGetFileListResultCountOnePartition() throws Exception {
+    createDatabaseOnePartition();
+    List<String> partVals = new ArrayList<>();
+    partVals.add("a1");
+    GetFileListResponse resp = getClient().getFileList(null, DB_NAME, TABLE_NAME, partVals,  "");
+    Assert.assertEquals(NUM_FILES_IN_PARTITION, resp.getFileListDataSize());
+  }
+
+  @Test
+  public void testGetFileListResultOnePartition() throws Exception {
+    createDatabaseOnePartition();
+    List<String> partVals = new ArrayList<>();
+    partVals.add("a1");
+    GetFileListResponse resp = getClient().getFileList(null, DB_NAME, TABLE_NAME, partVals,  "");
+    List<String> expectedPaths = new ArrayList<>();
+    for (int i = 0; i < NUM_FILES_IN_PARTITION; i++) {
+      expectedPaths.add("/" + FILE_NAME + i);
+    }
+
+    List<String> actualPaths = new ArrayList<>();
+    for (ByteBuffer buffer: resp.getFileListData()) {
+      FbFileStatus fileStatus = FbFileStatus.getRootAsFbFileStatus(buffer);
+      actualPaths.add(fileStatus.relativePath());
+    }
+    Collections.sort(actualPaths);
+    Assert.assertArrayEquals(expectedPaths.toArray(), actualPaths.toArray());
+  }
+
+  @Test
+  public void testGetFileListResultMultiPartition() throws Exception {
+    createDatabaseMultiPartition();
+    List<String> partVals = new ArrayList<>();
+    partVals.add("a1");
+    partVals.add("1");
+    GetFileListResponse resp = getClient().getFileList(null, DB_NAME, TABLE_NAME, partVals,  "");
+    List<String> expectedPaths = new ArrayList<>();
+    for (int i = 0; i < NUM_FILES_IN_PARTITION; i++) {
+      expectedPaths.add("/" + FILE_NAME + i);
+    }
+
+    List<String> actualPaths = new ArrayList<>();
+    for (ByteBuffer buffer: resp.getFileListData()) {
+      FbFileStatus fileStatus = FbFileStatus.getRootAsFbFileStatus(buffer);
+      actualPaths.add(fileStatus.relativePath());
+    }
+    Collections.sort(actualPaths);
+    Assert.assertArrayEquals(expectedPaths.toArray(), actualPaths.toArray());
+  }
+
+  @Test
+  public void testGetFileListNoResult() throws Exception {
+    createDatabaseOnePartition();
+    List<String> partVals = new ArrayList<>();
+    partVals.add("a2");
+    Assert.assertThrows(MetaException.class,
+            () -> getClient().getFileList(null, DB_NAME, TABLE_NAME, partVals,  ""));
+  }
+
+
+
+  private void createDatabaseOnePartition() throws Exception {
+    String path = hiveWarehouse;
+
+    Database db = new DatabaseBuilder().
+            setName(DB_NAME).
+            setLocation(path).

Review comment:
       Why do you need to set the location?
   Weren't default location working for this test?

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5685,63 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);

Review comment:
       Will be solved by the cache




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r466993241



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,

Review comment:
       Yes, it can retrieve for 1 partition at a time. It can handle multi-level partitions as well. Do you think it would make sense to support returning file metadata for a list of partition values? It can lead to a huge amount of data sent over the network and the cache can be filled easily with just a few entries.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r467008407



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -2454,10 +2467,14 @@ PartitionsResponse get_partitions_req(1:PartitionsRequest req)
   // partition keys in new_part should be the same as those in old partition.
   void rename_partition(1:string db_name, 2:string tbl_name, 3:list<string> part_vals, 4:Partition new_part)
                        throws (1:InvalidOperationException o1, 2:MetaException o2)
-  
+
   RenamePartitionResponse rename_partition_req(1:RenamePartitionRequest req)
                        throws (1:InvalidOperationException o1, 2:MetaException o2)
 
+  // Returns a file list using FlatBuffers as serialization
+  GetFileListResponse get_file_list(1:GetFileListRequest req)
+    throws(1:MetaException o1)

Review comment:
       You're right, I forgot to add it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r462284542



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,18 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,
+  6: optional string validWriteIdList
+}
+
+struct GetFileListResponse {
+  1: required list<binary> fileListData

Review comment:
       Made it optional and added version number




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r467010382



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5706,67 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);
+        Path path = new Path(p.getSd().getLocation());
+
+        FileSystem fs = path.getFileSystem(conf);
+        RemoteIterator<LocatedFileStatus> itr = fs.listFiles(path, true);
+        while (itr.hasNext()) {
+          FileStatus fStatus = itr.next();
+          Reader reader = OrcFile.createReader(fStatus.getPath(), OrcFile.readerOptions(fs.getConf()));

Review comment:
       Currently yes. But I think it makes sense to check if it's ORC first. If not, just leave the fileFormat field empty. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] github-actions[bot] commented on pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #1330:
URL: https://github.com/apache/hive/pull/1330#issuecomment-727669192


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r467011860



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##########
@@ -5685,6 +5706,67 @@ private void alter_table_core(String catName, String dbname, String name, Table
       }
     }
 
+    @Override
+    public GetFileListResponse get_file_list(GetFileListRequest req) throws MetaException {
+      String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf);
+      String dbName = req.getDbName();
+      String tblName = req.getTableName();
+      List<String> partitions = req.getPartVals();
+      // Will be used later, when cache is introduced
+      String validWriteIdList = req.getValidWriteIdList();
+
+      startFunction("get_file_list", ": " + TableName.getQualified(catName, dbName, tblName)
+              + ", partitions: " + partitions.toString());
+
+
+      GetFileListResponse response = new GetFileListResponse();
+
+      boolean success = false;
+      Exception ex = null;
+      try {
+        Partition p =  getMS().getPartition(catName, dbName, tblName, partitions);
+        Path path = new Path(p.getSd().getLocation());
+
+        FileSystem fs = path.getFileSystem(conf);
+        RemoteIterator<LocatedFileStatus> itr = fs.listFiles(path, true);
+        while (itr.hasNext()) {
+          FileStatus fStatus = itr.next();
+          Reader reader = OrcFile.createReader(fStatus.getPath(), OrcFile.readerOptions(fs.getConf()));
+          boolean isRawFormat  = !CollectionUtils.isEqualCollection(reader.getSchema().getFieldNames(), ALL_ACID_ROW_NAMES);
+          int fileFormat = isRawFormat ? 0 : 2;

Review comment:
       My first thought was that is should be a boolean. But after some discussion, if we want to add new versions, or fileFormats (like we release ACID v3), it'd make sense not to store it in boolean in the first place. But you're right, for more readable code, we can use enums.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] bmaidics commented on a change in pull request #1330: HIVE-23890: Create HMS endpoint for querying file lists using FlatBuf…

Posted by GitBox <gi...@apache.org>.
bmaidics commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r467007215



##########
File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list<string> partVals,
+  6: optional string validWriteIdList
+}
+
+struct GetFileListResponse {
+  1: optional list<binary> fileListData,
+  2: optional i32 fbVersionNumber

Review comment:
       I agree. Here I wanted to express, that the version represents the fields in the flatbuffer file. In the first version, we'd like to store relative_path, length, last_modification_time, and file_format, but in future versions, we might want to add more fields. But you're right, this naming is confusing. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org