You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "zeroshade (via GitHub)" <gi...@apache.org> on 2023/09/19 15:34:54 UTC

[GitHub] [arrow] zeroshade commented on a diff in pull request #37786: GH-35775: [Go][Parquet] Allow key value file metadata to be written after writing row groups

zeroshade commented on code in PR #37786:
URL: https://github.com/apache/arrow/pull/37786#discussion_r1330329526


##########
go/parquet/file/file_writer.go:
##########
@@ -30,21 +30,19 @@ import (
 
 // Writer is the primary interface for writing a parquet file
 type Writer struct {
-	sink           utils.WriteCloserTell
-	open           bool
-	props          *parquet.WriterProperties
-	rowGroups      int
-	nrows          int
-	metadata       metadata.FileMetaDataBuilder
-	fileEncryptor  encryption.FileEncryptor
-	rowGroupWriter *rowGroupWriter
+	sink                    utils.WriteCloserTell
+	open                    bool
+	props                   *parquet.WriterProperties
+	rowGroups               int
+	nrows                   int
+	metadata                metadata.FileMetaDataBuilder
+	fileEncryptor           encryption.FileEncryptor
+	rowGroupWriter          *rowGroupWriter
+	initialKeyValueMetadata metadata.KeyValueMetadata

Review Comment:
   Since we just initialize the `FileMetaDataBuilder` with the initialKeyValueMetadata, we don't actually need to store it after we initialize things.
   
   We could probably modify the `WriteOption` and how the config works by introducing a config struct like `type config struct { props *parquet.WriterProperties; initialKVMetadata metadata.KeyValueMetadata }`  and have `WriteOption` modify the config struct to initialize the writer. This would let us remove the `initialKeyValueMetadata` member entirely from the `Writer`.
   
   It's not necessary (things work just fine currently and I think this is good) but it's an idea.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org