You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Yida Wu (Code Review)" <ge...@cloudera.org> on 2020/11/13 18:10:07 UTC

[Impala-ASF-CR] IMPALA-9867: Add Support for Spilling to S3: Milestone 1

Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/16318 )

Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
......................................................................


Patch Set 19:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/hdfs-file-writer.cc
File be/src/runtime/io/hdfs-file-writer.cc:

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/hdfs-file-writer.cc@38
PS19, Line 38:   if (expected_local_) {
> Can we remove "expected_local_" altogether from this class?
I think I can remove the whole file for now, the HdfsFileWriter is not necessarily used by any class.


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/hdfs-file-writer.cc@68
PS19, Line 68:     range->SetOffset(written_bytes_);
> written_bytes_ is updated when 'UpdateWrittenSize' is called in next line. 
The entire file deleted.


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/hdfs-file-writer.cc@77
PS19, Line 77:   DCHECK(hdfs_file_ != nullptr);
> Should we also acquire lock_ here?
The entire file deleted.


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-system.cc
File be/src/runtime/io/local-file-system.cc:

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-system.cc@101
PS19, Line 101:   if (bytes_read < length) {
> I am not sure if this is always the case? Do we always allocate buffer the 
Yes, currently we assume it can read the same length of data as the buffer for normal cases, since the file size is always integer times of the buffer size. But it might not be a good implementation, I will leave a comment in case someday we need to support the other cases.


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.h
File be/src/runtime/io/local-file-writer.h:

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.h@48
PS19, Line 48:   Status WritePadding(WriteRange* range, int64_t padding_size);
> Since we are adding padding to the file, how do we determine how much to re
The offset and length of the read ranges are from the write ranges, so if the parameters of the write ranges are set correctly, no padding data of a file could be read.


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.cc
File be/src/runtime/io/local-file-writer.cc:

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.cc@51
PS19, Line 51:   range->SetOffset(written_bytes_);
> Similar comment as before. I don't think `written_bytes_` has the updated v
The written_bytes_ is the current offset of a file, so the logic here is to set the current offset, which is the actual offset, to the write range. I may change the name to make it easier to understand.


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.cc@87
PS19, Line 87: Status LocalFileWriter::WriteOne(WriteRange* write_range) {
> Maybe add a comment, explaining why the lock_ is not acquired in this funct
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89
Gerrit-Change-Number: 16318
Gerrit-PatchSet: 19
Gerrit-Owner: Yida Wu <wy...@gmail.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wy...@gmail.com>
Gerrit-Comment-Date: Fri, 13 Nov 2020 18:10:07 +0000
Gerrit-HasComments: Yes