You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Riza Suminto (Code Review)" <ge...@cloudera.org> on 2021/09/16 22:38:08 UTC

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17853


Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................

IMPALA-10714: Defer advancing read page until the buffer is attached

If a BufferedTupeStream is set up with attach_on_read = true,
BufferedTupeStream::NextReadPage() expect that the page buffer is
attached to the output row batch before advancing the read iterator.
However, BufferedTupleStream::GetNextInternal() will not attach the page
buffer of a completely read page if it is a read-write page.
BufferedTupeStream::UnpinStream() need to defer advancing the read page
if this situation happens.

Testing:
- Add be test SimpleTupleStreamTest.DeferAdvancingUnattachedReadPage
  that simulate the corner case.

Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
---
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
3 files changed, 80 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17853/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 3:

(7 comments)

Looks great! Thanks a lot for the refactoring which makes the new code enjoyable to read.

http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc@1388
PS2, Line 1388: eservation to alloc
> This is now addressed by:
Done


http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc@1397
PS2, Line 1397: 
> I believe this is now addressed by TestUnpinAfterFullStreamRead with attach
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@573
PS3, Line 573: StreamStateTest
cool class!


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@575
PS3, Line 575: defer
nit. missing s.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1455
PS3, Line 1455: removed from stream.pages_ 
Can we assert on this assertion here:

if (!read_write && attach_on_read && stream.pinned()) ASSERT_TRUE(stream.pages_ > 0);


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1484
PS3, Line 1484: num_pinned_pages = 0;
              :       } else {
              :         num_pinned_pages = 0;
num_pinned_pages = 0 can be moved out of the IF.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@2220
PS3, Line 2220: read-only stream
Maybe add a comment in the code to explain why these two combos are not tested:  (false, false, true) and (false, true, true).



-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 14:22:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG@12
PS3, Line 12: However, BufferedTupleStream::GetNextInternal() will not
            : attach the page buffer of a fully read page if it is a read-write page.
            : 
> The attachment that is NOT invoked is at https://github.com/apache/impala/b
I describe the scenario that lead to the assertion failure in patch set 4.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@573
PS3, Line 573: StreamStateTest
> cool class!
Thanks!


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@575
PS3, Line 575: defer
> nit. missing s.
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1450
PS3, Line 1450: 
> nit: this can be simplified to be
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1455
PS3, Line 1455: 
> Can we assert on this assertion here:
Added the assertion, but rephrasing it a bit.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1484
PS3, Line 1484:   num_pinned_pages = 1;
              :           ASSERT_TRUE(stream.pages_.back().is_pinned());
              :         }
> num_pinned_pages = 0 can be moved out of the IF.
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1490
PS3, Line 1490:       }
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@2220
PS3, Line 2220: nAfterFullStream
> Maybe add a comment in the code to explain why these two combos are not tes
Those combos are not tested because, as far as I know, read-only stream can not return to writing state after it gets into reading state.
I add an explanation message in the first DCHECK at StreamStateTest::TestUnpinAfterFullStreamRead().


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@2213
PS3, Line 2213: }
              : 
              : TEST_F(StreamStateTest, UnpinFullyExhaustedReadPageOnReadWriteStreamNoAttachRefill) {
              :   TestUnpinAfterFullStreamRead(true, false, true);
              : }
              : 
              : TEST_F(StreamStateTest, UnpinFullyExhaustedReadPageOnReadWriteStreamNoAttachNoRefill) {
              :   TestUnpinAfterFullStreamRead(true, false, false);
              : }
              : 
> nit: I'd suggest spliting these into 6 tests so it'd be easier to locate th
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc@724
PS3, Line 724:           && (num_pages_ <= 2 || !read_it_.read_page_->attached_to_output_batch)) {
> nit: we can merge these two if-clauses into one.
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc@730
PS3, Line 730:         //    reservation if the stream ended up with only 1 read/write page after
> Could you update the comments to mention the new case?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:57:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 4: Code-Review+1

Looks great!


-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 18:38:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................

IMPALA-10714: Defer advancing read page until the buffer is attached

If a BufferedTupleStream is a read-write stream and set up with
attach_on_read = true, BufferedTupleStream::NextReadPage() expects that
the page buffer is attached to the output row batch before advancing the
read iterator. However, BufferedTupleStream::GetNextInternal() will not
attach the page buffer of a fully read page if it is a read-write page.

Consider the following scenario:
1. Only 1 page left in stream. This is a read-write page.
2. GetNext() has fully read the page, but does NOT attach the buffer to
   output row batch because it is a read-write page.
3. Stream writer insert more rows, but the read-write page can not fit
   any more rows. Therefore, new pages are created.
4. Stream writer call UnpinStream().
5. UnpinStream() call NextReadPage(), which in turn will fail the
   assertion "read_iter->read_page_->attached_to_output_batch".

BufferedTupleStream::UnpinStream() need to defer advancing the read page
if this situation happens.

This patch adds BE test StreamStateTest.UnpinFullyExhaustedReadPage that
simulates the corner case. This patch also moves BE test
DeferAdvancingReadPage and ShortDebugString into class StreamStateTest
to reduce friend class declaration in buffered-tuple-stream.h

Testing:
- Run and pass BE test StreamStateTest.UnpinFullyExhaustedReadPage.

Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Reviewed-on: http://gerrit.cloudera.org:8080/17853
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
3 files changed, 244 insertions(+), 19 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG@12
PS3, Line 12: However, BufferedTupleStream::GetNextInternal() will not
            : attach the page buffer of a completely read page if it is a read-write
            : page.
Isn't this done at https://github.com/apache/impala/blob/35b21083b1866b7056e3810ae5a8daf7bc77ddda/be/src/runtime/buffered-tuple-stream.cc#L812 ? Or do you mean the case when the writer unpins the stream before the reader goes to this line?


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1450
PS3, Line 1450:     bool attached = !read_write && attach_on_read ? true : false;
nit: this can be simplified to be

 bool attached = !read_write && attach_on_read;


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@2213
PS3, Line 2213:   TestUnpinAfterFullStreamRead(num_rows, buffer_size, max_num_pages, true, false, true);
              :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, max_num_pages, true, false, false);
              : 
              :   // Test read-write stream with attach_on_read.
              :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, max_num_pages, true, true, true);
              :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, max_num_pages, true, true, false);
              : 
              :   // Test read-only stream both with and without attach_on_read.
              :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, max_num_pages, false, false, false);
              :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, max_num_pages, false, true, false);
nit: I'd suggest spliting these into 6 tests so it'd be easier to locate the failed case if any of them fail.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc@724
PS3, Line 724:         if (num_pages_ <= 2 || !read_it_.read_page_->attached_to_output_batch) {
nit: we can merge these two if-clauses into one.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc@730
PS3, Line 730:           // (see IMPALA-10584).
Could you update the comments to mention the new case?



-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 14:06:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Qifan Chen, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17853

to look at the new patch set (#4).

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................

IMPALA-10714: Defer advancing read page until the buffer is attached

If a BufferedTupleStream is a read-write stream and set up with
attach_on_read = true, BufferedTupleStream::NextReadPage() expects that
the page buffer is attached to the output row batch before advancing the
read iterator. However, BufferedTupleStream::GetNextInternal() will not
attach the page buffer of a fully read page if it is a read-write page.

Consider the following scenario:
1. Only 1 page left in stream. This is a read-write page.
2. GetNext() has fully read the page, but does NOT attach the buffer to
   output row batch because it is a read-write page.
3. Stream writer insert more rows, but the read-write page can not fit
   any more rows. Therefore, new pages are created.
4. Stream writer call UnpinStream().
5. UnpinStream() call NextReadPage(), which in turn will fail the
   assertion "read_iter->read_page_->attached_to_output_batch".

BufferedTupleStream::UnpinStream() need to defer advancing the read page
if this situation happens.

This patch adds BE test StreamStateTest.UnpinFullyExhaustedReadPage that
simulates the corner case. This patch also moves BE test
DeferAdvancingReadPage and ShortDebugString into class StreamStateTest
to reduce friend class declaration in buffered-tuple-stream.h

Testing:
- Run and pass BE test StreamStateTest.UnpinFullyExhaustedReadPage.

Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
---
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
3 files changed, 244 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17853/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9466/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Sep 2021 22:59:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 5: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 18:48:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Qifan Chen, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17853

to look at the new patch set (#3).

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................

IMPALA-10714: Defer advancing read page until the buffer is attached

If a BufferedTupleStream is a read-write stream and set up with
attach_on_read = true, BufferedTupleStream::NextReadPage() expects that
the page buffer is attached to the output row batch before advancing the
read iterator. However, BufferedTupleStream::GetNextInternal() will not
attach the page buffer of a completely read page if it is a read-write
page. BufferedTupleStream::UnpinStream() need to defer advancing the
read page if this situation happens.

This patch adds BE test StreamStateTest.UnpinFullyExhaustedReadPage that
simulates the corner case. This patch also moves BE test
DeferAdvancingReadPage and ShortDebugString into class StreamStateTest
to reduce friend class declaration in buffered-tuple-stream.h

Testing:
- Run and pass BE test StreamStateTest.UnpinFullyExhaustedReadPage.

Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
---
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
3 files changed, 214 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17853/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG@12
PS3, Line 12: However, BufferedTupleStream::GetNextInternal() will not
            : attach the page buffer of a completely read page if it is a read-write
            : page.
> Isn't this done at https://github.com/apache/impala/blob/35b21083b1866b7056
The attachment that is NOT invoked is at https://github.com/apache/impala/blob/35b21083b1866b7056e3810ae5a8daf7bc77ddda/be/src/runtime/buffered-tuple-stream.cc#L888
Notice the condition 'if (!has_read_write_page())' right above it.



-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 15:42:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 3:

(3 comments)

Thank you for the feedback!
After expanding the test case, I found that fix from patch set 2 is still incorrect.
Delayed advancement of read page should only allowed in read-write stream.
Patch set 3 add checks for 'has_write_iterator_' to make sure that the stream is indeed a read-write stream.

http://gerrit.cloudera.org:8080/#/c/17853/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/2//COMMIT_MSG@10
PS2, Line 10: Stream
> nit. expects
Done


http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc@1388
PS2, Line 1388: eservation to alloc
> May need to add the following additional test cases 
This is now addressed by:
1. TestUnpinAfterFullStreamRead(num_rows, buffer_size, max_num_pages, false, true, false);
2. TestUnpinAfterFullStreamRead(num_rows, buffer_size, max_num_pages, true, false, ...);


http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc@1397
PS2, Line 1397: 
> May need to test #pages == 1.
I believe this is now addressed by TestUnpinAfterFullStreamRead with attach_on_read=true and refill_before_unpin=false.
With this argument, the stream only have 1 page left before UnpinStream() is called.



-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Sep 2021 17:40:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 4: Code-Review+2

(1 comment)

The fix makes sense to me. Thanks for digging into this! Carrying Qifan's +1 and mine to +2.

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG@12
PS3, Line 12: However, BufferedTupleStream::GetNextInternal() will not
            : attach the page buffer of a fully read page if it is a read-write page.
            : 
> I describe the scenario that lead to the assertion failure in patch set 4.
I see. Thanks for adding more details!



-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 12:32:33 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 12:33:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 2:

(3 comments)

Looks good!. 

Some additional test cases may be needed to assure the new logic is sound.

http://gerrit.cloudera.org:8080/#/c/17853/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/2//COMMIT_MSG@10
PS2, Line 10: expect
nit. expects


http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc@1388
PS2, Line 1388: PrepareForReadWrite
May need to add the following additional test cases 

1. PrepareForRead(true, ...) for read only use case;
2. PrepareForReadWrite(false, ...) that will not delay the advance of the 1st read page.


http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc@1397
PS2, Line 1397: 2
May need to test #pages == 1.



-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Sep 2021 16:06:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17853

to look at the new patch set (#2).

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................

IMPALA-10714: Defer advancing read page until the buffer is attached

If a BufferedTupleStream is set up with attach_on_read = true,
BufferedTupleStream::NextReadPage() expect that the page buffer is
attached to the output row batch before advancing the read iterator.
However, BufferedTupleStream::GetNextInternal() will not attach the page
buffer of a completely read page if it is a read-write page.
BufferedTupleStream::UnpinStream() need to defer advancing the read page
if this situation happens.

Testing:
- Add be test SimpleTupleStreamTest.DeferAdvancingUnattachedReadPage
  that simulate the corner case.

Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
---
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
3 files changed, 80 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17853/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9477/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Sep 2021 17:49:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7484/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 12:33:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1490
PS3, Line 1490:     VerifyStreamState(&stream, num_pages, num_pinned_pages, num_unpinned_pages, buffer_size);
line too long (93 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Sep 2021 17:28:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is attached
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9486/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 18:10:13 +0000
Gerrit-HasComments: No