You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Dan Burkert (Code Review)" <ge...@cloudera.org> on 2018/01/05 20:36:25 UTC

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Dan Burkert has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8951


Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

Since we appear to have poor coverage of the update-heavy workload
required to produce this bug, I opted to structure the test as an
integration test, instead of a targeted unit test.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/heavy-update-compaction-test.cc
M src/kudu/tablet/rowset_info.h
3 files changed, 194 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/8951/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Burkert <da...@apache.org>

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

PS4: 
> We talked about this a little bit in person: I think it'd be good to produc
Done


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@92
PS4, Line 92:     b.AddColumn("val_e")->Type(KuduColumnSchema::STRING)->NotNull();
> unique_ptr in new code.
Done


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@116
PS4, Line 116:     }
> For simplicity's sake, perhaps inline the definition of MakeRow here?
Done


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@155
PS4, Line 155: };
> Nit: use the shorter NO_FATALS macro.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 06 Jan 2018 01:54:16 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 7
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 11 Jan 2018 19:22:46 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8951

to look at the new patch set (#6).

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

This commit contains two repro tests: one targeted unit-test which
reproduces the overflow quickly and deterministically, and an
integration test that gives us more coverage of the update-heavy write
workloads required to hit this bug. The integration test has mixed
success on triggering the overflow, depending on how fast the machine
it's running on is (particularly disk throughput), and thus how fast the
MM can do compactions. On my laptop it triggered the overflow nearly
100% of the time, but on dist-test it triggered nearly 0% of the time.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/heavy-update-compaction-itest.cc
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset_info.h
5 files changed, 260 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/8951/6
-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8951

to look at the new patch set (#5).

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

This commit contains two repro tests: one targeted unit-test which
reproduces the overflow quickly and deterministically, and an
integration test that gives us more coverage of the update-heavy write
workloads required to hit this bug. The integration test has mixed
success on triggering the overflow, depending on how fast the machine
it's running on is (particularly disk throughput), and thus how fast the
MM can do compactions. On my laptop it triggered the overflow nearly
100% of the time, but on dist-test it triggered nearly 0% of the time.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/heavy-update-compaction-test.cc
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset_info.h
5 files changed, 259 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/8951/5
-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@46
PS1, Line 46: How many rounds will be performed
rounds of what?


http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@145
PS1, Line 145: HeavyUpdateCompactionTest
any way to add verification that not only did it not overflow but that the data is correct?


http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@175
PS1, Line 175: HeavyUpdateCompactionTest
docs


http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@175
PS1, Line 175: void
why have this last?



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 05 Jan 2018 20:50:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@145
PS1, Line 145: HeavyUpdateCompactionTest
> What are you thinking? As a repro for KUDU2231 this test is extremely timin
I was thinking of doing it after the fact, out of the core write cycles



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 05 Jan 2018 21:48:32 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8951/6/src/kudu/integration-tests/heavy-update-compaction-itest.cc
File src/kudu/integration-tests/heavy-update-compaction-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8951/6/src/kudu/integration-tests/heavy-update-compaction-itest.cc@206
PS6, Line 206:   // Scan the updated rows and ensure the final values are present.
             :   KuduScanner scanner(table_.get());
             :   ASSERT_OK(scanner.SetFaultTolerant());
             :   ASSERT_OK(scanner.AddConjunctPredicate(
             :         table_->NewComparisonPredicate(
             :           "key", KuduPredicate::LESS, KuduValue::FromInt(FLAGS_rows))));
             : 
             :   // Walking the updates can take a long time.
             :   scanner.SetTimeoutMillis(120 * 1000);
             : 
             :   LOG_TIMING(INFO, "scanning") {
             :     ASSERT_OK(scanner.Open());
             :     vector<KuduRowResult> rows;
             :     size_t final_values_offset = 0;
             :     while (scanner.HasMoreRows()) {
             :       ASSERT_OK(scanner.NextBatch(&rows));
             :       for (const auto & row : rows) {
             :         for (int idx = 1; idx <= 5; idx++) {
             :           ASSERT_GT(final_values.size(), final_values_offset);
             :           Slice actual_val;
             :           ASSERT_OK(row.GetString(idx, &actual_val));
             :           EXPECT_EQ(actual_val, final_values[final_values_offset++]);
             :         }
             :       }
             :     }
             :   }
Thanks for doing this. I'm surprised that it might take so long. I guess this is one of the cases where skipping updates is going to make a huge difference.


http://gerrit.cloudera.org:8080/#/c/8951/6/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/8951/6/src/kudu/tablet/rowset_info.h@21
PS6, Line 21: 
extra line



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 08 Jan 2018 19:02:48 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8951

to look at the new patch set (#7).

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

Testing: included is a targeted unit-test which reproduces the overflow
quickly and deterministically. I also reproduced the issue using an
integration test, however that test exposed other issues which need to
be addressed before it can land (KUDU-2253). I'll be working on that in
a follow-up commit.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
---
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset_info.h
3 files changed, 24 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/8951/7
-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 7
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8951

to look at the new patch set (#2).

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

Since we appear to have poor coverage of the update-heavy workload
required to produce this bug, I opted to structure the test as an
integration test, instead of a targeted unit test.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/heavy-update-compaction-test.cc
M src/kudu/tablet/rowset_info.h
3 files changed, 192 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/8951/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 6:

Without profiling it I don't have a great explanation for why TSAN is particularly slow on this workload. It's definitely slower on many workloads though, so increasing timeout in TSAN is probably reasonable if you don't want to take the time to profile it.


-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 08 Jan 2018 23:56:54 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 7:

I've backed out the integration test, since it exposed other problems which need to be addressed in a follow up commit(s).


-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 7
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 11 Jan 2018 18:51:23 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8951/6/src/kudu/integration-tests/heavy-update-compaction-itest.cc
File src/kudu/integration-tests/heavy-update-compaction-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8951/6/src/kudu/integration-tests/heavy-update-compaction-itest.cc@206
PS6, Line 206:   // Scan the updated rows and ensure the final values are present.
             :   KuduScanner scanner(table_.get());
             :   ASSERT_OK(scanner.SetFaultTolerant());
             :   ASSERT_OK(scanner.AddConjunctPredicate(
             :         table_->NewComparisonPredicate(
             :           "key", KuduPredicate::LESS, KuduValue::FromInt(FLAGS_rows))));
             : 
             :   // Walking the updates can take a long time.
             :   scanner.SetTimeoutMillis(120 * 1000);
             : 
             :   LOG_TIMING(INFO, "scanning") {
             :     ASSERT_OK(scanner.Open());
             :     vector<KuduRowResult> rows;
             :     size_t final_values_offset = 0;
             :     while (scanner.HasMoreRows()) {
             :       ASSERT_OK(scanner.NextBatch(&rows));
             :       for (const auto & row : rows) {
             :         for (int idx = 1; idx <= 5; idx++) {
             :           ASSERT_GT(final_values.size(), final_values_offset);
             :           Slice actual_val;
             :           ASSERT_OK(row.GetString(idx, &actual_val));
             :           EXPECT_EQ(actual_val, final_values[final_values_offset++]);
             :         }
             :       }
             :     }
             :   }
> Thanks for doing this. I'm surprised that it might take so long. I guess th
i've also seen ASSERT_OK and EXPECT_* etc be slow in gtest, since it actually collects all the successes and not just failures. We have ASSERT_OK_FAST which is a faster version of ASSERT_OK but I don't think there is any such for _EQ. We could just use CHECK instead though and it might be faster?



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 08 Jan 2018 19:50:06 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

PS4: 
We talked about this a little bit in person: I think it'd be good to produce a second, smaller test that's targeted for this particular regression. Hopefully such a test could work without actually writing 2 GB to disk, so it could really be a simple unit test. The problem is that the generality of this test makes it hard to actually ensure that the bug doesn't regress.


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@92
PS4, Line 92:     gscoped_ptr<KuduTableCreator> table_creator(client_->NewTableCreator());
unique_ptr in new code.


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@116
PS4, Line 116:   void MakeRow(int64_t key, KuduPartialRow* row);
For simplicity's sake, perhaps inline the definition of MakeRow here?


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@155
PS4, Line 155:   ASSERT_NO_FATAL_FAILURE(CreateTable());
Nit: use the shorter NO_FATALS macro.



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 05 Jan 2018 21:55:13 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 08 Jan 2018 21:34:30 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8951

to look at the new patch set (#3).

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

Since we appear to have poor coverage of the update-heavy workload
required to produce this bug, I opted to structure the test as an
integration test, instead of a targeted unit test.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/heavy-update-compaction-test.cc
M src/kudu/tablet/rowset_info.h
3 files changed, 192 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/8951/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 6:

OK I think I've satisfied everyone's requests WRT tests, however it now appears that the IT is taking more than 120 seconds to scan 500 rows in TSAN builds... so I'll be looking into that.  May just have to bump the timeout, but maybe I can figure out why it's taking that long.


-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 08 Jan 2018 18:59:36 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@53
PS1, Line 53: using std::vector;
> warning: using decl 'vector' is unused [misc-unused-using-decls]
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 05 Jan 2018 20:46:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8951

to look at the new patch set (#4).

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

Since we appear to have poor coverage of the update-heavy workload
required to produce this bug, I opted to structure the test as an
integration test, instead of a targeted unit test.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/heavy-update-compaction-test.cc
M src/kudu/tablet/rowset_info.h
3 files changed, 192 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/8951/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@46
PS1, Line 46: How many rounds will be performed
> rounds of what?
Done


http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@145
PS1, Line 145: HeavyUpdateCompactionTest
> any way to add verification that not only did it not overflow but that the 
What are you thinking? As a repro for KUDU2231 this test is extremely timing dependent, so I think scanning in between every round is going to make it less effective.


http://gerrit.cloudera.org:8080/#/c/8951/1/src/kudu/integration-tests/heavy-update-compaction-test.cc@175
PS1, Line 175: void
> why have this last?
moved.  Docs are on the decl.



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 05 Jan 2018 21:02:39 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 6:

I changed the asserts and expects to checks, and it didn't have much of an effect.  The slowness appears to be specific to TSAN.  I'm able to repro the scan timing out (120s+) on a fast machine (va1022), whereas the scans take 2s and 7s with release and debug builds, respectively. Any thoughts on why TSAN may be slow in particular? Next I'm going to verify that the scans are actually completing given a much longer timeout.


-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 08 Jan 2018 20:07:09 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8951/5/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/8951/5/src/kudu/integration-tests/CMakeLists.txt@77
PS5, Line 77: ADD_KUDU_TEST(heavy-update-compaction-test RUN_SERIAL true)
> nit: can we name this '-itest' since it's an integration level test?
Done


http://gerrit.cloudera.org:8080/#/c/8951/5/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

http://gerrit.cloudera.org:8080/#/c/8951/5/src/kudu/integration-tests/heavy-update-compaction-test.cc@139
PS5, Line 139:   const char* const kTableName = "heavy-update-compaction-test";
> style strikes me as a bit odd to have the constant interspersed with functi
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 06 Jan 2018 06:14:19 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8951/5/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/8951/5/src/kudu/integration-tests/CMakeLists.txt@77
PS5, Line 77: ADD_KUDU_TEST(heavy-update-compaction-test RUN_SERIAL true)
nit: can we name this '-itest' since it's an integration level test?


http://gerrit.cloudera.org:8080/#/c/8951/5/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

http://gerrit.cloudera.org:8080/#/c/8951/5/src/kudu/integration-tests/heavy-update-compaction-test.cc@139
PS5, Line 139:   const char* const kTableName = "heavy-update-compaction-test";
style strikes me as a bit odd to have the constant interspersed with functions



-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 06 Jan 2018 02:44:09 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

Testing: included is a targeted unit-test which reproduces the overflow
quickly and deterministically. I also reproduced the issue using an
integration test, however that test exposed other issues which need to
be addressed before it can land (KUDU-2253). I'll be working on that in
a follow-up commit.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Reviewed-on: http://gerrit.cloudera.org:8080/8951
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
---
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset_info.h
3 files changed, 24 insertions(+), 6 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  David Ribeiro Alves: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 8
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>