You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2020/02/06 01:20:15 UTC

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15168


Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................

[util] remove timeout parameter for cloud::InstanceDetector

With this patch, the detector relies solely on the timeouts reported
by metadata fetchers and doesn't set the timeout on the auto-detection
process itself.  This change simplifies the semantics of the cloud
instance detection.

In addition, and this patch fixes intermittent failures recently
seen in the following test scenarios:
  * InstanceDetectorTest.Basic
  * BuiltinNtpWithMiniChronydTest.CloudInstanceNtpServer

Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
---
M src/kudu/clock/ntp-test.cc
M src/kudu/util/cloud/instance_detector-test.cc
M src/kudu/util/cloud/instance_detector.cc
M src/kudu/util/cloud/instance_detector.h
M src/kudu/util/cloud/instance_metadata.cc
M src/kudu/util/cloud/instance_metadata.h
6 files changed, 73 insertions(+), 46 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/15168/1
-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15168 )

Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_detector.cc
File src/kudu/util/cloud/instance_detector.cc:

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_detector.cc@65
PS1, Line 65:   // Add an extra delay in addition to the declared timeout for fetching
            :   // instance metadata. The idea is to have the limit on the time to wait
            :   // in case if instance metadata fetcher could not fulfill its promise to
            :   // return valid result or an error within the init_timeout() interval.
Yeah but is this actually possible? Or a hypothetical concern?

BTW, is it actually safe to WaitUntil? Let's suppose one detector is very slow and we end up timing out on L96. If we return a bad Status, there's a very good chance that the caller to Detect() will pass that Status back up the call stack, destroying objects along the way (or at the end when the server is destroyed). In doing so, they may destroy the InstanceDetector itself even though there's an outstanding detector thread running in GetInstanceInfo. That thread will then write to uninitialized memory.


http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_metadata.h
File src/kudu/util/cloud/instance_metadata.h:

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_metadata.h@78
PS1, Line 78: class InstanceMetadataBase : public InstanceMetadata {
Any chance you'd be interested in combining this with InstanceMetadata? The interface vs. implementation divide is necessary in Java, but I think it just makes code harder to follow in C++ (unless it's part of a library).



-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Volodymyr Verovkin <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Feb 2020 01:36:34 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Adar Dembo, Volodymyr Verovkin, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15168

to look at the new patch set (#2).

Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................

[util] remove timeout parameter for cloud::InstanceDetector

With this patch, the detector relies solely on the timeouts reported
by metadata fetchers and doesn't set the timeout on the auto-detection
process itself.  This change simplifies the semantics of the cloud
instance detection.

In addition, and this patch fixes intermittent failures recently seen
in the following test scenarios when running on non-cloud hosts:
  * InstanceDetectorTest.Basic
  * BuiltinNtpWithMiniChronydTest.CloudInstanceNtpServer

Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
---
M src/kudu/clock/ntp-test.cc
M src/kudu/util/cloud/instance_detector-test.cc
M src/kudu/util/cloud/instance_detector.cc
M src/kudu/util/cloud/instance_detector.h
M src/kudu/util/cloud/instance_metadata.cc
M src/kudu/util/cloud/instance_metadata.h
6 files changed, 79 insertions(+), 98 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/15168/2
-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Volodymyr Verovkin <ve...@cloudera.com>

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15168 )

Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_detector.cc
File src/kudu/util/cloud/instance_detector.cc:

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_detector.cc@65
PS1, Line 65:   // Add an extra delay in addition to the declared timeout for fetching
            :   // instance metadata. The idea is to have the limit on the time to wait
            :   // in case if instance metadata fetcher could not fulfill its promise to
            :   // return valid result or an error within the init_timeout() interval.
> Yeah but is this actually possible? Or a hypothetical concern?
It's more a hypothetical concern at this point since we know that CURL works pretty reliably w.r.t. sticking to the specified timeouts modulo various scheduler anomalies.

As I understand, the destructor will wait for all threads to join, so there should be no issue with writing into uninitialized memory.



-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Volodymyr Verovkin <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Feb 2020 01:54:17 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15168 )

Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_detector.cc
File src/kudu/util/cloud/instance_detector.cc:

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_detector.cc@65
PS1, Line 65:   // Add an extra delay in addition to the declared timeout for fetching
            :   // instance metadata. The idea is to have the limit on the time to wait
            :   // in case if instance metadata fetcher could not fulfill its promise to
            :   // return valid result or an error within the init_timeout() interval.
> It's more a hypothetical concern at this point since we know that CURL work
Good point; I missed that the destructor joins on the threads.



-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Volodymyr Verovkin <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Feb 2020 02:15:39 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15168 )

Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_detector.cc
File src/kudu/util/cloud/instance_detector.cc:

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_detector.cc@65
PS1, Line 65:   // Add an extra delay in addition to the declared timeout for fetching
            :   // instance metadata. The idea is to have the limit on the time to wait
            :   // in case if instance metadata fetcher could not fulfill its promise to
            :   // return valid result or an error within the init_timeout() interval.
> Good point; I missed that the destructor joins on the threads.
Thank you for the feedback!  I was hesitant to use Wait() instead of WaitUntil(), but now I understand it was just paranoia.  I'm going to get rid of these ugly timeouts.



-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Volodymyr Verovkin <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Feb 2020 02:34:39 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15168 )

Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Volodymyr Verovkin <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Feb 2020 04:17:07 +0000
Gerrit-HasComments: No

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15168 )

Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_metadata.h
File src/kudu/util/cloud/instance_metadata.h:

http://gerrit.cloudera.org:8080/#/c/15168/1/src/kudu/util/cloud/instance_metadata.h@78
PS1, Line 78: class InstanceMetadataBase : public InstanceMetadata {
> Any chance you'd be interested in combining this with InstanceMetadata? The
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Volodymyr Verovkin <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Feb 2020 03:26:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] remove timeout parameter for cloud::InstanceDetector

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15168 )

Change subject: [util] remove timeout parameter for cloud::InstanceDetector
......................................................................

[util] remove timeout parameter for cloud::InstanceDetector

With this patch, the detector relies solely on the timeouts reported
by metadata fetchers and doesn't set the timeout on the auto-detection
process itself.  This change simplifies the semantics of the cloud
instance detection.

In addition, and this patch fixes intermittent failures recently seen
in the following test scenarios when running on non-cloud hosts:
  * InstanceDetectorTest.Basic
  * BuiltinNtpWithMiniChronydTest.CloudInstanceNtpServer

Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Reviewed-on: http://gerrit.cloudera.org:8080/15168
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/clock/ntp-test.cc
M src/kudu/util/cloud/instance_detector-test.cc
M src/kudu/util/cloud/instance_detector.cc
M src/kudu/util/cloud/instance_detector.h
M src/kudu/util/cloud/instance_metadata.cc
M src/kudu/util/cloud/instance_metadata.h
6 files changed, 79 insertions(+), 98 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Adar Dembo: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/15168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6f7cac9273e884abda22b9e9a7d37fa3e307c2d4
Gerrit-Change-Number: 15168
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Volodymyr Verovkin <ve...@cloudera.com>