You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Lars Volker (Code Review)" <ge...@cloudera.org> on 2017/04/07 11:36:25 UTC

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6588

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................

IMPALA-3794: Workaround for Breakpad ID conflicts

Breakpad determines the ID of the minidump file to be written in case of
a crash during startup of the process randomly, seeded with the current
system time. If two impalads start up within the same second, there is a
chance for a name conflict. When sending a signal to all of them, one
process can overwrite the minidump of another one. This is an upstream
issue and is tracked in Breakpad-681. I confirmed my suspicion by
tentatively making an own output folder for each running instance of
impalad and was then unable to reproduce the issue. However, it is a
more clear solution to fix the underlying issue than to change the
folder locations for minidumps in impala.

Until this is fixed upstream, we can make sure that we see at least one
minidump for the group of impalads in the test cluster. It is not a
product defect, since we don't support running multiple impalads on a
single host, let alone starting them all at once.

Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 8 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/6588/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6588/3//COMMIT_MSG
Commit Message:

PS3, Line 11: If two impalads start up within the same second
> May be add this briefly to the commit message if you think that helps.
Done


http://gerrit.cloudera.org:8080/#/c/6588/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

Line 84:                                  max_minidumps=self.MAX_MINIDUMPS)
> indentation?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 7: Code-Review+2

Rebased, carrying Mike's +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Hello Michael Brown, Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6588

to look at the new patch set (#5).

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................

IMPALA-3794: Workaround for Breakpad ID conflicts

Breakpad determines the ID of the minidump file to be written in case of
a crash during startup of the process randomly, seeded with the current
system time with second granularity. If two impalads start up within the
same second, there is a chance for a name conflict. The one second delay
between starting impalads in start-impala-cluster.py is not sufficient:

I0407 22:34:52.018563 28473 minidump.cc:245] Setting minidump size limit
to 20971520.
I0407 22:34:52.997046 28749 minidump.cc:245] Setting minidump size limit
to 20971520.

When sending a signal to all of them, one process can overwrite the
minidump of another one. This is an upstream issue and is tracked in
Breakpad-681. I further confirmed my suspicion by tentatively making an
own output folder for each running instance of impalad and was then
unable to reproduce the issue. However, it is a more clear solution to
fix the underlying issue than to change the folder locations for
minidumps in impala.

Until this is fixed upstream, we can make sure that we see at least one
minidump for the group of impalads in the test cluster. It is not a
product defect, since we don't support running multiple impalads on a
single host, let alone starting them all at once.

To test this I ran the following loop for about an hour on my dev
machine without hitting the issue:

while [ $? -eq 0 ]; do impala-py.test
tests/custom_cluster/test_breakpad.py --exploration_strategy=exhaustive
-k test_minidump_relative_path -x -s; done

Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 26 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/6588/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6588/3//COMMIT_MSG
Commit Message:

PS3, Line 11: If two impalads start up within the same second
> Thank you for your comment. I dug around more and updated the JIRA with my 
May be add this briefly to the commit message if you think that helps.


http://gerrit.cloudera.org:8080/#/c/6588/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

Line 84:                                  max_minidumps=self.MAX_MINIDUMPS)
indentation?


-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 7:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/448/

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/445/

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


IMPALA-3794: Workaround for Breakpad ID conflicts

Breakpad determines the ID of the minidump file to be written in case of
a crash during startup of the process randomly, seeded with the current
system time with second granularity. If two impalads start up within the
same second, there is a chance for a name conflict. The one second delay
between starting impalads in start-impala-cluster.py is not sufficient:

I0407 22:34:52.018563 28473 minidump.cc:245] Setting minidump size limit
to 20971520.
I0407 22:34:52.997046 28749 minidump.cc:245] Setting minidump size limit
to 20971520.

When sending a signal to all of them, one process can overwrite the
minidump of another one. This is an upstream issue and is tracked in
Breakpad-681. I further confirmed my suspicion by tentatively making an
own output folder for each running instance of impalad and was then
unable to reproduce the issue. However, it is a more clear solution to
fix the underlying issue than to change the folder locations for
minidumps in impala.

Until this is fixed upstream, we can make sure that we see at least one
minidump for the group of impalads in the test cluster. It is not a
product defect, since we don't support running multiple impalads on a
single host, let alone starting them all at once.

To test this I ran the following loop for about an hour on my dev
machine without hitting the issue:

while [ $? -eq 0 ]; do impala-py.test
tests/custom_cluster/test_breakpad.py --exploration_strategy=exhaustive
-k test_minidump_relative_path -x -s; done

Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Reviewed-on: http://gerrit.cloudera.org:8080/6588
Reviewed-by: Lars Volker <lv...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 22 insertions(+), 15 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Lars Volker: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6588/3//COMMIT_MSG
Commit Message:

PS3, Line 11: If two impalads start up within the same second
Are you sure about this? The startup script has a sleep(1) between each each start command.


  # Start each impalad instance and optionally redirect the output to a log file.
  for i in range(cluster_size):
    if i == 0:
      # The first impalad always logs to impalad.INFO
      service_name = "impalad"
    else:
      service_name = "impalad_node%s" % i
      # Sleep between instance startup: simultaneous starts hurt the minikdc
      # Yes, this is a hack, but it's easier than modifying the minikdc...
      # TODO: is this really necessary?
      sleep(1)

    print "Starting Impala Daemon logging to %s/%s.INFO" % (options.log_dir,
        service_name)

I'm not aware of how breakpad determines the ID of the minidump but just want to be sure that we are fixing the right thing. Otherwise the patch lgtm.


-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................

IMPALA-3794: Workaround for Breakpad ID conflicts

Breakpad determines the ID of the minidump file to be written in case of
a crash during startup of the process randomly, seeded with the current
system time. If two impalads start up within the same second, there is a
chance for a name conflict. When sending a signal to all of them, one
process can overwrite the minidump of another one. This is an upstream
issue and is tracked in Breakpad-681. I confirmed my suspicion by
tentatively making an own output folder for each running instance of
impalad and was then unable to reproduce the issue. However, it is a
more clear solution to fix the underlying issue than to change the
folder locations for minidumps in impala.

Until this is fixed upstream, we can make sure that we see at least one
minidump for the group of impalads in the test cluster. It is not a
product defect, since we don't support running multiple impalads on a
single host, let alone starting them all at once.

To test this I ran the following loop for about an hour on my dev
machine without hitting the issue:

while [ $? -eq 0 ]; do impala-py.test
tests/custom_cluster/test_breakpad.py --exploration_strategy=exhaustive
-k test_minidump_relative_path -x -s; done

Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 8 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/6588/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 2:

(1 comment)

Thank you for the review. Please see PS2.

http://gerrit.cloudera.org:8080/#/c/6588/2/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS2, Line 156:     self.assert_num_logfile_entries(1)
             :     # IMPALA-3794 / Breakpad-681: Weak minidump ID generation can lead to name conflicts,
             :     # so that one process overwrites the minidump of others. See IMPALA-3794 for more
             :     # information.
             :     # TODO: Change this here and elsewhere in this file to expect 'cluster_size' minidumps
             :     # once Breakpad-681 has been fixed.
             :     assert self.count_minidumps('impalad') >= 1
             :     assert self.count_minidumps('statestored') == 1
             :     assert self.count_minidumps('catalogd') == 1
> Any interest in making this a private method that is called L183, L201, etc
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6588/3//COMMIT_MSG
Commit Message:

PS3, Line 11: If two impalads start up within the same second
> Are you sure about this? The startup script has a sleep(1) between each eac
Thank you for your comment. I dug around more and updated the JIRA with my findings. In short, daemons with conflicting file names actually seem to initialize breakpad within the same second, despite this delay.

As another workaround we could therefore increase the latency here to 2 seconds, making it much less likely to hit this error. However, the current workaround should do the trick until the upstream issue is fixed.


-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 3:

> (1 comment)

That's a good point. I'll have a look.

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 4: Code-Review+2

Carrying forward Mike's +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6588/2/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS2, Line 156:     self.assert_num_logfile_entries(1)
             :     # IMPALA-3794 / Breakpad-681: Weak minidump ID generation can lead to name conflicts,
             :     # so that one process overwrites the minidump of others. See IMPALA-3794 for more
             :     # information.
             :     # TODO: Change this here and elsewhere in this file to expect 'cluster_size' minidumps
             :     # once Breakpad-681 has been fixed.
             :     assert self.count_minidumps('impalad') >= 1
             :     assert self.count_minidumps('statestored') == 1
             :     assert self.count_minidumps('catalogd') == 1
Any interest in making this a private method that is called L183, L201, etc.? You reduce maintenance, and now your L157-161 note applies in the one place it matters.


-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 7: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 6: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/445/

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has uploaded a new patch set (#3).

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................

IMPALA-3794: Workaround for Breakpad ID conflicts

Breakpad determines the ID of the minidump file to be written in case of
a crash during startup of the process randomly, seeded with the current
system time. If two impalads start up within the same second, there is a
chance for a name conflict. When sending a signal to all of them, one
process can overwrite the minidump of another one. This is an upstream
issue and is tracked in Breakpad-681. I confirmed my suspicion by
tentatively making an own output folder for each running instance of
impalad and was then unable to reproduce the issue. However, it is a
more clear solution to fix the underlying issue than to change the
folder locations for minidumps in impala.

Until this is fixed upstream, we can make sure that we see at least one
minidump for the group of impalads in the test cluster. It is not a
product defect, since we don't support running multiple impalads on a
single host, let alone starting them all at once.

To test this I ran the following loop for about an hour on my dev
machine without hitting the issue:

while [ $? -eq 0 ]; do impala-py.test
tests/custom_cluster/test_breakpad.py --exploration_strategy=exhaustive
-k test_minidump_relative_path -x -s; done

Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 22 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/6588/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 4:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/444/

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Hello Michael Brown, Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6588

to look at the new patch set (#4).

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................

IMPALA-3794: Workaround for Breakpad ID conflicts

Breakpad determines the ID of the minidump file to be written in case of
a crash during startup of the process randomly, seeded with the current
system time with second granularity. If two impalads start up within the
same second, there is a chance for a name conflict. The one second delay
between starting impalads in start-impala-cluster.py is not sufficient:

I0407 22:34:52.018563 28473 minidump.cc:245] Setting minidump size limit
to 20971520.
I0407 22:34:52.997046 28749 minidump.cc:245] Setting minidump size limit
to 20971520.

When sending a signal to all of them, one process can overwrite the
minidump of another one. This is an upstream issue and is tracked in
Breakpad-681. I further confirmed my suspicion by tentatively making an
own output folder for each running instance of impalad and was then
unable to reproduce the issue. However, it is a more clear solution to
fix the underlying issue than to change the folder locations for
minidumps in impala.

Until this is fixed upstream, we can make sure that we see at least one
minidump for the group of impalads in the test cluster. It is not a
product defect, since we don't support running multiple impalads on a
single host, let alone starting them all at once.

To test this I ran the following loop for about an hour on my dev
machine without hitting the issue:

while [ $? -eq 0 ]; do impala-py.test
tests/custom_cluster/test_breakpad.py --exploration_strategy=exhaustive
-k test_minidump_relative_path -x -s; done

Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 27 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/6588/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6588/4/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS4, Line 83:     self.start_cluster_with_args(minidump_path=self.tmp_dir,
            :         max_minidumps=self.MAX_MINIDUMPS)
Sorry, but this isn't correct for PEP-008.

It was fine how it was, or you have to start the first param on the second line as well, like:

  self.start_cluster_with_args(
      minidump_path=self.tmp_dir,
      max_minidumps=self.MAX_MINIDUMPS)


PS4, Line 99:       # time.sleep(1)
??


-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 6:

(2 comments)

I restored PS3, apologies for the confusion. Please have another look at PS6.

http://gerrit.cloudera.org:8080/#/c/6588/4/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS4, Line 83:     self.start_cluster_with_args(minidump_path=self.tmp_dir,
            :                                  max_mini
> Sorry, but this isn't correct for PEP-008.
Done


PS4, Line 99:     signal is SIGUSR1
> ??
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-3794: Workaround for Breakpad ID conflicts

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Hello Michael Brown, Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6588

to look at the new patch set (#6).

Change subject: IMPALA-3794: Workaround for Breakpad ID conflicts
......................................................................

IMPALA-3794: Workaround for Breakpad ID conflicts

Breakpad determines the ID of the minidump file to be written in case of
a crash during startup of the process randomly, seeded with the current
system time with second granularity. If two impalads start up within the
same second, there is a chance for a name conflict. The one second delay
between starting impalads in start-impala-cluster.py is not sufficient:

I0407 22:34:52.018563 28473 minidump.cc:245] Setting minidump size limit
to 20971520.
I0407 22:34:52.997046 28749 minidump.cc:245] Setting minidump size limit
to 20971520.

When sending a signal to all of them, one process can overwrite the
minidump of another one. This is an upstream issue and is tracked in
Breakpad-681. I further confirmed my suspicion by tentatively making an
own output folder for each running instance of impalad and was then
unable to reproduce the issue. However, it is a more clear solution to
fix the underlying issue than to change the folder locations for
minidumps in impala.

Until this is fixed upstream, we can make sure that we see at least one
minidump for the group of impalads in the test cluster. It is not a
product defect, since we don't support running multiple impalads on a
single host, let alone starting them all at once.

To test this I ran the following loop for about an hour on my dev
machine without hitting the issue:

while [ $? -eq 0 ]; do impala-py.test
tests/custom_cluster/test_breakpad.py --exploration_strategy=exhaustive
-k test_minidump_relative_path -x -s; done

Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 22 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/6588/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4ae589f6eb5cbbfb860943214edc0e6415eeb862
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>