You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Attila Jeges (Code Review)" <ge...@cloudera.org> on 2016/10/27 14:08:01 UTC

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Attila Jeges has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4842

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................

IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

This change introduces a new catalogd startup option
(init_first_metastore_client_timeout_seconds) that specifies the
time in seconds catalogd should spend on retrying to establish a
connection to HMS the first time on startup before giving up and
exiting fatally.

Setting this startup option to a value that is greater than the HMS
startup time will allow CM to start Impala at the same time or even
before HMS.

The default value of init_first_metastore_client_timeout_seconds is
120 seconds.

Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
---
M be/src/catalog/catalog.cc
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
A tests/experiments/test_catalog_hms_failures.py
8 files changed, 218 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/4842/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................


Patch Set 4:

(9 comments)

Looks pretty close to me.

http://gerrit.cloudera.org:8080/#/c/4842/4/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

PS4, Line 105: if (numClients > 0) {
             :       metaStoreClientPool_.addClients(1, initialCnxnTimeoutSec);
             :       metaStoreClientPool_.addClients(numClients - 1, 0);
             :     }
I think you can make this logic a method of MetaStoreClientPool - it's already called from MSCP's constructor, so a chance for sharing one method between both palces.


http://gerrit.cloudera.org:8080/#/c/4842/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

Line 110:   private static final int META_STORE_CLIENT_POOL_SIZE = 10;
is this the max size, or the initial size, or both?


http://gerrit.cloudera.org:8080/#/c/4842/4/fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
File fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java:

PS4, Line 76: private
Add a comment to this method describing the behaviour.


PS4, Line 83: IMetaStoreClient hiveClient = null;
What's the point of this variable - why not assign to hiveClient_ directly?


PS4, Line 99: catch (InterruptedException ignore) {}
on this path you might end up not sleeping for long enough if you get interrupted. Instead, suggest you have something roughly like:

  long delayUntil = System.currentTimeMillis() + retryDelayMillis;
  if (delayUntil > endTimeMillis) throw...
  while (delayUntil > System.currentTimeMillis()) {
    try {
      Thread.sleep(delayUntil - System.currentTimeMillis());
    } catch (...) { }
  }


PS4, Line 161: initialCnxnTimeoutSec
this is not the 'initial' timeout in the same sense that it is elsewhere, I think. Elsewhere we use it to mean "first client", but here it's used to mean "first connection". I think it's better just to call it cnxnTimeoutSec here.


http://gerrit.cloudera.org:8080/#/c/4842/4/tests/experiments/test_catalog_hms_failures.py
File tests/experiments/test_catalog_hms_failures.py:

PS4, Line 81: 10
just from experience, suggest you make this 30s. Timeouts are surprisingly flaky in EC2-based build infrastructure.


Line 114: 
Perhaps wait 5s or so here to be sure that the catalog is in the 'trying to connect' phase of its startup.


Line 122
How about a test that confirms that if the HMS does not start within initial_hms_cnxn_timeout_s, then the catalogd fails?


-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Attila Jeges (Code Review)" <ge...@cloudera.org>.
Attila Jeges has uploaded a new patch set (#5).

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................

IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

This change introduces a new catalogd startup option
(init_first_metastore_client_timeout_seconds) that specifies the
time in seconds catalogd should spend on retrying to establish a
connection to HMS the first time on startup before giving up and
exiting fatally.

Setting this startup option to a value that is greater than the HMS
startup time will allow CM to start Impala at the same time or even
before HMS.

The default value of init_first_metastore_client_timeout_seconds is
120 seconds.

Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
A tests/experiments/test_catalog_hms_failures.py
10 files changed, 298 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/4842/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Attila Jeges (Code Review)" <ge...@cloudera.org>.
Attila Jeges has posted comments on this change.

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4842/3/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

PS3, Line 41: spend on trying to establish connection to HMS the "
            :     "first time on startup
> "wait to establish an initial connection to the HMS"
Done


PS3, Line 75: init_first
> init_first seems redundant, here and everywhere else.
Done


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

Line 67:   protected final MetaStoreClientPool metaStoreClientPool_ = new MetaStoreClientPool(0, 0);
> long line
Done


PS3, Line 103: public void initMetaStoreClientPool(long initFirstClientTimeoutSeconds) {
             :     metaStoreClientPool_.addClients(META_STORE_CLIENT_POOL_SIZE,
             :         initFirstClientTimeoutSeconds);
             :   }
> Why not call this in the c'tor any more? (if timeout is < 0, add no clients
Added an overloaded constructor that takes two parameters: the number of clients and the initialization timeout.


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

PS3, Line 163: super
> don't think you need super., do you?
Removed initMetaStoreClientPool()


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
File fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java:

PS3, Line 156: Add 'numClients' to the client pool.
             :    * 'initFirstClientTimeoutSeconds' specifies the time (in seconds) spent on trying to
             :    * initialize the first MetaStore client before giving up.
             :    */
> it might be clearer to call this, initially, as addClients(1, 120), then ca
Done


http://gerrit.cloudera.org:8080/#/c/4842/3/tests/experiments/test_catalog_hms_failures.py
File tests/experiments/test_catalog_hms_failures.py:

PS3, Line 61: metadatra
> metadata
Done


PS3, Line 109: statestored.service.wait_for_live_subscriber
> if this fails, I think the test can leave the HMS not running. That will af
In teardown_class() L48 we always call run_hive_server() to make sure that HMS is running even if a test case fails.


-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4842/3/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

PS3, Line 41: spend on trying to establish connection to HMS the "
            :     "first time on startup
"wait to establish an initial connection to the HMS"


PS3, Line 75: init_first
init_first seems redundant, here and everywhere else.

How about "initial_hms_cnxn_timeout_s"


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

Line 67:   protected final MetaStoreClientPool metaStoreClientPool_ = new MetaStoreClientPool(0, 0);
long line


PS3, Line 103: public void initMetaStoreClientPool(long initFirstClientTimeoutSeconds) {
             :     metaStoreClientPool_.addClients(META_STORE_CLIENT_POOL_SIZE,
             :         initFirstClientTimeoutSeconds);
             :   }
Why not call this in the c'tor any more? (if timeout is < 0, add no clients. Or maybe pass META_STORE_CLIENT_POOL_SIZE as a parameter to the c'tor).


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

PS3, Line 163: super
don't think you need super., do you?


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
File fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java:

PS3, Line 156: Add 'numClients' to the client pool.
             :    * 'initFirstClientTimeoutSeconds' specifies the time (in seconds) spent on trying to
             :    * initialize the first MetaStore client before giving up.
             :    */
it might be clearer to call this, initially, as addClients(1, 120), then call addClients(9, 10) (or whatever the second timeout should be). The current API is a little confusing.


http://gerrit.cloudera.org:8080/#/c/4842/3/tests/experiments/test_catalog_hms_failures.py
File tests/experiments/test_catalog_hms_failures.py:

PS3, Line 61: metadatra
metadata


PS3, Line 109: statestored.service.wait_for_live_subscriber
if this fails, I think the test can leave the HMS not running. That will affect other tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Attila Jeges (Code Review)" <ge...@cloudera.org>.
Attila Jeges has uploaded a new patch set (#4).

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................

IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

This change introduces a new catalogd startup option
(init_first_metastore_client_timeout_seconds) that specifies the
time in seconds catalogd should spend on retrying to establish a
connection to HMS the first time on startup before giving up and
exiting fatally.

Setting this startup option to a value that is greater than the HMS
startup time will allow CM to start Impala at the same time or even
before HMS.

The default value of init_first_metastore_client_timeout_seconds is
120 seconds.

Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
---
M be/src/catalog/catalog.cc
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
A tests/experiments/test_catalog_hms_failures.py
8 files changed, 222 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/4842/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Attila Jeges (Code Review)" <ge...@cloudera.org>.
Attila Jeges has uploaded a new patch set (#6).

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................

IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

This change introduces a new catalogd startup option
(init_first_metastore_client_timeout_seconds) that specifies the
time in seconds catalogd should spend on retrying to establish a
connection to HMS the first time on startup before giving up and
exiting fatally.

Setting this startup option to a value that is greater than the HMS
startup time will allow CM to start Impala at the same time or even
before HMS.

The default value of init_first_metastore_client_timeout_seconds is
120 seconds.

Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
A tests/experiments/test_catalog_hms_failures.py
10 files changed, 300 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/4842/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Attila Jeges (Code Review)" <ge...@cloudera.org>.
Attila Jeges has uploaded a new patch set (#4).

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................

IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

This change introduces a new catalogd startup option
(init_first_metastore_client_timeout_seconds) that specifies the
time in seconds catalogd should spend on retrying to establish a
connection to HMS the first time on startup before giving up and
exiting fatally.

Setting this startup option to a value that is greater than the HMS
startup time will allow CM to start Impala at the same time or even
before HMS.

The default value of init_first_metastore_client_timeout_seconds is
120 seconds.

Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
---
M be/src/catalog/catalog.cc
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
A tests/experiments/test_catalog_hms_failures.py
8 files changed, 221 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/4842/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Attila Jeges (Code Review)" <ge...@cloudera.org>.
Attila Jeges has posted comments on this change.

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/4842/4/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

PS4, Line 105: if (numClients > 0) {
             :       metaStoreClientPool_.addClients(1, initialCnxnTimeoutSec);
             :       metaStoreClientPool_.addClients(numClients - 1, 0);
             :     }
> I think you can make this logic a method of MetaStoreClientPool - it's alre
Done


http://gerrit.cloudera.org:8080/#/c/4842/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

Line 110:   private static final int META_STORE_CLIENT_POOL_SIZE = 10;
> is this the max size, or the initial size, or both?
Initial pool size. Changed the name of the field to reflect this.


http://gerrit.cloudera.org:8080/#/c/4842/4/fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
File fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java:

PS4, Line 76: private
> Add a comment to this method describing the behaviour.
Done


PS4, Line 83: IMetaStoreClient hiveClient = null;
> What's the point of this variable - why not assign to hiveClient_ directly?
hiveClient_ is final, the compiler complains if I try to set it in a loop.


PS4, Line 99: catch (InterruptedException ignore) {}
> on this path you might end up not sleeping for long enough if you get inter
Done


PS4, Line 161: initialCnxnTimeoutSec
> this is not the 'initial' timeout in the same sense that it is elsewhere, I
Done


http://gerrit.cloudera.org:8080/#/c/4842/4/tests/experiments/test_catalog_hms_failures.py
File tests/experiments/test_catalog_hms_failures.py:

PS4, Line 81: 10
> just from experience, suggest you make this 30s. Timeouts are surprisingly 
Done


Line 114: 
> Perhaps wait 5s or so here to be sure that the catalog is in the 'trying to
Done


Line 122
> How about a test that confirms that if the HMS does not start within initia
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

Posted by "Attila Jeges (Code Review)" <ge...@cloudera.org>.
Attila Jeges has uploaded a new patch set (#4).

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
......................................................................

IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

This change introduces a new catalogd startup option
(init_first_metastore_client_timeout_seconds) that specifies the
time in seconds catalogd should spend on retrying to establish a
connection to HMS the first time on startup before giving up and
exiting fatally.

Setting this startup option to a value that is greater than the HMS
startup time will allow CM to start Impala at the same time or even
before HMS.

The default value of init_first_metastore_client_timeout_seconds is
120 seconds.

Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
---
M be/src/catalog/catalog.cc
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
A tests/experiments/test_catalog_hms_failures.py
8 files changed, 221 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/4842/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>