You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Timothy Farkas (JIRA)" <ji...@apache.org> on 2018/05/02 15:58:00 UTC

[jira] [Updated] (DRILL-6380) Mongo db storage plugin tests can hang on jenkins.

     [ https://issues.apache.org/jira/browse/DRILL-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Timothy Farkas updated DRILL-6380:
----------------------------------
    Description: 
When running on our Jenkins server the mongodb tests hang because the Config servers take up to 5 seconds to process each request (see *Error 1*). This causes the tests to never finish within a reasonable span of time. Searching online people run into this issue when mixing versions of mongo db, but that is not happening in our tests. A possible cause is *Error 2* which seems to indicate that the mongo db config servers are not completely initialized since the config servers should have a lockping document when starting up.

*Error 1*

{code}
[mongod output] 2018-05-01T23:38:47.468-0700 I COMMAND  [replSetDistLockPinger] command config.lockpings command: findAndModify { findAndModify: "lockpings", query: { _id: "ConfigServer" }, update: { $set: { ping: new Date(1525243123413) } }, upsert: true, writeConcern: { w: "majority", wtimeout: 15000 } } planSummary: IDHACK update: { $set: { ping: new Date(1525243123413) } } keysExamined:0 docsExamined:0 nMatched:0 nModified:0 upsert:1 keysInserted:2 numYields:0 reslen:198 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, Metadata: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } protocol:op_query 4055ms
[mongod output] 2018-05-01T23:38:47.469-0700 W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: LockStateChangeFailed: findAndModify query predicate didn't match any lock document
[mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] lock 'balancer' successfully forced
[mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] distributed lock 'balancer' acquired, ts : 5ae95cd5d1023488104e6282
[mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS balancer thread is recovering
[mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS balancer thread is recovered
[mongod output] 2018-05-01T23:38:48.056-0700 I NETWORK  [thread2] connection accepted from 127.0.0.1:50244 #10 (7 connections now open)
{code}

*Error 2*

{code}
[mongod output] 2018-05-01T23:39:37.690-0700 I COMMAND  [conn7] command config.settings command: find { find: "settings", filter: { _id: "chunksize" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1525243172000|1, t: 1 } }, limit: 1, maxTimeMS: 30000 } planSummary: EOF keysExamined:0 docsExamined:0 cursorExhausted:1 numYields:0 nreturned:0 reslen:354 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_command 4988ms
{code}

  was:
When running on our Jenkins server the mongodb tests hang because the Config servers take up to 5 seconds to process each request (see *Error 2*). This causes the tests to never finish within a reasonable span of time. Searching online people run into this issue when mixing versions of mongo db, but that is not happening in our tests. A possible cause is *Error 1* which seems to indicate that the mongo db config servers are not completely initialized since the config servers should have a lockping document when starting up.

*Error 1*

{code}
[mongod output] 2018-05-01T23:38:47.468-0700 I COMMAND  [replSetDistLockPinger] command config.lockpings command: findAndModify { findAndModify: "lockpings", query: { _id: "ConfigServer" }, update: { $set: { ping: new Date(1525243123413) } }, upsert: true, writeConcern: { w: "majority", wtimeout: 15000 } } planSummary: IDHACK update: { $set: { ping: new Date(1525243123413) } } keysExamined:0 docsExamined:0 nMatched:0 nModified:0 upsert:1 keysInserted:2 numYields:0 reslen:198 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, Metadata: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } protocol:op_query 4055ms
[mongod output] 2018-05-01T23:38:47.469-0700 W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: LockStateChangeFailed: findAndModify query predicate didn't match any lock document
[mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] lock 'balancer' successfully forced
[mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] distributed lock 'balancer' acquired, ts : 5ae95cd5d1023488104e6282
[mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS balancer thread is recovering
[mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS balancer thread is recovered
[mongod output] 2018-05-01T23:38:48.056-0700 I NETWORK  [thread2] connection accepted from 127.0.0.1:50244 #10 (7 connections now open)
{code}

*Error 2*

{code}
[mongod output] 2018-05-01T23:39:37.690-0700 I COMMAND  [conn7] command config.settings command: find { find: "settings", filter: { _id: "chunksize" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1525243172000|1, t: 1 } }, limit: 1, maxTimeMS: 30000 } planSummary: EOF keysExamined:0 docsExamined:0 cursorExhausted:1 numYields:0 nreturned:0 reslen:354 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_command 4988ms
{code}


> Mongo db storage plugin tests can hang on jenkins.
> --------------------------------------------------
>
>                 Key: DRILL-6380
>                 URL: https://issues.apache.org/jira/browse/DRILL-6380
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Timothy Farkas
>            Assignee: Timothy Farkas
>            Priority: Major
>
> When running on our Jenkins server the mongodb tests hang because the Config servers take up to 5 seconds to process each request (see *Error 1*). This causes the tests to never finish within a reasonable span of time. Searching online people run into this issue when mixing versions of mongo db, but that is not happening in our tests. A possible cause is *Error 2* which seems to indicate that the mongo db config servers are not completely initialized since the config servers should have a lockping document when starting up.
> *Error 1*
> {code}
> [mongod output] 2018-05-01T23:38:47.468-0700 I COMMAND  [replSetDistLockPinger] command config.lockpings command: findAndModify { findAndModify: "lockpings", query: { _id: "ConfigServer" }, update: { $set: { ping: new Date(1525243123413) } }, upsert: true, writeConcern: { w: "majority", wtimeout: 15000 } } planSummary: IDHACK update: { $set: { ping: new Date(1525243123413) } } keysExamined:0 docsExamined:0 nMatched:0 nModified:0 upsert:1 keysInserted:2 numYields:0 reslen:198 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, Metadata: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } protocol:op_query 4055ms
> [mongod output] 2018-05-01T23:38:47.469-0700 W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: LockStateChangeFailed: findAndModify query predicate didn't match any lock document
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] lock 'balancer' successfully forced
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] distributed lock 'balancer' acquired, ts : 5ae95cd5d1023488104e6282
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS balancer thread is recovering
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS balancer thread is recovered
> [mongod output] 2018-05-01T23:38:48.056-0700 I NETWORK  [thread2] connection accepted from 127.0.0.1:50244 #10 (7 connections now open)
> {code}
> *Error 2*
> {code}
> [mongod output] 2018-05-01T23:39:37.690-0700 I COMMAND  [conn7] command config.settings command: find { find: "settings", filter: { _id: "chunksize" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1525243172000|1, t: 1 } }, limit: 1, maxTimeMS: 30000 } planSummary: EOF keysExamined:0 docsExamined:0 cursorExhausted:1 numYields:0 nreturned:0 reslen:354 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_command 4988ms
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)