You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tephra.apache.org by F21 <f2...@gmail.com> on 2016/09/27 00:15:54 UTC

Tephra unreliable in CI environments

Hi all,

I have created a docker image containing HBase 1.2.3 and Phoenix 4.8.0. 
See: https://github.com/Boostport/hbase-phoenix-all-in-one

When running tests against the image on my machine, tephra works perfectly.

However, tephra seems to be unreliable in CI environments. It seems that 
the tx service is not discovered:

RuntimeException: java.lang.Exception: Thrift error for 
org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e: 
Unable to discover tx service. -> Exception: Thrift error for 
org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e: 
Unable to discover tx service. -> TException: Unable to discover tx service.

Here's a build on wercker which shows tephra failing: 
https://app.wercker.com/boostport/avatica/runs/build/57e9b5d170a35501008402b4?step=57e9b5f72c15ad000127a534

I also tried travis, but the same issue occurs: 
https://travis-ci.org/Boostport/avatica/builds/162952367

Since I am unable to ssh into those docker container on wercker or 
travis, it is hard to debug what's causing tephra to fail. I am hoping 
that the issue is related to TEPHRA-179 (and a few other JIRAs related 
to it) which I reported a few weeks ago and has since been fixed.

Has anyone else ran into similar problems? I would love to hear your 
thoughts.

Cheers,

Francis


Re: Tephra unreliable in CI environments

Posted by F21 <f2...@gmail.com>.
Just a quick update on this one. I upgraded the image to Phoenix 
4.8.1-rc0, but still have the same problem.

Initially, I thought it might be due to lack of memory, but travis 
provides 7.5GB of memory and 2 cores in their standard environment, so 
that should be more than enough for testing.

On 27/09/2016 10:15 AM, F21 wrote:
> Hi all,
>
> I have created a docker image containing HBase 1.2.3 and Phoenix 
> 4.8.0. See: https://github.com/Boostport/hbase-phoenix-all-in-one
>
> When running tests against the image on my machine, tephra works 
> perfectly.
>
> However, tephra seems to be unreliable in CI environments. It seems 
> that the tx service is not discovered:
>
> RuntimeException: java.lang.Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e: 
> Unable to discover tx service. -> Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e: 
> Unable to discover tx service. -> TException: Unable to discover tx 
> service.
>
> Here's a build on wercker which shows tephra failing: 
> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d170a35501008402b4?step=57e9b5f72c15ad000127a534
>
> I also tried travis, but the same issue occurs: 
> https://travis-ci.org/Boostport/avatica/builds/162952367
>
> Since I am unable to ssh into those docker container on wercker or 
> travis, it is hard to debug what's causing tephra to fail. I am hoping 
> that the issue is related to TEPHRA-179 (and a few other JIRAs related 
> to it) which I reported a few weeks ago and has since been fixed.
>
> Has anyone else ran into similar problems? I would love to hear your 
> thoughts.
>
> Cheers,
>
> Francis
>


Re: Tephra unreliable in CI environments

Posted by Poorna Chandra <po...@gmail.com>.
Cool, thanks for the update!

Poorna.

On Oct 10, 2016 9:34 PM, "F21" <f2...@gmail.com> wrote:

> Hi Poorna,
>
> I was able to resolve this with James' help in PHOENIX-3259 by deleting
> the guava jar in the HBase distribution and replacing it with the guava-13
> jar.
>
> Thanks,
> Francis
>
> On 11/10/2016 3:09 PM, Poorna Chandra wrote:
>
>> Hi Francis,
>>
>> (Apology for the delayed response, I was out of office for the past few
>> weeks.)
>>
>> Do you see a guava-13 jar in the classpath of Tephra? It should be in
>> <tephra-home>/lib directory. You can define environment variable CLASSPATH
>> pointing to path of guava-13 jar before starting Tephra. This will put
>> guava-13 jar earlier in Tephra's classpath.
>>
>> Thanks,
>> Poorna.
>>
>>
>> On Tue, Sep 27, 2016 at 8:57 PM, F21 <f2...@gmail.com> wrote:
>>
>> Hi Poorna,
>>>
>>> Thanks, that narrows down the problem. I was spinning up a few VMs with
>>> various versions of Ubuntu and their kernels, but that didn't shed any
>>> light on the problem.
>>>
>>> You mentioned that adding the guava-13 jar to the classpath before the
>>> guava-12 jar would be a workaround. I am using Phoenix 4.8.1-rc0, so what
>>> would be the best way to do this?
>>>
>>> Cheers,
>>> Francis
>>>
>>> On 28/09/2016 1:43 PM, Poorna Chandra wrote:
>>>
>>> Hi Francis,
>>>>
>>>> This is due to guava-12 vs guava-13 incompatibility [1]. Tephra depends
>>>> on
>>>> guava-13 and HBase depends on guava-12. Depending on how the OS orders
>>>> the
>>>> jars in the classpath, sometimes guava-12 may appear earlier in the
>>>> classpath. This leads to the NoSuchMethodError exception. We are
>>>> planning
>>>> on removing Tephra's dependency on guava-13 in the next release. Until
>>>> then
>>>> a workaround is to add guava-13 jar into the classpath before guava-12
>>>> jar.
>>>>
>>>> Thanks,
>>>> Poorna.
>>>>
>>>> [1] - https://issues.apache.org/jira/browse/TEPHRA-181
>>>>
>>>>
>>>> On Tue, Sep 27, 2016 at 10:09 PM, F21 <f2...@gmail.com> wrote:
>>>>
>>>> Hi Poorna,
>>>>
>>>>> That would be very helpful! Unfortunately, I ran into the same issue
>>>>> where
>>>>> the image no longer works correct on my dev environment, but works
>>>>> properly
>>>>> on travis.
>>>>>
>>>>> I am not receiving this error:
>>>>>
>>>>> java.lang.NoSuchMethodError:
>>>>> co.cask.tephra.TransactionManager.addListener(Lcom/google/co
>>>>> mmon/util/concurrent/Service$Listener;Ljava/util/concurrent/
>>>>> Executor;)V
>>>>>            at
>>>>> co.cask.tephra.distributed.TransactionService$1.leader(Trans
>>>>> actionService.java:83)
>>>>>            at
>>>>> org.apache.twill.internal.zookeeper.LeaderElection.becomeLea
>>>>> der(LeaderElection.java:229)
>>>>>            at
>>>>> org.apache.twill.internal.zookeeper.LeaderElection.access$
>>>>> 1800(LeaderElection.java:53)
>>>>>            at
>>>>> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
>>>>> ss(LeaderElection.java:207)
>>>>>            at
>>>>> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
>>>>> ss(LeaderElection.java:186)
>>>>>            at
>>>>> com.google.common.util.concurrent.Futures$5.run(Futures.java:768)
>>>>>            at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>> Executor.java:1142)
>>>>>            at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>> lExecutor.java:617)
>>>>>            at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> This is the exact same problem I encountered and posted on the list
>>>>> about
>>>>> previously: https://mail-archives.apache.o
>>>>> rg/mod_mbox/phoenix-user/20160
>>>>> 3.mbox/%3C56FC727E.30906@gmail.com%3E
>>>>>
>>>>> It's puzzling that an identical image will behave differently on
>>>>> different
>>>>> systems. Does tephra use any kernel functionalities directly?
>>>>>
>>>>> Cheers,
>>>>> Francis
>>>>>
>>>>>
>>>>> On 28/09/2016 12:04 PM, Poorna Chandra wrote:
>>>>>
>>>>> Hi Francis,
>>>>>
>>>>>> Tephra startup script redirects all logs to a file by default. To help
>>>>>> debug such issues, you could update the Tephra startup script [1] to
>>>>>> log
>>>>>> everything to stdout instead of a file when running inside docker. We
>>>>>> are
>>>>>> also planning on adding a --foreground option to Tephra startup
>>>>>> script,
>>>>>> in
>>>>>> which case the logs will be written to stdout directly. This will help
>>>>>> in
>>>>>> debugging in future.
>>>>>>
>>>>>> Thanks,
>>>>>> Poorna.
>>>>>>
>>>>>> [1] - https://github.com/apache/incubator-tephra/blob/master/bin/
>>>>>> tephra#L175
>>>>>>
>>>>>>
>>>>>> On Tue, Sep 27, 2016 at 12:59 AM, F21 <f2...@gmail.com> wrote:
>>>>>>
>>>>>> I was able to get it to run reliably with Phoenix 4.8.1-rc0. Another
>>>>>> part
>>>>>>
>>>>>> of the equation was forcing travis to use their Ubuntu 14.04
>>>>>>> environment
>>>>>>> rather than the default 12.04 environment. I am assuming 12.04 had an
>>>>>>> older
>>>>>>> kernel which prevent docker images from working correctly.
>>>>>>>
>>>>>>> On 27/09/2016 10:15 AM, F21 wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I have created a docker image containing HBase 1.2.3 and Phoenix
>>>>>>>> 4.8.0.
>>>>>>>> See: https://github.com/Boostport/hbase-phoenix-all-in-one
>>>>>>>>
>>>>>>>> When running tests against the image on my machine, tephra works
>>>>>>>> perfectly.
>>>>>>>>
>>>>>>>> However, tephra seems to be unreliable in CI environments. It seems
>>>>>>>> that
>>>>>>>> the tx service is not discovered:
>>>>>>>>
>>>>>>>> RuntimeException: java.lang.Exception: Thrift error for
>>>>>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>>>>>> Unable to discover tx service. -> Exception: Thrift error for
>>>>>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>>>>>> Unable to discover tx service. -> TException: Unable to discover tx
>>>>>>>> service.
>>>>>>>>
>>>>>>>> Here's a build on wercker which shows tephra failing:
>>>>>>>> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d
>>>>>>>> 170a35501008402b4?step=57e9b5f72c15ad000127a534
>>>>>>>>
>>>>>>>> I also tried travis, but the same issue occurs:
>>>>>>>> https://travis-ci.org/Boostport/avatica/builds/162952367
>>>>>>>>
>>>>>>>> Since I am unable to ssh into those docker container on wercker or
>>>>>>>> travis, it is hard to debug what's causing tephra to fail. I am
>>>>>>>> hoping
>>>>>>>> that
>>>>>>>> the issue is related to TEPHRA-179 (and a few other JIRAs related to
>>>>>>>> it)
>>>>>>>> which I reported a few weeks ago and has since been fixed.
>>>>>>>>
>>>>>>>> Has anyone else ran into similar problems? I would love to hear your
>>>>>>>> thoughts.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> Francis
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>

Re: Tephra unreliable in CI environments

Posted by F21 <f2...@gmail.com>.
Hi Poorna,

I was able to resolve this with James' help in PHOENIX-3259 by deleting 
the guava jar in the HBase distribution and replacing it with the 
guava-13 jar.

Thanks,
Francis

On 11/10/2016 3:09 PM, Poorna Chandra wrote:
> Hi Francis,
>
> (Apology for the delayed response, I was out of office for the past few
> weeks.)
>
> Do you see a guava-13 jar in the classpath of Tephra? It should be in
> <tephra-home>/lib directory. You can define environment variable CLASSPATH
> pointing to path of guava-13 jar before starting Tephra. This will put
> guava-13 jar earlier in Tephra's classpath.
>
> Thanks,
> Poorna.
>
>
> On Tue, Sep 27, 2016 at 8:57 PM, F21 <f2...@gmail.com> wrote:
>
>> Hi Poorna,
>>
>> Thanks, that narrows down the problem. I was spinning up a few VMs with
>> various versions of Ubuntu and their kernels, but that didn't shed any
>> light on the problem.
>>
>> You mentioned that adding the guava-13 jar to the classpath before the
>> guava-12 jar would be a workaround. I am using Phoenix 4.8.1-rc0, so what
>> would be the best way to do this?
>>
>> Cheers,
>> Francis
>>
>> On 28/09/2016 1:43 PM, Poorna Chandra wrote:
>>
>>> Hi Francis,
>>>
>>> This is due to guava-12 vs guava-13 incompatibility [1]. Tephra depends on
>>> guava-13 and HBase depends on guava-12. Depending on how the OS orders the
>>> jars in the classpath, sometimes guava-12 may appear earlier in the
>>> classpath. This leads to the NoSuchMethodError exception. We are planning
>>> on removing Tephra's dependency on guava-13 in the next release. Until
>>> then
>>> a workaround is to add guava-13 jar into the classpath before guava-12
>>> jar.
>>>
>>> Thanks,
>>> Poorna.
>>>
>>> [1] - https://issues.apache.org/jira/browse/TEPHRA-181
>>>
>>>
>>> On Tue, Sep 27, 2016 at 10:09 PM, F21 <f2...@gmail.com> wrote:
>>>
>>> Hi Poorna,
>>>> That would be very helpful! Unfortunately, I ran into the same issue
>>>> where
>>>> the image no longer works correct on my dev environment, but works
>>>> properly
>>>> on travis.
>>>>
>>>> I am not receiving this error:
>>>>
>>>> java.lang.NoSuchMethodError:
>>>> co.cask.tephra.TransactionManager.addListener(Lcom/google/co
>>>> mmon/util/concurrent/Service$Listener;Ljava/util/concurrent/Executor;)V
>>>>            at
>>>> co.cask.tephra.distributed.TransactionService$1.leader(Trans
>>>> actionService.java:83)
>>>>            at
>>>> org.apache.twill.internal.zookeeper.LeaderElection.becomeLea
>>>> der(LeaderElection.java:229)
>>>>            at
>>>> org.apache.twill.internal.zookeeper.LeaderElection.access$
>>>> 1800(LeaderElection.java:53)
>>>>            at
>>>> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
>>>> ss(LeaderElection.java:207)
>>>>            at
>>>> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
>>>> ss(LeaderElection.java:186)
>>>>            at
>>>> com.google.common.util.concurrent.Futures$5.run(Futures.java:768)
>>>>            at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>> Executor.java:1142)
>>>>            at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>> lExecutor.java:617)
>>>>            at java.lang.Thread.run(Thread.java:745)
>>>>
>>>> This is the exact same problem I encountered and posted on the list about
>>>> previously: https://mail-archives.apache.org/mod_mbox/phoenix-user/20160
>>>> 3.mbox/%3C56FC727E.30906@gmail.com%3E
>>>>
>>>> It's puzzling that an identical image will behave differently on
>>>> different
>>>> systems. Does tephra use any kernel functionalities directly?
>>>>
>>>> Cheers,
>>>> Francis
>>>>
>>>>
>>>> On 28/09/2016 12:04 PM, Poorna Chandra wrote:
>>>>
>>>> Hi Francis,
>>>>> Tephra startup script redirects all logs to a file by default. To help
>>>>> debug such issues, you could update the Tephra startup script [1] to log
>>>>> everything to stdout instead of a file when running inside docker. We
>>>>> are
>>>>> also planning on adding a --foreground option to Tephra startup script,
>>>>> in
>>>>> which case the logs will be written to stdout directly. This will help
>>>>> in
>>>>> debugging in future.
>>>>>
>>>>> Thanks,
>>>>> Poorna.
>>>>>
>>>>> [1] - https://github.com/apache/incubator-tephra/blob/master/bin/
>>>>> tephra#L175
>>>>>
>>>>>
>>>>> On Tue, Sep 27, 2016 at 12:59 AM, F21 <f2...@gmail.com> wrote:
>>>>>
>>>>> I was able to get it to run reliably with Phoenix 4.8.1-rc0. Another
>>>>> part
>>>>>
>>>>>> of the equation was forcing travis to use their Ubuntu 14.04
>>>>>> environment
>>>>>> rather than the default 12.04 environment. I am assuming 12.04 had an
>>>>>> older
>>>>>> kernel which prevent docker images from working correctly.
>>>>>>
>>>>>> On 27/09/2016 10:15 AM, F21 wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>>> I have created a docker image containing HBase 1.2.3 and Phoenix
>>>>>>> 4.8.0.
>>>>>>> See: https://github.com/Boostport/hbase-phoenix-all-in-one
>>>>>>>
>>>>>>> When running tests against the image on my machine, tephra works
>>>>>>> perfectly.
>>>>>>>
>>>>>>> However, tephra seems to be unreliable in CI environments. It seems
>>>>>>> that
>>>>>>> the tx service is not discovered:
>>>>>>>
>>>>>>> RuntimeException: java.lang.Exception: Thrift error for
>>>>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>>>>> Unable to discover tx service. -> Exception: Thrift error for
>>>>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>>>>> Unable to discover tx service. -> TException: Unable to discover tx
>>>>>>> service.
>>>>>>>
>>>>>>> Here's a build on wercker which shows tephra failing:
>>>>>>> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d
>>>>>>> 170a35501008402b4?step=57e9b5f72c15ad000127a534
>>>>>>>
>>>>>>> I also tried travis, but the same issue occurs:
>>>>>>> https://travis-ci.org/Boostport/avatica/builds/162952367
>>>>>>>
>>>>>>> Since I am unable to ssh into those docker container on wercker or
>>>>>>> travis, it is hard to debug what's causing tephra to fail. I am hoping
>>>>>>> that
>>>>>>> the issue is related to TEPHRA-179 (and a few other JIRAs related to
>>>>>>> it)
>>>>>>> which I reported a few weeks ago and has since been fixed.
>>>>>>>
>>>>>>> Has anyone else ran into similar problems? I would love to hear your
>>>>>>> thoughts.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Francis
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>


Re: Tephra unreliable in CI environments

Posted by Poorna Chandra <po...@gmail.com>.
Hi Francis,

(Apology for the delayed response, I was out of office for the past few
weeks.)

Do you see a guava-13 jar in the classpath of Tephra? It should be in
<tephra-home>/lib directory. You can define environment variable CLASSPATH
pointing to path of guava-13 jar before starting Tephra. This will put
guava-13 jar earlier in Tephra's classpath.

Thanks,
Poorna.


On Tue, Sep 27, 2016 at 8:57 PM, F21 <f2...@gmail.com> wrote:

> Hi Poorna,
>
> Thanks, that narrows down the problem. I was spinning up a few VMs with
> various versions of Ubuntu and their kernels, but that didn't shed any
> light on the problem.
>
> You mentioned that adding the guava-13 jar to the classpath before the
> guava-12 jar would be a workaround. I am using Phoenix 4.8.1-rc0, so what
> would be the best way to do this?
>
> Cheers,
> Francis
>
> On 28/09/2016 1:43 PM, Poorna Chandra wrote:
>
>> Hi Francis,
>>
>> This is due to guava-12 vs guava-13 incompatibility [1]. Tephra depends on
>> guava-13 and HBase depends on guava-12. Depending on how the OS orders the
>> jars in the classpath, sometimes guava-12 may appear earlier in the
>> classpath. This leads to the NoSuchMethodError exception. We are planning
>> on removing Tephra's dependency on guava-13 in the next release. Until
>> then
>> a workaround is to add guava-13 jar into the classpath before guava-12
>> jar.
>>
>> Thanks,
>> Poorna.
>>
>> [1] - https://issues.apache.org/jira/browse/TEPHRA-181
>>
>>
>> On Tue, Sep 27, 2016 at 10:09 PM, F21 <f2...@gmail.com> wrote:
>>
>> Hi Poorna,
>>>
>>> That would be very helpful! Unfortunately, I ran into the same issue
>>> where
>>> the image no longer works correct on my dev environment, but works
>>> properly
>>> on travis.
>>>
>>> I am not receiving this error:
>>>
>>> java.lang.NoSuchMethodError:
>>> co.cask.tephra.TransactionManager.addListener(Lcom/google/co
>>> mmon/util/concurrent/Service$Listener;Ljava/util/concurrent/Executor;)V
>>>           at
>>> co.cask.tephra.distributed.TransactionService$1.leader(Trans
>>> actionService.java:83)
>>>           at
>>> org.apache.twill.internal.zookeeper.LeaderElection.becomeLea
>>> der(LeaderElection.java:229)
>>>           at
>>> org.apache.twill.internal.zookeeper.LeaderElection.access$
>>> 1800(LeaderElection.java:53)
>>>           at
>>> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
>>> ss(LeaderElection.java:207)
>>>           at
>>> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
>>> ss(LeaderElection.java:186)
>>>           at
>>> com.google.common.util.concurrent.Futures$5.run(Futures.java:768)
>>>           at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>>           at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>>           at java.lang.Thread.run(Thread.java:745)
>>>
>>> This is the exact same problem I encountered and posted on the list about
>>> previously: https://mail-archives.apache.org/mod_mbox/phoenix-user/20160
>>> 3.mbox/%3C56FC727E.30906@gmail.com%3E
>>>
>>> It's puzzling that an identical image will behave differently on
>>> different
>>> systems. Does tephra use any kernel functionalities directly?
>>>
>>> Cheers,
>>> Francis
>>>
>>>
>>> On 28/09/2016 12:04 PM, Poorna Chandra wrote:
>>>
>>> Hi Francis,
>>>>
>>>> Tephra startup script redirects all logs to a file by default. To help
>>>> debug such issues, you could update the Tephra startup script [1] to log
>>>> everything to stdout instead of a file when running inside docker. We
>>>> are
>>>> also planning on adding a --foreground option to Tephra startup script,
>>>> in
>>>> which case the logs will be written to stdout directly. This will help
>>>> in
>>>> debugging in future.
>>>>
>>>> Thanks,
>>>> Poorna.
>>>>
>>>> [1] - https://github.com/apache/incubator-tephra/blob/master/bin/
>>>> tephra#L175
>>>>
>>>>
>>>> On Tue, Sep 27, 2016 at 12:59 AM, F21 <f2...@gmail.com> wrote:
>>>>
>>>> I was able to get it to run reliably with Phoenix 4.8.1-rc0. Another
>>>> part
>>>>
>>>>> of the equation was forcing travis to use their Ubuntu 14.04
>>>>> environment
>>>>> rather than the default 12.04 environment. I am assuming 12.04 had an
>>>>> older
>>>>> kernel which prevent docker images from working correctly.
>>>>>
>>>>> On 27/09/2016 10:15 AM, F21 wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>>> I have created a docker image containing HBase 1.2.3 and Phoenix
>>>>>> 4.8.0.
>>>>>> See: https://github.com/Boostport/hbase-phoenix-all-in-one
>>>>>>
>>>>>> When running tests against the image on my machine, tephra works
>>>>>> perfectly.
>>>>>>
>>>>>> However, tephra seems to be unreliable in CI environments. It seems
>>>>>> that
>>>>>> the tx service is not discovered:
>>>>>>
>>>>>> RuntimeException: java.lang.Exception: Thrift error for
>>>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>>>> Unable to discover tx service. -> Exception: Thrift error for
>>>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>>>> Unable to discover tx service. -> TException: Unable to discover tx
>>>>>> service.
>>>>>>
>>>>>> Here's a build on wercker which shows tephra failing:
>>>>>> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d
>>>>>> 170a35501008402b4?step=57e9b5f72c15ad000127a534
>>>>>>
>>>>>> I also tried travis, but the same issue occurs:
>>>>>> https://travis-ci.org/Boostport/avatica/builds/162952367
>>>>>>
>>>>>> Since I am unable to ssh into those docker container on wercker or
>>>>>> travis, it is hard to debug what's causing tephra to fail. I am hoping
>>>>>> that
>>>>>> the issue is related to TEPHRA-179 (and a few other JIRAs related to
>>>>>> it)
>>>>>> which I reported a few weeks ago and has since been fixed.
>>>>>>
>>>>>> Has anyone else ran into similar problems? I would love to hear your
>>>>>> thoughts.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Francis
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>

Re: Tephra unreliable in CI environments

Posted by F21 <f2...@gmail.com>.
Hi Poorna,

Thanks, that narrows down the problem. I was spinning up a few VMs with 
various versions of Ubuntu and their kernels, but that didn't shed any 
light on the problem.

You mentioned that adding the guava-13 jar to the classpath before the 
guava-12 jar would be a workaround. I am using Phoenix 4.8.1-rc0, so 
what would be the best way to do this?

Cheers,
Francis

On 28/09/2016 1:43 PM, Poorna Chandra wrote:
> Hi Francis,
>
> This is due to guava-12 vs guava-13 incompatibility [1]. Tephra depends on
> guava-13 and HBase depends on guava-12. Depending on how the OS orders the
> jars in the classpath, sometimes guava-12 may appear earlier in the
> classpath. This leads to the NoSuchMethodError exception. We are planning
> on removing Tephra's dependency on guava-13 in the next release. Until then
> a workaround is to add guava-13 jar into the classpath before guava-12 jar.
>
> Thanks,
> Poorna.
>
> [1] - https://issues.apache.org/jira/browse/TEPHRA-181
>
>
> On Tue, Sep 27, 2016 at 10:09 PM, F21 <f2...@gmail.com> wrote:
>
>> Hi Poorna,
>>
>> That would be very helpful! Unfortunately, I ran into the same issue where
>> the image no longer works correct on my dev environment, but works properly
>> on travis.
>>
>> I am not receiving this error:
>>
>> java.lang.NoSuchMethodError:
>> co.cask.tephra.TransactionManager.addListener(Lcom/google/co
>> mmon/util/concurrent/Service$Listener;Ljava/util/concurrent/Executor;)V
>>           at
>> co.cask.tephra.distributed.TransactionService$1.leader(Trans
>> actionService.java:83)
>>           at
>> org.apache.twill.internal.zookeeper.LeaderElection.becomeLea
>> der(LeaderElection.java:229)
>>           at
>> org.apache.twill.internal.zookeeper.LeaderElection.access$
>> 1800(LeaderElection.java:53)
>>           at
>> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
>> ss(LeaderElection.java:207)
>>           at
>> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
>> ss(LeaderElection.java:186)
>>           at
>> com.google.common.util.concurrent.Futures$5.run(Futures.java:768)
>>           at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>>           at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>>           at java.lang.Thread.run(Thread.java:745)
>>
>> This is the exact same problem I encountered and posted on the list about
>> previously: https://mail-archives.apache.org/mod_mbox/phoenix-user/20160
>> 3.mbox/%3C56FC727E.30906@gmail.com%3E
>>
>> It's puzzling that an identical image will behave differently on different
>> systems. Does tephra use any kernel functionalities directly?
>>
>> Cheers,
>> Francis
>>
>>
>> On 28/09/2016 12:04 PM, Poorna Chandra wrote:
>>
>>> Hi Francis,
>>>
>>> Tephra startup script redirects all logs to a file by default. To help
>>> debug such issues, you could update the Tephra startup script [1] to log
>>> everything to stdout instead of a file when running inside docker. We are
>>> also planning on adding a --foreground option to Tephra startup script, in
>>> which case the logs will be written to stdout directly. This will help in
>>> debugging in future.
>>>
>>> Thanks,
>>> Poorna.
>>>
>>> [1] - https://github.com/apache/incubator-tephra/blob/master/bin/
>>> tephra#L175
>>>
>>>
>>> On Tue, Sep 27, 2016 at 12:59 AM, F21 <f2...@gmail.com> wrote:
>>>
>>> I was able to get it to run reliably with Phoenix 4.8.1-rc0. Another part
>>>> of the equation was forcing travis to use their Ubuntu 14.04 environment
>>>> rather than the default 12.04 environment. I am assuming 12.04 had an
>>>> older
>>>> kernel which prevent docker images from working correctly.
>>>>
>>>> On 27/09/2016 10:15 AM, F21 wrote:
>>>>
>>>> Hi all,
>>>>> I have created a docker image containing HBase 1.2.3 and Phoenix 4.8.0.
>>>>> See: https://github.com/Boostport/hbase-phoenix-all-in-one
>>>>>
>>>>> When running tests against the image on my machine, tephra works
>>>>> perfectly.
>>>>>
>>>>> However, tephra seems to be unreliable in CI environments. It seems that
>>>>> the tx service is not discovered:
>>>>>
>>>>> RuntimeException: java.lang.Exception: Thrift error for
>>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>>> Unable to discover tx service. -> Exception: Thrift error for
>>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>>> Unable to discover tx service. -> TException: Unable to discover tx
>>>>> service.
>>>>>
>>>>> Here's a build on wercker which shows tephra failing:
>>>>> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d
>>>>> 170a35501008402b4?step=57e9b5f72c15ad000127a534
>>>>>
>>>>> I also tried travis, but the same issue occurs:
>>>>> https://travis-ci.org/Boostport/avatica/builds/162952367
>>>>>
>>>>> Since I am unable to ssh into those docker container on wercker or
>>>>> travis, it is hard to debug what's causing tephra to fail. I am hoping
>>>>> that
>>>>> the issue is related to TEPHRA-179 (and a few other JIRAs related to it)
>>>>> which I reported a few weeks ago and has since been fixed.
>>>>>
>>>>> Has anyone else ran into similar problems? I would love to hear your
>>>>> thoughts.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Francis
>>>>>
>>>>>
>>>>>


Re: Tephra unreliable in CI environments

Posted by Poorna Chandra <po...@apache.org>.
Hi Francis,

This is due to guava-12 vs guava-13 incompatibility [1]. Tephra depends on
guava-13 and HBase depends on guava-12. Depending on how the OS orders the
jars in the classpath, sometimes guava-12 may appear earlier in the
classpath. This leads to the NoSuchMethodError exception. We are planning
on removing Tephra's dependency on guava-13 in the next release. Until then
a workaround is to add guava-13 jar into the classpath before guava-12 jar.

Thanks,
Poorna.

[1] - https://issues.apache.org/jira/browse/TEPHRA-181


On Tue, Sep 27, 2016 at 10:09 PM, F21 <f2...@gmail.com> wrote:

> Hi Poorna,
>
> That would be very helpful! Unfortunately, I ran into the same issue where
> the image no longer works correct on my dev environment, but works properly
> on travis.
>
> I am not receiving this error:
>
> java.lang.NoSuchMethodError:
> co.cask.tephra.TransactionManager.addListener(Lcom/google/co
> mmon/util/concurrent/Service$Listener;Ljava/util/concurrent/Executor;)V
>          at
> co.cask.tephra.distributed.TransactionService$1.leader(Trans
> actionService.java:83)
>          at
> org.apache.twill.internal.zookeeper.LeaderElection.becomeLea
> der(LeaderElection.java:229)
>          at
> org.apache.twill.internal.zookeeper.LeaderElection.access$
> 1800(LeaderElection.java:53)
>          at
> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
> ss(LeaderElection.java:207)
>          at
> org.apache.twill.internal.zookeeper.LeaderElection$5.onSucce
> ss(LeaderElection.java:186)
>          at
> com.google.common.util.concurrent.Futures$5.run(Futures.java:768)
>          at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> Executor.java:1142)
>          at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> lExecutor.java:617)
>          at java.lang.Thread.run(Thread.java:745)
>
> This is the exact same problem I encountered and posted on the list about
> previously: https://mail-archives.apache.org/mod_mbox/phoenix-user/20160
> 3.mbox/%3C56FC727E.30906@gmail.com%3E
>
> It's puzzling that an identical image will behave differently on different
> systems. Does tephra use any kernel functionalities directly?
>
> Cheers,
> Francis
>
>
> On 28/09/2016 12:04 PM, Poorna Chandra wrote:
>
>> Hi Francis,
>>
>> Tephra startup script redirects all logs to a file by default. To help
>> debug such issues, you could update the Tephra startup script [1] to log
>> everything to stdout instead of a file when running inside docker. We are
>> also planning on adding a --foreground option to Tephra startup script, in
>> which case the logs will be written to stdout directly. This will help in
>> debugging in future.
>>
>> Thanks,
>> Poorna.
>>
>> [1] - https://github.com/apache/incubator-tephra/blob/master/bin/
>> tephra#L175
>>
>>
>> On Tue, Sep 27, 2016 at 12:59 AM, F21 <f2...@gmail.com> wrote:
>>
>> I was able to get it to run reliably with Phoenix 4.8.1-rc0. Another part
>>> of the equation was forcing travis to use their Ubuntu 14.04 environment
>>> rather than the default 12.04 environment. I am assuming 12.04 had an
>>> older
>>> kernel which prevent docker images from working correctly.
>>>
>>> On 27/09/2016 10:15 AM, F21 wrote:
>>>
>>> Hi all,
>>>>
>>>> I have created a docker image containing HBase 1.2.3 and Phoenix 4.8.0.
>>>> See: https://github.com/Boostport/hbase-phoenix-all-in-one
>>>>
>>>> When running tests against the image on my machine, tephra works
>>>> perfectly.
>>>>
>>>> However, tephra seems to be unreliable in CI environments. It seems that
>>>> the tx service is not discovered:
>>>>
>>>> RuntimeException: java.lang.Exception: Thrift error for
>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>> Unable to discover tx service. -> Exception: Thrift error for
>>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>>> Unable to discover tx service. -> TException: Unable to discover tx
>>>> service.
>>>>
>>>> Here's a build on wercker which shows tephra failing:
>>>> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d
>>>> 170a35501008402b4?step=57e9b5f72c15ad000127a534
>>>>
>>>> I also tried travis, but the same issue occurs:
>>>> https://travis-ci.org/Boostport/avatica/builds/162952367
>>>>
>>>> Since I am unable to ssh into those docker container on wercker or
>>>> travis, it is hard to debug what's causing tephra to fail. I am hoping
>>>> that
>>>> the issue is related to TEPHRA-179 (and a few other JIRAs related to it)
>>>> which I reported a few weeks ago and has since been fixed.
>>>>
>>>> Has anyone else ran into similar problems? I would love to hear your
>>>> thoughts.
>>>>
>>>> Cheers,
>>>>
>>>> Francis
>>>>
>>>>
>>>>
>

Re: Tephra unreliable in CI environments

Posted by F21 <f2...@gmail.com>.
Hi Poorna,

That would be very helpful! Unfortunately, I ran into the same issue 
where the image no longer works correct on my dev environment, but works 
properly on travis.

I am not receiving this error:

java.lang.NoSuchMethodError:
co.cask.tephra.TransactionManager.addListener(Lcom/google/common/util/concurrent/Service$Listener;Ljava/util/concurrent/Executor;)V
          at
co.cask.tephra.distributed.TransactionService$1.leader(TransactionService.java:83)
          at
org.apache.twill.internal.zookeeper.LeaderElection.becomeLeader(LeaderElection.java:229)
          at
org.apache.twill.internal.zookeeper.LeaderElection.access$1800(LeaderElection.java:53)
          at
org.apache.twill.internal.zookeeper.LeaderElection$5.onSuccess(LeaderElection.java:207)
          at
org.apache.twill.internal.zookeeper.LeaderElection$5.onSuccess(LeaderElection.java:186)
          at
com.google.common.util.concurrent.Futures$5.run(Futures.java:768)
          at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
          at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
          at java.lang.Thread.run(Thread.java:745)

This is the exact same problem I encountered and posted on the list 
about previously: 
https://mail-archives.apache.org/mod_mbox/phoenix-user/201603.mbox/%3C56FC727E.30906@gmail.com%3E

It's puzzling that an identical image will behave differently on 
different systems. Does tephra use any kernel functionalities directly?

Cheers,
Francis

On 28/09/2016 12:04 PM, Poorna Chandra wrote:
> Hi Francis,
>
> Tephra startup script redirects all logs to a file by default. To help
> debug such issues, you could update the Tephra startup script [1] to log
> everything to stdout instead of a file when running inside docker. We are
> also planning on adding a --foreground option to Tephra startup script, in
> which case the logs will be written to stdout directly. This will help in
> debugging in future.
>
> Thanks,
> Poorna.
>
> [1] - https://github.com/apache/incubator-tephra/blob/master/bin/tephra#L175
>
>
> On Tue, Sep 27, 2016 at 12:59 AM, F21 <f2...@gmail.com> wrote:
>
>> I was able to get it to run reliably with Phoenix 4.8.1-rc0. Another part
>> of the equation was forcing travis to use their Ubuntu 14.04 environment
>> rather than the default 12.04 environment. I am assuming 12.04 had an older
>> kernel which prevent docker images from working correctly.
>>
>> On 27/09/2016 10:15 AM, F21 wrote:
>>
>>> Hi all,
>>>
>>> I have created a docker image containing HBase 1.2.3 and Phoenix 4.8.0.
>>> See: https://github.com/Boostport/hbase-phoenix-all-in-one
>>>
>>> When running tests against the image on my machine, tephra works
>>> perfectly.
>>>
>>> However, tephra seems to be unreliable in CI environments. It seems that
>>> the tx service is not discovered:
>>>
>>> RuntimeException: java.lang.Exception: Thrift error for
>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>> Unable to discover tx service. -> Exception: Thrift error for
>>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>>> Unable to discover tx service. -> TException: Unable to discover tx service.
>>>
>>> Here's a build on wercker which shows tephra failing:
>>> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d
>>> 170a35501008402b4?step=57e9b5f72c15ad000127a534
>>>
>>> I also tried travis, but the same issue occurs:
>>> https://travis-ci.org/Boostport/avatica/builds/162952367
>>>
>>> Since I am unable to ssh into those docker container on wercker or
>>> travis, it is hard to debug what's causing tephra to fail. I am hoping that
>>> the issue is related to TEPHRA-179 (and a few other JIRAs related to it)
>>> which I reported a few weeks ago and has since been fixed.
>>>
>>> Has anyone else ran into similar problems? I would love to hear your
>>> thoughts.
>>>
>>> Cheers,
>>>
>>> Francis
>>>
>>>


Re: Tephra unreliable in CI environments

Posted by Poorna Chandra <po...@apache.org>.
Hi Francis,

Tephra startup script redirects all logs to a file by default. To help
debug such issues, you could update the Tephra startup script [1] to log
everything to stdout instead of a file when running inside docker. We are
also planning on adding a --foreground option to Tephra startup script, in
which case the logs will be written to stdout directly. This will help in
debugging in future.

Thanks,
Poorna.

[1] - https://github.com/apache/incubator-tephra/blob/master/bin/tephra#L175


On Tue, Sep 27, 2016 at 12:59 AM, F21 <f2...@gmail.com> wrote:

> I was able to get it to run reliably with Phoenix 4.8.1-rc0. Another part
> of the equation was forcing travis to use their Ubuntu 14.04 environment
> rather than the default 12.04 environment. I am assuming 12.04 had an older
> kernel which prevent docker images from working correctly.
>
> On 27/09/2016 10:15 AM, F21 wrote:
>
>> Hi all,
>>
>> I have created a docker image containing HBase 1.2.3 and Phoenix 4.8.0.
>> See: https://github.com/Boostport/hbase-phoenix-all-in-one
>>
>> When running tests against the image on my machine, tephra works
>> perfectly.
>>
>> However, tephra seems to be unreliable in CI environments. It seems that
>> the tx service is not discovered:
>>
>> RuntimeException: java.lang.Exception: Thrift error for
>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>> Unable to discover tx service. -> Exception: Thrift error for
>> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e:
>> Unable to discover tx service. -> TException: Unable to discover tx service.
>>
>> Here's a build on wercker which shows tephra failing:
>> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d
>> 170a35501008402b4?step=57e9b5f72c15ad000127a534
>>
>> I also tried travis, but the same issue occurs:
>> https://travis-ci.org/Boostport/avatica/builds/162952367
>>
>> Since I am unable to ssh into those docker container on wercker or
>> travis, it is hard to debug what's causing tephra to fail. I am hoping that
>> the issue is related to TEPHRA-179 (and a few other JIRAs related to it)
>> which I reported a few weeks ago and has since been fixed.
>>
>> Has anyone else ran into similar problems? I would love to hear your
>> thoughts.
>>
>> Cheers,
>>
>> Francis
>>
>>
>

Re: Tephra unreliable in CI environments

Posted by F21 <f2...@gmail.com>.
I was able to get it to run reliably with Phoenix 4.8.1-rc0. Another 
part of the equation was forcing travis to use their Ubuntu 14.04 
environment rather than the default 12.04 environment. I am assuming 
12.04 had an older kernel which prevent docker images from working 
correctly.

On 27/09/2016 10:15 AM, F21 wrote:
> Hi all,
>
> I have created a docker image containing HBase 1.2.3 and Phoenix 
> 4.8.0. See: https://github.com/Boostport/hbase-phoenix-all-in-one
>
> When running tests against the image on my machine, tephra works 
> perfectly.
>
> However, tephra seems to be unreliable in CI environments. It seems 
> that the tx service is not discovered:
>
> RuntimeException: java.lang.Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e: 
> Unable to discover tx service. -> Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@3e9e291e: 
> Unable to discover tx service. -> TException: Unable to discover tx 
> service.
>
> Here's a build on wercker which shows tephra failing: 
> https://app.wercker.com/boostport/avatica/runs/build/57e9b5d170a35501008402b4?step=57e9b5f72c15ad000127a534
>
> I also tried travis, but the same issue occurs: 
> https://travis-ci.org/Boostport/avatica/builds/162952367
>
> Since I am unable to ssh into those docker container on wercker or 
> travis, it is hard to debug what's causing tephra to fail. I am hoping 
> that the issue is related to TEPHRA-179 (and a few other JIRAs related 
> to it) which I reported a few weeks ago and has since been fixed.
>
> Has anyone else ran into similar problems? I would love to hear your 
> thoughts.
>
> Cheers,
>
> Francis
>