You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Priyanka Gugale <pr...@apache.org> on 2015/12/18 13:33:10 UTC

Running CassandraOutputOperator with newer library

Hi,

I am facing classpath issues when I try to use newer version (e.g. 2.1.x or
3.0.x) of cassandra library for CassandraOutputOperator.

>From datastax documents looks like we need to include few more libraries to
setup java environment. (reference:
http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/settingUpJavaProgEnv_r.html
).
When I included the required libraries my ClassNotFound issue was resolved
but then I was getting some methods missing error.
Looks like, apex platform also uses guava library and it's version is lower
than 14.0.1, I couldn't find the source though (Note: I was just trying to
run unit tests here, not an application).

Has anyone faced such issue earlier?

-Priyanka

Re: Running CassandraOutputOperator with newer library

Posted by Priyanka Gugale <pr...@datatorrent.com>.
Tim,

Could you run tests in Malhar by just changing protocol version to 2.0 and
using cassandra drive library 2.1.8 ?

-Priyanka

On Sat, Dec 19, 2015 at 12:01 PM, Priyanka Gugale <pr...@apache.org> wrote:

> Hi,
>
> Thanks for inputs. I will try out version change as well as check if I can
> do anything about dependencies.
>
> Thomas,
> Even I am not sure where conflict is, but looking at error looks like
> there is conflict somewhere. I will take Gaurav and Tim's suggestion and
> try to figure out what exactly is causing conflict.
>
> Thanks,
> Priyanka
>
>
> On Sat, Dec 19, 2015 at 12:54 AM, Gaurav Gupta <ga...@datatorrent.com>
> wrote:
>
>> Priyanka,
>>
>> Guava is coming from Hadoop. Execute "mvn dependency:tree" under apex/api
>> and you should see something like.
>>
>>  org.apache.apex:apex-api:jar:3.3.0-incubating-SNAPSHOT
>> [INFO] +- org.apache.hadoop:hadoop-common:jar:2.2.0:compile
>> [INFO] |  +- org.apache.hadoop:hadoop-annotations:jar:2.2.0:compile
>> [INFO] |  +- com.google.guava:guava:jar:11.0.2:compile
>> [INFO] |  |  \- com.google.code.findbugs:jsr305:jar:1.3.9:compile
>>
>> Thanks
>> - Gaurav
>>
>> > On Dec 18, 2015, at 7:20 AM, Thomas Weise <th...@datatorrent.com>
>> wrote:
>> >
>> > It wasn't clear where the conflict is, so shading may or may not be a
>> > solution.
>> >
>> > If your app with the Cassandra operator needs a newer version of Guava,
>> > include that newer version into the app package. The container's class
>> path
>> > will have the dependencies packaged with the app before anything from
>> Hadoop
>> > .
>> >
>> > If the newer version of Guava isn't backward compatible, then things
>> could
>> > get complicated. If the conflict is between the Hadoop stuff and your
>> app
>> > (including Cassandra), shading won't help because those are dependencies
>> > that we use as is. In this case, it may be necessary to run the
>> Cassandra
>> > code in a fully isolated class loader. If the conflict is between Apex
>> code
>> > and the app, then shading may help.
>> >
>> > Thomas
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Fri, Dec 18, 2015 at 5:43 AM, Timothy Farkas <ti...@datatorrent.com>
>> wrote:
>> >
>> >> Hi Priyanka,
>> >>
>> >> I haven't seen this issue yet, but I am not using the newest version of
>> >> cassandra driver. This is the version I am using:
>> >>
>> >>
>> >>        <dependency>
>> >>          <groupId>com.datastax.cassandra</groupId>
>> >>          <artifactId>cassandra-driver-core</artifactId>
>> >>          <version>2.1.8</version>
>> >>        </dependency>
>> >>
>> >> I did run into an issue where I was getting strange cassandra
>> exceptions,
>> >> so I set the protocol to version 2.0 in the CassandraStore class like
>> this
>> >> and the issue was resolved:
>> >>
>> >>
>> >>      cluster = Cluster.builder()
>> >>
>> >>
>> >>
>> .addContactPoint(node).withProtocolVersion(ProtocolVersion.V2).withCredentials(userName,
>> >> password).build();
>> >>
>> >> If you are still experiencing missing method issues due to different
>> guava
>> >> versions it may be necessary to shade the version of guava used by
>> apex.
>> >> Shading changes the package names of all the packages in a library by
>> >> appending a prefix, so that
>> >> all the classes in the guava version used by apex would look like this:
>> >> org.apache.apex.com.google.guavapackage.GuavaClass. This prevents
>> version
>> >> clashes, and this technique is used by Flink for that reason. Perhaps
>> we
>> >> can start that discussion again in a separate thread.
>> >>
>> >> A temporary workaround could be to exclude guava from the apex
>> dependencies
>> >> in your pom, that way the newest version of guava will be used.
>> >>
>> >> Thanks,
>> >> Tim
>> >>
>> >> On Fri, Dec 18, 2015 at 4:33 AM, Priyanka Gugale <pr...@apache.org>
>> >> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I am facing classpath issues when I try to use newer version (e.g.
>> 2.1.x
>> >> or
>> >>> 3.0.x) of cassandra library for CassandraOutputOperator.
>> >>>
>> >>> From datastax documents looks like we need to include few more
>> libraries
>> >> to
>> >>> setup java environment. (reference:
>> >>>
>> >>>
>> >>
>> http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/settingUpJavaProgEnv_r.html
>> >>> ).
>> >>> When I included the required libraries my ClassNotFound issue was
>> >> resolved
>> >>> but then I was getting some methods missing error.
>> >>> Looks like, apex platform also uses guava library and it's version is
>> >> lower
>> >>> than 14.0.1, I couldn't find the source though (Note: I was just
>> trying
>> >> to
>> >>> run unit tests here, not an application).
>> >>>
>> >>> Has anyone faced such issue earlier?
>> >>>
>> >>> -Priyanka
>> >>>
>> >>
>>
>>
>

Re: Running CassandraOutputOperator with newer library

Posted by Priyanka Gugale <pr...@apache.org>.
Hi,

Thanks for inputs. I will try out version change as well as check if I can
do anything about dependencies.

Thomas,
Even I am not sure where conflict is, but looking at error looks like there
is conflict somewhere. I will take Gaurav and Tim's suggestion and try to
figure out what exactly is causing conflict.

Thanks,
Priyanka


On Sat, Dec 19, 2015 at 12:54 AM, Gaurav Gupta <ga...@datatorrent.com>
wrote:

> Priyanka,
>
> Guava is coming from Hadoop. Execute "mvn dependency:tree" under apex/api
> and you should see something like.
>
>  org.apache.apex:apex-api:jar:3.3.0-incubating-SNAPSHOT
> [INFO] +- org.apache.hadoop:hadoop-common:jar:2.2.0:compile
> [INFO] |  +- org.apache.hadoop:hadoop-annotations:jar:2.2.0:compile
> [INFO] |  +- com.google.guava:guava:jar:11.0.2:compile
> [INFO] |  |  \- com.google.code.findbugs:jsr305:jar:1.3.9:compile
>
> Thanks
> - Gaurav
>
> > On Dec 18, 2015, at 7:20 AM, Thomas Weise <th...@datatorrent.com>
> wrote:
> >
> > It wasn't clear where the conflict is, so shading may or may not be a
> > solution.
> >
> > If your app with the Cassandra operator needs a newer version of Guava,
> > include that newer version into the app package. The container's class
> path
> > will have the dependencies packaged with the app before anything from
> Hadoop
> > .
> >
> > If the newer version of Guava isn't backward compatible, then things
> could
> > get complicated. If the conflict is between the Hadoop stuff and your app
> > (including Cassandra), shading won't help because those are dependencies
> > that we use as is. In this case, it may be necessary to run the Cassandra
> > code in a fully isolated class loader. If the conflict is between Apex
> code
> > and the app, then shading may help.
> >
> > Thomas
> >
> >
> >
> >
> >
> >
> > On Fri, Dec 18, 2015 at 5:43 AM, Timothy Farkas <ti...@datatorrent.com>
> wrote:
> >
> >> Hi Priyanka,
> >>
> >> I haven't seen this issue yet, but I am not using the newest version of
> >> cassandra driver. This is the version I am using:
> >>
> >>
> >>        <dependency>
> >>          <groupId>com.datastax.cassandra</groupId>
> >>          <artifactId>cassandra-driver-core</artifactId>
> >>          <version>2.1.8</version>
> >>        </dependency>
> >>
> >> I did run into an issue where I was getting strange cassandra
> exceptions,
> >> so I set the protocol to version 2.0 in the CassandraStore class like
> this
> >> and the issue was resolved:
> >>
> >>
> >>      cluster = Cluster.builder()
> >>
> >>
> >>
> .addContactPoint(node).withProtocolVersion(ProtocolVersion.V2).withCredentials(userName,
> >> password).build();
> >>
> >> If you are still experiencing missing method issues due to different
> guava
> >> versions it may be necessary to shade the version of guava used by apex.
> >> Shading changes the package names of all the packages in a library by
> >> appending a prefix, so that
> >> all the classes in the guava version used by apex would look like this:
> >> org.apache.apex.com.google.guavapackage.GuavaClass. This prevents
> version
> >> clashes, and this technique is used by Flink for that reason. Perhaps we
> >> can start that discussion again in a separate thread.
> >>
> >> A temporary workaround could be to exclude guava from the apex
> dependencies
> >> in your pom, that way the newest version of guava will be used.
> >>
> >> Thanks,
> >> Tim
> >>
> >> On Fri, Dec 18, 2015 at 4:33 AM, Priyanka Gugale <pr...@apache.org>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> I am facing classpath issues when I try to use newer version (e.g.
> 2.1.x
> >> or
> >>> 3.0.x) of cassandra library for CassandraOutputOperator.
> >>>
> >>> From datastax documents looks like we need to include few more
> libraries
> >> to
> >>> setup java environment. (reference:
> >>>
> >>>
> >>
> http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/settingUpJavaProgEnv_r.html
> >>> ).
> >>> When I included the required libraries my ClassNotFound issue was
> >> resolved
> >>> but then I was getting some methods missing error.
> >>> Looks like, apex platform also uses guava library and it's version is
> >> lower
> >>> than 14.0.1, I couldn't find the source though (Note: I was just trying
> >> to
> >>> run unit tests here, not an application).
> >>>
> >>> Has anyone faced such issue earlier?
> >>>
> >>> -Priyanka
> >>>
> >>
>
>

Re: Running CassandraOutputOperator with newer library

Posted by Gaurav Gupta <ga...@datatorrent.com>.
Priyanka,

Guava is coming from Hadoop. Execute "mvn dependency:tree" under apex/api and you should see something like.

 org.apache.apex:apex-api:jar:3.3.0-incubating-SNAPSHOT
[INFO] +- org.apache.hadoop:hadoop-common:jar:2.2.0:compile
[INFO] |  +- org.apache.hadoop:hadoop-annotations:jar:2.2.0:compile
[INFO] |  +- com.google.guava:guava:jar:11.0.2:compile
[INFO] |  |  \- com.google.code.findbugs:jsr305:jar:1.3.9:compile

Thanks
- Gaurav

> On Dec 18, 2015, at 7:20 AM, Thomas Weise <th...@datatorrent.com> wrote:
> 
> It wasn't clear where the conflict is, so shading may or may not be a
> solution.
> 
> If your app with the Cassandra operator needs a newer version of Guava,
> include that newer version into the app package. The container's class path
> will have the dependencies packaged with the app before anything from Hadoop
> .
> 
> If the newer version of Guava isn't backward compatible, then things could
> get complicated. If the conflict is between the Hadoop stuff and your app
> (including Cassandra), shading won't help because those are dependencies
> that we use as is. In this case, it may be necessary to run the Cassandra
> code in a fully isolated class loader. If the conflict is between Apex code
> and the app, then shading may help.
> 
> Thomas
> 
> 
> 
> 
> 
> 
> On Fri, Dec 18, 2015 at 5:43 AM, Timothy Farkas <ti...@datatorrent.com> wrote:
> 
>> Hi Priyanka,
>> 
>> I haven't seen this issue yet, but I am not using the newest version of
>> cassandra driver. This is the version I am using:
>> 
>> 
>>        <dependency>
>>          <groupId>com.datastax.cassandra</groupId>
>>          <artifactId>cassandra-driver-core</artifactId>
>>          <version>2.1.8</version>
>>        </dependency>
>> 
>> I did run into an issue where I was getting strange cassandra exceptions,
>> so I set the protocol to version 2.0 in the CassandraStore class like this
>> and the issue was resolved:
>> 
>> 
>>      cluster = Cluster.builder()
>> 
>> 
>> .addContactPoint(node).withProtocolVersion(ProtocolVersion.V2).withCredentials(userName,
>> password).build();
>> 
>> If you are still experiencing missing method issues due to different guava
>> versions it may be necessary to shade the version of guava used by apex.
>> Shading changes the package names of all the packages in a library by
>> appending a prefix, so that
>> all the classes in the guava version used by apex would look like this:
>> org.apache.apex.com.google.guavapackage.GuavaClass. This prevents version
>> clashes, and this technique is used by Flink for that reason. Perhaps we
>> can start that discussion again in a separate thread.
>> 
>> A temporary workaround could be to exclude guava from the apex dependencies
>> in your pom, that way the newest version of guava will be used.
>> 
>> Thanks,
>> Tim
>> 
>> On Fri, Dec 18, 2015 at 4:33 AM, Priyanka Gugale <pr...@apache.org>
>> wrote:
>> 
>>> Hi,
>>> 
>>> I am facing classpath issues when I try to use newer version (e.g. 2.1.x
>> or
>>> 3.0.x) of cassandra library for CassandraOutputOperator.
>>> 
>>> From datastax documents looks like we need to include few more libraries
>> to
>>> setup java environment. (reference:
>>> 
>>> 
>> http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/settingUpJavaProgEnv_r.html
>>> ).
>>> When I included the required libraries my ClassNotFound issue was
>> resolved
>>> but then I was getting some methods missing error.
>>> Looks like, apex platform also uses guava library and it's version is
>> lower
>>> than 14.0.1, I couldn't find the source though (Note: I was just trying
>> to
>>> run unit tests here, not an application).
>>> 
>>> Has anyone faced such issue earlier?
>>> 
>>> -Priyanka
>>> 
>> 


Re: Running CassandraOutputOperator with newer library

Posted by Thomas Weise <th...@datatorrent.com>.
It wasn't clear where the conflict is, so shading may or may not be a
solution.

If your app with the Cassandra operator needs a newer version of Guava,
include that newer version into the app package. The container's class path
will have the dependencies packaged with the app before anything from Hadoop
.

If the newer version of Guava isn't backward compatible, then things could
get complicated. If the conflict is between the Hadoop stuff and your app
(including Cassandra), shading won't help because those are dependencies
that we use as is. In this case, it may be necessary to run the Cassandra
code in a fully isolated class loader. If the conflict is between Apex code
and the app, then shading may help.

Thomas






On Fri, Dec 18, 2015 at 5:43 AM, Timothy Farkas <ti...@datatorrent.com> wrote:

> Hi Priyanka,
>
> I haven't seen this issue yet, but I am not using the newest version of
> cassandra driver. This is the version I am using:
>
>
>         <dependency>
>           <groupId>com.datastax.cassandra</groupId>
>           <artifactId>cassandra-driver-core</artifactId>
>           <version>2.1.8</version>
>         </dependency>
>
> I did run into an issue where I was getting strange cassandra exceptions,
> so I set the protocol to version 2.0 in the CassandraStore class like this
> and the issue was resolved:
>
>
>       cluster = Cluster.builder()
>
>
> .addContactPoint(node).withProtocolVersion(ProtocolVersion.V2).withCredentials(userName,
> password).build();
>
> If you are still experiencing missing method issues due to different guava
> versions it may be necessary to shade the version of guava used by apex.
> Shading changes the package names of all the packages in a library by
> appending a prefix, so that
> all the classes in the guava version used by apex would look like this:
> org.apache.apex.com.google.guavapackage.GuavaClass. This prevents version
> clashes, and this technique is used by Flink for that reason. Perhaps we
> can start that discussion again in a separate thread.
>
> A temporary workaround could be to exclude guava from the apex dependencies
> in your pom, that way the newest version of guava will be used.
>
> Thanks,
> Tim
>
> On Fri, Dec 18, 2015 at 4:33 AM, Priyanka Gugale <pr...@apache.org>
> wrote:
>
> > Hi,
> >
> > I am facing classpath issues when I try to use newer version (e.g. 2.1.x
> or
> > 3.0.x) of cassandra library for CassandraOutputOperator.
> >
> > From datastax documents looks like we need to include few more libraries
> to
> > setup java environment. (reference:
> >
> >
> http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/settingUpJavaProgEnv_r.html
> > ).
> > When I included the required libraries my ClassNotFound issue was
> resolved
> > but then I was getting some methods missing error.
> > Looks like, apex platform also uses guava library and it's version is
> lower
> > than 14.0.1, I couldn't find the source though (Note: I was just trying
> to
> > run unit tests here, not an application).
> >
> > Has anyone faced such issue earlier?
> >
> > -Priyanka
> >
>

Re: Running CassandraOutputOperator with newer library

Posted by Timothy Farkas <ti...@datatorrent.com>.
Hi Priyanka,

I haven't seen this issue yet, but I am not using the newest version of
cassandra driver. This is the version I am using:


        <dependency>
          <groupId>com.datastax.cassandra</groupId>
          <artifactId>cassandra-driver-core</artifactId>
          <version>2.1.8</version>
        </dependency>

I did run into an issue where I was getting strange cassandra exceptions,
so I set the protocol to version 2.0 in the CassandraStore class like this
and the issue was resolved:


      cluster = Cluster.builder()

.addContactPoint(node).withProtocolVersion(ProtocolVersion.V2).withCredentials(userName,
password).build();

If you are still experiencing missing method issues due to different guava
versions it may be necessary to shade the version of guava used by apex.
Shading changes the package names of all the packages in a library by
appending a prefix, so that
all the classes in the guava version used by apex would look like this:
org.apache.apex.com.google.guavapackage.GuavaClass. This prevents version
clashes, and this technique is used by Flink for that reason. Perhaps we
can start that discussion again in a separate thread.

A temporary workaround could be to exclude guava from the apex dependencies
in your pom, that way the newest version of guava will be used.

Thanks,
Tim

On Fri, Dec 18, 2015 at 4:33 AM, Priyanka Gugale <pr...@apache.org> wrote:

> Hi,
>
> I am facing classpath issues when I try to use newer version (e.g. 2.1.x or
> 3.0.x) of cassandra library for CassandraOutputOperator.
>
> From datastax documents looks like we need to include few more libraries to
> setup java environment. (reference:
>
> http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/settingUpJavaProgEnv_r.html
> ).
> When I included the required libraries my ClassNotFound issue was resolved
> but then I was getting some methods missing error.
> Looks like, apex platform also uses guava library and it's version is lower
> than 14.0.1, I couldn't find the source though (Note: I was just trying to
> run unit tests here, not an application).
>
> Has anyone faced such issue earlier?
>
> -Priyanka
>