You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Luciano Resende <lu...@gmail.com> on 2015/10/21 22:16:33 UTC

Bringing up JDBC Tests to trunk

I have started looking into PR-8101 [1] and what is required to merge it
into trunk which will also unblock me around SPARK-10521 [2].

So here is the minimal plan I was thinking about :

- make the docker image version fixed so we make sure we are using the same
image all the time
- pull the required images on the Jenkins executors so tests are not
delayed/timedout because it is waiting for docker images to download
- create a profile to run the JDBC tests
- create daily jobs for running the JDBC tests


In parallel, I learned that Alan Chin from my team is working with the
AmpLab team to expand the build capacity for Spark, so I will use some of
the nodes he is preparing to test/run these builds for now.

Please let me know if there is anything else needed around this.


[1] https://github.com/apache/spark/pull/8101
[2] https://issues.apache.org/jira/browse/SPARK-10521

-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: Bringing up JDBC Tests to trunk

Posted by Josh Rosen <jo...@databricks.com>.
Can you write a script to download and install the JDBC driver to the local
Maven repository if it's not already present? If we had that, we could just
invoke it as part of dev/run-tests.

On Thu, Dec 3, 2015 at 5:55 PM Luciano Resende <lu...@gmail.com> wrote:

>
>
> On Mon, Nov 30, 2015 at 1:53 PM, Josh Rosen <jo...@databricks.com>
> wrote:
>
>> The JDBC drivers are currently being pulled in as test-scope dependencies
>> of the `sql/core` module:
>> https://github.com/apache/spark/blob/f2fbfa444f6e8d27953ec2d1c0b3abd603c963f9/sql/core/pom.xml#L91
>>
>> In SBT, these wind up on the Docker JDBC tests' classpath as a transitive
>> dependency of the `spark-sql` test JAR. However, what we *should* be
>> doing is adding them as explicit test dependencies of the
>> `docker-integration-tests` subproject, since Maven handles transitive test
>> JAR dependencies differently than SBT (see
>> https://github.com/apache/spark/pull/9876#issuecomment-158593498 for
>> some discussion). If you choose to make that fix as part of your PR, be
>> sure to move the version handling to the root POM's <dependencyManagement>
>> section so that the versions in both modules stay in sync. We might also be
>> able to just simply move the JDBC driver dependencies to
>> docker-integration-tests' POM if it turns out that they're not used
>> anywhere else (that's my hunch).
>>
>>
>
> So, the issue I am having now is that the DB2 JDBC is not available in any
> maven public repository, so the plan I am going in with is :
>
> - Before running the DB2 Docker Tests, the client machine needs to
> download the jdbc driver locally and install it to it's local maven
> repository (or sbt equivalent)  (instructions to be provided in either
> readme or pom file)
>
> - We would need help with installing the DB2 JDBC on the Jenkins slaves
> machines
>
> - We could also create a new profile for the DB2 Docker Tests, so that
> this tests are running when this profile is enabled.
>
> I could probably think about other options, but they would sound a lot
> hacky.....
>
> Thoughts ? Some suggestions ?
>
> --
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>

Re: Bringing up JDBC Tests to trunk

Posted by Luciano Resende <lu...@gmail.com>.
On Mon, Nov 30, 2015 at 1:53 PM, Josh Rosen <jo...@databricks.com>
wrote:

> The JDBC drivers are currently being pulled in as test-scope dependencies
> of the `sql/core` module:
> https://github.com/apache/spark/blob/f2fbfa444f6e8d27953ec2d1c0b3abd603c963f9/sql/core/pom.xml#L91
>
> In SBT, these wind up on the Docker JDBC tests' classpath as a transitive
> dependency of the `spark-sql` test JAR. However, what we *should* be
> doing is adding them as explicit test dependencies of the
> `docker-integration-tests` subproject, since Maven handles transitive test
> JAR dependencies differently than SBT (see
> https://github.com/apache/spark/pull/9876#issuecomment-158593498 for some
> discussion). If you choose to make that fix as part of your PR, be sure to
> move the version handling to the root POM's <dependencyManagement> section
> so that the versions in both modules stay in sync. We might also be able to
> just simply move the JDBC driver dependencies to docker-integration-tests'
> POM if it turns out that they're not used anywhere else (that's my hunch).
>
>

So, the issue I am having now is that the DB2 JDBC is not available in any
maven public repository, so the plan I am going in with is :

- Before running the DB2 Docker Tests, the client machine needs to download
the jdbc driver locally and install it to it's local maven repository (or
sbt equivalent)  (instructions to be provided in either readme or pom file)

- We would need help with installing the DB2 JDBC on the Jenkins slaves
machines

- We could also create a new profile for the DB2 Docker Tests, so that this
tests are running when this profile is enabled.

I could probably think about other options, but they would sound a lot
hacky.....

Thoughts ? Some suggestions ?

-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: Bringing up JDBC Tests to trunk

Posted by Jacek Laskowski <ja...@japila.pl>.
On Mon, Nov 30, 2015 at 10:53 PM, Josh Rosen <jo...@databricks.com> wrote:

> In SBT, these wind up on the Docker JDBC tests' classpath as a transitive
> dependency of the `spark-sql` test JAR. However, what we should be doing is
> adding them as explicit test dependencies of the `docker-integration-tests`
> subproject, since Maven handles transitive test JAR dependencies differently
> than SBT (see
> https://github.com/apache/spark/pull/9876#issuecomment-158593498 for some
> discussion). If you choose to make that fix as part of your PR, be sure to
> move the version handling to the root POM's <dependencyManagement> section
> so that the versions in both modules stay in sync. We might also be able to
> just simply move the JDBC driver dependencies to docker-integration-tests'
> POM if it turns out that they're not used anywhere else (that's my hunch).

Hi Josh,

Could you elaborate a little more on what is really required and how
to verify requested changes (or at least what is failing so once it's
not, it's supposed to be a solution)? The little magic word "sbt"
triggered me thinking I could help here and there? :-)

If there's a JIRA task for it, let me know. Thanks!

Jacek

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Bringing up JDBC Tests to trunk

Posted by Josh Rosen <jo...@databricks.com>.
The JDBC drivers are currently being pulled in as test-scope dependencies
of the `sql/core` module:
https://github.com/apache/spark/blob/f2fbfa444f6e8d27953ec2d1c0b3abd603c963f9/sql/core/pom.xml#L91

In SBT, these wind up on the Docker JDBC tests' classpath as a transitive
dependency of the `spark-sql` test JAR. However, what we *should* be doing
is adding them as explicit test dependencies of the
`docker-integration-tests` subproject, since Maven handles transitive test
JAR dependencies differently than SBT (see
https://github.com/apache/spark/pull/9876#issuecomment-158593498 for some
discussion). If you choose to make that fix as part of your PR, be sure to
move the version handling to the root POM's <dependencyManagement> section
so that the versions in both modules stay in sync. We might also be able to
just simply move the JDBC driver dependencies to docker-integration-tests'
POM if it turns out that they're not used anywhere else (that's my hunch).

On Sun, Nov 22, 2015 at 6:49 PM, Luciano Resende <lu...@gmail.com>
wrote:

> Hey Josh,
>
> Thanks for helping bringing this up, I have just pushed a WIP PR for
> bringing the DB2 tests to be running on Docker, and I have a question about
> how the jdbc drivers are actually being setup for the other datasources
> (MySQL and PostgreSQL), are these setup directly on the Jenkins slaves ? I
> didn't see the jars or anything specific on the pom or other files...
>
>
> Thanks
>
> On Wed, Oct 21, 2015 at 1:26 PM, Josh Rosen <ro...@gmail.com> wrote:
>
>> Hey Luciano,
>>
>> This sounds like a reasonable plan to me. One of my colleagues has
>> written some Dockerized MySQL testing utilities, so I'll take a peek at
>> those to see if there are any specifics of their solution that we should
>> adapt for Spark.
>>
>> On Wed, Oct 21, 2015 at 1:16 PM, Luciano Resende <lu...@gmail.com>
>> wrote:
>>
>>> I have started looking into PR-8101 [1] and what is required to merge it
>>> into trunk which will also unblock me around SPARK-10521 [2].
>>>
>>> So here is the minimal plan I was thinking about :
>>>
>>> - make the docker image version fixed so we make sure we are using the
>>> same image all the time
>>> - pull the required images on the Jenkins executors so tests are not
>>> delayed/timedout because it is waiting for docker images to download
>>> - create a profile to run the JDBC tests
>>> - create daily jobs for running the JDBC tests
>>>
>>>
>>> In parallel, I learned that Alan Chin from my team is working with the
>>> AmpLab team to expand the build capacity for Spark, so I will use some of
>>> the nodes he is preparing to test/run these builds for now.
>>>
>>> Please let me know if there is anything else needed around this.
>>>
>>>
>>> [1] https://github.com/apache/spark/pull/8101
>>> [2] https://issues.apache.org/jira/browse/SPARK-10521
>>>
>>> --
>>> Luciano Resende
>>> http://people.apache.org/~lresende
>>> http://twitter.com/lresende1975
>>> http://lresende.blogspot.com/
>>>
>>
>>
>
>
> --
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>

Re: Bringing up JDBC Tests to trunk

Posted by Luciano Resende <lu...@gmail.com>.
Hey Josh,

Thanks for helping bringing this up, I have just pushed a WIP PR for
bringing the DB2 tests to be running on Docker, and I have a question about
how the jdbc drivers are actually being setup for the other datasources
(MySQL and PostgreSQL), are these setup directly on the Jenkins slaves ? I
didn't see the jars or anything specific on the pom or other files...


Thanks

On Wed, Oct 21, 2015 at 1:26 PM, Josh Rosen <ro...@gmail.com> wrote:

> Hey Luciano,
>
> This sounds like a reasonable plan to me. One of my colleagues has written
> some Dockerized MySQL testing utilities, so I'll take a peek at those to
> see if there are any specifics of their solution that we should adapt for
> Spark.
>
> On Wed, Oct 21, 2015 at 1:16 PM, Luciano Resende <lu...@gmail.com>
> wrote:
>
>> I have started looking into PR-8101 [1] and what is required to merge it
>> into trunk which will also unblock me around SPARK-10521 [2].
>>
>> So here is the minimal plan I was thinking about :
>>
>> - make the docker image version fixed so we make sure we are using the
>> same image all the time
>> - pull the required images on the Jenkins executors so tests are not
>> delayed/timedout because it is waiting for docker images to download
>> - create a profile to run the JDBC tests
>> - create daily jobs for running the JDBC tests
>>
>>
>> In parallel, I learned that Alan Chin from my team is working with the
>> AmpLab team to expand the build capacity for Spark, so I will use some of
>> the nodes he is preparing to test/run these builds for now.
>>
>> Please let me know if there is anything else needed around this.
>>
>>
>> [1] https://github.com/apache/spark/pull/8101
>> [2] https://issues.apache.org/jira/browse/SPARK-10521
>>
>> --
>> Luciano Resende
>> http://people.apache.org/~lresende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
>>
>
>


-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: Bringing up JDBC Tests to trunk

Posted by Josh Rosen <ro...@gmail.com>.
Hey Luciano,

This sounds like a reasonable plan to me. One of my colleagues has written
some Dockerized MySQL testing utilities, so I'll take a peek at those to
see if there are any specifics of their solution that we should adapt for
Spark.

On Wed, Oct 21, 2015 at 1:16 PM, Luciano Resende <lu...@gmail.com>
wrote:

> I have started looking into PR-8101 [1] and what is required to merge it
> into trunk which will also unblock me around SPARK-10521 [2].
>
> So here is the minimal plan I was thinking about :
>
> - make the docker image version fixed so we make sure we are using the
> same image all the time
> - pull the required images on the Jenkins executors so tests are not
> delayed/timedout because it is waiting for docker images to download
> - create a profile to run the JDBC tests
> - create daily jobs for running the JDBC tests
>
>
> In parallel, I learned that Alan Chin from my team is working with the
> AmpLab team to expand the build capacity for Spark, so I will use some of
> the nodes he is preparing to test/run these builds for now.
>
> Please let me know if there is anything else needed around this.
>
>
> [1] https://github.com/apache/spark/pull/8101
> [2] https://issues.apache.org/jira/browse/SPARK-10521
>
> --
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>