You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Pablo Estrada <pa...@google.com> on 2019/07/31 01:13:14 UTC

[DISCUSS] Moving FakeBigQueryServices to main/ rather than test/

Hello all,
I found some test utilities that we use to write unit tests for transforms
that read/write to/from BigQuery. These are all the
non-(*IT.java/*Test.java) classes in [1].

I believe that users may want to write tests for their own pipelines that
may rely on complex DynamicDestination logic (imagine streaming, or side
inputs for on-the-fly schema computation, or other tricky issues).

I think it makes sense to move these classes to
org.apache.beam.io.gcp.bigquery.testing, and publish them in the release.
Thoughts?

-P.

[1]
https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery

Re: [DISCUSS] Moving FakeBigQueryServices to main/ rather than test/

Posted by Reuven Lax <re...@google.com>.
The only concern is that we somewhat regularly have changed those classes,
and I think we didn't want to have to support them as a stable interface.

On Thu, Aug 1, 2019 at 5:23 PM Pablo Estrada <pa...@google.com> wrote:

> I've started making the change in https://github.com/apache/beam/pull/9206,
> but I'm having some trouble with Api Surface tests. Some odd NPE.
> I'll try to debug further and see.
> Best
> -P.
>
> On Wed, Jul 31, 2019 at 7:46 PM Kenneth Knowles <ke...@apache.org> wrote:
>
>> Publishing the "-tests" jar does not work for this purpose. The "test"
>> classifier means "these are tests". The classifier does not mean "this is
>> test related stuff". This is because "test" scope does not have transitive
>> dependencies resolved in the same way and does not work for shipping a
>> library to be used, at least with maven. (technically, you can make up any
>> classifier and it is the scoping of the deps that breaks things). To
>> publish test-related code for re-use as a library, you put it in the main/
>> section of an artifact, which you could name for testing if you like.
>>
>> The reason I know this in such detail is that Beam has done it the wrong
>> way multiple times.
>>
>> Kenn
>>
>> On Wed, Jul 31, 2019 at 1:12 AM Ryan Skraba <ry...@skraba.com> wrote:
>>
>>> Hello!  No objection to the move :/  But what do you think about
>>> publishing the test jar created in google-cloud-platform to be reused
>>> without moving the code to the main artifact jar?
>>>
>>> I admit that I'm familiar with this technique with maven, and not at
>>> all with gradle, but it's described here:
>>> https://maven.apache.org/guides/mini/guide-attached-tests.html  It
>>> relies on the classifier to distinguish between test classes and main
>>> classes.  Is this possible and/or easy with gradle?
>>>
>>> I've used this in the past to create "re-usable test artifacts" that
>>> remain independent from "real" code but tightly associated with the
>>> main artifact.
>>>
>>> I noticed that elasticsearch publishes main artifacts with `-test` in
>>> the artifact name as an alternative strategy (i.e. a separate
>>> project).
>>>
>>> All my best, Ryan
>>>
>>> On Wed, Jul 31, 2019 at 5:55 AM Mikhail Gryzykhin
>>> <gr...@gmail.com> wrote:
>>> >
>>> > +1
>>> > It is completely worth it.
>>> >
>>> > On Tue, Jul 30, 2019 at 8:50 PM Rui Wang <ru...@google.com> wrote:
>>> >>
>>> >> +1.
>>> >>
>>> >> I did something similar before: move TestBoundedTable to BeamSQL main
>>> to allow another module tests use it.
>>> >>
>>> >>
>>> >> -Rui
>>> >>
>>> >> On Tue, Jul 30, 2019 at 6:13 PM Pablo Estrada <pa...@google.com>
>>> wrote:
>>> >>>
>>> >>> Hello all,
>>> >>> I found some test utilities that we use to write unit tests for
>>> transforms that read/write to/from BigQuery. These are all the
>>> non-(*IT.java/*Test.java) classes in [1].
>>> >>>
>>> >>> I believe that users may want to write tests for their own pipelines
>>> that may rely on complex DynamicDestination logic (imagine streaming, or
>>> side inputs for on-the-fly schema computation, or other tricky issues).
>>> >>>
>>> >>> I think it makes sense to move these classes to
>>> org.apache.beam.io.gcp.bigquery.testing, and publish them in the release.
>>> Thoughts?
>>> >>>
>>> >>> -P.
>>> >>>
>>> >>> [1]
>>> https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery
>>>
>>

Re: [DISCUSS] Moving FakeBigQueryServices to main/ rather than test/

Posted by Pablo Estrada <pa...@google.com>.
I've started making the change in https://github.com/apache/beam/pull/9206,
but I'm having some trouble with Api Surface tests. Some odd NPE.
I'll try to debug further and see.
Best
-P.

On Wed, Jul 31, 2019 at 7:46 PM Kenneth Knowles <ke...@apache.org> wrote:

> Publishing the "-tests" jar does not work for this purpose. The "test"
> classifier means "these are tests". The classifier does not mean "this is
> test related stuff". This is because "test" scope does not have transitive
> dependencies resolved in the same way and does not work for shipping a
> library to be used, at least with maven. (technically, you can make up any
> classifier and it is the scoping of the deps that breaks things). To
> publish test-related code for re-use as a library, you put it in the main/
> section of an artifact, which you could name for testing if you like.
>
> The reason I know this in such detail is that Beam has done it the wrong
> way multiple times.
>
> Kenn
>
> On Wed, Jul 31, 2019 at 1:12 AM Ryan Skraba <ry...@skraba.com> wrote:
>
>> Hello!  No objection to the move :/  But what do you think about
>> publishing the test jar created in google-cloud-platform to be reused
>> without moving the code to the main artifact jar?
>>
>> I admit that I'm familiar with this technique with maven, and not at
>> all with gradle, but it's described here:
>> https://maven.apache.org/guides/mini/guide-attached-tests.html  It
>> relies on the classifier to distinguish between test classes and main
>> classes.  Is this possible and/or easy with gradle?
>>
>> I've used this in the past to create "re-usable test artifacts" that
>> remain independent from "real" code but tightly associated with the
>> main artifact.
>>
>> I noticed that elasticsearch publishes main artifacts with `-test` in
>> the artifact name as an alternative strategy (i.e. a separate
>> project).
>>
>> All my best, Ryan
>>
>> On Wed, Jul 31, 2019 at 5:55 AM Mikhail Gryzykhin
>> <gr...@gmail.com> wrote:
>> >
>> > +1
>> > It is completely worth it.
>> >
>> > On Tue, Jul 30, 2019 at 8:50 PM Rui Wang <ru...@google.com> wrote:
>> >>
>> >> +1.
>> >>
>> >> I did something similar before: move TestBoundedTable to BeamSQL main
>> to allow another module tests use it.
>> >>
>> >>
>> >> -Rui
>> >>
>> >> On Tue, Jul 30, 2019 at 6:13 PM Pablo Estrada <pa...@google.com>
>> wrote:
>> >>>
>> >>> Hello all,
>> >>> I found some test utilities that we use to write unit tests for
>> transforms that read/write to/from BigQuery. These are all the
>> non-(*IT.java/*Test.java) classes in [1].
>> >>>
>> >>> I believe that users may want to write tests for their own pipelines
>> that may rely on complex DynamicDestination logic (imagine streaming, or
>> side inputs for on-the-fly schema computation, or other tricky issues).
>> >>>
>> >>> I think it makes sense to move these classes to
>> org.apache.beam.io.gcp.bigquery.testing, and publish them in the release.
>> Thoughts?
>> >>>
>> >>> -P.
>> >>>
>> >>> [1]
>> https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery
>>
>

Re: [DISCUSS] Moving FakeBigQueryServices to main/ rather than test/

Posted by Kenneth Knowles <ke...@apache.org>.
Publishing the "-tests" jar does not work for this purpose. The "test"
classifier means "these are tests". The classifier does not mean "this is
test related stuff". This is because "test" scope does not have transitive
dependencies resolved in the same way and does not work for shipping a
library to be used, at least with maven. (technically, you can make up any
classifier and it is the scoping of the deps that breaks things). To
publish test-related code for re-use as a library, you put it in the main/
section of an artifact, which you could name for testing if you like.

The reason I know this in such detail is that Beam has done it the wrong
way multiple times.

Kenn

On Wed, Jul 31, 2019 at 1:12 AM Ryan Skraba <ry...@skraba.com> wrote:

> Hello!  No objection to the move :/  But what do you think about
> publishing the test jar created in google-cloud-platform to be reused
> without moving the code to the main artifact jar?
>
> I admit that I'm familiar with this technique with maven, and not at
> all with gradle, but it's described here:
> https://maven.apache.org/guides/mini/guide-attached-tests.html  It
> relies on the classifier to distinguish between test classes and main
> classes.  Is this possible and/or easy with gradle?
>
> I've used this in the past to create "re-usable test artifacts" that
> remain independent from "real" code but tightly associated with the
> main artifact.
>
> I noticed that elasticsearch publishes main artifacts with `-test` in
> the artifact name as an alternative strategy (i.e. a separate
> project).
>
> All my best, Ryan
>
> On Wed, Jul 31, 2019 at 5:55 AM Mikhail Gryzykhin
> <gr...@gmail.com> wrote:
> >
> > +1
> > It is completely worth it.
> >
> > On Tue, Jul 30, 2019 at 8:50 PM Rui Wang <ru...@google.com> wrote:
> >>
> >> +1.
> >>
> >> I did something similar before: move TestBoundedTable to BeamSQL main
> to allow another module tests use it.
> >>
> >>
> >> -Rui
> >>
> >> On Tue, Jul 30, 2019 at 6:13 PM Pablo Estrada <pa...@google.com>
> wrote:
> >>>
> >>> Hello all,
> >>> I found some test utilities that we use to write unit tests for
> transforms that read/write to/from BigQuery. These are all the
> non-(*IT.java/*Test.java) classes in [1].
> >>>
> >>> I believe that users may want to write tests for their own pipelines
> that may rely on complex DynamicDestination logic (imagine streaming, or
> side inputs for on-the-fly schema computation, or other tricky issues).
> >>>
> >>> I think it makes sense to move these classes to
> org.apache.beam.io.gcp.bigquery.testing, and publish them in the release.
> Thoughts?
> >>>
> >>> -P.
> >>>
> >>> [1]
> https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery
>

Re: [DISCUSS] Moving FakeBigQueryServices to main/ rather than test/

Posted by Ryan Skraba <ry...@skraba.com>.
Hello!  No objection to the move :/  But what do you think about
publishing the test jar created in google-cloud-platform to be reused
without moving the code to the main artifact jar?

I admit that I'm familiar with this technique with maven, and not at
all with gradle, but it's described here:
https://maven.apache.org/guides/mini/guide-attached-tests.html  It
relies on the classifier to distinguish between test classes and main
classes.  Is this possible and/or easy with gradle?

I've used this in the past to create "re-usable test artifacts" that
remain independent from "real" code but tightly associated with the
main artifact.

I noticed that elasticsearch publishes main artifacts with `-test` in
the artifact name as an alternative strategy (i.e. a separate
project).

All my best, Ryan

On Wed, Jul 31, 2019 at 5:55 AM Mikhail Gryzykhin
<gr...@gmail.com> wrote:
>
> +1
> It is completely worth it.
>
> On Tue, Jul 30, 2019 at 8:50 PM Rui Wang <ru...@google.com> wrote:
>>
>> +1.
>>
>> I did something similar before: move TestBoundedTable to BeamSQL main to allow another module tests use it.
>>
>>
>> -Rui
>>
>> On Tue, Jul 30, 2019 at 6:13 PM Pablo Estrada <pa...@google.com> wrote:
>>>
>>> Hello all,
>>> I found some test utilities that we use to write unit tests for transforms that read/write to/from BigQuery. These are all the non-(*IT.java/*Test.java) classes in [1].
>>>
>>> I believe that users may want to write tests for their own pipelines that may rely on complex DynamicDestination logic (imagine streaming, or side inputs for on-the-fly schema computation, or other tricky issues).
>>>
>>> I think it makes sense to move these classes to org.apache.beam.io.gcp.bigquery.testing, and publish them in the release. Thoughts?
>>>
>>> -P.
>>>
>>> [1] https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery

Re: [DISCUSS] Moving FakeBigQueryServices to main/ rather than test/

Posted by Mikhail Gryzykhin <gr...@gmail.com>.
+1
It is completely worth it.

On Tue, Jul 30, 2019 at 8:50 PM Rui Wang <ru...@google.com> wrote:

> +1.
>
> I did something similar before: move TestBoundedTable to BeamSQL main to
> allow another module tests use it.
>
>
> -Rui
>
> On Tue, Jul 30, 2019 at 6:13 PM Pablo Estrada <pa...@google.com> wrote:
>
>> Hello all,
>> I found some test utilities that we use to write unit tests for
>> transforms that read/write to/from BigQuery. These are all the
>> non-(*IT.java/*Test.java) classes in [1].
>>
>> I believe that users may want to write tests for their own pipelines that
>> may rely on complex DynamicDestination logic (imagine streaming, or side
>> inputs for on-the-fly schema computation, or other tricky issues).
>>
>> I think it makes sense to move these classes to
>> org.apache.beam.io.gcp.bigquery.testing, and publish them in the release.
>> Thoughts?
>>
>> -P.
>>
>> [1]
>> https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery
>>
>

Re: [DISCUSS] Moving FakeBigQueryServices to main/ rather than test/

Posted by Rui Wang <ru...@google.com>.
+1.

I did something similar before: move TestBoundedTable to BeamSQL main to
allow another module tests use it.


-Rui

On Tue, Jul 30, 2019 at 6:13 PM Pablo Estrada <pa...@google.com> wrote:

> Hello all,
> I found some test utilities that we use to write unit tests for transforms
> that read/write to/from BigQuery. These are all the
> non-(*IT.java/*Test.java) classes in [1].
>
> I believe that users may want to write tests for their own pipelines that
> may rely on complex DynamicDestination logic (imagine streaming, or side
> inputs for on-the-fly schema computation, or other tricky issues).
>
> I think it makes sense to move these classes to
> org.apache.beam.io.gcp.bigquery.testing, and publish them in the release.
> Thoughts?
>
> -P.
>
> [1]
> https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery
>