You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Udi Meiri <eh...@google.com> on 2019/11/07 02:12:21 UTC

Cython unit test suites running without Cythonized sources

I opened this bug today after commenting
<https://github.com/apache/beam/pull/9056#discussion_r343260172> on Chad's
type hints PR.
https://issues.apache.org/jira/browse/BEAM-8572?filter=-1

I am 95% sure that our Precommit tests are using tarballs that are built
without Cython (including the Cython tasks).

I'm NOT currently working on fixing this. One solution might be to add an
additional task (sdistCython) and tell gradle that sdist and the new task
should not run concurrently.
Does anyone want to take this up?

Re: Cython unit test suites running without Cythonized sources

Posted by Chad Dombrova <ch...@gmail.com>.
>
> IIUC, isolated_build=True and the removal of setup.py invocation in the
> current virtualenv should eliminate any Cython output files in the repo,
> and no need for run_tox_cleanup.sh?
>

Correct, that script is deleted in this commit:
https://github.com/apache/beam/pull/10038/commits/c6dab09abf9f4091f0dbf7eac2964c5be0665763

Re: Cython unit test suites running without Cythonized sources

Posted by Udi Meiri <eh...@google.com>.
The `changedir = {envsitepackagesdir}` setting is definitely something I
haven't thought of.
It solves the shadowing issue without needing to split tests and packages
from one another. (though I still think it's unnecessary to include tests
in the published package)

IIUC, isolated_build=True and the removal of setup.py invocation in the
current virtualenv should eliminate any Cython output files in the repo,
and no need for run_tox_cleanup.sh?


On Wed, Dec 11, 2019 at 9:38 AM Chad Dombrova <ch...@gmail.com> wrote:

> Hi Udi,
>
>> Sorry I didn't realize you already had a solution for the shadowing issue
>> and BEAM-8572.
>>
>
> No worries at all.  I haven't had much time to invest into that PR lately
> (most of it I did at home on my own time), but I did get past most of the
> major issues.  You've been working on so many of the same problems I was
> trying to solve there, and so far you've been coming to the same
> conclusions independently (e.g. removing pytest-runner and
> setup_requires).   It's great to have that validation, and it's helped
> reduce the scope of my PR.  Moving forward, I would love to team up on
> this.  Happy to answer any questions you have about the approach I took.
>
> -chad
>
>
>

Re: Cython unit test suites running without Cythonized sources

Posted by Chad Dombrova <ch...@gmail.com>.
Hi Udi,

> Sorry I didn't realize you already had a solution for the shadowing issue
> and BEAM-8572.
>

No worries at all.  I haven't had much time to invest into that PR lately
(most of it I did at home on my own time), but I did get past most of the
major issues.  You've been working on so many of the same problems I was
trying to solve there, and so far you've been coming to the same
conclusions independently (e.g. removing pytest-runner and
setup_requires).   It's great to have that validation, and it's helped
reduce the scope of my PR.  Moving forward, I would love to team up on
this.  Happy to answer any questions you have about the approach I took.

-chad

Re: Cython unit test suites running without Cythonized sources

Posted by Udi Meiri <eh...@google.com>.
Sorry I didn't realize you already had a solution for the shadowing issue
and BEAM-8572.

On Tue, Dec 10, 2019 at 6:21 PM Chad Dombrova <ch...@gmail.com> wrote:

> Hi Udi, I know you're aware of my PR
> <https://github.com/apache/beam/pull/10038>, but I really encourage you
> to look into pep517 and pep518.  They are the new solution for all of this
> -- declaring build dependencies and creating isolated out-of-source builds
> e.g. using tox.  Another thing I added to my PR is something which asserts
> that cython is *or is not* enabled based on an env var set in the tox
> config.  In other words, we're currently skipping tests when cython is not
> installed or when code is not cythonized, but we're not asserting whether
> we *expect* that code to be cythonized or not.  Also there are a few bugs
> where cython tests don't run at all because we're identifying the wrong
> module name to check for cythonization ("apache_beam.coders" which is a
> package).
>
> -chad
>
>
> On Tue, Dec 10, 2019 at 6:03 PM Udi Meiri <eh...@google.com> wrote:
>
>> To follow up, since I'm trying to run cython-based tests using pytest:
>> - tox does in fact correctly install apache-beam with cythonized modules
>> in its virtualenv.
>> - Since our tests are under apache_beam/, local sources shadow those in
>> the installed apache_beam package.
>> - The original issue I raised (BEAM-8572
>> <https://issues.apache.org/jira/browse/BEAM-8572>) was due to bad usage
>> of utils.check_compiled().
>>
>> So I'm going to add an additional "python setup.py build_ext --inplace"
>> to pyXX-cython-pytest envs.
>> If we want to remove this additional step we'll probably have to move
>> tests under a separate directory that is not part of the apache_beam
>> package.
>>
>> See also:
>> https://github.com/tox-dev/tox/issues/514
>> https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure
>>
>> On Mon, Nov 11, 2019 at 3:21 PM Ahmet Altay <al...@google.com> wrote:
>>
>>> Thank you for spending time on this to clarify it for all of us! Much
>>> appreciated.
>>>
>>> On Sun, Nov 10, 2019 at 3:45 PM Chad Dombrova <ch...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>>
>>>>> The sdist step creates a package that should be installed into each
>>>>> tox environment. If the tox environment has cython when this apache
>>>>> beam package is installed, it should be used. Nose (or whatever)
>>>>> should then run the tests.
>>>>>
>>>> I spent some time this weekend trying to understand the Beam python
>>>> build process, and here’s an overview of what I’ve learned:
>>>>
>>>>    - the :sdks:python:sdist gradle task creates the source tarball (no
>>>>    surprises there)
>>>>       - the protobuf stubs are generated during this process
>>>>    - the sdist is provided to tox, which installs it into the the
>>>>    virtualenv for that task
>>>>    - for *-cython tasks, tox installs the cython dep and, as Ahmet
>>>>    asserted, python setup.py nosetests performs the cythonization.
>>>>       - this cythonized build overrides the one installed by tox
>>>>
>>>> Here’s what I learned about the current status of tests wrt cython:
>>>>
>>>>    - cython tox tasks *are* using cython (good!)
>>>>    - non-cython tox tasks *are not* using cython (good!)
>>>>    - none of the GCP or integration tests are using cython (bad?)
>>>>
>>>> This is intentional with the recent change to drop base only tests.
>>> Otherwise we would not have coverage for non-cythonized tests.
>>>
>>>>
>>>>    - This is because the build is only cythonized when python setup.py
>>>>       nosetests is used in conjunction with tox (tox installs cython, python
>>>>       setup.py nosetests compiles it).
>>>>       - GCP tests don't install cython.  ITs don't use tox.
>>>>
>>>> To confirm my understanding of this, I created a PR [1] to assert that
>>>> a cythonized or pure-python build is being used.  A cythonized build is
>>>> expected by default on linux systems unless a special flag is provided to
>>>> inform the test otherwise.  It appears as though the portable tests passed
>>>> (i.e. used cython), but I forgot to add the assertion for those; like the
>>>> other ITs they are not using cython.
>>>>
>>>> *Questions:*
>>>>
>>>>    - Is the lack of cython for ITs expected and/or desired?
>>>>    - Why aren't ITs using tox?  It's quite possible to pass arguments
>>>>    into tox to control it's behavior.  For example, it seems reasonable that
>>>>    run_integration_test.sh could be inside tox
>>>>
>>>>
>>> For ITs the primary execution will happen on a remote system. As long as
>>> those remote workers have cython installed the tests will run in a
>>> cythonized environment. This is true for any IT tests, that runs in a
>>> container based on the Dockerfile we have in beam (which installs cython),
>>> and also true for legacy Dataflow environments that are not yet using this
>>> Beam provided Dockerfile.
>>>
>>>
>>>>
>>>> *Next Steps:*There has been some movement in the python community to
>>>> solve problems around build dependencies [2] and toolchains [3].  I hope to
>>>> have a proposal for how to simplify this process soon.
>>>>
>>> Thank you!
>>>
>>> [1] https://github.com/apache/beam/pull/10058
>>>> [2] https://www.python.org/dev/peps/pep-0517/
>>>> [3] https://www.python.org/dev/peps/pep-0518/
>>>>
>>>> -chad
>>>>
>>>

Re: Cython unit test suites running without Cythonized sources

Posted by Chad Dombrova <ch...@gmail.com>.
Hi Udi, I know you're aware of my PR
<https://github.com/apache/beam/pull/10038>, but I really encourage you to
look into pep517 and pep518.  They are the new solution for all of this --
declaring build dependencies and creating isolated out-of-source builds
e.g. using tox.  Another thing I added to my PR is something which asserts
that cython is *or is not* enabled based on an env var set in the tox
config.  In other words, we're currently skipping tests when cython is not
installed or when code is not cythonized, but we're not asserting whether
we *expect* that code to be cythonized or not.  Also there are a few bugs
where cython tests don't run at all because we're identifying the wrong
module name to check for cythonization ("apache_beam.coders" which is a
package).

-chad


On Tue, Dec 10, 2019 at 6:03 PM Udi Meiri <eh...@google.com> wrote:

> To follow up, since I'm trying to run cython-based tests using pytest:
> - tox does in fact correctly install apache-beam with cythonized modules
> in its virtualenv.
> - Since our tests are under apache_beam/, local sources shadow those in
> the installed apache_beam package.
> - The original issue I raised (BEAM-8572
> <https://issues.apache.org/jira/browse/BEAM-8572>) was due to bad usage
> of utils.check_compiled().
>
> So I'm going to add an additional "python setup.py build_ext --inplace" to
> pyXX-cython-pytest envs.
> If we want to remove this additional step we'll probably have to move
> tests under a separate directory that is not part of the apache_beam
> package.
>
> See also:
> https://github.com/tox-dev/tox/issues/514
> https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure
>
> On Mon, Nov 11, 2019 at 3:21 PM Ahmet Altay <al...@google.com> wrote:
>
>> Thank you for spending time on this to clarify it for all of us! Much
>> appreciated.
>>
>> On Sun, Nov 10, 2019 at 3:45 PM Chad Dombrova <ch...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>>
>>>> The sdist step creates a package that should be installed into each
>>>> tox environment. If the tox environment has cython when this apache
>>>> beam package is installed, it should be used. Nose (or whatever)
>>>> should then run the tests.
>>>>
>>> I spent some time this weekend trying to understand the Beam python
>>> build process, and here’s an overview of what I’ve learned:
>>>
>>>    - the :sdks:python:sdist gradle task creates the source tarball (no
>>>    surprises there)
>>>       - the protobuf stubs are generated during this process
>>>    - the sdist is provided to tox, which installs it into the the
>>>    virtualenv for that task
>>>    - for *-cython tasks, tox installs the cython dep and, as Ahmet
>>>    asserted, python setup.py nosetests performs the cythonization.
>>>       - this cythonized build overrides the one installed by tox
>>>
>>> Here’s what I learned about the current status of tests wrt cython:
>>>
>>>    - cython tox tasks *are* using cython (good!)
>>>    - non-cython tox tasks *are not* using cython (good!)
>>>    - none of the GCP or integration tests are using cython (bad?)
>>>
>>> This is intentional with the recent change to drop base only tests.
>> Otherwise we would not have coverage for non-cythonized tests.
>>
>>>
>>>    - This is because the build is only cythonized when python setup.py
>>>       nosetests is used in conjunction with tox (tox installs cython, python
>>>       setup.py nosetests compiles it).
>>>       - GCP tests don't install cython.  ITs don't use tox.
>>>
>>> To confirm my understanding of this, I created a PR [1] to assert that a
>>> cythonized or pure-python build is being used.  A cythonized build is
>>> expected by default on linux systems unless a special flag is provided to
>>> inform the test otherwise.  It appears as though the portable tests passed
>>> (i.e. used cython), but I forgot to add the assertion for those; like the
>>> other ITs they are not using cython.
>>>
>>> *Questions:*
>>>
>>>    - Is the lack of cython for ITs expected and/or desired?
>>>    - Why aren't ITs using tox?  It's quite possible to pass arguments
>>>    into tox to control it's behavior.  For example, it seems reasonable that
>>>    run_integration_test.sh could be inside tox
>>>
>>>
>> For ITs the primary execution will happen on a remote system. As long as
>> those remote workers have cython installed the tests will run in a
>> cythonized environment. This is true for any IT tests, that runs in a
>> container based on the Dockerfile we have in beam (which installs cython),
>> and also true for legacy Dataflow environments that are not yet using this
>> Beam provided Dockerfile.
>>
>>
>>>
>>> *Next Steps:*There has been some movement in the python community to
>>> solve problems around build dependencies [2] and toolchains [3].  I hope to
>>> have a proposal for how to simplify this process soon.
>>>
>> Thank you!
>>
>> [1] https://github.com/apache/beam/pull/10058
>>> [2] https://www.python.org/dev/peps/pep-0517/
>>> [3] https://www.python.org/dev/peps/pep-0518/
>>>
>>> -chad
>>>
>>

Re: Cython unit test suites running without Cythonized sources

Posted by Udi Meiri <eh...@google.com>.
To follow up, since I'm trying to run cython-based tests using pytest:
- tox does in fact correctly install apache-beam with cythonized modules in
its virtualenv.
- Since our tests are under apache_beam/, local sources shadow those in the
installed apache_beam package.
- The original issue I raised (BEAM-8572
<https://issues.apache.org/jira/browse/BEAM-8572>) was due to bad usage
of utils.check_compiled().

So I'm going to add an additional "python setup.py build_ext --inplace" to
pyXX-cython-pytest envs.
If we want to remove this additional step we'll probably have to move tests
under a separate directory that is not part of the apache_beam package.

See also:
https://github.com/tox-dev/tox/issues/514
https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure

On Mon, Nov 11, 2019 at 3:21 PM Ahmet Altay <al...@google.com> wrote:

> Thank you for spending time on this to clarify it for all of us! Much
> appreciated.
>
> On Sun, Nov 10, 2019 at 3:45 PM Chad Dombrova <ch...@gmail.com> wrote:
>
>> Hi all,
>>
>>
>>> The sdist step creates a package that should be installed into each
>>> tox environment. If the tox environment has cython when this apache
>>> beam package is installed, it should be used. Nose (or whatever)
>>> should then run the tests.
>>>
>> I spent some time this weekend trying to understand the Beam python build
>> process, and here’s an overview of what I’ve learned:
>>
>>    - the :sdks:python:sdist gradle task creates the source tarball (no
>>    surprises there)
>>       - the protobuf stubs are generated during this process
>>    - the sdist is provided to tox, which installs it into the the
>>    virtualenv for that task
>>    - for *-cython tasks, tox installs the cython dep and, as Ahmet
>>    asserted, python setup.py nosetests performs the cythonization.
>>       - this cythonized build overrides the one installed by tox
>>
>> Here’s what I learned about the current status of tests wrt cython:
>>
>>    - cython tox tasks *are* using cython (good!)
>>    - non-cython tox tasks *are not* using cython (good!)
>>    - none of the GCP or integration tests are using cython (bad?)
>>
>> This is intentional with the recent change to drop base only tests.
> Otherwise we would not have coverage for non-cythonized tests.
>
>>
>>    - This is because the build is only cythonized when python setup.py
>>       nosetests is used in conjunction with tox (tox installs cython, python
>>       setup.py nosetests compiles it).
>>       - GCP tests don't install cython.  ITs don't use tox.
>>
>> To confirm my understanding of this, I created a PR [1] to assert that a
>> cythonized or pure-python build is being used.  A cythonized build is
>> expected by default on linux systems unless a special flag is provided to
>> inform the test otherwise.  It appears as though the portable tests passed
>> (i.e. used cython), but I forgot to add the assertion for those; like the
>> other ITs they are not using cython.
>>
>> *Questions:*
>>
>>    - Is the lack of cython for ITs expected and/or desired?
>>    - Why aren't ITs using tox?  It's quite possible to pass arguments
>>    into tox to control it's behavior.  For example, it seems reasonable that
>>    run_integration_test.sh could be inside tox
>>
>>
> For ITs the primary execution will happen on a remote system. As long as
> those remote workers have cython installed the tests will run in a
> cythonized environment. This is true for any IT tests, that runs in a
> container based on the Dockerfile we have in beam (which installs cython),
> and also true for legacy Dataflow environments that are not yet using this
> Beam provided Dockerfile.
>
>
>>
>> *Next Steps:*There has been some movement in the python community to
>> solve problems around build dependencies [2] and toolchains [3].  I hope to
>> have a proposal for how to simplify this process soon.
>>
> Thank you!
>
> [1] https://github.com/apache/beam/pull/10058
>> [2] https://www.python.org/dev/peps/pep-0517/
>> [3] https://www.python.org/dev/peps/pep-0518/
>>
>> -chad
>>
>

Re: Cython unit test suites running without Cythonized sources

Posted by Ahmet Altay <al...@google.com>.
Thank you for spending time on this to clarify it for all of us! Much
appreciated.

On Sun, Nov 10, 2019 at 3:45 PM Chad Dombrova <ch...@gmail.com> wrote:

> Hi all,
>
>
>> The sdist step creates a package that should be installed into each
>> tox environment. If the tox environment has cython when this apache
>> beam package is installed, it should be used. Nose (or whatever)
>> should then run the tests.
>>
> I spent some time this weekend trying to understand the Beam python build
> process, and here’s an overview of what I’ve learned:
>
>    - the :sdks:python:sdist gradle task creates the source tarball (no
>    surprises there)
>       - the protobuf stubs are generated during this process
>    - the sdist is provided to tox, which installs it into the the
>    virtualenv for that task
>    - for *-cython tasks, tox installs the cython dep and, as Ahmet
>    asserted, python setup.py nosetests performs the cythonization.
>       - this cythonized build overrides the one installed by tox
>
> Here’s what I learned about the current status of tests wrt cython:
>
>    - cython tox tasks *are* using cython (good!)
>    - non-cython tox tasks *are not* using cython (good!)
>    - none of the GCP or integration tests are using cython (bad?)
>
> This is intentional with the recent change to drop base only tests.
Otherwise we would not have coverage for non-cythonized tests.

>
>    - This is because the build is only cythonized when python setup.py
>       nosetests is used in conjunction with tox (tox installs cython, python
>       setup.py nosetests compiles it).
>       - GCP tests don't install cython.  ITs don't use tox.
>
> To confirm my understanding of this, I created a PR [1] to assert that a
> cythonized or pure-python build is being used.  A cythonized build is
> expected by default on linux systems unless a special flag is provided to
> inform the test otherwise.  It appears as though the portable tests passed
> (i.e. used cython), but I forgot to add the assertion for those; like the
> other ITs they are not using cython.
>
> *Questions:*
>
>    - Is the lack of cython for ITs expected and/or desired?
>    - Why aren't ITs using tox?  It's quite possible to pass arguments
>    into tox to control it's behavior.  For example, it seems reasonable that
>    run_integration_test.sh could be inside tox
>
>
For ITs the primary execution will happen on a remote system. As long as
those remote workers have cython installed the tests will run in a
cythonized environment. This is true for any IT tests, that runs in a
container based on the Dockerfile we have in beam (which installs cython),
and also true for legacy Dataflow environments that are not yet using this
Beam provided Dockerfile.


>
> *Next Steps:*There has been some movement in the python community to
> solve problems around build dependencies [2] and toolchains [3].  I hope to
> have a proposal for how to simplify this process soon.
>
Thank you!

[1] https://github.com/apache/beam/pull/10058
> [2] https://www.python.org/dev/peps/pep-0517/
> [3] https://www.python.org/dev/peps/pep-0518/
>
> -chad
>

Re: Cython unit test suites running without Cythonized sources

Posted by Chad Dombrova <ch...@gmail.com>.
Hi all,


> The sdist step creates a package that should be installed into each
> tox environment. If the tox environment has cython when this apache
> beam package is installed, it should be used. Nose (or whatever)
> should then run the tests.
>
I spent some time this weekend trying to understand the Beam python build
process, and here’s an overview of what I’ve learned:

   - the :sdks:python:sdist gradle task creates the source tarball (no
   surprises there)
      - the protobuf stubs are generated during this process
   - the sdist is provided to tox, which installs it into the the
   virtualenv for that task
   - for *-cython tasks, tox installs the cython dep and, as Ahmet
   asserted, python setup.py nosetests performs the cythonization.
      - this cythonized build overrides the one installed by tox

Here’s what I learned about the current status of tests wrt cython:

   - cython tox tasks *are* using cython (good!)
   - non-cython tox tasks *are not* using cython (good!)
   - none of the GCP or integration tests are using cython (bad?)
      - This is because the build is only cythonized when python setup.py
      nosetests is used in conjunction with tox (tox installs cython, python
      setup.py nosetests compiles it).
      - GCP tests don't install cython.  ITs don't use tox.

To confirm my understanding of this, I created a PR [1] to assert that a
cythonized or pure-python build is being used.  A cythonized build is
expected by default on linux systems unless a special flag is provided to
inform the test otherwise.  It appears as though the portable tests passed
(i.e. used cython), but I forgot to add the assertion for those; like the
other ITs they are not using cython.

*Questions:*

   - Is the lack of cython for ITs expected and/or desired?
   - Why aren't ITs using tox?  It's quite possible to pass arguments into
   tox to control it's behavior.  For example, it seems reasonable that
   run_integration_test.sh could be inside tox


*Next Steps:*There has been some movement in the python community to solve
problems around build dependencies [2] and toolchains [3].  I hope to have
a proposal for how to simplify this process soon.

[1] https://github.com/apache/beam/pull/10058
[2] https://www.python.org/dev/peps/pep-0517/
[3] https://www.python.org/dev/peps/pep-0518/

-chad

Re: Cython unit test suites running without Cythonized sources

Posted by Robert Bradshaw <ro...@google.com>.
On Thu, Nov 7, 2019 at 6:25 PM Chad Dombrova <ch...@gmail.com> wrote:
>
> Hi,
> Answers inline below,
>
>>> It's unclear from the nose source[1] whether it's calling build_py and build_ext, or just build_ext.  It's also unclear whether the result of that build is actually used.  When python setup.py nosetests runs, it runs inside of a virtualenv created by tox, and tox has already installed beam into that venv.  It seems unlikely to me that build_ext or build_py is going to install over top of the beam package installed by tox, but who knows, it may end up first on the path.  Also recall that there is an sdist gradle task running before tox that creates a tarball that is passed to run_tox.sh which passes it along to tox --installpkg flag[2] so that tox won't build beam itself.
>>
>>
>> I believe the build step is executed as expected and during installation results in cythonized package to be installed. This could be verified by, in a new virtual environment creating a source distribution, installing cython, then installing the source distribution. Resulting installation does have the .so files. This is done before running nosetests.
>
>
> Even if it *is* working, I think it's a pretty poor design that we build it once in sdist and then rebuild it again with nose.  It's very obfuscated and brittle, hence we're still debating the probable outcome.  We should choose one place to build and that should either be the sdist gradle task or tox, not the test command.

While "building" is a bit of an odd concept in Python, these steps
have different roles.

The sdist step creates a package that should be installed into each
tox environment. If the tox environment has cython when this apache
beam package is installed, it should be used. Nose (or whatever)
should then run the tests.

I agree this could be cleaned up. I personally don't gradle to execute
any of this stuff so don't remember how it's set up.

>>> We should designate a single place that always does the build.  I thought that was supposed to be the gradle sdist task, but invoking nose via `python setup.py` means that we're introducing the possibility that a build is occurring which may or may not be used, depending on the entirely unclear dependencies of the setup commands, and the entirely unclear relationship of that build output to the tox venv.  As a first step of clarity, we could stop invoking nose using `python setup.py nosetests`, and instead use `nosetests` (and in the future `pytest`).  I think the reason for `python setup.py nosetest` is to ensure the test requirements are installed,
>>
>>
>> I believe the reason we are invokign nosetest this way is related to the beam testing plugin. It is configure in setup.py. The behavior is documented here: https://nose.readthedocs.io/en/latest/api/commands.html
>
>
> It is possible to register a custom plugin without using setup.py: https://nose.readthedocs.io/en/latest/plugins/writing.html#registering-a-plugin-without-setuptools
>
> Since we're on the verge of switching to pytest, perhaps we should investigate switching that over to to not use setup.py instead of chasing our tails with nose.
>
>>>
>>> but we could shift those to a separate requirements file or use extra_requires and tox can ensure that they're installed.  I find these two options to be pretty common patterns [3].
>>
>>
>> We do use extras is tox already. GCP tests work this way by installing additional GCP package. In my opinion, letting tox to setup the virtual environment either from the package or from setup.py is a better option than using requirements file. Otherwise we would need a way to keep setup.py and requirements file in sync.
>
>
> Oh yeah, I see that the tests already are an extra package.  Well, that'll make it that much easier to stop using `python setup.py nosetests`.
>
> -chad
>

Re: Cython unit test suites running without Cythonized sources

Posted by Chad Dombrova <ch...@gmail.com>.
Hi,
Answers inline below,

It's unclear from the nose source[1] whether it's calling build_py
>> and build_ext, or just build_ext.  It's also unclear whether the result of
>> that build is actually used.  When python setup.py nosetests runs, it runs
>> inside of a virtualenv created by tox, and tox has already installed beam
>> into that venv.  It seems unlikely to me that build_ext or build_py is
>> going to install over top of the beam package installed by tox, but who
>> knows, it may end up first on the path.  Also recall that there is an sdist
>> gradle task running before tox that creates a tarball that is passed to
>> run_tox.sh which passes it along to tox --installpkg flag[2] so that tox
>> won't build beam itself.
>>
>
> I believe the build step is executed as expected and during installation
> results in cythonized package to be installed. This could be verified by,
> in a new virtual environment creating a source distribution, installing
> cython, then installing the source distribution. Resulting installation
> does have the .so files. This is done before running nosetests.
>

Even if it *is* working, I think it's a pretty poor design that we build it
once in sdist and then rebuild it again with nose.  It's very obfuscated
and brittle, hence we're still debating the probable outcome.  We should
choose one place to build and that should either be the sdist gradle task
or tox, not the test command.


> We should designate a single place that always does the build.  I thought
>> that was supposed to be the gradle sdist task, but invoking nose via
>> `python setup.py` means that we're introducing the possibility that a build
>> is occurring which may or may not be used, depending on the entirely
>> unclear dependencies of the setup commands, and the entirely unclear
>> relationship of that build output to the tox venv.  As a first step of
>> clarity, we could stop invoking nose using `python setup.py nosetests`, and
>> instead use `nosetests` (and in the future `pytest`).  I think the reason
>> for `python setup.py nosetest` is to ensure the test requirements are
>> installed,
>>
>
> I believe the reason we are invokign nosetest this way is related to the
> beam testing plugin. It is configure in setup.py. The behavior is
> documented here: https://nose.readthedocs.io/en/latest/api/commands.html
>

It is possible to register a custom plugin without using setup.py:
https://nose.readthedocs.io/en/latest/plugins/writing.html#registering-a-plugin-without-setuptools

Since we're on the verge of switching to pytest, perhaps we should
investigate switching that over to to not use setup.py instead of chasing
our tails with nose.


> but we could shift those to a separate requirements file or use
>> extra_requires and tox can ensure that they're installed.  I find these two
>> options to be pretty common patterns [3].
>>
>
> We do use extras is tox already. GCP tests work this way by installing
> additional GCP package. In my opinion, letting tox to setup the virtual
> environment either from the package or from setup.py is a better option
> than using requirements file. Otherwise we would need a way to keep
> setup.py and requirements file in sync.
>

Oh yeah, I see that the tests already are an extra package.  Well, that'll
make it that much easier to stop using `python setup.py nosetests`.

-chad

Re: Cython unit test suites running without Cythonized sources

Posted by Ahmet Altay <al...@google.com>.
On Thu, Nov 7, 2019 at 1:37 PM Chad Dombrova <ch...@gmail.com> wrote:

>
> On Thu, Nov 7, 2019 at 11:31 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> Does python setup.py nosetests invoke build_ext (or, more generally,
>> build)?
>
>
> It's unclear from the nose source[1] whether it's calling build_py
> and build_ext, or just build_ext.  It's also unclear whether the result of
> that build is actually used.  When python setup.py nosetests runs, it runs
> inside of a virtualenv created by tox, and tox has already installed beam
> into that venv.  It seems unlikely to me that build_ext or build_py is
> going to install over top of the beam package installed by tox, but who
> knows, it may end up first on the path.  Also recall that there is an sdist
> gradle task running before tox that creates a tarball that is passed to
> run_tox.sh which passes it along to tox --installpkg flag[2] so that tox
> won't build beam itself.
>

I believe the build step is executed as expected and during installation
results in cythonized package to be installed. This could be verified by,
in a new virtual environment creating a source distribution, installing
cython, then installing the source distribution. Resulting installation
does have the .so files. This is done before running nosetests.


> We should designate a single place that always does the build.  I thought
> that was supposed to be the gradle sdist task, but invoking nose via
> `python setup.py` means that we're introducing the possibility that a build
> is occurring which may or may not be used, depending on the entirely
> unclear dependencies of the setup commands, and the entirely unclear
> relationship of that build output to the tox venv.  As a first step of
> clarity, we could stop invoking nose using `python setup.py nosetests`, and
> instead use `nosetests` (and in the future `pytest`).  I think the reason
> for `python setup.py nosetest` is to ensure the test requirements are
> installed,
>

I believe the reason we are invokign nosetest this way is related to the
beam testing plugin. It is configure in setup.py. The behavior is
documented here: https://nose.readthedocs.io/en/latest/api/commands.html


> but we could shift those to a separate requirements file or use
> extra_requires and tox can ensure that they're installed.  I find these two
> options to be pretty common patterns [3].
>

We do use extras is tox already. GCP tests work this way by installing
additional GCP package. In my opinion, letting tox to setup the virtual
environment either from the package or from setup.py is a better option
than using requirements file. Otherwise we would need a way to keep
setup.py and requirements file in sync. This is also doable.


>
> [1] https://github.com/nose-devs/nose/blob/master/nose/commands.py#L113
> [2]
> https://tox.readthedocs.io/en/latest/config.html#cmdoption-tox-installpkg
> [3]
> https://stackoverflow.com/questions/29870629/pip-install-test-dependencies-for-tox-from-setup-py
>
>
>
>> It's possible cython is present, but the build step is not
>> invoked which would explain the skip for slow_coders_test. The correct
>> test is being used in
>>
>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/fast_coders_test.py#L34
>>
>> On Thu, Nov 7, 2019 at 11:20 AM Ahmet Altay <al...@google.com> wrote:
>> >
>> > I believe tox is correctly installing cython and executes "python
>> setup.py nosetests" which triggers cythonzation path inside setup.py. Some
>> indications that cython is installed and used is the following log entries
>> (from a recent precommit cron job [1])
>> > - [ 1/12] Cythonizing apache_beam/coders/coder_impl.py
>> > - Errors with cython.cast in the stack traces.
>> > - Tests skipped with: test_using_slow_impl
>> (apache_beam.coders.slow_coders_test.SlowCoders) ... SKIP: Found cython,
>> cannot test non-compiled implementation.
>> >
>> > At the same time there are log entries as following:
>> > - test_using_fast_impl (apache_beam.coders.fast_coders_test.FastCoders)
>> ... SKIP: Cython is not installed
>> >
>> > It might be an issue with what these tests are suing to check whether
>> they are cythonized or not. We seem to have at least 2 different versions
>> of this check [2][3]. Maybe we need to converge on one (former?).
>> >
>> > Ahmet
>> >
>> > [1]
>> https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PreCommit_Python_Cron/2008/consoleFull
>> > [2]
>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/slow_coders_test.py#L32
>> > [3]
>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tools/utils.py#L33
>> >
>> >
>> >
>> > On Wed, Nov 6, 2019 at 6:19 PM Chad Dombrova <ch...@gmail.com> wrote:
>> >>
>> >> Another potential solution would be to _not_ use the sdist task to
>> build the tarball and let tox do it.  Tox should install cython on
>> supported platforms before running sdist itself (which it does by default
>> unless you explicitly provide it with a tarball, which we are doing).  This
>> has the added benefit of one less virtualenv.  Right now we create a
>> virtualenv to build the sdist tarball, then we create another virtualenv to
>> run tox, then tox creates a virtualenv to run the task.   It's unclear (to
>> me) whether the tarball is rebuilt for each tox task or if it's reused.
>> >
>> >
>> > I do not know if not passing the tarball will solve the issue. I tried
>> this and ran into the same problem.
>> >
>> > I agree that we can get rid of setup virtualenv task if it is not
>> adding value.
>> >
>> >>
>> >>
>> >> -chad
>> >>
>> >>
>> >> On Wed, Nov 6, 2019 at 6:13 PM Udi Meiri <eh...@google.com> wrote:
>> >>>
>> >>> I opened this bug today after commenting on Chad's type hints PR.
>> >>> https://issues.apache.org/jira/browse/BEAM-8572?filter=-1
>> >
>> >
>> > Thank you for filing an issue.
>> >
>> >>>
>> >>>
>> >>>
>> >>> I am 95% sure that our Precommit tests are using tarballs that are
>> built without Cython (including the Cython tasks).
>> >>>
>> >>> I'm NOT currently working on fixing this. One solution might be to
>> add an additional task (sdistCython) and tell gradle that sdist and the new
>> task should not run concurrently.
>> >>> Does anyone want to take this up?
>>
>

Re: Cython unit test suites running without Cythonized sources

Posted by Chad Dombrova <ch...@gmail.com>.
On Thu, Nov 7, 2019 at 11:31 AM Robert Bradshaw <ro...@google.com> wrote:

> Does python setup.py nosetests invoke build_ext (or, more generally,
> build)?


It's unclear from the nose source[1] whether it's calling build_py
and build_ext, or just build_ext.  It's also unclear whether the result of
that build is actually used.  When python setup.py nosetests runs, it runs
inside of a virtualenv created by tox, and tox has already installed beam
into that venv.  It seems unlikely to me that build_ext or build_py is
going to install over top of the beam package installed by tox, but who
knows, it may end up first on the path.  Also recall that there is an sdist
gradle task running before tox that creates a tarball that is passed to
run_tox.sh which passes it along to tox --installpkg flag[2] so that tox
won't build beam itself.

We should designate a single place that always does the build.  I thought
that was supposed to be the gradle sdist task, but invoking nose via
`python setup.py` means that we're introducing the possibility that a build
is occurring which may or may not be used, depending on the entirely
unclear dependencies of the setup commands, and the entirely unclear
relationship of that build output to the tox venv.  As a first step of
clarity, we could stop invoking nose using `python setup.py nosetests`, and
instead use `nosetests` (and in the future `pytest`).  I think the reason
for `python setup.py nosetest` is to ensure the test requirements are
installed, but we could shift those to a separate requirements file or use
extra_requires and tox can ensure that they're installed.  I find these two
options to be pretty common patterns [3].

[1] https://github.com/nose-devs/nose/blob/master/nose/commands.py#L113
[2]
https://tox.readthedocs.io/en/latest/config.html#cmdoption-tox-installpkg
[3]
https://stackoverflow.com/questions/29870629/pip-install-test-dependencies-for-tox-from-setup-py



> It's possible cython is present, but the build step is not
> invoked which would explain the skip for slow_coders_test. The correct
> test is being used in
>
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/fast_coders_test.py#L34
>
> On Thu, Nov 7, 2019 at 11:20 AM Ahmet Altay <al...@google.com> wrote:
> >
> > I believe tox is correctly installing cython and executes "python
> setup.py nosetests" which triggers cythonzation path inside setup.py. Some
> indications that cython is installed and used is the following log entries
> (from a recent precommit cron job [1])
> > - [ 1/12] Cythonizing apache_beam/coders/coder_impl.py
> > - Errors with cython.cast in the stack traces.
> > - Tests skipped with: test_using_slow_impl
> (apache_beam.coders.slow_coders_test.SlowCoders) ... SKIP: Found cython,
> cannot test non-compiled implementation.
> >
> > At the same time there are log entries as following:
> > - test_using_fast_impl (apache_beam.coders.fast_coders_test.FastCoders)
> ... SKIP: Cython is not installed
> >
> > It might be an issue with what these tests are suing to check whether
> they are cythonized or not. We seem to have at least 2 different versions
> of this check [2][3]. Maybe we need to converge on one (former?).
> >
> > Ahmet
> >
> > [1]
> https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PreCommit_Python_Cron/2008/consoleFull
> > [2]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/slow_coders_test.py#L32
> > [3]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tools/utils.py#L33
> >
> >
> >
> > On Wed, Nov 6, 2019 at 6:19 PM Chad Dombrova <ch...@gmail.com> wrote:
> >>
> >> Another potential solution would be to _not_ use the sdist task to
> build the tarball and let tox do it.  Tox should install cython on
> supported platforms before running sdist itself (which it does by default
> unless you explicitly provide it with a tarball, which we are doing).  This
> has the added benefit of one less virtualenv.  Right now we create a
> virtualenv to build the sdist tarball, then we create another virtualenv to
> run tox, then tox creates a virtualenv to run the task.   It's unclear (to
> me) whether the tarball is rebuilt for each tox task or if it's reused.
> >
> >
> > I do not know if not passing the tarball will solve the issue. I tried
> this and ran into the same problem.
> >
> > I agree that we can get rid of setup virtualenv task if it is not adding
> value.
> >
> >>
> >>
> >> -chad
> >>
> >>
> >> On Wed, Nov 6, 2019 at 6:13 PM Udi Meiri <eh...@google.com> wrote:
> >>>
> >>> I opened this bug today after commenting on Chad's type hints PR.
> >>> https://issues.apache.org/jira/browse/BEAM-8572?filter=-1
> >
> >
> > Thank you for filing an issue.
> >
> >>>
> >>>
> >>>
> >>> I am 95% sure that our Precommit tests are using tarballs that are
> built without Cython (including the Cython tasks).
> >>>
> >>> I'm NOT currently working on fixing this. One solution might be to add
> an additional task (sdistCython) and tell gradle that sdist and the new
> task should not run concurrently.
> >>> Does anyone want to take this up?
>

Re: Cython unit test suites running without Cythonized sources

Posted by Robert Bradshaw <ro...@google.com>.
Does python setup.py nosetests invoke build_ext (or, more generally,
build)? It's possible cython is present, but the build step is not
invoked which would explain the skip for slow_coders_test. The correct
test is being used in
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/fast_coders_test.py#L34

On Thu, Nov 7, 2019 at 11:20 AM Ahmet Altay <al...@google.com> wrote:
>
> I believe tox is correctly installing cython and executes "python setup.py nosetests" which triggers cythonzation path inside setup.py. Some indications that cython is installed and used is the following log entries (from a recent precommit cron job [1])
> - [ 1/12] Cythonizing apache_beam/coders/coder_impl.py
> - Errors with cython.cast in the stack traces.
> - Tests skipped with: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders) ... SKIP: Found cython, cannot test non-compiled implementation.
>
> At the same time there are log entries as following:
> - test_using_fast_impl (apache_beam.coders.fast_coders_test.FastCoders) ... SKIP: Cython is not installed
>
> It might be an issue with what these tests are suing to check whether they are cythonized or not. We seem to have at least 2 different versions of this check [2][3]. Maybe we need to converge on one (former?).
>
> Ahmet
>
> [1] https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PreCommit_Python_Cron/2008/consoleFull
> [2] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/slow_coders_test.py#L32
> [3] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tools/utils.py#L33
>
>
>
> On Wed, Nov 6, 2019 at 6:19 PM Chad Dombrova <ch...@gmail.com> wrote:
>>
>> Another potential solution would be to _not_ use the sdist task to build the tarball and let tox do it.  Tox should install cython on supported platforms before running sdist itself (which it does by default unless you explicitly provide it with a tarball, which we are doing).  This has the added benefit of one less virtualenv.  Right now we create a virtualenv to build the sdist tarball, then we create another virtualenv to run tox, then tox creates a virtualenv to run the task.   It's unclear (to me) whether the tarball is rebuilt for each tox task or if it's reused.
>
>
> I do not know if not passing the tarball will solve the issue. I tried this and ran into the same problem.
>
> I agree that we can get rid of setup virtualenv task if it is not adding value.
>
>>
>>
>> -chad
>>
>>
>> On Wed, Nov 6, 2019 at 6:13 PM Udi Meiri <eh...@google.com> wrote:
>>>
>>> I opened this bug today after commenting on Chad's type hints PR.
>>> https://issues.apache.org/jira/browse/BEAM-8572?filter=-1
>
>
> Thank you for filing an issue.
>
>>>
>>>
>>>
>>> I am 95% sure that our Precommit tests are using tarballs that are built without Cython (including the Cython tasks).
>>>
>>> I'm NOT currently working on fixing this. One solution might be to add an additional task (sdistCython) and tell gradle that sdist and the new task should not run concurrently.
>>> Does anyone want to take this up?

Re: Cython unit test suites running without Cythonized sources

Posted by Ahmet Altay <al...@google.com>.
I believe tox is correctly installing cython and executes "python setup.py
nosetests" which triggers cythonzation path inside setup.py. Some
indications that cython is installed and used is the following log entries
(from a recent precommit cron job [1])
- [ 1/12] Cythonizing apache_beam/coders/coder_impl.py
- Errors with cython.cast in the stack traces.
- Tests skipped with: test_using_slow_impl
(apache_beam.coders.slow_coders_test.SlowCoders) ... SKIP: Found cython,
cannot test non-compiled implementation.

At the same time there are log entries as following:
- test_using_fast_impl (apache_beam.coders.fast_coders_test.FastCoders) ...
SKIP: Cython is not installed

It might be an issue with what these tests are suing to check whether they
are cythonized or not. We seem to have at least 2 different versions of
this check [2][3]. Maybe we need to converge on one (former?).

Ahmet

[1]
https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PreCommit_Python_Cron/2008/consoleFull
[2]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/slow_coders_test.py#L32
[3]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tools/utils.py#L33



On Wed, Nov 6, 2019 at 6:19 PM Chad Dombrova <ch...@gmail.com> wrote:

> Another potential solution would be to _not_ use the sdist task to build
> the tarball and let tox do it.  Tox should install cython on supported
> platforms before running sdist itself (which it does by default unless you
> explicitly provide it with a tarball, which we are doing).  This has the
> added benefit of one less virtualenv.  Right now we create a virtualenv to
> build the sdist tarball, then we create another virtualenv to run tox, then
> tox creates a virtualenv to run the task.   It's unclear (to me) whether
> the tarball is rebuilt for each tox task or if it's reused.
>

I do not know if not passing the tarball will solve the issue. I tried this
and ran into the same problem.

I agree that we can get rid of setup virtualenv task if it is not adding
value.


>
> -chad
>
>
> On Wed, Nov 6, 2019 at 6:13 PM Udi Meiri <eh...@google.com> wrote:
>
>> I opened this bug today after commenting
>> <https://github.com/apache/beam/pull/9056#discussion_r343260172> on
>> Chad's type hints PR.
>> https://issues.apache.org/jira/browse/BEAM-8572?filter=-1
>>
>
Thank you for filing an issue.


>
>>
>> I am 95% sure that our Precommit tests are using tarballs that are built
>> without Cython (including the Cython tasks).
>>
>> I'm NOT currently working on fixing this. One solution might be to add an
>> additional task (sdistCython) and tell gradle that sdist and the new task
>> should not run concurrently.
>> Does anyone want to take this up?
>>
>

Re: Cython unit test suites running without Cythonized sources

Posted by Chad Dombrova <ch...@gmail.com>.
Another potential solution would be to _not_ use the sdist task to build
the tarball and let tox do it.  Tox should install cython on supported
platforms before running sdist itself (which it does by default unless you
explicitly provide it with a tarball, which we are doing).  This has the
added benefit of one less virtualenv.  Right now we create a virtualenv to
build the sdist tarball, then we create another virtualenv to run tox, then
tox creates a virtualenv to run the task.   It's unclear (to me) whether
the tarball is rebuilt for each tox task or if it's reused.

-chad


On Wed, Nov 6, 2019 at 6:13 PM Udi Meiri <eh...@google.com> wrote:

> I opened this bug today after commenting
> <https://github.com/apache/beam/pull/9056#discussion_r343260172> on
> Chad's type hints PR.
> https://issues.apache.org/jira/browse/BEAM-8572?filter=-1
>
> I am 95% sure that our Precommit tests are using tarballs that are built
> without Cython (including the Cython tasks).
>
> I'm NOT currently working on fixing this. One solution might be to add an
> additional task (sdistCython) and tell gradle that sdist and the new task
> should not run concurrently.
> Does anyone want to take this up?
>