You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by Andi Vajda <va...@apache.org> on 2021/02/28 01:36:48 UTC

Re: jcc - Output unbuilt package

  Hi Phil,

On Sun, 28 Feb 2021, Phil wrote:

> I currently use jcc to wrap a Java library for use in Python - it works
> great.
>
> The project I'm working on is moving it's package management from
> traditional pip installs to Guix:
> https://guix.gnu.org/
>
> Guix handles python packages pretty well, and I have jcc running nicely on
> there.
>
> The problem I have is that Guix expects as an input a Python source
> repo, but the jcc outputs a binary wheel.

I'm not sure what you mean by "binary wheel", I'm not familiar with that 
format. Yes, JCC's __main__.py documents that
  --bdist generate a binary distutils-based distribution
          or a setuptools-based .egg
  --wheel generate wheel using setuptools (requires wheel
          package: pip install wheel)
  --build generate the wrapper and compile it
  --compile recompile the (previously generated) module

I did not write the bdist nor the wheel support, they were contributed and I 
don't now that --wheel makes a binary wheel, specifically.
Note that you have binaries in whatever you distribute, if you consider that 
the JAR files or the .class files are binaries. They are required (.class 
files are) for JCC to operate as it uses reflection to do its job.

I see that --build or --compile causes JCC to ask python to compile the egg 
it produces. Without these flags, I think it'll just produce the .cpp files, 
and with --python, the python wrappers (also C++ code).

If you don't invoke JCC with --wheel, --bdist, --build or --compile, you get 
just source files (not countng .jar).

I just tried that on PyLucene (the project I originally wrote JCC for) and 
no compilation happens without (some of) these flags being set.
   On my Mac, with python3, the command line looks like:
$ python3 -m jcc --shared --arch x86_64
    --jar <repeated many times>
    --package <repeated many times> <individual classes as needed> etc...
    --module <repeared a few times> --mapping ... --sequence ...
    --exclude ...
    --resources ...
    --python lucene
    --version 8.6.1
    --files 10   (10 or 11 .cpp files are generated)
--> no binaries made

> What I'd like is for jcc to put together a source package, complete with
> jars, C/C++, and python wrapper and a setup.py that is called to
> generate the wheel.  But to stop short of generating the wheel for me.
> I've had a look at the gnerated build directory - I could see the C/C++
> source and jars, but there didn't seem to be a setup.py to trigger an install?

The installation of the python extension built by JCC is triggered by 
passing --install to JCC's invocation.

Maybe what you actually want is to implement 'sdist' support for JCC ?
(again, not familiar with wheels, so I may not be making sense here).

> Such a directory could be then be fed into Guix who would happily build 
> the package and install it using the standard setup.py provided.

As long as GUIX knows how to drive a C++ compiler and linker, build python 
extensions (and knows how to build the libjcc shared library), you should 
be fine.

> As a crude workaround I can try to unzip the contents of the produced 
> wheel and stick the contents into a repo, with a new setup.py, and 
> MANIFEST.in file that would simply copy across the jars and the previously 
> build C/C++ library.  However having the setup.py building the library and 
> installing it is a more elegant solution.
>
> Any ideas?

Not much beyond what I just wrote, I don't understand enough about the 
problem you're trying to solve nor much about current Python extension 
deployment practices, I'm stuck in the days of plain setuptools.
I'm happy to integrate a patch/contribution from you if it makes sense to 
me.

Andi..

Re: jcc - Output unbuilt package

Posted by Andi Vajda <va...@apache.org>.
  Hi Phil,

On Mon, 1 Mar 2021, Phil wrote:

> Great - I've attached a one-line change that outputs the missing info to
> stdout.  I haven't added a command line switch at this stage as there is
> no functional change to the output - just an extra line of logging.

Yes, a flag is not necessary for this change.

> Note this is necessary because although the program already outputs the
> setup args which are also required, the Extension class doesn't render
> its args as text on printing the setup args, so I also need to output
> the inputs to the Extension so that I can capture all the inputs to setup().
>
> This would be hugely useful for me if we could even this simple change.

I incorporated your change and committed into rev 1887063.

> I will look at automating the full generation of a source package, and
> adding an optional switch to control this - but this is a larger undertaking,
> so I'll have to add it to the to-do list.  I'll let you know when I have
> had a chance to implement this.

Great, thank you !

Andi..

Re: jcc - Output unbuilt package

Posted by Phil <ph...@beadling.co.uk>.
Hi Andi,

Great - I've attached a one-line change that outputs the missing info to
stdout.  I haven't added a command line switch at this stage as there is
no functional change to the output - just an extra line of logging.

Note this is necessary because although the program already outputs the
setup args which are also required, the Extension class doesn't render
its args as text on printing the setup args, so I also need to output
the inputs to the Extension so that I can capture all the inputs to setup().

This would be hugely useful for me if we could even this simple change.

I will look at automating the full generation of a source package, and
adding an optional switch to control this - but this is a larger undertaking,
so I'll have to add it to the to-do list.  I'll let you know when I have
had a chance to implement this.

Cheers,
Phil.





Re: jcc - Output unbuilt package

Posted by Andi Vajda <va...@apache.org>.
  Hi Phil,

Excellent !

Yes, I'd accept a patch that makes your feature available via a new command 
line flag to JCC (don't forget to document it in __main__.py).

It's ok to only send in a patch for the python3 version (under jcc3). The 
python2 version (under jcc2) is maintenance only, don't bother.

Thanks !

Andi..

On Sun, 28 Feb 2021, Phil wrote:

>
> Thanks very much for the reply Andi.
>
> I've done a bit more digging and the short answer is you can't do what I
> wanted with the current JCC..... but, it turns out it was fairly
> straightforward to tweak JCC to provide me with the all details I needed.
>
> I successfully made this work today with Guix with the project
> that previously used JCC to build a wheel.
>
> Do you accept patches onto the subversion source trunk? - I could
> prepare a very simple, optional extension to dump out the data I needed.
> If I have time I also could write something which would fully automate
> the entire process in the future.
>
> Some more details below.
>
> Andi Vajda writes:
>
>> I did not write the bdist nor the wheel support, they were contributed
>> and I don't now that --wheel makes a binary wheel, specifically.
>> Note that you have binaries in whatever you distribute, if you
>> consider that the JAR files or the .class files are binaries. They are
>> required (.class files are) for JCC to operate as it uses reflection
>> to do its job.
>
> Yes the --wheel switch implies that the C++ build will be performed.
>
> I agree the JAR/.class files will always be binaries, but it is
> possible to distribute a package which contains only the JARs as
> binaries, along with python API and unbuilt C/C++, with a setup.py,
> describing the build process.
>
> This delays the calling of setuptools, and thus the building the C++ to
> package install time rather than wheel creation time - and allows us to
> create a regular setup.py file.
>
>>
>> If you don't invoke JCC with --wheel, --bdist, --build or --compile,
>> you get just source files (not countng .jar).
>
> Almost - you get the C++, but no setup.py or python wrapper generated.
> At least this was the case for me.
>
> However, if you build the wheel the build directories do contain
> everything you need apart from the setup.py - you just need to lift the
> right files out and put them into the right directory structure.
>
> To create a setup.py you need both the setup and extension arguments
> generated by JCC.  Normally these are fed straight into setuptools by
> JCC, but we can leak them out instead.
>
> JCC already ouputs the setup arguments to stdout, so the only change I
> had to do to the source was to also output the extension args
> dictionary to stdout too, towards the end of:
> https://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/jcc3/python.py
>
> extensions = [Extension('.'.join([moduleName, extname]), **args)]
>
> print("extensions args = %s" % args) # I added this
>
>
> Then at the bottom the setuptools args are already output to stdout:
>
> print("setup args = %s" % args)
>
> setup(**args)
>
>
> My setup.py then contains a call to setuptools.setup() providing same
> **args structure generated by the script (I'd link it here to demonstrate this,
> but the repo is behind the a firewall, alas).
>
> This plus a tiny bit of tweaking and boilerplate gives me exactly what I want.
>
>
>> Maybe what you actually want is to implement 'sdist' support for JCC ?
>> (again, not familiar with wheels, so I may not be making sense here).
>
> I thought this too - but I couldn't get it to work.
>
> I tried passing '--extra-setup-arg sdist' into JCC, but this got the
> same result as not passing --build or --wheel as discussed above.  Only
> C++ is generated.
>
>> As long as GUIX knows how to drive a C++ compiler and linker, build
>> python extensions (and knows how to build the libjcc shared library),
>> you should be fine.
>
> Yep this all works perfectly in Guix once we have a regular python repo
> containing the source with the setup.py as described.
>
> The only downside is the whole thing is rather manual now - I'd like to
> tweak JCC to make this more streamlined - let me know if a patch
> interests you?
>
>

Re: jcc - Output unbuilt package

Posted by Phil <ph...@beadling.co.uk>.
Thanks very much for the reply Andi.

I've done a bit more digging and the short answer is you can't do what I
wanted with the current JCC..... but, it turns out it was fairly
straightforward to tweak JCC to provide me with the all details I needed.

I successfully made this work today with Guix with the project
that previously used JCC to build a wheel.

Do you accept patches onto the subversion source trunk? - I could
prepare a very simple, optional extension to dump out the data I needed.
If I have time I also could write something which would fully automate
the entire process in the future.

Some more details below.

Andi Vajda writes:

> I did not write the bdist nor the wheel support, they were contributed
> and I don't now that --wheel makes a binary wheel, specifically.
> Note that you have binaries in whatever you distribute, if you
> consider that the JAR files or the .class files are binaries. They are
> required (.class files are) for JCC to operate as it uses reflection
> to do its job.

Yes the --wheel switch implies that the C++ build will be performed.

I agree the JAR/.class files will always be binaries, but it is
possible to distribute a package which contains only the JARs as
binaries, along with python API and unbuilt C/C++, with a setup.py,
describing the build process.

This delays the calling of setuptools, and thus the building the C++ to
package install time rather than wheel creation time - and allows us to
create a regular setup.py file.

>
> If you don't invoke JCC with --wheel, --bdist, --build or --compile,
> you get just source files (not countng .jar).

Almost - you get the C++, but no setup.py or python wrapper generated.
At least this was the case for me.

However, if you build the wheel the build directories do contain
everything you need apart from the setup.py - you just need to lift the
right files out and put them into the right directory structure.

To create a setup.py you need both the setup and extension arguments
generated by JCC.  Normally these are fed straight into setuptools by
JCC, but we can leak them out instead.

JCC already ouputs the setup arguments to stdout, so the only change I
had to do to the source was to also output the extension args
dictionary to stdout too, towards the end of:
https://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/jcc3/python.py

extensions = [Extension('.'.join([moduleName, extname]), **args)]

print("extensions args = %s" % args) # I added this


Then at the bottom the setuptools args are already output to stdout:

print("setup args = %s" % args)

setup(**args)


My setup.py then contains a call to setuptools.setup() providing same
**args structure generated by the script (I'd link it here to demonstrate this,
but the repo is behind the a firewall, alas).

This plus a tiny bit of tweaking and boilerplate gives me exactly what I want.


> Maybe what you actually want is to implement 'sdist' support for JCC ?
> (again, not familiar with wheels, so I may not be making sense here).

I thought this too - but I couldn't get it to work.

I tried passing '--extra-setup-arg sdist' into JCC, but this got the
same result as not passing --build or --wheel as discussed above.  Only
C++ is generated.

> As long as GUIX knows how to drive a C++ compiler and linker, build
> python extensions (and knows how to build the libjcc shared library),
> you should be fine.

Yep this all works perfectly in Guix once we have a regular python repo
containing the source with the setup.py as described.

The only downside is the whole thing is rather manual now - I'd like to
tweak JCC to make this more streamlined - let me know if a patch
interests you?