You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by Junjie Wei <jw...@nyu.edu> on 2017/03/15 21:25:16 UTC
pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1 Directory
Hi,
When I was trying to build pylucene-6.4.1 in Cygwin on Windows, the "$ make
build" exit with errors complaining that some jar files cannot be open. It
seems because some of the jars under lucene-java-6.4.1 are symbolic links
with size of 1k instead of concrete ones. Here is a list that I located
them with find command:
$ find ./lucene-java-6.4.1/ -name *.jar -size 1k
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/icu/lib/icu4j-56.1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/lib/morfologik-fsa-2.1.1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/lib/morfologik-polish-2.1.1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/lib/morfologik-stemming-2.1.1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/phonetic/lib/commons-codec-1.10.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/Tagger-2.3.1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/uimaj-core-2.3.1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/WhitespaceTokenizer-2.3.1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/antlr4-runtime-4.5.1-1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/asm-5.1.jar
./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/asm-commons-5.1.jar
After downloaded and replaced lucene-java-6.4.1 from
https://archive.apache.org/dist/lucene/java/6.4.1/, things went all good.
Is it an issue in the release, or I have missed something before built?
Thanks,
Junjie
Re: pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1
Directory
Posted by Andi Vajda <va...@apache.org>.
On Fri, 17 Mar 2017, Ruediger Meier wrote:
> On Friday 17 March 2017, Andi Vajda wrote:
>> Now, several people, including yourself, have proposed
>> python 3 ports. I still have to figure a way to package this all up
>> into a release that works with both. I need some time to integrate
>> the three python 3 ports,
>
> FYI I have the other two ports also imported into my github repo which
> makes it easy to compare again.
>
> $ git ls-remote https://github.com/rudimeier/jcc |cut -f2
>
> refs/heads/master <<< my final one, works for py2 and py3
> refs/heads/py3-old-orig <<< old svn, pylucene/branches/python_3
> refs/heads/py3-tommykoch <<< from https://gist.github.com/tommykoch
> refs/tags/v2.23
Thank you. I got started on this today and I'm now starting to look at the
three ports. So far, I've got jcc split into two parts (still one module,
one egg) to work with both python2 and python3 but keeping the code
separate. It's too much of a mess to keep both versions together in the same
file and I don't expect the python2 version to change too much since jcc has
been quite stable...
Andi..
>
>
> cu,
> Rudi
>
Re: pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1 Directory
Posted by Ruediger Meier <sw...@gmx.de>.
On Friday 17 March 2017, Andi Vajda wrote:
> Now, several people, including yourself, have proposed
> python 3 ports. I still have to figure a way to package this all up
> into a release that works with both. I need some time to integrate
> the three python 3 ports,
FYI I have the other two ports also imported into my github repo which
makes it easy to compare again.
$ git ls-remote https://github.com/rudimeier/jcc |cut -f2
refs/heads/master <<< my final one, works for py2 and py3
refs/heads/py3-old-orig <<< old svn, pylucene/branches/python_3
refs/heads/py3-tommykoch <<< from https://gist.github.com/tommykoch
refs/tags/v2.23
cu,
Rudi
Re: pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1 Directory
Posted by Andi Vajda <va...@apache.org>.
> On Mar 16, 2017, at 20:34, Ruediger Meier <sw...@gmx.de> wrote:
>
>> On Friday 17 March 2017, Andi Vajda wrote:
>>> On Thu, 16 Mar 2017, Ruediger Meier wrote:
>>>> On Thursday 16 March 2017, Andi Vajda wrote:
>>>> Indeed, this is a bug of mine.
>>>> What would you prefer:
>>>> - include the actual .jar files in the distribution archive
>>>> (tell tar to follow the symlinks when I build the PyLucene
>>>> distribution) - or exclude the symlinks (tell tar to exclude
>>>> symlinks); your running build would then use ivy to fetch them
>>>
>>> Usually my opinion is that tarballs should have the least possible
>>> dependencies. But in this case where all the deps are hosted on the
>>> same source (apache.org) I would not include it but download on
>>> build time (if user has not downloaded it manually already).
>>
>> +1, I'm leaning towards not including these .jar files as well.
>> It saves about 20Mb on the pylucene distribution tar file and they
>> can be obtained from ivy anyway.
>>
>>> Maybe we could even enhance the Makefile to automatically find an
>>> already installed lucene or download the latest minor version. IMO
>>> it makes no sense that pylucene users by default always use a
>>> non-bugfixed outdated lucene. And I saw on this mailing list how
>>> difficult it can be to get enough votes for a pylucene minor
>>> update.
>>
>> There is no such thing as a bugfixed Lucene. Each Lucene release has
>> new bug fixes but also new bugs, such is software development. Lucene
>> also breaks things on a regular basis inspite of being quite careful
>> about backwards compatibility, thus PyLucene unit tests have to be
>> checked for each release.
>>
>> The problem you're referring to would not be much of an issue if it
>> was easier to garner votes for a PyLucene release. A new release
>> would happen in lock step with each Lucene release, as was the case
>> in the past, a few years ago. There is a Lucene 6.5 release being
>> talked about and I intend to release a PyLucene 6.5 shortly
>> thereafter.
>
> Well, I was speaking about the minor maintenance updates like 6.4.2 but
> you know surely better about the quality of lucene updates.
>
>>> The same goes for the jcc python package which the user has to
>>> install manually anyways. We don't need to ship it with pylucene. I
>>> guess jcc would be far more famous if it would be hosted decoupled
>>> of pylucene. IMO jcc is a really amazing good working thing.
>>> pylucene is just a nice example how easy you can use java libs via
>>> python.
>>
>> Thank you for the kind words. JCC is already available without
>> PyLucene from Python's PyPI: https://pypi.python.org/pypi/JCC/2.23
>> JCC gets released on PyPI at the same time as the main Apache
>> PyLucene release.
>>
>> I agree that PyLucene is just an example of JCC usage but it's the
>> main one and PyLucene has been driving the features of JCC.
>
> Yep, jcc only exists because of pylucene. And good that pylucene's
> development and user base guarantees that jcc will be well maintained
> in future too. On the other hand pylucene may be some kind of show
> stopper for jcc. Why wasn't the old experimental jcc/py3 port released
> quickly on PyPI 7 years ago?
Because it was an experimental branch that was never finished.
> Is there any chance to get the recent
> jcc/py3 port released soon even pylucene still cares for stable py2
> only?
I don't think PyLucene cares either way. I have not had enough time in a long while to do a releasable version of jcc with python 3 support. Now, several people, including yourself, have proposed python 3 ports. I still have to figure a way to package this all up into a release that works with both.
I need some time to integrate the three python 3 ports, update it to do proper string conversions and package it in a way that it works both with python 2 and 3 (can be different sets of sources, with possible overlaps, but in the same source egg).
Andi..
> I mean releasing jcc for py3 cannot break any existing project.
> No need to wait for the right time to test it more carefully.
>
> Cheers,
> Rudi
Re: pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1 Directory
Posted by Ruediger Meier <sw...@gmx.de>.
On Friday 17 March 2017, Andi Vajda wrote:
> On Thu, 16 Mar 2017, Ruediger Meier wrote:
> > On Thursday 16 March 2017, Andi Vajda wrote:
> >> Indeed, this is a bug of mine.
> >> What would you prefer:
> >> - include the actual .jar files in the distribution archive
> >> (tell tar to follow the symlinks when I build the PyLucene
> >> distribution) - or exclude the symlinks (tell tar to exclude
> >> symlinks); your running build would then use ivy to fetch them
> >
> > Usually my opinion is that tarballs should have the least possible
> > dependencies. But in this case where all the deps are hosted on the
> > same source (apache.org) I would not include it but download on
> > build time (if user has not downloaded it manually already).
>
> +1, I'm leaning towards not including these .jar files as well.
> It saves about 20Mb on the pylucene distribution tar file and they
> can be obtained from ivy anyway.
>
> > Maybe we could even enhance the Makefile to automatically find an
> > already installed lucene or download the latest minor version. IMO
> > it makes no sense that pylucene users by default always use a
> > non-bugfixed outdated lucene. And I saw on this mailing list how
> > difficult it can be to get enough votes for a pylucene minor
> > update.
>
> There is no such thing as a bugfixed Lucene. Each Lucene release has
> new bug fixes but also new bugs, such is software development. Lucene
> also breaks things on a regular basis inspite of being quite careful
> about backwards compatibility, thus PyLucene unit tests have to be
> checked for each release.
>
> The problem you're referring to would not be much of an issue if it
> was easier to garner votes for a PyLucene release. A new release
> would happen in lock step with each Lucene release, as was the case
> in the past, a few years ago. There is a Lucene 6.5 release being
> talked about and I intend to release a PyLucene 6.5 shortly
> thereafter.
Well, I was speaking about the minor maintenance updates like 6.4.2 but
you know surely better about the quality of lucene updates.
> > The same goes for the jcc python package which the user has to
> > install manually anyways. We don't need to ship it with pylucene. I
> > guess jcc would be far more famous if it would be hosted decoupled
> > of pylucene. IMO jcc is a really amazing good working thing.
> > pylucene is just a nice example how easy you can use java libs via
> > python.
>
> Thank you for the kind words. JCC is already available without
> PyLucene from Python's PyPI: https://pypi.python.org/pypi/JCC/2.23
> JCC gets released on PyPI at the same time as the main Apache
> PyLucene release.
>
> I agree that PyLucene is just an example of JCC usage but it's the
> main one and PyLucene has been driving the features of JCC.
Yep, jcc only exists because of pylucene. And good that pylucene's
development and user base guarantees that jcc will be well maintained
in future too. On the other hand pylucene may be some kind of show
stopper for jcc. Why wasn't the old experimental jcc/py3 port released
quickly on PyPI 7 years ago? Is there any chance to get the recent
jcc/py3 port released soon even pylucene still cares for stable py2
only? I mean releasing jcc for py3 cannot break any existing project.
No need to wait for the right time to test it more carefully.
Cheers,
Rudi
Re: pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1
Directory
Posted by Andi Vajda <va...@apache.org>.
On Thu, 16 Mar 2017, Ruediger Meier wrote:
> On Thursday 16 March 2017, Andi Vajda wrote:
>
>> Indeed, this is a bug of mine.
>> What would you prefer:
>> - include the actual .jar files in the distribution archive (tell
>> tar to follow the symlinks when I build the PyLucene distribution) -
>> or exclude the symlinks (tell tar to exclude symlinks); your running
>> build would then use ivy to fetch them
>
> Usually my opinion is that tarballs should have the least possible
> dependencies. But in this case where all the deps are hosted on the
> same source (apache.org) I would not include it but download on build
> time (if user has not downloaded it manually already).
+1, I'm leaning towards not including these .jar files as well.
It saves about 20Mb on the pylucene distribution tar file and they can be
obtained from ivy anyway.
> Maybe we could even enhance the Makefile to automatically find an
> already installed lucene or download the latest minor version. IMO it
> makes no sense that pylucene users by default always use a non-bugfixed
> outdated lucene. And I saw on this mailing list how difficult it can be
> to get enough votes for a pylucene minor update.
There is no such thing as a bugfixed Lucene. Each Lucene release has new bug
fixes but also new bugs, such is software development. Lucene also breaks
things on a regular basis inspite of being quite careful about backwards
compatibility, thus PyLucene unit tests have to be checked for each release.
The problem you're referring to would not be much of an issue if it was
easier to garner votes for a PyLucene release. A new release would happen in
lock step with each Lucene release, as was the case in the past, a few years
ago. There is a Lucene 6.5 release being talked about and I intend to
release a PyLucene 6.5 shortly thereafter.
> The same goes for the jcc python package which the user has to install
> manually anyways. We don't need to ship it with pylucene. I guess jcc
> would be far more famous if it would be hosted decoupled of pylucene.
> IMO jcc is a really amazing good working thing. pylucene is just a nice
> example how easy you can use java libs via python.
Thank you for the kind words. JCC is already available without PyLucene from
Python's PyPI: https://pypi.python.org/pypi/JCC/2.23
JCC gets released on PyPI at the same time as the main Apache PyLucene release.
I agree that PyLucene is just an example of JCC usage but it's the main one
and PyLucene has been driving the features of JCC.
Andi..
>
> cheers,
> Rudi
>
Re: pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1 Directory
Posted by Ruediger Meier <sw...@gmx.de>.
On Thursday 16 March 2017, Andi Vajda wrote:
> Indeed, this is a bug of mine.
> What would you prefer:
> - include the actual .jar files in the distribution archive (tell
> tar to follow the symlinks when I build the PyLucene distribution) -
> or exclude the symlinks (tell tar to exclude symlinks); your running
> build would then use ivy to fetch them
Usually my opinion is that tarballs should have the least possible
dependencies. But in this case where all the deps are hosted on the
same source (apache.org) I would not include it but download on build
time (if user has not downloaded it manually already).
Maybe we could even enhance the Makefile to automatically find an
already installed lucene or download the latest minor version. IMO it
makes no sense that pylucene users by default always use a non-bugfixed
outdated lucene. And I saw on this mailing list how difficult it can be
to get enough votes for a pylucene minor update.
The same goes for the jcc python package which the user has to install
manually anyways. We don't need to ship it with pylucene. I guess jcc
would be far more famous if it would be hosted decoupled of pylucene.
IMO jcc is a really amazing good working thing. pylucene is just a nice
example how easy you can use java libs via python.
cheers,
Rudi
Re: pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1
Directory
Posted by Andi Vajda <va...@apache.org>.
On Wed, 15 Mar 2017, Ruediger Meier wrote:
> On Wednesday 15 March 2017, Junjie Wei wrote:
>> Hi,
>>
>> When I was trying to build pylucene-6.4.1 in Cygwin on Windows, the
>> "$ make build" exit with errors complaining that some jar files
>> cannot be open. It seems because some of the jars under
>> lucene-java-6.4.1 are symbolic links with size of 1k instead of
>> concrete ones. Here is a list that I located them with find command:
>>
>> $ find ./lucene-java-6.4.1/ -name *.jar -size 1k
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/icu/lib/icu4j
>> -56.1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/li
>> b/morfologik-fsa-2.1.1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/li
>> b/morfologik-polish-2.1.1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/li
>> b/morfologik-stemming-2.1.1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/phonetic/lib/
>> commons-codec-1.10.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/Tagg
>> er-2.3.1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/uima
>> j-core-2.3.1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/Whit
>> espaceTokenizer-2.3.1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/antlr4
>> -runtime-4.5.1-1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/asm-5.
>> 1.jar
>> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/asm-co
>> mmons-5.1.jar
>>
>>
>> After downloaded and replaced lucene-java-6.4.1 from
>> https://archive.apache.org/dist/lucene/java/6.4.1/, things went all
>> good.
>>
>> Is it an issue in the release, or I have missed something before
>> built?
>
> Yes this is a minor but annoying issue of this realease. There are some
> dead links packaged, pointing to Andi's home. like this one
>
> ./lucene-java-6.4.1/lucene/analysis/morfologik/lib/morfologik-stemming-2.1.1.jar -> /Users/vajda/.ivy2/cache/org.carrot2/morfologik-stemming/bundles/morfologik-stemming-2.1.1.jar
>
> Maybe the "make release/distrib" target has a bug or these links where
> commited to svn by mistake.
>
> BTW this is no real issue on a real POSIX system. Cygwin seems to make
> this worse as it has to emulate symlinks somehow. I guess instead of
> downloading lucene manually you could have fixed it by just removing
> all the bad links.
Indeed, this is a bug of mine.
What would you prefer:
- include the actual .jar files in the distribution archive (tell tar to
follow the symlinks when I build the PyLucene distribution)
- or exclude the symlinks (tell tar to exclude symlinks); your
running build would then use ivy to fetch them
Andi..
>
> cu,
> Rudi
>
>
Re: pylucene-6.4.1: Missing/Can't unzip jars Under lucene-java-6.4.1 Directory
Posted by Ruediger Meier <sw...@gmx.de>.
On Wednesday 15 March 2017, Junjie Wei wrote:
> Hi,
>
> When I was trying to build pylucene-6.4.1 in Cygwin on Windows, the
> "$ make build" exit with errors complaining that some jar files
> cannot be open. It seems because some of the jars under
> lucene-java-6.4.1 are symbolic links with size of 1k instead of
> concrete ones. Here is a list that I located them with find command:
>
> $ find ./lucene-java-6.4.1/ -name *.jar -size 1k
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/icu/lib/icu4j
>-56.1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/li
>b/morfologik-fsa-2.1.1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/li
>b/morfologik-polish-2.1.1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/morfologik/li
>b/morfologik-stemming-2.1.1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/phonetic/lib/
>commons-codec-1.10.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/Tagg
>er-2.3.1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/uima
>j-core-2.3.1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/analysis/uima/lib/Whit
>espaceTokenizer-2.3.1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/antlr4
>-runtime-4.5.1-1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/asm-5.
>1.jar
> ./test/pylucene-6.4.1/lucene-java-6.4.1/lucene/expressions/lib/asm-co
>mmons-5.1.jar
>
>
> After downloaded and replaced lucene-java-6.4.1 from
> https://archive.apache.org/dist/lucene/java/6.4.1/, things went all
> good.
>
> Is it an issue in the release, or I have missed something before
> built?
Yes this is a minor but annoying issue of this realease. There are some
dead links packaged, pointing to Andi's home. like this one
./lucene-java-6.4.1/lucene/analysis/morfologik/lib/morfologik-stemming-2.1.1.jar -> /Users/vajda/.ivy2/cache/org.carrot2/morfologik-stemming/bundles/morfologik-stemming-2.1.1.jar
Maybe the "make release/distrib" target has a bug or these links where
commited to svn by mistake.
BTW this is no real issue on a real POSIX system. Cygwin seems to make
this worse as it has to emulate symlinks somehow. I guess instead of
downloading lucene manually you could have fixed it by just removing
all the bad links.
cu,
Rudi