You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by Bill Janssen <ja...@parc.com> on 2010/07/20 04:40:23 UTC

API changes between 2.9.2 and 2.9.3

Looks like the combination of JCC 2.6 and Lucene 2.9.3 have made some
significant API changes.  This is what I get with 2.9.3:

% python /u/python/uplib/indexing.py search /local/demo-repo/index picasso
[...]
hits are <Hits: org.apache.lucene.search.Hits@f6f3dc> (0 hits)
Traceback (most recent call last):
  File "/u/python/uplib/indexing.py", line 930, in <module>
    search(sys.argv[2], sys.argv[3:])
  File "/u/python/uplib/indexing.py", line 897, in search
    print c.search(' '.join(searchterms))
  File "/u/python/uplib/indexing.py", line 687, in search
    for hit in hits:
lucene.JavaError: java.lang.IndexOutOfBoundsException: Not a valid hit number: 0
    Java stacktrace:
java.lang.IndexOutOfBoundsException: Not a valid hit number: 0
	at org.apache.lucene.search.Hits.hitDoc(Hits.java:215)
	at org.apache.lucene.search.Hits.doc(Hits.java:168)

In other words, Hits are now something I can take the length of, but
cannot enumerate?  Have we switched to TopDocs already?


% python /u/python/uplib/indexing.py search /local/demo-repo/index Apple
[...]
hits are <Hits: org.apache.lucene.search.Hits@f4b0dc> (12)
Traceback (most recent call last):
  File "/u/python/uplib/indexing.py", line 930, in <module>
    search(sys.argv[2], sys.argv[3:])
  File "/u/python/uplib/indexing.py", line 897, in search
    print c.search(' '.join(searchterms))
  File "/u/python/uplib/indexing.py", line 688, in search
    doc = Hit.cast_(hit).getDocument()
TypeError: Document<stored/uncompressed,indexed<id:01182-38-8512-609> stored/uncompressed,indexed<uplibdate:20070620> stored/uncompressed,indexed<uplibtype:whole> stored/uncompressed,indexed<categories:_(all)_>>
%

Bill

RE: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Wed, 21 Jul 2010, Thomas Koch wrote:

> But I understand now that as long as you remove deprecated code from 2.9 it
> *should* work with 2.9 and 3.0 as well! Right?

Correct.

> e.g.
> <method>Hits search(Query query)
>   Is now deprecated as
> "Hits will be removed in Lucene 3.0"
>
> 2.9 already supports
> <method>TopDocs search(Query, Filter, int)
> Which one should use instead.
>
> The problem here is that - as far as I understand - you can make it work
> with 2.9 and 3.0 - but then you loose backward compatibility with any 2.x
> version before 2.9.... The point is that you may then end up forcing your
> users (admins) to install a newer version of PyLucene - which people may not
> want to do...

It's all about trade-offs. If your application runs on 2.x and you don't 
want to reinstall, don't upgrade. If you have found bugs that are fixed in 
3.x or want to take advantage of new features or improvements, upgrade.

Andi..

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
> I'm going back to 2.9.2 :-).

For some reason, 2.9.2 installs JCC 2.4.1.  Is that right?  Shouldn't it
be 2.5.1?

Bill

holmes : /tmp/pylucene-2.9.2-1/jcc 99 % sudo python setup.py install
sudo python setup.py install
running install
running bdist_egg
running egg_info
writing JCC.egg-info/PKG-INFO
writing top-level names to JCC.egg-info/top_level.txt
writing dependency_links to JCC.egg-info/dependency_links.txt
reading manifest file 'JCC.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'JCC.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.5-i386/egg
running install_lib
running build_py
copying jcc/config.py -> build/lib.macosx-10.5-i386-2.5/jcc
copying jcc/classes/org/apache/jcc/PythonVM.class -> build/lib.macosx-10.5-i386-2.5/jcc/classes/org/apache/jcc
copying jcc/classes/org/apache/jcc/PythonException.class -> build/lib.macosx-10.5-i386-2.5/jcc/classes/org/apache/jcc
running build_ext
creating build/bdist.macosx-10.5-i386
creating build/bdist.macosx-10.5-i386/egg
creating build/bdist.macosx-10.5-i386/egg/jcc
copying build/lib.macosx-10.5-i386-2.5/jcc/__init__.py -> build/bdist.macosx-10.5-i386/egg/jcc
copying build/lib.macosx-10.5-i386-2.5/jcc/__main__.py -> build/bdist.macosx-10.5-i386/egg/jcc
copying build/lib.macosx-10.5-i386-2.5/jcc/_jcc.so -> build/bdist.macosx-10.5-i386/egg/jcc
creating build/bdist.macosx-10.5-i386/egg/jcc/classes
creating build/bdist.macosx-10.5-i386/egg/jcc/classes/org
creating build/bdist.macosx-10.5-i386/egg/jcc/classes/org/apache
creating build/bdist.macosx-10.5-i386/egg/jcc/classes/org/apache/jcc
copying build/lib.macosx-10.5-i386-2.5/jcc/classes/org/apache/jcc/PythonException.class -> build/bdist.macosx-10.5-i386/egg/jcc/classes/org/apache/jcc
copying build/lib.macosx-10.5-i386-2.5/jcc/classes/org/apache/jcc/PythonVM.class -> build/bdist.macosx-10.5-i386/egg/jcc/classes/org/apache/jcc
copying build/lib.macosx-10.5-i386-2.5/jcc/config.py -> build/bdist.macosx-10.5-i386/egg/jcc
copying build/lib.macosx-10.5-i386-2.5/jcc/cpp.py -> build/bdist.macosx-10.5-i386/egg/jcc
creating build/bdist.macosx-10.5-i386/egg/jcc/patches
copying build/lib.macosx-10.5-i386-2.5/jcc/patches/patch.4195 -> build/bdist.macosx-10.5-i386/egg/jcc/patches
copying build/lib.macosx-10.5-i386-2.5/jcc/patches/patch.43.0.6c11 -> build/bdist.macosx-10.5-i386/egg/jcc/patches
copying build/lib.macosx-10.5-i386-2.5/jcc/patches/patch.43.0.6c7 -> build/bdist.macosx-10.5-i386/egg/jcc/patches
copying build/lib.macosx-10.5-i386-2.5/jcc/python.py -> build/bdist.macosx-10.5-i386/egg/jcc
creating build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/functions.cpp -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/functions.h -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/JArray.cpp -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/JArray.h -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/jcc.cpp -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/JCCEnv.cpp -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/JCCEnv.h -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/jccfuncs.h -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/JObject.cpp -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/JObject.h -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/macros.h -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/jcc/sources/types.cpp -> build/bdist.macosx-10.5-i386/egg/jcc/sources
copying build/lib.macosx-10.5-i386-2.5/libjcc.dylib -> build/bdist.macosx-10.5-i386/egg
byte-compiling build/bdist.macosx-10.5-i386/egg/jcc/__init__.py to __init__.pyc
byte-compiling build/bdist.macosx-10.5-i386/egg/jcc/__main__.py to __main__.pyc
byte-compiling build/bdist.macosx-10.5-i386/egg/jcc/config.py to config.pyc
byte-compiling build/bdist.macosx-10.5-i386/egg/jcc/cpp.py to cpp.pyc
byte-compiling build/bdist.macosx-10.5-i386/egg/jcc/python.py to python.pyc
creating stub loader for jcc/_jcc.so
byte-compiling build/bdist.macosx-10.5-i386/egg/jcc/_jcc.py to _jcc.pyc
creating build/bdist.macosx-10.5-i386/egg/EGG-INFO
copying JCC.egg-info/PKG-INFO -> build/bdist.macosx-10.5-i386/egg/EGG-INFO
copying JCC.egg-info/SOURCES.txt -> build/bdist.macosx-10.5-i386/egg/EGG-INFO
copying JCC.egg-info/dependency_links.txt -> build/bdist.macosx-10.5-i386/egg/EGG-INFO
copying JCC.egg-info/not-zip-safe -> build/bdist.macosx-10.5-i386/egg/EGG-INFO
copying JCC.egg-info/top_level.txt -> build/bdist.macosx-10.5-i386/egg/EGG-INFO
writing build/bdist.macosx-10.5-i386/egg/EGG-INFO/native_libs.txt
creating dist
creating 'dist/JCC-2.4.1-py2.5-macosx-10.5-i386.egg' and adding 'build/bdist.macosx-10.5-i386/egg' to it
removing 'build/bdist.macosx-10.5-i386/egg' (and everything under it)
Processing JCC-2.4.1-py2.5-macosx-10.5-i386.egg
removing '/Library/Python/2.5/site-packages/JCC-2.4.1-py2.5-macosx-10.5-i386.egg' (and everything under it)
creating /Library/Python/2.5/site-packages/JCC-2.4.1-py2.5-macosx-10.5-i386.egg
Extracting JCC-2.4.1-py2.5-macosx-10.5-i386.egg to /Library/Python/2.5/site-packages
JCC 2.4.1 is already the active version in easy-install.pth

Installed /Library/Python/2.5/site-packages/JCC-2.4.1-py2.5-macosx-10.5-i386.egg
Processing dependencies for JCC==2.4.1
Finished processing dependencies for JCC==2.4.1
holmes : /tmp/pylucene-2.9.2-1/jcc 100 % 

Re: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Jul 21, 2010, at 19:33, Bill Janssen <ja...@parc.com> wrote:

> What's crashing with PyLucene 2.9.3 is this code:
>
>     for field in x.getFields():
>
> where "x" is an instance of org.apache.lucene.document.Document.  I  
> can
> print x and it looks OK, but an attempt to iterate over the list of
> fields seems broken.  Is this another iterator change?

Yep, that could very well be. Use .iterator() on it.
I'm surprised that it crashes instead if complaining that it's not  
iterable. Maybe my fix (to the incorrect Iterable asaumption) is  
incomplete...

Andi..

>
> Bill
>
> Thread 14 Crashed:
> 0   libjvm.dylib                      0x0198a1cb 0x18e7000 + 668107
> 1   libjvm.dylib                      0x01af5c47  
> JNI_CreateJavaVM_Impl + 96759
> 2   libjcc.dylib                      0x007a168d  
> JCCEnv::callObjectMethod(_jobject*, _jmethodID*, ...) const + 73
> 3   libjcc.dylib                      0x007a1254 JCCEnv::iterator 
> (_jobject*) const + 34
> 4   _lucene.so                        0x013c0e83 _object*  
> get_iterator<java::util::t_List>(java::util::t_List*) + 59
> 5   org.python.python                 0x00121dfd PyObject_GetIter +  
> 107

Re: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Jul 22, 2010, at 17:52, Bill Janssen <ja...@parc.com> wrote:

> Andi Vajda <va...@apache.org> wrote:
>
>>
>> On Jul 22, 2010, at 2:09, Bill Janssen <ja...@parc.com> wrote:
>>
>>> Andi Vajda <va...@apache.org> wrote:
>>>
>>>> Porting your stuff to 3.0 is thus highly recommended instead
>>>> of complaining about broken (my bad) long- deprecated APIs.
>>>
>>> Hey, take 2.9.3 down, and announce no further pylucene support for
>>> 2.x,
>>> and I'll stop talking about it.
>>
>> The value in 2.9.3 is really just in the Lucene fixes since 2.9.2. If
>> you want them without the new JCC which is tripping you up, take a
>> 2.9.2 build tree and change the Lucene svn url near the top of the
>> Makefile to point at the 2.9.3 sources. This should "just work" (tm).
>
> Another fix is to edit the common-build.xml file in the Lucene subtree
> to remove the 1.4 restriction.  That lets it build with Java 5 and  
> that
> adds the Iterable interface, and things work as they did, even with  
> jcc 2.6.

Even better. Still, none of the Lucene 2.9 code uses any of the Java  
1.5 features directly, hence why Lucene 3.0 is yet a better choice.

Andi..


>
> Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
Andi Vajda <va...@apache.org> wrote:

> 
> On Jul 22, 2010, at 2:09, Bill Janssen <ja...@parc.com> wrote:
> 
> > Andi Vajda <va...@apache.org> wrote:
> >
> >> Porting your stuff to 3.0 is thus highly recommended instead
> >> of complaining about broken (my bad) long- deprecated APIs.
> >
> > Hey, take 2.9.3 down, and announce no further pylucene support for
> > 2.x,
> > and I'll stop talking about it.
> 
> The value in 2.9.3 is really just in the Lucene fixes since 2.9.2. If
> you want them without the new JCC which is tripping you up, take a
> 2.9.2 build tree and change the Lucene svn url near the top of the
> Makefile to point at the 2.9.3 sources. This should "just work" (tm).

Another fix is to edit the common-build.xml file in the Lucene subtree
to remove the 1.4 restriction.  That lets it build with Java 5 and that
adds the Iterable interface, and things work as they did, even with jcc 2.6.

Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Jul 22, 2010, at 2:09, Bill Janssen <ja...@parc.com> wrote:

> Andi Vajda <va...@apache.org> wrote:
>
>> Porting your stuff to 3.0 is thus highly recommended instead
>> of complaining about broken (my bad) long- deprecated APIs.
>
> Hey, take 2.9.3 down, and announce no further pylucene support for  
> 2.x,
> and I'll stop talking about it.

The value in 2.9.3 is really just in the Lucene fixes since 2.9.2. If  
you want them without the new JCC which is tripping you up, take a  
2.9.2 build tree and change the Lucene svn url near the top of the  
Makefile to point at the 2.9.3 sources. This should "just work" (tm).

Andi..

>
> Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
Andi Vajda <va...@apache.org> wrote:

> Porting your stuff to 3.0 is thus highly recommended instead
> of complaining about broken (my bad) long- deprecated APIs.

Hey, take 2.9.3 down, and announce no further pylucene support for 2.x,
and I'll stop talking about it.

Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Jul 21, 2010, at 23:10, Bill Janssen <ja...@parc.com> wrote:

> Andi Vajda <va...@apache.org> wrote:
>
>>
>> On Jul 21, 2010, at 19:59, Bill Janssen <ja...@parc.com> wrote:
>>
>>> Bill Janssen <ja...@parc.com> wrote:
>>>
>>>> What's crashing with PyLucene 2.9.3 is this code:
>>>>
>>>>    for field in x.getFields():
>>>>
>>>> where "x" is an instance of org.apache.lucene.document.Document.  I
>>>> can
>>>> print x and it looks OK, but an attempt to iterate over the list of
>>>> fields seems broken.  Is this another iterator change?
>>>
>>> I see that I also can't iterate over x.getFields().listIterator(),
>>> presumably because, in the Java 1.4 that Lucene 2.9.x uses,
>>> java.util.Iterator doesn't "implement" java.lang.Iterable.  A tad
>>> ridiculous.
>>
>> Not ridiculous, impossible, since java.lang.Iterable appeared in Java
>> 1.5 and Lucene 2.x claims Java 1.4 compatibility.
>
> No, I mean it's ridiculous that I can't subscript or iterate, in  
> Python,
> a value of java.util.List.  You need something in jcc to support Java
> 1.4 sequence types

Yes, it can be done with --sequence

> as well as the new code for 1.5 sequence types.

With 1.5, Iterable works like a charm.

> I
> presume that there will be a 2.9.4 in the future, right?

Maybe, maybe not. It depends on what Lucene Java does. I sure hope  
that the 2.9 release series is winding down. There are just too many  
release branches at the moment.

> I looked at the jcc code a bit.  In jcc/python.py, this bit
>
>    if env.java_version >= '1.5':
>        iterable = findClass('java/lang/Iterable')
>        iterator = findClass('java/util/Iterator')
>    else:
>        iterable = iterator = None
>
> could perhaps become
>
>    if env.java_version >= '1.5':
>        iterable = findClass('java/lang/Iterable')
>        iterator = findClass('java/util/Iterator')
>    else:
>        iterable = findClass('java/lang/Object')
>        iterator = findClass('java/lang/Iterator')
>
> Not sure.  Probably more is required.
>
>>> Certainly java.util.List should be a sequence of some sort.
>>
>> It can be if declare it via the --sequence jcc command line flag.
>
> So a change to the Makefile would be in order, for the 2.x branch.
>
> Is there an auto-downcasting switch for jcc?  That is, it would be  
> nice,
> for sequences and mappings, if the "get" method would automatically  
> cast
> the retrieved value to the Pythonic representation of the most  
> specific
> type.

If that get() method is declared with that specific type, I expect it  
to be used. Otherwise, this is what the 3.0 release is about (among  
many other things), extensive use of Java 1.5, type parameters and  
generics. Porting your stuff to 3.0 is thus highly recommended instead  
of complaining about broken (my bad) long- deprecated APIs.

Andi..

>
> Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
Andi Vajda <va...@apache.org> wrote:

> 
> On Jul 21, 2010, at 19:59, Bill Janssen <ja...@parc.com> wrote:
> 
> > Bill Janssen <ja...@parc.com> wrote:
> >
> >> What's crashing with PyLucene 2.9.3 is this code:
> >>
> >>     for field in x.getFields():
> >>
> >> where "x" is an instance of org.apache.lucene.document.Document.  I
> >> can
> >> print x and it looks OK, but an attempt to iterate over the list of
> >> fields seems broken.  Is this another iterator change?
> >
> > I see that I also can't iterate over x.getFields().listIterator(),
> > presumably because, in the Java 1.4 that Lucene 2.9.x uses,
> > java.util.Iterator doesn't "implement" java.lang.Iterable.  A tad
> > ridiculous.
> 
> Not ridiculous, impossible, since java.lang.Iterable appeared in Java
> 1.5 and Lucene 2.x claims Java 1.4 compatibility.

No, I mean it's ridiculous that I can't subscript or iterate, in Python,
a value of java.util.List.  You need something in jcc to support Java
1.4 sequence types as well as the new code for 1.5 sequence types.  I
presume that there will be a 2.9.4 in the future, right?

I looked at the jcc code a bit.  In jcc/python.py, this bit

    if env.java_version >= '1.5':
        iterable = findClass('java/lang/Iterable')
        iterator = findClass('java/util/Iterator')
    else:
        iterable = iterator = None

could perhaps become

    if env.java_version >= '1.5':
        iterable = findClass('java/lang/Iterable')
        iterator = findClass('java/util/Iterator')
    else:
        iterable = findClass('java/lang/Object')
        iterator = findClass('java/lang/Iterator')

Not sure.  Probably more is required.

> > Certainly java.util.List should be a sequence of some sort.
> 
> It can be if declare it via the --sequence jcc command line flag.

So a change to the Makefile would be in order, for the 2.x branch.

Is there an auto-downcasting switch for jcc?  That is, it would be nice,
for sequences and mappings, if the "get" method would automatically cast
the retrieved value to the Pythonic representation of the most specific
type.

Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Jul 21, 2010, at 19:59, Bill Janssen <ja...@parc.com> wrote:

> Bill Janssen <ja...@parc.com> wrote:
>
>> What's crashing with PyLucene 2.9.3 is this code:
>>
>>     for field in x.getFields():
>>
>> where "x" is an instance of org.apache.lucene.document.Document.  I  
>> can
>> print x and it looks OK, but an attempt to iterate over the list of
>> fields seems broken.  Is this another iterator change?
>
> I see that I also can't iterate over x.getFields().listIterator(),
> presumably because, in the Java 1.4 that Lucene 2.9.x uses,
> java.util.Iterator doesn't "implement" java.lang.Iterable.  A tad
> ridiculous.

Not ridiculous, impossible, since java.lang.Iterable appeared in Java  
1.5 and Lucene 2.x claims Java 1.4 compatibility.

> Certainly java.util.List should be a sequence of some sort.

It can be if declare it via the --sequence jcc command line flag.

Andi..

>
> Bill
>
>>
>> Bill
>>
>> Thread 14 Crashed:
>> 0   libjvm.dylib                      0x0198a1cb 0x18e7000 + 668107
>> 1   libjvm.dylib                      0x01af5c47  
>> JNI_CreateJavaVM_Impl + 96759
>> 2   libjcc.dylib                      0x007a168d  
>> JCCEnv::callObjectMethod(_jobject*, _jmethodID*, ...) const + 73
>> 3   libjcc.dylib                      0x007a1254 JCCEnv::iterator 
>> (_jobject*) const + 34
>> 4   _lucene.so                        0x013c0e83 _object*  
>> get_iterator<java::util::t_List>(java::util::t_List*) + 59
>> 5   org.python.python                 0x00121dfd PyObject_GetIter +  
>> 107

Re: API changes between 2.9.2 and 2.9.3

Posted by Christian Heimes <ch...@cheimes.de>.
> Presumably that's no longer the case with JCC 2.6.  Probably should be
> updated to whatever the current version does.  Or perhaps versioned and
> checked into the source tree.

It could be related to the --no-generics addition. Have you tried to
recompile PyLucene with the option "JCCFLAGS="?

Christian

Re: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Jul 21, 2010, at 20:38, Bill Janssen <ja...@parc.com> wrote:

> Bill Janssen <ja...@parc.com> wrote:
>
>> Bill Janssen <ja...@parc.com> wrote:
>>
>>> What's crashing with PyLucene 2.9.3 is this code:
>>>
>>>     for field in x.getFields():
>>>
>>> where "x" is an instance of org.apache.lucene.document.Document.   
>>> I can
>>> print x and it looks OK, but an attempt to iterate over the list of
>>> fields seems broken.  Is this another iterator change?
>>
>> I see that I also can't iterate over x.getFields().listIterator(),
>> presumably because, in the Java 1.4 that Lucene 2.9.x uses,
>> java.util.Iterator doesn't "implement" java.lang.Iterable.  A tad
>> ridiculous.  Certainly java.util.List should be a sequence of some  
>> sort.
>
> I looked into this a bit further.  The common-build.xml file for  
> Lucene
> 2.9.x specifies
>
>  <property name="javac.source" value="1.4"/>
>  <property name="javac.target" value="1.4"/>
>
> and in 1.4 the java.util.Iterable class from Java 1.5 doesn't exist.
>
> The docs for JCC still say this:
>
> ``When generating wrappers for Python, JCC attempts to detect which
> classes can be made iterable:
>
>  * When a class declares to implement java.util.Iterator or something
>    compatible with it, JCC makes it iterable from Python.
>
>  * When a Java class declares a method called iterator() with no
>    arguments returning a type compatible with java.util.Iterator, this
>    class is made iterable from Python.

Yes, that's incorrect with JCC 2.6.
I need to fix the docs.

Andi..

>
>  * When a Java class declares a method called next() with no arguments
>    returning an object type, this class is made iterable. Its next()
>    method is assumed to terminate iteration by returning null.''
>
> Presumably that's no longer the case with JCC 2.6.  Probably should be
> updated to whatever the current version does.  Or perhaps versioned  
> and
> checked into the source tree.
>
> Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
Bill Janssen <ja...@parc.com> wrote:

> Bill Janssen <ja...@parc.com> wrote:
> 
> > What's crashing with PyLucene 2.9.3 is this code:
> > 
> >      for field in x.getFields():
> > 
> > where "x" is an instance of org.apache.lucene.document.Document.  I can
> > print x and it looks OK, but an attempt to iterate over the list of
> > fields seems broken.  Is this another iterator change?
> 
> I see that I also can't iterate over x.getFields().listIterator(),
> presumably because, in the Java 1.4 that Lucene 2.9.x uses,
> java.util.Iterator doesn't "implement" java.lang.Iterable.  A tad
> ridiculous.  Certainly java.util.List should be a sequence of some sort.

I looked into this a bit further.  The common-build.xml file for Lucene
2.9.x specifies

  <property name="javac.source" value="1.4"/>
  <property name="javac.target" value="1.4"/>

and in 1.4 the java.util.Iterable class from Java 1.5 doesn't exist.

The docs for JCC still say this:

``When generating wrappers for Python, JCC attempts to detect which
classes can be made iterable:

  * When a class declares to implement java.util.Iterator or something
    compatible with it, JCC makes it iterable from Python.

  * When a Java class declares a method called iterator() with no
    arguments returning a type compatible with java.util.Iterator, this
    class is made iterable from Python.

  * When a Java class declares a method called next() with no arguments
    returning an object type, this class is made iterable. Its next()
    method is assumed to terminate iteration by returning null.''

Presumably that's no longer the case with JCC 2.6.  Probably should be
updated to whatever the current version does.  Or perhaps versioned and
checked into the source tree.

Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
Bill Janssen <ja...@parc.com> wrote:

> What's crashing with PyLucene 2.9.3 is this code:
> 
>      for field in x.getFields():
> 
> where "x" is an instance of org.apache.lucene.document.Document.  I can
> print x and it looks OK, but an attempt to iterate over the list of
> fields seems broken.  Is this another iterator change?

I see that I also can't iterate over x.getFields().listIterator(),
presumably because, in the Java 1.4 that Lucene 2.9.x uses,
java.util.Iterator doesn't "implement" java.lang.Iterable.  A tad
ridiculous.  Certainly java.util.List should be a sequence of some sort.

Bill

> 
> Bill
> 
> Thread 14 Crashed:
> 0   libjvm.dylib                  	0x0198a1cb 0x18e7000 + 668107
> 1   libjvm.dylib                  	0x01af5c47 JNI_CreateJavaVM_Impl + 96759
> 2   libjcc.dylib                  	0x007a168d JCCEnv::callObjectMethod(_jobject*, _jmethodID*, ...) const + 73
> 3   libjcc.dylib                  	0x007a1254 JCCEnv::iterator(_jobject*) const + 34
> 4   _lucene.so                    	0x013c0e83 _object* get_iterator<java::util::t_List>(java::util::t_List*) + 59
> 5   org.python.python             	0x00121dfd PyObject_GetIter + 107

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
What's crashing with PyLucene 2.9.3 is this code:

     for field in x.getFields():

where "x" is an instance of org.apache.lucene.document.Document.  I can
print x and it looks OK, but an attempt to iterate over the list of
fields seems broken.  Is this another iterator change?

Bill

Thread 14 Crashed:
0   libjvm.dylib                  	0x0198a1cb 0x18e7000 + 668107
1   libjvm.dylib                  	0x01af5c47 JNI_CreateJavaVM_Impl + 96759
2   libjcc.dylib                  	0x007a168d JCCEnv::callObjectMethod(_jobject*, _jmethodID*, ...) const + 73
3   libjcc.dylib                  	0x007a1254 JCCEnv::iterator(_jobject*) const + 34
4   _lucene.so                    	0x013c0e83 _object* get_iterator<java::util::t_List>(java::util::t_List*) + 59
5   org.python.python             	0x00121dfd PyObject_GetIter + 107

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
Thomas Koch <ko...@orbiteam.de> wrote:

> > ...
> > I realize that PyLucene doesn't make that easy because it doesn't warn
> > about deprecated API use.
> > 
> [Thomas Koch] Well this is a general drawback in Python as interpreted
> language I guess - wrong interfaces are only detected at runtime and are
> thus harder to test (unless you describe the interfaces and use tools such
> as pylint...)
> I wouldn't expect PyLucene to provide direct support here.
> 
> > One thing I could add to JCC is a command line flag to _not_ wrap any
> > deprecated APIs. With that applied to PyLucene, one could then find all
> > errors they'd be hitting when upgrading to 3.x. That being said, I don't
> > see
> > the difference between this and just upgrading to 3.x and looking for
> > the
> > very same errors since, by definition, 3.0 == 2.9 - deprecations. This
> > explains why I haven't implemented this feature so far.
> > 
> > Andi..
> > 
> [Thomas Koch] Thanks for the explanation - that makes it more clear to me
> now.
> 
> The question remains if it's feasible to support 2.x *and* 3.x  - as Bill
> mentioned "... I'd like to make it work on both." - me too.  I did fear that
> this makes things much more complicated and you end up with code "if
> lucene.VERSION.split('.')[0]>2: ... else ..." - we did that some time ago
> during GCJ and JCC based versions of PyLucene, but at that time it was
> merely a matter of different imports and init stuff (initVM).
> 
> But I understand now that as long as you remove deprecated code from 2.9 it
> *should* work with 2.9 and 3.0 as well! Right?
> 
> e.g.
> <method>Hits search(Query query)
>    Is now deprecated as 
> "Hits will be removed in Lucene 3.0" 
> 
> 2.9 already supports
> <method>TopDocs search(Query, Filter, int) 
> Which one should use instead.
> 
> The problem here is that - as far as I understand - you can make it work
> with 2.9 and 3.0 - but then you loose backward compatibility with any 2.x
> version before 2.9.... The point is that you may then end up forcing your
> users (admins) to install a newer version of PyLucene - which people may not
> want to do...

I changed my code to this:

try:
    from lucene import TopDocs
except ImportError:
    _have_topdocs = False
else:
    _have_topdocs = True

[...]

    if _have_topdocs:
        topdocs = s.search(parsed_query, count or 1000000)
        for hit in topdocs.scoreDocs:
            doc = s.doc(hit.doc)
            score = hit.score
            rval.append((doc.get("id"), score,))
    else:
        hits = s.search(parsed_query)
        for hit in hits:
            doc = Hit.cast_(hit).getDocument()
            score = Hit.cast_(hit).getScore()
            rval.append((doc.get("id"), score,))

Unfortunately, 2.9.3 now coredumps on me (OS X 10.5.8, system python 2.5):

Exception Type:  EXC_BAD_ACCESS (SIGBUS)
Exception Codes: KERN_PROTECTION_FAILURE at 0x0000000000000000
Crashed Thread:  14

VM state:not at safepoint (normal execution)
VM Mutex/Monitor currently owned by a thread: None

Heap
 def new generation   total 4544K, used 2441K [0x0d5a0000, 0x0da80000, 0x0fd00000)
  eden space 4096K,  48% used [0x0d5a0000, 0x0d7926a8, 0x0d9a0000)
  from space 448K, 100% used [0x0da10000, 0x0da80000, 0x0da80000)
  to   space 448K,   0% used [0x0d9a0000, 0x0d9a0000, 0x0da10000)
 tenured generation   total 60544K, used 722K [0x0fd00000, 0x13820000, 0x2d5a0000)
   the space 60544K,   1% used [0x0fd00000, 0x0fdb49c0, 0x0fdb4a00, 0x13820000)
 compacting perm gen  total 8192K, used 2246K [0x2d5a0000, 0x2dda0000, 0x315a0000)
   the space 8192K,  27% used [0x2d5a0000, 0x2d7d1ba8, 0x2d7d1c00, 0x2dda0000)
    ro space 8192K,  63% used [0x315a0000, 0x31abcf60, 0x31abd000, 0x31da0000)
    rw space 12288K,  43% used [0x31da0000, 0x322d35a8, 0x322d3600, 0x329a0000)

Virtual Machine arguments:
 JVM args: -Xms64m -Xmx512m -Xss100m -Djava.awt.headless=true
 Java command: <unknown>
 launcher type: generic

Thread 14 Crashed:
0   libjvm.dylib                  	0x019b81cb 0x1915000 + 668107
1   libjvm.dylib                  	0x01b23c47 JNI_CreateJavaVM_Impl + 96759
2   libjcc.dylib                  	0x0073368d JCCEnv::callObjectMethod(_jobject*, _jmethodID*, ...) const + 73
3   libjcc.dylib                  	0x00733254 JCCEnv::iterator(_jobject*) const + 34
4   _lucene.so                    	0x013c0e65 _object* get_iterator<java::util::t_List>(java::util::t_List*) + 59
5   org.python.python             	0x00121dfd PyObject_GetIter + 107
6   org.python.python             	0x0018edbd PyEval_EvalFrameEx + 15227
7   org.python.python             	0x00191173 PyEval_EvalCodeEx + 1638

I'm going back to 2.9.2 :-).

Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Aric Coady <ar...@gmail.com>.
On Jul 21, 2010, at 12:18 AM, Thomas Koch wrote:
> The question remains if it's feasible to support 2.x *and* 3.x  - as Bill
> mentioned "... I'd like to make it work on both." - me too.  I did fear that
> this makes things much more complicated and you end up with code "if
> lucene.VERSION.split('.')[0]>2: ... else ..." - we did that some time ago
> during GCJ and JCC based versions of PyLucene, but at that time it was
> merely a matter of different imports and init stuff (initVM).
> 
> But I understand now that as long as you remove deprecated code from 2.9 it
> *should* work with 2.9 and 3.0 as well! Right?

It's certainly possible, but there are some gotchas.  I've been maintaining 2.4, 2.9, and 3.0 for my project (http://code.google.com/p/lupyne/), and just recently dropped 2.4 support.

The conditional checks that are still left involve the Python* overrides.  There are several in 2.9 that still wrap the deprecated method or class, and of course they're missing in 3.0.  The ones I remember are PythonHitCollector, PythonFilter.bits, and PythonTokenFilter iteration.


RE: API changes between 2.9.2 and 2.9.3

Posted by Thomas Koch <ko...@orbiteam.de>.
> ...
> I realize that PyLucene doesn't make that easy because it doesn't warn
> about deprecated API use.
> 
[Thomas Koch] Well this is a general drawback in Python as interpreted
language I guess - wrong interfaces are only detected at runtime and are
thus harder to test (unless you describe the interfaces and use tools such
as pylint...)
I wouldn't expect PyLucene to provide direct support here.

> One thing I could add to JCC is a command line flag to _not_ wrap any
> deprecated APIs. With that applied to PyLucene, one could then find all
> errors they'd be hitting when upgrading to 3.x. That being said, I don't
> see
> the difference between this and just upgrading to 3.x and looking for
> the
> very same errors since, by definition, 3.0 == 2.9 - deprecations. This
> explains why I haven't implemented this feature so far.
> 
> Andi..
> 
[Thomas Koch] Thanks for the explanation - that makes it more clear to me
now.

The question remains if it's feasible to support 2.x *and* 3.x  - as Bill
mentioned "... I'd like to make it work on both." - me too.  I did fear that
this makes things much more complicated and you end up with code "if
lucene.VERSION.split('.')[0]>2: ... else ..." - we did that some time ago
during GCJ and JCC based versions of PyLucene, but at that time it was
merely a matter of different imports and init stuff (initVM).

But I understand now that as long as you remove deprecated code from 2.9 it
*should* work with 2.9 and 3.0 as well! Right?

e.g.
<method>Hits search(Query query)
   Is now deprecated as 
"Hits will be removed in Lucene 3.0" 

2.9 already supports
<method>TopDocs search(Query, Filter, int) 
Which one should use instead.

The problem here is that - as far as I understand - you can make it work
with 2.9 and 3.0 - but then you loose backward compatibility with any 2.x
version before 2.9.... The point is that you may then end up forcing your
users (admins) to install a newer version of PyLucene - which people may not
want to do...


Regards
Thomas



RE: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Tue, 20 Jul 2010, Thomas Koch wrote:

>> Porting your stuff to Lucene 3.0 is recommended...
>>
> [Thomas Koch] That's what I'm supposed to do next: port our PyLucene code to
> some "up-to-date" release - our codebase is still on PyLucene 2.6 and I
> expect it to break with the 3.x release ...
>
> With that in mind: is it still worth looking at/supporting PyLucene 2.x or
> do you suggest to switch to 3.x directly?

PyLucene is coordinating its releases and support with Lucene Java. As long 
as Lucene Java is making 2.x releases, I see no reason to not do so as well 
for PyLucene 2.x.

> I understand that Lucene 2.x is Java1.4 compatible and Lucene 3.x makes use
> of Java5 generics. Is it still API compatible or a complete new branch?

Lucene Java has a policy to only break API compatibility when making major 
releases as when going from 2.x to 3.x. They sure did so for 3.0 and the 
Lucene Java 2.9.x release series is the one where both APIs are supported 
but with the obsolete 2.x APIs marked as deprecated.

In Lucene Java 3.0, they've removed all the APIs thus deprecated in 2.x. The 
point they're making, and rightly so, is that before upgrading to 3.x, be 
sure that you're not using any deprecated APIs by using the java compiler 
deprecation warnings as canari. Only then should you upgrade to 3.x.
I realize that PyLucene doesn't make that easy because it doesn't warn about 
deprecated API use.

One thing I could add to JCC is a command line flag to _not_ wrap any 
deprecated APIs. With that applied to PyLucene, one could then find all 
errors they'd be hitting when upgrading to 3.x. That being said, I don't see 
the difference between this and just upgrading to 3.x and looking for the 
very same errors since, by definition, 3.0 == 2.9 - deprecations. This 
explains why I haven't implemented this feature so far.

Andi..

>
> Regards
> Thomas
>
>
>
>

RE: API changes between 2.9.2 and 2.9.3

Posted by Thomas Koch <ko...@orbiteam.de>.
> Porting your stuff to Lucene 3.0 is recommended...
> 
[Thomas Koch] That's what I'm supposed to do next: port our PyLucene code to
some "up-to-date" release - our codebase is still on PyLucene 2.6 and I
expect it to break with the 3.x release ...

With that in mind: is it still worth looking at/supporting PyLucene 2.x or
do you suggest to switch to 3.x directly?

I understand that Lucene 2.x is Java1.4 compatible and Lucene 3.x makes use
of Java5 generics. Is it still API compatible or a complete new branch?

Regards
Thomas





Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
Andi Vajda <va...@apache.org> wrote:

> On Jul 20, 2010, at 18:14, Bill Janssen <ja...@parc.com> wrote:
> 
> > Andi Vajda <va...@apache.org> wrote:
> >
> >>
> >> On Jul 20, 2010, at 4:40, Bill Janssen <ja...@parc.com> wrote:
> >>
> >>> Looks like the combination of JCC 2.6 and Lucene 2.9.3 have made
> >>> some
> >>> significant API changes.  This is what I get with 2.9.3:
> >>>
> >>> % python /u/python/uplib/indexing.py search /local/demo-repo/index
> >>> picasso
> >>> [...]
> >>> hits are <Hits: org.apache.lucene.search.Hits@f6f3dc> (0 hits)
> >>> Traceback (most recent call last):
> >>> File "/u/python/uplib/indexing.py", line 930, in <module>
> >>>   search(sys.argv[2], sys.argv[3:])
> >>> File "/u/python/uplib/indexing.py", line 897, in search
> >>>   print c.search(' '.join(searchterms))
> >>> File "/u/python/uplib/indexing.py", line 687, in search
> >>>   for hit in hits:
> >>> lucene.JavaError: java.lang.IndexOutOfBoundsException: Not a valid
> >>> hit number: 0
> >>>   Java stacktrace:
> >>> java.lang.IndexOutOfBoundsException: Not a valid hit number: 0
> >>>   at org.apache.lucene.search.Hits.hitDoc(Hits.java:215)
> >>>   at org.apache.lucene.search.Hits.doc(Hits.java:168)
> >>>
> >>> In other words, Hits are now something I can take the length of, but
> >>> cannot enumerate?  Have we switched to TopDocs already?
> >>
> >> There was a bug in jcc that assumed an Iterable out of Hits because
> >> of
> >> its iterator() method. But Hits doesn't actually implement Iterable
> >> (it's a Java 1.5 thing and Lucene 2.x is Java 1.4 compatible) it only
> >> mimicks it. You can call hits.iterator() for the same effect. See the
> >> tests and samples that use this class.
> >
> > So a bug fix in jcc 2.6 breaks working PyLucene code?  I see what
> > you're
> > saying, but it hardly seems like a good release policy for a micro
> > release bump, 2.9.2 to 2.9.3.  And it's probably worth mentioning in
> > the
> > PyLucene 2.9.3 change log.  I can't be the only one using "search
> > (query)".
> 
> All true. The multiplication of branches, parallel releases of Lucene,
> PyLucene and JCC got a bit out of hand. At least now, the JCCs are the
> same.
> 
> Then, tongue in cheek, this being a volunteer effort of the community,
> had you participated in trying the release out before it got too late,
> this mishap could have been averted. Complaining after the fact is
> only half as good :-)

I completely agree, no worries there.

> Of course, samples and tests using Hits broke after the fix. It didn't
> occur to me to make a bigger note of it. Sorry.
> 
> > Is this the change noted in the jcc 2.5.1 -> 2.5.2 CHANGES as
> > "fixed bug with not heeding type parameter for --sequence get method"?
> 
> No, it's noted in the 2.5 -> 2.6 changes, last item.
> 
> > Oh, well.  Do you suppose I can wrap Lucene 2.9.3 with jcc 2.5.1?  I'm
> > interested in the memory leak fixes in 2.9.3.
> 
> Of course you can. Some PyLucene unit tests might fail, though.

I'll try that, then.

> Porting your stuff to Lucene 3.0 is recommended...

Yep.  Though I'd like to make it work on both.

Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Jul 20, 2010, at 18:14, Bill Janssen <ja...@parc.com> wrote:

> Andi Vajda <va...@apache.org> wrote:
>
>>
>> On Jul 20, 2010, at 4:40, Bill Janssen <ja...@parc.com> wrote:
>>
>>> Looks like the combination of JCC 2.6 and Lucene 2.9.3 have made  
>>> some
>>> significant API changes.  This is what I get with 2.9.3:
>>>
>>> % python /u/python/uplib/indexing.py search /local/demo-repo/index
>>> picasso
>>> [...]
>>> hits are <Hits: org.apache.lucene.search.Hits@f6f3dc> (0 hits)
>>> Traceback (most recent call last):
>>> File "/u/python/uplib/indexing.py", line 930, in <module>
>>>   search(sys.argv[2], sys.argv[3:])
>>> File "/u/python/uplib/indexing.py", line 897, in search
>>>   print c.search(' '.join(searchterms))
>>> File "/u/python/uplib/indexing.py", line 687, in search
>>>   for hit in hits:
>>> lucene.JavaError: java.lang.IndexOutOfBoundsException: Not a valid
>>> hit number: 0
>>>   Java stacktrace:
>>> java.lang.IndexOutOfBoundsException: Not a valid hit number: 0
>>>   at org.apache.lucene.search.Hits.hitDoc(Hits.java:215)
>>>   at org.apache.lucene.search.Hits.doc(Hits.java:168)
>>>
>>> In other words, Hits are now something I can take the length of, but
>>> cannot enumerate?  Have we switched to TopDocs already?
>>
>> There was a bug in jcc that assumed an Iterable out of Hits because  
>> of
>> its iterator() method. But Hits doesn't actually implement Iterable
>> (it's a Java 1.5 thing and Lucene 2.x is Java 1.4 compatible) it only
>> mimicks it. You can call hits.iterator() for the same effect. See the
>> tests and samples that use this class.
>
> So a bug fix in jcc 2.6 breaks working PyLucene code?  I see what  
> you're
> saying, but it hardly seems like a good release policy for a micro
> release bump, 2.9.2 to 2.9.3.  And it's probably worth mentioning in  
> the
> PyLucene 2.9.3 change log.  I can't be the only one using "search 
> (query)".

All true. The multiplication of branches, parallel releases of Lucene,  
PyLucene and JCC got a bit out of hand. At least now, the JCCs are the  
same.

Then, tongue in cheek, this being a volunteer effort of the community,  
had you participated in trying the release out before it got too late,  
this mishap could have been averted. Complaining after the fact is  
only half as good :-)

Of course, samples and tests using Hits broke after the fix. It didn't  
occur to me to make a bigger note of it. Sorry.

> Is this the change noted in the jcc 2.5.1 -> 2.5.2 CHANGES as
> "fixed bug with not heeding type parameter for --sequence get method"?

No, it's noted in the 2.5 -> 2.6 changes, last item.

> Oh, well.  Do you suppose I can wrap Lucene 2.9.3 with jcc 2.5.1?  I'm
> interested in the memory leak fixes in 2.9.3.

Of course you can. Some PyLucene unit tests might fail, though.

Porting your stuff to Lucene 3.0 is recommended...

Andi..

>
> Bill

Re: API changes between 2.9.2 and 2.9.3

Posted by Bill Janssen <ja...@parc.com>.
Andi Vajda <va...@apache.org> wrote:

> 
> On Jul 20, 2010, at 4:40, Bill Janssen <ja...@parc.com> wrote:
> 
> > Looks like the combination of JCC 2.6 and Lucene 2.9.3 have made some
> > significant API changes.  This is what I get with 2.9.3:
> >
> > % python /u/python/uplib/indexing.py search /local/demo-repo/index
> > picasso
> > [...]
> > hits are <Hits: org.apache.lucene.search.Hits@f6f3dc> (0 hits)
> > Traceback (most recent call last):
> >  File "/u/python/uplib/indexing.py", line 930, in <module>
> >    search(sys.argv[2], sys.argv[3:])
> >  File "/u/python/uplib/indexing.py", line 897, in search
> >    print c.search(' '.join(searchterms))
> >  File "/u/python/uplib/indexing.py", line 687, in search
> >    for hit in hits:
> > lucene.JavaError: java.lang.IndexOutOfBoundsException: Not a valid
> > hit number: 0
> >    Java stacktrace:
> > java.lang.IndexOutOfBoundsException: Not a valid hit number: 0
> >    at org.apache.lucene.search.Hits.hitDoc(Hits.java:215)
> >    at org.apache.lucene.search.Hits.doc(Hits.java:168)
> >
> > In other words, Hits are now something I can take the length of, but
> > cannot enumerate?  Have we switched to TopDocs already?
> 
> There was a bug in jcc that assumed an Iterable out of Hits because of
> its iterator() method. But Hits doesn't actually implement Iterable
> (it's a Java 1.5 thing and Lucene 2.x is Java 1.4 compatible) it only
> mimicks it. You can call hits.iterator() for the same effect. See the
> tests and samples that use this class.

So a bug fix in jcc 2.6 breaks working PyLucene code?  I see what you're
saying, but it hardly seems like a good release policy for a micro
release bump, 2.9.2 to 2.9.3.  And it's probably worth mentioning in the
PyLucene 2.9.3 change log.  I can't be the only one using "search(query)".

Is this the change noted in the jcc 2.5.1 -> 2.5.2 CHANGES as
"fixed bug with not heeding type parameter for --sequence get method"?

Oh, well.  Do you suppose I can wrap Lucene 2.9.3 with jcc 2.5.1?  I'm
interested in the memory leak fixes in 2.9.3.

Bill


Re: API changes between 2.9.2 and 2.9.3

Posted by Andi Vajda <va...@apache.org>.
On Jul 20, 2010, at 4:40, Bill Janssen <ja...@parc.com> wrote:

> Looks like the combination of JCC 2.6 and Lucene 2.9.3 have made some
> significant API changes.  This is what I get with 2.9.3:
>
> % python /u/python/uplib/indexing.py search /local/demo-repo/index  
> picasso
> [...]
> hits are <Hits: org.apache.lucene.search.Hits@f6f3dc> (0 hits)
> Traceback (most recent call last):
>  File "/u/python/uplib/indexing.py", line 930, in <module>
>    search(sys.argv[2], sys.argv[3:])
>  File "/u/python/uplib/indexing.py", line 897, in search
>    print c.search(' '.join(searchterms))
>  File "/u/python/uplib/indexing.py", line 687, in search
>    for hit in hits:
> lucene.JavaError: java.lang.IndexOutOfBoundsException: Not a valid  
> hit number: 0
>    Java stacktrace:
> java.lang.IndexOutOfBoundsException: Not a valid hit number: 0
>    at org.apache.lucene.search.Hits.hitDoc(Hits.java:215)
>    at org.apache.lucene.search.Hits.doc(Hits.java:168)
>
> In other words, Hits are now something I can take the length of, but
> cannot enumerate?  Have we switched to TopDocs already?

There was a bug in jcc that assumed an Iterable out of Hits because of  
its iterator() method. But Hits doesn't actually implement Iterable  
(it's a Java 1.5 thing and Lucene 2.x is Java 1.4 compatible) it only  
mimicks it. You can call hits.iterator() for the same effect. See the  
tests and samples that use this class.

Yes, Hits is deprecated and moving to TopDocs is recommended as well.

Andi..

>
>
> % python /u/python/uplib/indexing.py search /local/demo-repo/index  
> Apple
> [...]
> hits are <Hits: org.apache.lucene.search.Hits@f4b0dc> (12)
> Traceback (most recent call last):
>  File "/u/python/uplib/indexing.py", line 930, in <module>
>    search(sys.argv[2], sys.argv[3:])
>  File "/u/python/uplib/indexing.py", line 897, in search
>    print c.search(' '.join(searchterms))
>  File "/u/python/uplib/indexing.py", line 688, in search
>    doc = Hit.cast_(hit).getDocument()
> TypeError: Document<stored/uncompressed,indexed<id: 
> 01182-38-8512-609> stored/uncompressed,indexed<uplibdate:20070620>  
> stored/uncompressed,indexed<uplibtype:whole> stored/ 
> uncompressed,indexed<categories:_(all)_>>
> %
>
> Bill