You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by Bill Janssen <ja...@parc.com> on 2009/02/22 23:19:10 UTC
how to instantiate a Set?
I'm probably missing something incredibly obvious here...
I'm trying to call MoreLikethis.setStopWords(Set words). I've got a
list of stop words in Python, but I can't figure out how to turn that
into a Java Set. I tried "lucene.HashSet(set(words)",
"lucene.HashSet(lucene.ArrayList(JArray("string")(words)))", and so
forth, without much luck.
Bill
Re: how to instantiate a Set?
Posted by Andi Vajda <va...@apache.org>.
On Feb 23, 2009, at 8:42, Bill Janssen <ja...@parc.com> wrote:
>>>> a = JavaSet(set(['foo', 'bar', 'baz']))
>
> How about letting me initialize JavaSet with a sequence, too?
>
>>>> a = JavaSet(['foo', 'bar', 'baz'])
>>>>
>
Well, sure, but the point of JavaSet is to expose a set you own and
control to Java. If you want to just create a set for Java the Arrays
route works just as well and produces a faster set since its values
are held in Java.
Andi..
> Bill
Re: how to instantiate a Set?
Posted by Bill Janssen <ja...@parc.com>.
>>> a = JavaSet(set(['foo', 'bar', 'baz']))
How about letting me initialize JavaSet with a sequence, too?
>>> a = JavaSet(['foo', 'bar', 'baz'])
Bill
Re: how to instantiate a Set?
Posted by Andi Vajda <va...@apache.org>.
On Sun, 22 Feb 2009, Andi Vajda wrote:
>
> On Sun, 22 Feb 2009, Bill Janssen wrote:
>
>> I'm probably missing something incredibly obvious here...
>>
>> I'm trying to call MoreLikethis.setStopWords(Set words). I've got a
>> list of stop words in Python, but I can't figure out how to turn that
>> into a Java Set. I tried "lucene.HashSet(set(words)",
>> "lucene.HashSet(lucene.ArrayList(JArray("string")(words)))", and so
>> forth, without much luck.
>
> PyLucene doesn't wrap the java.util.Arrays class that fills in the Java gap
> between arrays and collections. That should be considered an oversight of
> mine. I should add it to the JCC invocation in PyLucene's Makefile. Then you
> would be able pass your JArray instance to Arrays toList() method to make an
> ArrayList and finally feed that to a HashSet.
>
> Another alternative is to implement a Python extension of the Java Set
> interface. Guess what ? that is already part of PyLucene. The PythonSet class
> is the extension point for implementing a Java Set in Python and that is part
> of the PyLucene distribution.
>
> I even have such a Python implementation of a Java Set, called JavaSet.py,
> ready here but it's not currently shipping with PyLucene, another oversight
> of mine. I should add it to the distribution.
>
> Until then, here it is below. It takes a python set instance as constructor
> argument and implements the complete Java Set interface. This example also
> illustrates a Python implementation of the Java Iterator interface.
I added a collections.py module to the PyLucene distribution.
To use it:
>>> from lucene.collections import JavaSet
>>> from lucene import initVM, CLASSPATH
>>> initVM(CLASSPATH)
>>> a = JavaSet(set(['foo', 'bar', 'baz']))
I also added some missing proxies for the mapping and sequence protocols so
that JavaSet can be iterated and used with the 'in' operator from Python.
Andi..
Re: how to instantiate a Set?
Posted by Andi Vajda <va...@apache.org>.
On Sun, 22 Feb 2009, Bill Janssen wrote:
> I'm probably missing something incredibly obvious here...
>
> I'm trying to call MoreLikethis.setStopWords(Set words). I've got a
> list of stop words in Python, but I can't figure out how to turn that
> into a Java Set. I tried "lucene.HashSet(set(words)",
> "lucene.HashSet(lucene.ArrayList(JArray("string")(words)))", and so
> forth, without much luck.
PyLucene doesn't wrap the java.util.Arrays class that fills in the Java gap
between arrays and collections. That should be considered an oversight of
mine. I should add it to the JCC invocation in PyLucene's Makefile. Then you
would be able pass your JArray instance to Arrays toList() method to make an
ArrayList and finally feed that to a HashSet.
Another alternative is to implement a Python extension of the Java Set
interface. Guess what ? that is already part of PyLucene. The PythonSet
class is the extension point for implementing a Java Set in Python and that
is part of the PyLucene distribution.
I even have such a Python implementation of a Java Set, called JavaSet.py,
ready here but it's not currently shipping with PyLucene, another oversight
of mine. I should add it to the distribution.
Until then, here it is below. It takes a python set instance as constructor
argument and implements the complete Java Set interface. This example also
illustrates a Python implementation of the Java Iterator interface.
Please, let me know if this works for you.
Thanks !
Andi..
----------------------------------------------------------------
from lucene import PythonSet, PythonIterator, JavaError
class JavaSet(PythonSet):
def __init__(self, _set):
super(JavaSet, self).__init__()
self._set = _set
def add(self, obj):
if obj not in self._set:
self._set.add(obj)
return True
return False
def addAll(self, collection):
size = len(self._set)
self._set.update(collection)
return len(self._set) > size
def clear(self):
self._set.clear()
def contains(self, obj):
return obj in self._set
def containsAll(self, collection):
for obj in collection:
if obj not in self._set:
return False
return True
def equals(self, collection):
if type(self) is type(collection):
return self._set == collection._set
return False
def isEmpty(self):
return len(self._set) == 0
def iterator(self):
class _iterator(PythonIterator):
def __init__(_self):
super(_iterator, _self).__init__()
_self._iterator = iter(self._set)
def hasNext(_self):
if hasattr(_self, '_next'):
return True
try:
_self._next = _self._iterator.next()
return True
except StopIteration:
return False
def next(_self):
if hasattr(_self, '_next'):
next = _self._next
del _self._next
else:
next = _self._iterator.next()
return next
return _iterator()
def remove(self, obj):
try:
self._set.remove(obj)
return True
except KeyError:
return False
def removeAll(self, collection):
result = False
for obj in collection:
try:
self._set.remove(obj)
result = True
except KeyError:
pass
return result
def retainAll(self, collection):
result = False
for obj in list(self._set):
if obj not in c:
self._set.remove(obj)
result = True
return result
def size(self):
return len(self._set)
def toArray(self):
return list(self._set)
Re: how to instantiate a Set?
Posted by Andi Vajda <va...@apache.org>.
On Sun, 22 Feb 2009, Bill Janssen wrote:
> I'm probably missing something incredibly obvious here...
>
> I'm trying to call MoreLikethis.setStopWords(Set words). I've got a
> list of stop words in Python, but I can't figure out how to turn that
> into a Java Set. I tried "lucene.HashSet(set(words)",
> "lucene.HashSet(lucene.ArrayList(JArray("string")(words)))", and so
> forth, without much luck.
I just added the Arrays class to the build by adding java.util.Arrays to the
jcc invocation in PyLucene's Makefile and now the following just works:
>>> a=Arrays.asList(JArray('string')(('foo', 'bar', 'baz')))
>>> a
<List: [foo, bar, baz]>
>>> HashSet(a)
<HashSet: [foo, baz, bar]>
I should have that checked in shortly.
You then get to decide: use the "makes me cringe" Arrays class or the
pythonic JavaSet.py/PythonSet.java combo :)
Andi..