You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by "Michael McCandless (Created) (JIRA)" <ji...@apache.org> on 2011/11/22 18:20:40 UTC
[jira] [Created] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Add PythonReusableAnalyzerBase, so we can create analyzers in Python
--------------------------------------------------------------------
Key: PYLUCENE-12
URL: https://issues.apache.org/jira/browse/PYLUCENE-12
Project: PyLucene
Issue Type: Improvement
Reporter: Michael McCandless
Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Andi Vajda (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161404#comment-13161404 ]
Andi Vajda commented on PYLUCENE-12:
------------------------------------
you say: "I know we document that you must call super (http://lucene.apache.org/pylucene/jcc/documentation/readme.html#extensions), but, can we make this throw an exception instead of SEGV, to be more friendly? Or is that hard...? "
It's not hard, just costly. Everywhere the wrapped pointer is used, it must be checked. It's like checking for lack of calling initVM() or
attachCurrentThread(). It took a while to find the right way to do this that didn't involve checking these all the time.
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157322#comment-13157322 ]
Michael McCandless commented on PYLUCENE-12:
--------------------------------------------
One small fix to the patch: we also must add this:
@Override
public native Reader initReader(Reader reader);
So that the Python defined analyzer can provide a CharReader/Filter as well.
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase,
so we can create analyzers in Python
Posted by Michael McCandless <lu...@mikemccandless.com>.
On Sun, Dec 4, 2011 at 10:50 AM, Andi Vajda <va...@apache.org> wrote:
>> Just to be certain: how can I validate I truly succeeded in shared
>> linking for the lucene extension...? I'm on linux... when I run "nm"
>> on the _lucene.so, what should I look for to confirm I "succeeded"...?
>
> Use ldd (with the right flag) on _lucene.so, it should depend on libjcc.so if built shared.
Hmm indeed I did link shared.
OK! My bad... I had failed to "make clean" last time. Once I did
that, I now see the exception details. So it looks like not linking
shared was my problem.
Thanks Andi!
Mike McCandless
http://blog.mikemccandless.com
Re: [jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so we can create analyzers in Python
Posted by Andi Vajda <va...@apache.org>.
On Dec 4, 2011, at 6:52, Michael McCandless <lu...@mikemccandless.com> wrote:
> On Sun, Dec 4, 2011 at 9:25 AM, Michael McCandless
> <lu...@mikemccandless.com> wrote:
>> On Sat, Dec 3, 2011 at 5:10 PM, Andi Vajda <va...@apache.org> wrote:
>>>
>>> On Fri, 2 Dec 2011, Michael McCandless (Commented) (JIRA) wrote:
>>>
>>>> RE the exception inside createComponents... strange! Your exception
>>>> indeed has all the details (ie, shows the original traceback, from the
>>>> createComponents method).
>>>>
>>>> Yet, when I do exactly that change (stick the x in, then run the test case
>>>> directly, I get this:
>>>
>>> Did you build your lucene module with --shared (and did you build jcc with
>>> shared enabled, the default normally). It occurred to me that exception
>>> reporting is a bit weaker in non shared mode because the PythonException
>>> java class is not present. Just a thought...
>>
>> Hmm, I believe I built jcc with the defaults (shared), but indeed I
>> did not build the lucene extension shared... I'll try to build shared
>> and see if that fixes the exception reporting! If so, maybe we should
>> note this limitation of non-shared...
>
> Hmm I went and built the lucene extension shared (added --shared to
> the command-line passed to jcc module, in the topelevel Makefile) but
> I still don't get the traceback inside Python... spooky.
>
> Just to be certain: how can I validate I truly succeeded in shared
> linking for the lucene extension...? I'm on linux... when I run "nm"
> on the _lucene.so, what should I look for to confirm I "succeeded"...?
Use ldd (with the right flag) on _lucene.so, it should depend on libjcc.so if built shared.
Andi..
>
> Mike McCandless
>
> http://blog.mikemccandless.com
Re: [jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase,
so we can create analyzers in Python
Posted by Michael McCandless <lu...@mikemccandless.com>.
On Sun, Dec 4, 2011 at 9:25 AM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> On Sat, Dec 3, 2011 at 5:10 PM, Andi Vajda <va...@apache.org> wrote:
>>
>> On Fri, 2 Dec 2011, Michael McCandless (Commented) (JIRA) wrote:
>>
>>> RE the exception inside createComponents... strange! Your exception
>>> indeed has all the details (ie, shows the original traceback, from the
>>> createComponents method).
>>>
>>> Yet, when I do exactly that change (stick the x in, then run the test case
>>> directly, I get this:
>>
>> Did you build your lucene module with --shared (and did you build jcc with
>> shared enabled, the default normally). It occurred to me that exception
>> reporting is a bit weaker in non shared mode because the PythonException
>> java class is not present. Just a thought...
>
> Hmm, I believe I built jcc with the defaults (shared), but indeed I
> did not build the lucene extension shared... I'll try to build shared
> and see if that fixes the exception reporting! If so, maybe we should
> note this limitation of non-shared...
Hmm I went and built the lucene extension shared (added --shared to
the command-line passed to jcc module, in the topelevel Makefile) but
I still don't get the traceback inside Python... spooky.
Just to be certain: how can I validate I truly succeeded in shared
linking for the lucene extension...? I'm on linux... when I run "nm"
on the _lucene.so, what should I look for to confirm I "succeeded"...?
Mike McCandless
http://blog.mikemccandless.com
Re: [jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase,
so we can create analyzers in Python
Posted by Michael McCandless <lu...@mikemccandless.com>.
On Sat, Dec 3, 2011 at 5:10 PM, Andi Vajda <va...@apache.org> wrote:
>
> On Fri, 2 Dec 2011, Michael McCandless (Commented) (JIRA) wrote:
>
>> RE the exception inside createComponents... strange! Your exception
>> indeed has all the details (ie, shows the original traceback, from the
>> createComponents method).
>>
>> Yet, when I do exactly that change (stick the x in, then run the test case
>> directly, I get this:
>
> Did you build your lucene module with --shared (and did you build jcc with
> shared enabled, the default normally). It occurred to me that exception
> reporting is a bit weaker in non shared mode because the PythonException
> java class is not present. Just a thought...
Hmm, I believe I built jcc with the defaults (shared), but indeed I
did not build the lucene extension shared... I'll try to build shared
and see if that fixes the exception reporting! If so, maybe we should
note this limitation of non-shared...
Mike McCandless
http://blog.mikemccandless.com
Re: [jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase,
so we can create analyzers in Python
Posted by Andi Vajda <va...@apache.org>.
On Fri, 2 Dec 2011, Michael McCandless (Commented) (JIRA) wrote:
> RE the exception inside createComponents... strange! Your exception
> indeed has all the details (ie, shows the original traceback, from the
> createComponents method).
>
> Yet, when I do exactly that change (stick the x in, then run the test case
> directly, I get this:
Did you build your lucene module with --shared (and did you build jcc with
shared enabled, the default normally). It occurred to me that exception
reporting is a bit weaker in non shared mode because the PythonException
java class is not present. Just a thought...
Andi..
>
>
> ======================================================================
> ERROR: testReusable (__main__.ReusableAnalyzerBaseTestCase)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "test/test_ReusableAnalyzerBase.py", line 36, in testReusable
> stream = method("test", reader)
> JavaError: java.lang.RuntimeException: NameError
> Java stacktrace:
> java.lang.RuntimeException: NameError
> at org.apache.pylucene.analysis.PythonReusableAnalyzerBase.createComponents(Native Method)
> at org.apache.lucene.analysis.ReusableAnalyzerBase.reusableTokenStream(ReusableAnalyzerBase.java:73)
>
>
> Ie, for some reason, I don't get the traceback from the createComponents method; all I see is that a NameError had happened, not what name in particular, and what lines of Python source.
>
> I'm on Linux, Python 64 bit, Java 1.6.0_21... I wonder if I somehow compiled things incorrectly? Odd.
>
>> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
>> --------------------------------------------------------------------
>>
>> Key: PYLUCENE-12
>> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
>> Project: PyLucene
>> Issue Type: Improvement
>> Reporter: Michael McCandless
>> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>>
>>
>> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
>> I think we should expose it in Python... patch is simple.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>
[jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161718#comment-13161718 ]
Michael McCandless commented on PYLUCENE-12:
--------------------------------------------
RE the exception inside createComponents... strange! Your exception indeed has all the details (ie, shows the original traceback, from the createComponents method).
Yet, when I do exactly that change (stick the x in, then run the test case directly, I get this:
======================================================================
ERROR: testReusable (__main__.ReusableAnalyzerBaseTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test/test_ReusableAnalyzerBase.py", line 36, in testReusable
stream = method("test", reader)
JavaError: java.lang.RuntimeException: NameError
Java stacktrace:
java.lang.RuntimeException: NameError
at org.apache.pylucene.analysis.PythonReusableAnalyzerBase.createComponents(Native Method)
at org.apache.lucene.analysis.ReusableAnalyzerBase.reusableTokenStream(ReusableAnalyzerBase.java:73)
Ie, for some reason, I don't get the traceback from the createComponents method; all I see is that a NameError had happened, not what name in particular, and what lines of Python source.
I'm on Linux, Python 64 bit, Java 1.6.0_21... I wonder if I somehow compiled things incorrectly? Odd.
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated PYLUCENE-12:
---------------------------------------
Attachment: PYLUCENE-12.patch
New patch (just fixes my indentation screwup from last one).
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Andi Vajda (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161406#comment-13161406 ]
Andi Vajda commented on PYLUCENE-12:
------------------------------------
About the lack of information in the stacktrace, I added a random x into the createComponents method and I'm getting this:
{noformat}
Traceback (most recent call last):
File "test/test_ReusableAnalyzerBase.py", line 36, in testReusable
stream = method("test", reader)
JavaError: org.apache.jcc.PythonException: global name 'xfirst' is not defined
Traceback (most recent call last):
File "test/test_ReusableAnalyzerBase.py", line 24, in createComponents
last = StopFilter(Version.LUCENE_CURRENT, xfirst, StopAnalyzer.ENGLISH_STOP_WORDS_SET)
NameError: global name 'xfirst' is not defined
Java stacktrace:
org.apache.jcc.PythonException: global name 'xfirst' is not defined
Traceback (most recent call last):
File "test/test_ReusableAnalyzerBase.py", line 24, in createComponents
last = StopFilter(Version.LUCENE_CURRENT, xfirst, StopAnalyzer.ENGLISH_STOP_WORDS_SET)
NameError: global name 'xfirst' is not defined
at org.apache.pylucene.analysis.PythonReusableAnalyzerBase.createComponents(Native Method)
at org.apache.lucene.analysis.ReusableAnalyzerBase.reusableTokenStream(ReusableAnalyzerBase.java:73)
{noformat}
Seems plenty of detail to me. What do you think is missing ?
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155389#comment-13155389 ]
Michael McCandless commented on PYLUCENE-12:
--------------------------------------------
I noticed one unfriendliness here: if I modify the MyAnalyzer class (in test_ReusableAnalyzerBase.py), adding an empty ctor (def __init__) that fails to call super's __init__, then I get a SEGV.
I know we document that you must call super (http://lucene.apache.org/pylucene/jcc/documentation/readme.html#extensions), but, can we make this throw an exception instead of SEGV, to be more friendly? Or is that hard...?
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155397#comment-13155397 ]
Michael McCandless commented on PYLUCENE-12:
--------------------------------------------
Hmm, one more unfriendliness: if the createComponents method throws an exception (eg put xxx in there so you hit a NameError), you get back an exception like this:
{noformat}
======================================================================
ERROR: testReusable (__main__.ReusableAnalyzerBaseTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test/test_ReusableAnalyzerBase.py", line 37, in testReusable
stream = method("test", reader)
JavaError: java.lang.RuntimeException: NameError
Java stacktrace:
java.lang.RuntimeException: NameError
at org.apache.pylucene.analysis.PythonReusableAnalyzerBase.createComponents(Native Method)
at org.apache.lucene.analysis.ReusableAnalyzerBase.reusableTokenStream(ReusableAnalyzerBase.java:73)
{noformat}
Somehow this is missing details (exception cause & TB) of the python source that caused the exception.... can we fix this? If it's tricky I can open a new issue...
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase,
so we can create analyzers in Python
Posted by Andi Vajda <va...@apache.org>.
On Fri, 2 Dec 2011, Michael McCandless (Commented) (JIRA) wrote:
> Sorry, could you also add this method to PythonReusableAnalyzerBase.java (I missed it in my first patch):
>
> @Override
> public native Reader initReader(Reader reader);
Done in rev 1209756.
> Separately: how do we turn on Jira's markup like {noformat} and comment previews here ;)
I have no idea.
I find JIRA mail irritating anyway (it takes a whole iPhone screen to
display nothing of use, like a 5 line long useless URL, for example).
Andi..
>
>> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
>> --------------------------------------------------------------------
>>
>> Key: PYLUCENE-12
>> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
>> Project: PyLucene
>> Issue Type: Improvement
>> Reporter: Michael McCandless
>> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>>
>>
>> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
>> I think we should expose it in Python... patch is simple.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>
[jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161715#comment-13161715 ]
Michael McCandless commented on PYLUCENE-12:
--------------------------------------------
Sorry, could you also add this method to PythonReusableAnalyzerBase.java (I missed it in my first patch):
@Override
public native Reader initReader(Reader reader);
Separately: how do we turn on Jira's markup like {noformat} and comment previews here ;)
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161716#comment-13161716 ]
Michael McCandless commented on PYLUCENE-12:
--------------------------------------------
Re not SEGVing if you fail to call super ... OK, if we can't find a non-costly way to do it, let's not!
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Andi Vajda (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andi Vajda resolved PYLUCENE-12.
--------------------------------
Resolution: Fixed
rev 1209356, thanks Mike !
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch, PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PYLUCENE-12) Add PythonReusableAnalyzerBase, so
we can create analyzers in Python
Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PYLUCENE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated PYLUCENE-12:
---------------------------------------
Attachment: PYLUCENE-12.patch
Patch w/ basic test.
> Add PythonReusableAnalyzerBase, so we can create analyzers in Python
> --------------------------------------------------------------------
>
> Key: PYLUCENE-12
> URL: https://issues.apache.org/jira/browse/PYLUCENE-12
> Project: PyLucene
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: PYLUCENE-12.patch
>
>
> Lucene now has a useful helper class, ReusableAnalyzerBase; you subclass it and override one method, to create an analyzer that provides reusableTokenStream impl.
> I think we should expose it in Python... patch is simple.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira