You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by Andi Vajda <va...@apache.org> on 2014/07/04 19:17:58 UTC

Re: JCC Project Extensions

> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
> 
> Hi Andi,
> 
> First of all, absolutely fantastic work on the JCC project, we love it - Second of all, apologies for contacting you out of the blue via email, but I wasn't sure of the best place to contact you to ask a few questions (technical-level, not support-level).  If it isn't acceptable to email please let me know and I can move the discussion somewhere else.

Yes, please include pylucene-dev@lucene.apache.org (subscription required. send mail pylucene-dev-subscribe@ and follow the instructions in the reply).

> A bit of background first of all :- 
> 
> We've been integrating JCC for the past two weeks on a sizeable Python/Java/C++ project (where Python is a controller/glue layer) having migrated from a previous solution to integrate across the languages, and for the most part JCC has been a lifesaver.  
> 
> Technically we've only hit one major issue so far and that has been the translation of exceptions through the layers.  Our Java application does a lot of double-dispatch (visitation) type of execution and we're calling outwards to Python to implement the concrete visitation logic.  
> 
> In the case of an exception being thrown it results in JCC collating a traceback and captured exception at every level of scope, resulting in a giant JavaError exception which has lost it's original PythonException context.
> 
> I was actually able to fix this by attempting to capture the full context via PyErr_Fetch() when the first python object is thrown and storing it within PythonException, and then ensuring that this is carried through each layer appropriately - This seems to work very well, although admittedly I haven't considered regressive errors or compatibility yet, but just wanted to proof-of-concept it first of all then speak with a JCC maintainer.
> 
> The net result is being able to catch errors in a pipeline similar to the following:
> 
> Python
>    -> Java 
>        -> Python
>          -> Raise MyException
>        <- Throw PythonException (containing MyException)
>    <- Inject MyException
> Catch MyException
> 
> My main questions are to ask your thoughts about :-
> 
> - What are your thoughts about through-layer exceptions and a feature improvement like this?

I agree that losing information during exception throwing is not good. If you've got an improvement in the form of a patch, do not hesitate in sending it in.

> - JCC is (I think) currently at home with pylucene, are there are plans to making it separate?

No such plans at the moment.

> - Are you happy if I create a public github project and add the patched code in there for review?

You're welcome to fork the code if you'd like but you don't have to if all you want is sending a patch. Your call.

Thank you for the kind words !

Andi..

> 
> Hope to hear from you!
> 
> Cheers,
> Lee
> 
> 
> 
> TL;DR;
> 
> We would like to :-
> 
> - Improve JCC's through-layer exception support.
> - Establish a github project for JCC alone.
> 
> -- 
> Lee Skillen
> 
> Vulcan Financial Technologies
> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
> 
> Office:  +44 (0)28 95 817888
> Mobile:  +44 (0)78 41 425152
> Web:     www.vulcanft.com 

Re: JCC Project Extensions

Posted by Andi Vajda <va...@apache.org>.
> On Jul 14, 2014, at 16:21, Lee Skillen <ls...@vulcanft.com> wrote:
> 
>> On 10 July 2014 14:25, Andi Vajda <va...@apache.org> wrote:
>> 
>>> On Jul 10, 2014, at 12:23, Lee Skillen <ls...@vulcanft.com> wrote:
>>> 
>>> Hey,
>>> 
>>>>> On 9 July 2014 18:38, Andi Vajda <va...@apache.org> wrote:
>>>>> 
>>>>> On Wed, 9 Jul 2014, Andi Vajda wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Wed, 9 Jul 2014, Lee Skillen wrote:
>>>>>> 
>>>>>> Hey Andi,
>>>>>> 
>>>>>>> On 9 July 2014 13:50, Andi Vajda <va...@apache.org> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Hi Lee,
>>>>>>> 
>>>>>>> 
>>>>>>>> On Tue, 8 Jul 2014, Lee Skillen wrote:
>>>>>>>> 
>>>>>>>> Andi, thanks for the reply - I've created a github mirror of the
>>>>>>>> pylucene
>>>>>>>> project for our own use (which I intend to keep synced with your SVN
>>>>>>>> repository as its official upstream), located at:
>>>>>>>> 
>>>>>>>> https://github.com/lskillen/pylucene
>>>>>>>> 
>>>>>>>> As suggested I have formatted (and attached) a patch of the unfinished
>>>>>>>> code
>>>>>>>> that we're using for the through-layer exceptions.  Alternatively the
>>>>>>>> diff
>>>>>>>> can be inspected via github by diff'ing between the new
>>>>>>>> feature-thru-exception branch that I have created and the master
>>>>>>>> branch, as
>>>>>>>> such:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1
>>>>>>>> 
>>>>>>>> Although we've run the test suite without issues I realise there may
>>>>>>>> still
>>>>>>>> be functionality/style/logical issues with the code.  I also suspect
>>>>>>>> that
>>>>>>>> there may not be specific test cases that target regression failures
>>>>>>>> for
>>>>>>>> exceptions (yet), so confidence isn't high!  All comments are welcome
>>>>>>>> and I
>>>>>>>> realise this will likely require further changes and a repeatable test
>>>>>>>> case
>>>>>>>> before it is acceptable, but that's fine.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> I took a look at your patch and used the main idea to rewrite it so
>>>>>>> that:
>>>>>>> - ref handling is correct (hopefully): you had it backwards when
>>>>>>> calling
>>>>>>>   Py_Restore(), it steals what you currently must own.
>>>>>>> - the Python error state is now saved as one tuple of (type, value,
>>>>>>> tb) on
>>>>>>>   PythonException, not as three strings and three fake Objects
>>>>>>> (longs).
>>>>>>> - ref handling is correct via a finalize() method DECREF'ing the saved
>>>>>>>   error state tuple on PythonException when that exception is
>>>>>>> collected.
>>>>>>> - getMessage() still works as before but the traceback is extracted on
>>>>>>>   demand, not at throw time as was done before.
>>>>>>> 
>>>>>>> The previous implementation was done so that no Python cross-VM
>>>>>>> reference had to be tracked, all the error data was string'ified at error
>>>>>>> time.
>>>>>>> Your change now requires that the error be kept 'alive' accross layers
>>>>>>> and refs must be tracked to avoid leaks.
>>>>>> 
>>>>>> Great feedback, thank you - I suspected that the reference handling
>>>>>> wasn't quite there (first foray into CPython internals), and I also
>>>>>> really wanted to utilise a tuple and get rid of the strings but wasn't
>>>>>> sure what the standard was for extending a class such as
>>>>>> PythonException.  I actually did a quick attempt at running everything
>>>>>> under debug mode to inspect for leaks, but suffice to say that my
>>>>>> usual toolbox for debugging and tracking memory leaks
>>>>>> (gdb/clang/valgrind) didn't work too well - Had some minor success
>>>>>> with the excellent objgraph Python package, but would need to spend
>>>>>> more time on it.
>>>>>> 
>>>>>>> 
>>>>>>> I did _not_ test this as I don't have a system currently running that
>>>>>>> uses JCC in reverse like this. Since you're actively using the feature,
>>>>>>> please test the attached patch (against trunk) and report back with
>>>>>>> comments, bugs, fixes, etc...
>>>>>> 
>>>>>> Not a problem!  I applied your modified patch against trunk, rebuilt
>>>>>> and re-ran our application.  The good news is that everything is
>>>>>> working really well, and the only bug that I could see was within the
>>>>>> RegisterNatives() call within registerNatives() for PythonException
>>>>>> (size was still 2, changed it to calculate the size of the struct).
>>>>>> I'll re-attach the patch with the changes for you to review.
>>>>> 
>>>>> 
>>>>> Woops, I missed that - even though I looked for it. Oh well. Thanks.
>>>> 
>>>> 
>>>> I attached a new patch with your fix. I also removed the 'clear()' method
>>>> since it's no longer necessary: once the PythonException is constructed, the
>>>> error state in the Python VM is cleared because of PyErr_Fetch() being
>>>> called during saveErrorState().
>>> 
>>> Changes are looking great - Applied against trunk again here and all
>>> working well.  I've also added a new (simple) test case within the
>>> test directory under the pylucene root (test_PythonException.py) that
>>> checks to see if the through-layer exceptions are working,
>>> realistically it should probably be checking the traceback as well but
>>> I think this would suffice to show that the exceptions are propagating
>>> properly.  On a side note, I did have an issue building pylucene
>>> because  java/org/apache/pylucene/store/PythonIndexOutput.java refused
>>> to build, as it was referencing a BufferedIndexOutput that seems to
>>> have been removed by LUCENE-5678 (I just removed it in my local trunk
>>> but didn't want to add that to the patch).
>> 
>> Yes, that's fixed in the upcoming release 4.9.0 available to be voted on. See thread on this list started a few days ago for more details.
>> 
>> I should try and integrate your testcase next...
>> Thanks !
>> 
>> Andi..
> 
> Great - I created PYLUCENE-30 to help track it further (hope that's acceptable).

Excellent, thanks !

> Let me know if you need any other help or additional input!

I'm on vacation at the moment so things are moving slowly...

Andi..

> 
>>> 
>>> Cheers,
>>> Lee
>>> 
>>>> 
>>>> Andi..
>>>> 
>>>> 
>>>>> 
>>>>>> The only other comments I have are that:
>>>>>> 
>>>>>> (1) This was still an issue with my patch, but I don't know if there
>>>>>> is anyone out there relying upon the exact format of the string that
>>>>>> gets returned by getMessage(), as it is probably going to be different
>>>>>> now that it is calculated at a different point - In saying that I
>>>>>> imagine you would only be doing that if you were specifically trying
>>>>>> to handle an exception from the Python layer and were trying to
>>>>>> re-create it using the string, which wouldn't be an issue now.
>>>>> 
>>>>> 
>>>>> Yeah, I'm just concatenating the strings. I'm already using PyErr_Print().
>>>>> 
>>>>>> (2) I also noticed that your patch doesn't have the #if
>>>>>> defined(PYTHON) part to protect certain parts that rely on things like
>>>>>> getPythonExceptionClass() - Not sure if that was intentional or not
>>>>>> (guessing it might be one of those things that are always defined
>>>>>> anyway), but just wanted to highlight it.
>>>>> 
>>>>> 
>>>>> Not needed, all of functions.cpp depends on python amyway.
>>>>> 
>>>>>> Apart from that, it looks great - In order to assist with replication
>>>>>> of the through-exception issue on your side I also whipped up a (very)
>>>>>> quick example and am also attaching this to the email.  Utilises a
>>>>>> noddy Makefile to get the job done, but running "make" or "make test"
>>>>>> will generate the java/jcc packages and then run the test.  Running
>>>>>> "make uninstall" will remove the jcc package, and "make clean" will
>>>>>> tidy up the build artifacts.  Standard stuff!
>>>>>> 
>>>>>> Hope that helps!
>>>>> 
>>>>> 
>>>>> Excellent, thanks !
>>>>> 
>>>>> Andi..
>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Lee
>>>>>> 
>>>>>>> Thanks !
>>>>>>> 
>>>>>>> Andi..
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Lee
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Andi,
>>>>>>>>> 
>>>>>>>>> First of all, absolutely fantastic work on the JCC project, we love it
>>>>>>>>> -
>>>>>>>>> Second of all, apologies for contacting you out of the blue via email,
>>>>>>>>> but
>>>>>>>>> I wasn't sure of the best place to contact you to ask a few questions
>>>>>>>>> (technical-level, not support-level).  If it isn't acceptable to email
>>>>>>>>> please let me know and I can move the discussion somewhere else.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Yes, please include pylucene-dev@lucene.apache.org (subscription
>>>>>>>>> required. send mail pylucene-dev-subscribe@ and follow the
>>>>>>>>> instructions
>>>>>>>>> in the reply).
>>>>>>>>> 
>>>>>>>>> A bit of background first of all :-
>>>>>>>>> 
>>>>>>>>> We've been integrating JCC for the past two weeks on a sizeable
>>>>>>>>> Python/Java/C++ project (where Python is a controller/glue layer)
>>>>>>>>> having
>>>>>>>>> migrated from a previous solution to integrate across the languages,
>>>>>>>>> and
>>>>>>>>> for the most part JCC has been a lifesaver.
>>>>>>>>> 
>>>>>>>>> Technically we've only hit one major issue so far and that has been
>>>>>>>>> the
>>>>>>>>> translation of exceptions through the layers.  Our Java application
>>>>>>>>> does a
>>>>>>>>> lot of double-dispatch (visitation) type of execution and we're
>>>>>>>>> calling
>>>>>>>>> outwards to Python to implement the concrete visitation logic.
>>>>>>>>> 
>>>>>>>>> In the case of an exception being thrown it results in JCC collating a
>>>>>>>>> traceback and captured exception at every level of scope, resulting in
>>>>>>>>> a
>>>>>>>>> giant JavaError exception which has lost it's original PythonException
>>>>>>>>> context.
>>>>>>>>> 
>>>>>>>>> I was actually able to fix this by attempting to capture the full
>>>>>>>>> context
>>>>>>>>> via PyErr_Fetch() when the first python object is thrown and storing
>>>>>>>>> it
>>>>>>>>> within PythonException, and then ensuring that this is carried through
>>>>>>>>> each
>>>>>>>>> layer appropriately - This seems to work very well, although
>>>>>>>>> admittedly I
>>>>>>>>> haven't considered regressive errors or compatibility yet, but just
>>>>>>>>> wanted
>>>>>>>>> to proof-of-concept it first of all then speak with a JCC maintainer.
>>>>>>>>> 
>>>>>>>>> The net result is being able to catch errors in a pipeline similar to
>>>>>>>>> the
>>>>>>>>> following:
>>>>>>>>> 
>>>>>>>>> Python
>>>>>>>>>  -> Java
>>>>>>>>>      -> Python
>>>>>>>>>        -> Raise MyException
>>>>>>>>>      <- Throw PythonException (containing MyException)
>>>>>>>>>  <- Inject MyException
>>>>>>>>> Catch MyException
>>>>>>>>> 
>>>>>>>>> My main questions are to ask your thoughts about :-
>>>>>>>>> 
>>>>>>>>> - What are your thoughts about through-layer exceptions and a feature
>>>>>>>>> improvement like this?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I agree that losing information during exception throwing is not good.
>>>>>>>>> If
>>>>>>>>> you've got an improvement in the form of a patch, do not hesitate in
>>>>>>>>> sending it in.
>>>>>>>>> 
>>>>>>>>> - JCC is (I think) currently at home with pylucene, are there are
>>>>>>>>> plans to
>>>>>>>>> making it separate?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> No such plans at the moment.
>>>>>>>>> 
>>>>>>>>> - Are you happy if I create a public github project and add the
>>>>>>>>> patched
>>>>>>>>> code in there for review?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> You're welcome to fork the code if you'd like but you don't have to if
>>>>>>>>> all
>>>>>>>>> you want is sending a patch. Your call.
>>>>>>>>> 
>>>>>>>>> Thank you for the kind words !
>>>>>>>>> 
>>>>>>>>> Andi..
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hope to hear from you!
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Lee
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> TL;DR;
>>>>>>>>> 
>>>>>>>>> We would like to :-
>>>>>>>>> 
>>>>>>>>> - Improve JCC's through-layer exception support.
>>>>>>>>> - Establish a github project for JCC alone.
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Lee Skillen
>>>>>>>>> 
>>>>>>>>> Vulcan Financial Technologies
>>>>>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>>>>> 
>>>>>>>>> Office:  +44 (0)28 95 817888
>>>>>>>>> Mobile:  +44 (0)78 41 425152
>>>>>>>>> Web:     www.vulcanft.com
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Lee Skillen
>>>>>>>> 
>>>>>>>> Vulcan Financial Technologies
>>>>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>>>> 
>>>>>>>> Office:  +44 (0)28 95 817888
>>>>>>>> Mobile:  +44 (0)78 41 425152
>>>>>>>> Web:     www.vulcanft.com
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Lee Skillen
>>>>>> 
>>>>>> Vulcan Financial Technologies
>>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>> 
>>>>>> Office:  +44 (0)28 95 817888
>>>>>> Web:     www.vulcanft.com
>>> 
>>> 
>>> 
>>> --
>>> Lee Skillen
>>> 
>>> Vulcan Financial Technologies
>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>> 
>>> Office:  +44 (0)28 95 817888
>>> Web:     www.vulcanft.com
>>> <feature-thru-exception-3.patch>
> 
> 
> 
> -- 
> Lee Skillen
> 
> Vulcan Financial Technologies
> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
> 
> Office:  +44 (0)28 95 817888
> Web:     www.vulcanft.com

Re: JCC Project Extensions

Posted by Lee Skillen <ls...@vulcanft.com>.
On 10 July 2014 14:25, Andi Vajda <va...@apache.org> wrote:
>
>> On Jul 10, 2014, at 12:23, Lee Skillen <ls...@vulcanft.com> wrote:
>>
>> Hey,
>>
>>> On 9 July 2014 18:38, Andi Vajda <va...@apache.org> wrote:
>>>
>>>> On Wed, 9 Jul 2014, Andi Vajda wrote:
>>>>
>>>>
>>>>
>>>>> On Wed, 9 Jul 2014, Lee Skillen wrote:
>>>>>
>>>>> Hey Andi,
>>>>>
>>>>>> On 9 July 2014 13:50, Andi Vajda <va...@apache.org> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Lee,
>>>>>>
>>>>>>
>>>>>>> On Tue, 8 Jul 2014, Lee Skillen wrote:
>>>>>>>
>>>>>>> Andi, thanks for the reply - I've created a github mirror of the
>>>>>>> pylucene
>>>>>>> project for our own use (which I intend to keep synced with your SVN
>>>>>>> repository as its official upstream), located at:
>>>>>>>
>>>>>>> https://github.com/lskillen/pylucene
>>>>>>>
>>>>>>> As suggested I have formatted (and attached) a patch of the unfinished
>>>>>>> code
>>>>>>> that we're using for the through-layer exceptions.  Alternatively the
>>>>>>> diff
>>>>>>> can be inspected via github by diff'ing between the new
>>>>>>> feature-thru-exception branch that I have created and the master
>>>>>>> branch, as
>>>>>>> such:
>>>>>>>
>>>>>>>
>>>>>>> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1
>>>>>>>
>>>>>>> Although we've run the test suite without issues I realise there may
>>>>>>> still
>>>>>>> be functionality/style/logical issues with the code.  I also suspect
>>>>>>> that
>>>>>>> there may not be specific test cases that target regression failures
>>>>>>> for
>>>>>>> exceptions (yet), so confidence isn't high!  All comments are welcome
>>>>>>> and I
>>>>>>> realise this will likely require further changes and a repeatable test
>>>>>>> case
>>>>>>> before it is acceptable, but that's fine.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I took a look at your patch and used the main idea to rewrite it so
>>>>>> that:
>>>>>>  - ref handling is correct (hopefully): you had it backwards when
>>>>>> calling
>>>>>>    Py_Restore(), it steals what you currently must own.
>>>>>>  - the Python error state is now saved as one tuple of (type, value,
>>>>>> tb) on
>>>>>>    PythonException, not as three strings and three fake Objects
>>>>>> (longs).
>>>>>>  - ref handling is correct via a finalize() method DECREF'ing the saved
>>>>>>    error state tuple on PythonException when that exception is
>>>>>> collected.
>>>>>>  - getMessage() still works as before but the traceback is extracted on
>>>>>>    demand, not at throw time as was done before.
>>>>>>
>>>>>> The previous implementation was done so that no Python cross-VM
>>>>>> reference had to be tracked, all the error data was string'ified at error
>>>>>> time.
>>>>>> Your change now requires that the error be kept 'alive' accross layers
>>>>>> and refs must be tracked to avoid leaks.
>>>>>
>>>>> Great feedback, thank you - I suspected that the reference handling
>>>>> wasn't quite there (first foray into CPython internals), and I also
>>>>> really wanted to utilise a tuple and get rid of the strings but wasn't
>>>>> sure what the standard was for extending a class such as
>>>>> PythonException.  I actually did a quick attempt at running everything
>>>>> under debug mode to inspect for leaks, but suffice to say that my
>>>>> usual toolbox for debugging and tracking memory leaks
>>>>> (gdb/clang/valgrind) didn't work too well - Had some minor success
>>>>> with the excellent objgraph Python package, but would need to spend
>>>>> more time on it.
>>>>>
>>>>>>
>>>>>> I did _not_ test this as I don't have a system currently running that
>>>>>> uses JCC in reverse like this. Since you're actively using the feature,
>>>>>> please test the attached patch (against trunk) and report back with
>>>>>> comments, bugs, fixes, etc...
>>>>>
>>>>> Not a problem!  I applied your modified patch against trunk, rebuilt
>>>>> and re-ran our application.  The good news is that everything is
>>>>> working really well, and the only bug that I could see was within the
>>>>> RegisterNatives() call within registerNatives() for PythonException
>>>>> (size was still 2, changed it to calculate the size of the struct).
>>>>> I'll re-attach the patch with the changes for you to review.
>>>>
>>>>
>>>> Woops, I missed that - even though I looked for it. Oh well. Thanks.
>>>
>>>
>>> I attached a new patch with your fix. I also removed the 'clear()' method
>>> since it's no longer necessary: once the PythonException is constructed, the
>>> error state in the Python VM is cleared because of PyErr_Fetch() being
>>> called during saveErrorState().
>>
>> Changes are looking great - Applied against trunk again here and all
>> working well.  I've also added a new (simple) test case within the
>> test directory under the pylucene root (test_PythonException.py) that
>> checks to see if the through-layer exceptions are working,
>> realistically it should probably be checking the traceback as well but
>> I think this would suffice to show that the exceptions are propagating
>> properly.  On a side note, I did have an issue building pylucene
>> because  java/org/apache/pylucene/store/PythonIndexOutput.java refused
>> to build, as it was referencing a BufferedIndexOutput that seems to
>> have been removed by LUCENE-5678 (I just removed it in my local trunk
>> but didn't want to add that to the patch).
>
> Yes, that's fixed in the upcoming release 4.9.0 available to be voted on. See thread on this list started a few days ago for more details.
>
> I should try and integrate your testcase next...
> Thanks !
>
> Andi..
>

Great - I created PYLUCENE-30 to help track it further (hope that's acceptable).

Let me know if you need any other help or additional input!

>>
>> Cheers,
>> Lee
>>
>>>
>>> Andi..
>>>
>>>
>>>>
>>>>> The only other comments I have are that:
>>>>>
>>>>> (1) This was still an issue with my patch, but I don't know if there
>>>>> is anyone out there relying upon the exact format of the string that
>>>>> gets returned by getMessage(), as it is probably going to be different
>>>>> now that it is calculated at a different point - In saying that I
>>>>> imagine you would only be doing that if you were specifically trying
>>>>> to handle an exception from the Python layer and were trying to
>>>>> re-create it using the string, which wouldn't be an issue now.
>>>>
>>>>
>>>> Yeah, I'm just concatenating the strings. I'm already using PyErr_Print().
>>>>
>>>>> (2) I also noticed that your patch doesn't have the #if
>>>>> defined(PYTHON) part to protect certain parts that rely on things like
>>>>> getPythonExceptionClass() - Not sure if that was intentional or not
>>>>> (guessing it might be one of those things that are always defined
>>>>> anyway), but just wanted to highlight it.
>>>>
>>>>
>>>> Not needed, all of functions.cpp depends on python amyway.
>>>>
>>>>> Apart from that, it looks great - In order to assist with replication
>>>>> of the through-exception issue on your side I also whipped up a (very)
>>>>> quick example and am also attaching this to the email.  Utilises a
>>>>> noddy Makefile to get the job done, but running "make" or "make test"
>>>>> will generate the java/jcc packages and then run the test.  Running
>>>>> "make uninstall" will remove the jcc package, and "make clean" will
>>>>> tidy up the build artifacts.  Standard stuff!
>>>>>
>>>>> Hope that helps!
>>>>
>>>>
>>>> Excellent, thanks !
>>>>
>>>> Andi..
>>>>
>>>>>
>>>>> Thanks,
>>>>> Lee
>>>>>
>>>>>> Thanks !
>>>>>>
>>>>>> Andi..
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Lee
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>>>>>>>>
>>>>>>>> Hi Andi,
>>>>>>>>
>>>>>>>> First of all, absolutely fantastic work on the JCC project, we love it
>>>>>>>> -
>>>>>>>> Second of all, apologies for contacting you out of the blue via email,
>>>>>>>> but
>>>>>>>> I wasn't sure of the best place to contact you to ask a few questions
>>>>>>>> (technical-level, not support-level).  If it isn't acceptable to email
>>>>>>>> please let me know and I can move the discussion somewhere else.
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, please include pylucene-dev@lucene.apache.org (subscription
>>>>>>>> required. send mail pylucene-dev-subscribe@ and follow the
>>>>>>>> instructions
>>>>>>>> in the reply).
>>>>>>>>
>>>>>>>> A bit of background first of all :-
>>>>>>>>
>>>>>>>> We've been integrating JCC for the past two weeks on a sizeable
>>>>>>>> Python/Java/C++ project (where Python is a controller/glue layer)
>>>>>>>> having
>>>>>>>> migrated from a previous solution to integrate across the languages,
>>>>>>>> and
>>>>>>>> for the most part JCC has been a lifesaver.
>>>>>>>>
>>>>>>>> Technically we've only hit one major issue so far and that has been
>>>>>>>> the
>>>>>>>> translation of exceptions through the layers.  Our Java application
>>>>>>>> does a
>>>>>>>> lot of double-dispatch (visitation) type of execution and we're
>>>>>>>> calling
>>>>>>>> outwards to Python to implement the concrete visitation logic.
>>>>>>>>
>>>>>>>> In the case of an exception being thrown it results in JCC collating a
>>>>>>>> traceback and captured exception at every level of scope, resulting in
>>>>>>>> a
>>>>>>>> giant JavaError exception which has lost it's original PythonException
>>>>>>>> context.
>>>>>>>>
>>>>>>>> I was actually able to fix this by attempting to capture the full
>>>>>>>> context
>>>>>>>> via PyErr_Fetch() when the first python object is thrown and storing
>>>>>>>> it
>>>>>>>> within PythonException, and then ensuring that this is carried through
>>>>>>>> each
>>>>>>>> layer appropriately - This seems to work very well, although
>>>>>>>> admittedly I
>>>>>>>> haven't considered regressive errors or compatibility yet, but just
>>>>>>>> wanted
>>>>>>>> to proof-of-concept it first of all then speak with a JCC maintainer.
>>>>>>>>
>>>>>>>> The net result is being able to catch errors in a pipeline similar to
>>>>>>>> the
>>>>>>>> following:
>>>>>>>>
>>>>>>>> Python
>>>>>>>>   -> Java
>>>>>>>>       -> Python
>>>>>>>>         -> Raise MyException
>>>>>>>>       <- Throw PythonException (containing MyException)
>>>>>>>>   <- Inject MyException
>>>>>>>> Catch MyException
>>>>>>>>
>>>>>>>> My main questions are to ask your thoughts about :-
>>>>>>>>
>>>>>>>> - What are your thoughts about through-layer exceptions and a feature
>>>>>>>> improvement like this?
>>>>>>>>
>>>>>>>>
>>>>>>>> I agree that losing information during exception throwing is not good.
>>>>>>>> If
>>>>>>>> you've got an improvement in the form of a patch, do not hesitate in
>>>>>>>> sending it in.
>>>>>>>>
>>>>>>>> - JCC is (I think) currently at home with pylucene, are there are
>>>>>>>> plans to
>>>>>>>> making it separate?
>>>>>>>>
>>>>>>>>
>>>>>>>> No such plans at the moment.
>>>>>>>>
>>>>>>>> - Are you happy if I create a public github project and add the
>>>>>>>> patched
>>>>>>>> code in there for review?
>>>>>>>>
>>>>>>>>
>>>>>>>> You're welcome to fork the code if you'd like but you don't have to if
>>>>>>>> all
>>>>>>>> you want is sending a patch. Your call.
>>>>>>>>
>>>>>>>> Thank you for the kind words !
>>>>>>>>
>>>>>>>> Andi..
>>>>>>>>
>>>>>>>>
>>>>>>>> Hope to hear from you!
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Lee
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> TL;DR;
>>>>>>>>
>>>>>>>> We would like to :-
>>>>>>>>
>>>>>>>> - Improve JCC's through-layer exception support.
>>>>>>>> - Establish a github project for JCC alone.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Lee Skillen
>>>>>>>>
>>>>>>>> Vulcan Financial Technologies
>>>>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>>>>
>>>>>>>> Office:  +44 (0)28 95 817888
>>>>>>>> Mobile:  +44 (0)78 41 425152
>>>>>>>> Web:     www.vulcanft.com
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Lee Skillen
>>>>>>>
>>>>>>> Vulcan Financial Technologies
>>>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>>>
>>>>>>> Office:  +44 (0)28 95 817888
>>>>>>> Mobile:  +44 (0)78 41 425152
>>>>>>> Web:     www.vulcanft.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Lee Skillen
>>>>>
>>>>> Vulcan Financial Technologies
>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>
>>>>> Office:  +44 (0)28 95 817888
>>>>> Web:     www.vulcanft.com
>>
>>
>>
>> --
>> Lee Skillen
>>
>> Vulcan Financial Technologies
>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>
>> Office:  +44 (0)28 95 817888
>> Web:     www.vulcanft.com
>> <feature-thru-exception-3.patch>



-- 
Lee Skillen

Vulcan Financial Technologies
1st Floor, 47 Malone Road, Belfast, BT9 6RY

Office:  +44 (0)28 95 817888
Web:     www.vulcanft.com

Re: JCC Project Extensions

Posted by Andi Vajda <va...@apache.org>.
> On Jul 10, 2014, at 12:23, Lee Skillen <ls...@vulcanft.com> wrote:
> 
> Hey,
> 
>> On 9 July 2014 18:38, Andi Vajda <va...@apache.org> wrote:
>> 
>>> On Wed, 9 Jul 2014, Andi Vajda wrote:
>>> 
>>> 
>>> 
>>>> On Wed, 9 Jul 2014, Lee Skillen wrote:
>>>> 
>>>> Hey Andi,
>>>> 
>>>>> On 9 July 2014 13:50, Andi Vajda <va...@apache.org> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi Lee,
>>>>> 
>>>>> 
>>>>>> On Tue, 8 Jul 2014, Lee Skillen wrote:
>>>>>> 
>>>>>> Andi, thanks for the reply - I've created a github mirror of the
>>>>>> pylucene
>>>>>> project for our own use (which I intend to keep synced with your SVN
>>>>>> repository as its official upstream), located at:
>>>>>> 
>>>>>> https://github.com/lskillen/pylucene
>>>>>> 
>>>>>> As suggested I have formatted (and attached) a patch of the unfinished
>>>>>> code
>>>>>> that we're using for the through-layer exceptions.  Alternatively the
>>>>>> diff
>>>>>> can be inspected via github by diff'ing between the new
>>>>>> feature-thru-exception branch that I have created and the master
>>>>>> branch, as
>>>>>> such:
>>>>>> 
>>>>>> 
>>>>>> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1
>>>>>> 
>>>>>> Although we've run the test suite without issues I realise there may
>>>>>> still
>>>>>> be functionality/style/logical issues with the code.  I also suspect
>>>>>> that
>>>>>> there may not be specific test cases that target regression failures
>>>>>> for
>>>>>> exceptions (yet), so confidence isn't high!  All comments are welcome
>>>>>> and I
>>>>>> realise this will likely require further changes and a repeatable test
>>>>>> case
>>>>>> before it is acceptable, but that's fine.
>>>>> 
>>>>> 
>>>>> 
>>>>> I took a look at your patch and used the main idea to rewrite it so
>>>>> that:
>>>>>  - ref handling is correct (hopefully): you had it backwards when
>>>>> calling
>>>>>    Py_Restore(), it steals what you currently must own.
>>>>>  - the Python error state is now saved as one tuple of (type, value,
>>>>> tb) on
>>>>>    PythonException, not as three strings and three fake Objects
>>>>> (longs).
>>>>>  - ref handling is correct via a finalize() method DECREF'ing the saved
>>>>>    error state tuple on PythonException when that exception is
>>>>> collected.
>>>>>  - getMessage() still works as before but the traceback is extracted on
>>>>>    demand, not at throw time as was done before.
>>>>> 
>>>>> The previous implementation was done so that no Python cross-VM
>>>>> reference had to be tracked, all the error data was string'ified at error
>>>>> time.
>>>>> Your change now requires that the error be kept 'alive' accross layers
>>>>> and refs must be tracked to avoid leaks.
>>>> 
>>>> Great feedback, thank you - I suspected that the reference handling
>>>> wasn't quite there (first foray into CPython internals), and I also
>>>> really wanted to utilise a tuple and get rid of the strings but wasn't
>>>> sure what the standard was for extending a class such as
>>>> PythonException.  I actually did a quick attempt at running everything
>>>> under debug mode to inspect for leaks, but suffice to say that my
>>>> usual toolbox for debugging and tracking memory leaks
>>>> (gdb/clang/valgrind) didn't work too well - Had some minor success
>>>> with the excellent objgraph Python package, but would need to spend
>>>> more time on it.
>>>> 
>>>>> 
>>>>> I did _not_ test this as I don't have a system currently running that
>>>>> uses JCC in reverse like this. Since you're actively using the feature,
>>>>> please test the attached patch (against trunk) and report back with
>>>>> comments, bugs, fixes, etc...
>>>> 
>>>> Not a problem!  I applied your modified patch against trunk, rebuilt
>>>> and re-ran our application.  The good news is that everything is
>>>> working really well, and the only bug that I could see was within the
>>>> RegisterNatives() call within registerNatives() for PythonException
>>>> (size was still 2, changed it to calculate the size of the struct).
>>>> I'll re-attach the patch with the changes for you to review.
>>> 
>>> 
>>> Woops, I missed that - even though I looked for it. Oh well. Thanks.
>> 
>> 
>> I attached a new patch with your fix. I also removed the 'clear()' method
>> since it's no longer necessary: once the PythonException is constructed, the
>> error state in the Python VM is cleared because of PyErr_Fetch() being
>> called during saveErrorState().
> 
> Changes are looking great - Applied against trunk again here and all
> working well.  I've also added a new (simple) test case within the
> test directory under the pylucene root (test_PythonException.py) that
> checks to see if the through-layer exceptions are working,
> realistically it should probably be checking the traceback as well but
> I think this would suffice to show that the exceptions are propagating
> properly.  On a side note, I did have an issue building pylucene
> because  java/org/apache/pylucene/store/PythonIndexOutput.java refused
> to build, as it was referencing a BufferedIndexOutput that seems to
> have been removed by LUCENE-5678 (I just removed it in my local trunk
> but didn't want to add that to the patch).

Yes, that's fixed in the upcoming release 4.9.0 available to be voted on. See thread on this list started a few days ago for more details.

I should try and integrate your testcase next...
Thanks !

Andi..

> 
> Cheers,
> Lee
> 
>> 
>> Andi..
>> 
>> 
>>> 
>>>> The only other comments I have are that:
>>>> 
>>>> (1) This was still an issue with my patch, but I don't know if there
>>>> is anyone out there relying upon the exact format of the string that
>>>> gets returned by getMessage(), as it is probably going to be different
>>>> now that it is calculated at a different point - In saying that I
>>>> imagine you would only be doing that if you were specifically trying
>>>> to handle an exception from the Python layer and were trying to
>>>> re-create it using the string, which wouldn't be an issue now.
>>> 
>>> 
>>> Yeah, I'm just concatenating the strings. I'm already using PyErr_Print().
>>> 
>>>> (2) I also noticed that your patch doesn't have the #if
>>>> defined(PYTHON) part to protect certain parts that rely on things like
>>>> getPythonExceptionClass() - Not sure if that was intentional or not
>>>> (guessing it might be one of those things that are always defined
>>>> anyway), but just wanted to highlight it.
>>> 
>>> 
>>> Not needed, all of functions.cpp depends on python amyway.
>>> 
>>>> Apart from that, it looks great - In order to assist with replication
>>>> of the through-exception issue on your side I also whipped up a (very)
>>>> quick example and am also attaching this to the email.  Utilises a
>>>> noddy Makefile to get the job done, but running "make" or "make test"
>>>> will generate the java/jcc packages and then run the test.  Running
>>>> "make uninstall" will remove the jcc package, and "make clean" will
>>>> tidy up the build artifacts.  Standard stuff!
>>>> 
>>>> Hope that helps!
>>> 
>>> 
>>> Excellent, thanks !
>>> 
>>> Andi..
>>> 
>>>> 
>>>> Thanks,
>>>> Lee
>>>> 
>>>>> Thanks !
>>>>> 
>>>>> Andi..
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Lee
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>>>>>>> 
>>>>>>> Hi Andi,
>>>>>>> 
>>>>>>> First of all, absolutely fantastic work on the JCC project, we love it
>>>>>>> -
>>>>>>> Second of all, apologies for contacting you out of the blue via email,
>>>>>>> but
>>>>>>> I wasn't sure of the best place to contact you to ask a few questions
>>>>>>> (technical-level, not support-level).  If it isn't acceptable to email
>>>>>>> please let me know and I can move the discussion somewhere else.
>>>>>>> 
>>>>>>> 
>>>>>>> Yes, please include pylucene-dev@lucene.apache.org (subscription
>>>>>>> required. send mail pylucene-dev-subscribe@ and follow the
>>>>>>> instructions
>>>>>>> in the reply).
>>>>>>> 
>>>>>>> A bit of background first of all :-
>>>>>>> 
>>>>>>> We've been integrating JCC for the past two weeks on a sizeable
>>>>>>> Python/Java/C++ project (where Python is a controller/glue layer)
>>>>>>> having
>>>>>>> migrated from a previous solution to integrate across the languages,
>>>>>>> and
>>>>>>> for the most part JCC has been a lifesaver.
>>>>>>> 
>>>>>>> Technically we've only hit one major issue so far and that has been
>>>>>>> the
>>>>>>> translation of exceptions through the layers.  Our Java application
>>>>>>> does a
>>>>>>> lot of double-dispatch (visitation) type of execution and we're
>>>>>>> calling
>>>>>>> outwards to Python to implement the concrete visitation logic.
>>>>>>> 
>>>>>>> In the case of an exception being thrown it results in JCC collating a
>>>>>>> traceback and captured exception at every level of scope, resulting in
>>>>>>> a
>>>>>>> giant JavaError exception which has lost it's original PythonException
>>>>>>> context.
>>>>>>> 
>>>>>>> I was actually able to fix this by attempting to capture the full
>>>>>>> context
>>>>>>> via PyErr_Fetch() when the first python object is thrown and storing
>>>>>>> it
>>>>>>> within PythonException, and then ensuring that this is carried through
>>>>>>> each
>>>>>>> layer appropriately - This seems to work very well, although
>>>>>>> admittedly I
>>>>>>> haven't considered regressive errors or compatibility yet, but just
>>>>>>> wanted
>>>>>>> to proof-of-concept it first of all then speak with a JCC maintainer.
>>>>>>> 
>>>>>>> The net result is being able to catch errors in a pipeline similar to
>>>>>>> the
>>>>>>> following:
>>>>>>> 
>>>>>>> Python
>>>>>>>   -> Java
>>>>>>>       -> Python
>>>>>>>         -> Raise MyException
>>>>>>>       <- Throw PythonException (containing MyException)
>>>>>>>   <- Inject MyException
>>>>>>> Catch MyException
>>>>>>> 
>>>>>>> My main questions are to ask your thoughts about :-
>>>>>>> 
>>>>>>> - What are your thoughts about through-layer exceptions and a feature
>>>>>>> improvement like this?
>>>>>>> 
>>>>>>> 
>>>>>>> I agree that losing information during exception throwing is not good.
>>>>>>> If
>>>>>>> you've got an improvement in the form of a patch, do not hesitate in
>>>>>>> sending it in.
>>>>>>> 
>>>>>>> - JCC is (I think) currently at home with pylucene, are there are
>>>>>>> plans to
>>>>>>> making it separate?
>>>>>>> 
>>>>>>> 
>>>>>>> No such plans at the moment.
>>>>>>> 
>>>>>>> - Are you happy if I create a public github project and add the
>>>>>>> patched
>>>>>>> code in there for review?
>>>>>>> 
>>>>>>> 
>>>>>>> You're welcome to fork the code if you'd like but you don't have to if
>>>>>>> all
>>>>>>> you want is sending a patch. Your call.
>>>>>>> 
>>>>>>> Thank you for the kind words !
>>>>>>> 
>>>>>>> Andi..
>>>>>>> 
>>>>>>> 
>>>>>>> Hope to hear from you!
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Lee
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> TL;DR;
>>>>>>> 
>>>>>>> We would like to :-
>>>>>>> 
>>>>>>> - Improve JCC's through-layer exception support.
>>>>>>> - Establish a github project for JCC alone.
>>>>>>> 
>>>>>>> --
>>>>>>> Lee Skillen
>>>>>>> 
>>>>>>> Vulcan Financial Technologies
>>>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>>> 
>>>>>>> Office:  +44 (0)28 95 817888
>>>>>>> Mobile:  +44 (0)78 41 425152
>>>>>>> Web:     www.vulcanft.com
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Lee Skillen
>>>>>> 
>>>>>> Vulcan Financial Technologies
>>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>> 
>>>>>> Office:  +44 (0)28 95 817888
>>>>>> Mobile:  +44 (0)78 41 425152
>>>>>> Web:     www.vulcanft.com
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Lee Skillen
>>>> 
>>>> Vulcan Financial Technologies
>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>> 
>>>> Office:  +44 (0)28 95 817888
>>>> Web:     www.vulcanft.com
> 
> 
> 
> -- 
> Lee Skillen
> 
> Vulcan Financial Technologies
> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
> 
> Office:  +44 (0)28 95 817888
> Web:     www.vulcanft.com
> <feature-thru-exception-3.patch>

Re: JCC Project Extensions

Posted by Lee Skillen <ls...@vulcanft.com>.
Hey,

On 9 July 2014 18:38, Andi Vajda <va...@apache.org> wrote:
>
> On Wed, 9 Jul 2014, Andi Vajda wrote:
>
>>
>>
>> On Wed, 9 Jul 2014, Lee Skillen wrote:
>>
>>> Hey Andi,
>>>
>>> On 9 July 2014 13:50, Andi Vajda <va...@apache.org> wrote:
>>>>
>>>>
>>>>
>>>>  Hi Lee,
>>>>
>>>>
>>>> On Tue, 8 Jul 2014, Lee Skillen wrote:
>>>>
>>>>> Andi, thanks for the reply - I've created a github mirror of the
>>>>> pylucene
>>>>> project for our own use (which I intend to keep synced with your SVN
>>>>> repository as its official upstream), located at:
>>>>>
>>>>> https://github.com/lskillen/pylucene
>>>>>
>>>>> As suggested I have formatted (and attached) a patch of the unfinished
>>>>> code
>>>>> that we're using for the through-layer exceptions.  Alternatively the
>>>>> diff
>>>>> can be inspected via github by diff'ing between the new
>>>>> feature-thru-exception branch that I have created and the master
>>>>> branch, as
>>>>> such:
>>>>>
>>>>>
>>>>> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1
>>>>>
>>>>> Although we've run the test suite without issues I realise there may
>>>>> still
>>>>> be functionality/style/logical issues with the code.  I also suspect
>>>>> that
>>>>> there may not be specific test cases that target regression failures
>>>>> for
>>>>> exceptions (yet), so confidence isn't high!  All comments are welcome
>>>>> and I
>>>>> realise this will likely require further changes and a repeatable test
>>>>> case
>>>>> before it is acceptable, but that's fine.
>>>>
>>>>
>>>>
>>>> I took a look at your patch and used the main idea to rewrite it so
>>>> that:
>>>>   - ref handling is correct (hopefully): you had it backwards when
>>>> calling
>>>>     Py_Restore(), it steals what you currently must own.
>>>>   - the Python error state is now saved as one tuple of (type, value,
>>>> tb) on
>>>>     PythonException, not as three strings and three fake Objects
>>>> (longs).
>>>>   - ref handling is correct via a finalize() method DECREF'ing the saved
>>>>     error state tuple on PythonException when that exception is
>>>> collected.
>>>>   - getMessage() still works as before but the traceback is extracted on
>>>>     demand, not at throw time as was done before.
>>>>
>>>> The previous implementation was done so that no Python cross-VM
>>>> reference had to be tracked, all the error data was string'ified at error
>>>> time.
>>>> Your change now requires that the error be kept 'alive' accross layers
>>>> and refs must be tracked to avoid leaks.
>>>>
>>>
>>> Great feedback, thank you - I suspected that the reference handling
>>> wasn't quite there (first foray into CPython internals), and I also
>>> really wanted to utilise a tuple and get rid of the strings but wasn't
>>> sure what the standard was for extending a class such as
>>> PythonException.  I actually did a quick attempt at running everything
>>> under debug mode to inspect for leaks, but suffice to say that my
>>> usual toolbox for debugging and tracking memory leaks
>>> (gdb/clang/valgrind) didn't work too well - Had some minor success
>>> with the excellent objgraph Python package, but would need to spend
>>> more time on it.
>>>
>>>>
>>>> I did _not_ test this as I don't have a system currently running that
>>>> uses JCC in reverse like this. Since you're actively using the feature,
>>>> please test the attached patch (against trunk) and report back with
>>>> comments, bugs, fixes, etc...
>>>>
>>>
>>> Not a problem!  I applied your modified patch against trunk, rebuilt
>>> and re-ran our application.  The good news is that everything is
>>> working really well, and the only bug that I could see was within the
>>> RegisterNatives() call within registerNatives() for PythonException
>>> (size was still 2, changed it to calculate the size of the struct).
>>> I'll re-attach the patch with the changes for you to review.
>>
>>
>> Woops, I missed that - even though I looked for it. Oh well. Thanks.
>
>
> I attached a new patch with your fix. I also removed the 'clear()' method
> since it's no longer necessary: once the PythonException is constructed, the
> error state in the Python VM is cleared because of PyErr_Fetch() being
> called during saveErrorState().

Changes are looking great - Applied against trunk again here and all
working well.  I've also added a new (simple) test case within the
test directory under the pylucene root (test_PythonException.py) that
checks to see if the through-layer exceptions are working,
realistically it should probably be checking the traceback as well but
I think this would suffice to show that the exceptions are propagating
properly.  On a side note, I did have an issue building pylucene
because  java/org/apache/pylucene/store/PythonIndexOutput.java refused
to build, as it was referencing a BufferedIndexOutput that seems to
have been removed by LUCENE-5678 (I just removed it in my local trunk
but didn't want to add that to the patch).

Cheers,
Lee

>
> Andi..
>
>
>>
>>> The only other comments I have are that:
>>>
>>> (1) This was still an issue with my patch, but I don't know if there
>>> is anyone out there relying upon the exact format of the string that
>>> gets returned by getMessage(), as it is probably going to be different
>>> now that it is calculated at a different point - In saying that I
>>> imagine you would only be doing that if you were specifically trying
>>> to handle an exception from the Python layer and were trying to
>>> re-create it using the string, which wouldn't be an issue now.
>>
>>
>> Yeah, I'm just concatenating the strings. I'm already using PyErr_Print().
>>
>>> (2) I also noticed that your patch doesn't have the #if
>>> defined(PYTHON) part to protect certain parts that rely on things like
>>> getPythonExceptionClass() - Not sure if that was intentional or not
>>> (guessing it might be one of those things that are always defined
>>> anyway), but just wanted to highlight it.
>>
>>
>> Not needed, all of functions.cpp depends on python amyway.
>>
>>> Apart from that, it looks great - In order to assist with replication
>>> of the through-exception issue on your side I also whipped up a (very)
>>> quick example and am also attaching this to the email.  Utilises a
>>> noddy Makefile to get the job done, but running "make" or "make test"
>>> will generate the java/jcc packages and then run the test.  Running
>>> "make uninstall" will remove the jcc package, and "make clean" will
>>> tidy up the build artifacts.  Standard stuff!
>>>
>>> Hope that helps!
>>
>>
>> Excellent, thanks !
>>
>> Andi..
>>
>>>
>>> Thanks,
>>> Lee
>>>
>>>> Thanks !
>>>>
>>>> Andi..
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Lee
>>>>>
>>>>>
>>>>>
>>>>> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:
>>>>>
>>>>>>
>>>>>> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>>>>>>
>>>>>> Hi Andi,
>>>>>>
>>>>>> First of all, absolutely fantastic work on the JCC project, we love it
>>>>>> -
>>>>>> Second of all, apologies for contacting you out of the blue via email,
>>>>>> but
>>>>>> I wasn't sure of the best place to contact you to ask a few questions
>>>>>> (technical-level, not support-level).  If it isn't acceptable to email
>>>>>> please let me know and I can move the discussion somewhere else.
>>>>>>
>>>>>>
>>>>>> Yes, please include pylucene-dev@lucene.apache.org (subscription
>>>>>> required. send mail pylucene-dev-subscribe@ and follow the
>>>>>> instructions
>>>>>> in the reply).
>>>>>>
>>>>>> A bit of background first of all :-
>>>>>>
>>>>>> We've been integrating JCC for the past two weeks on a sizeable
>>>>>> Python/Java/C++ project (where Python is a controller/glue layer)
>>>>>> having
>>>>>> migrated from a previous solution to integrate across the languages,
>>>>>> and
>>>>>> for the most part JCC has been a lifesaver.
>>>>>>
>>>>>> Technically we've only hit one major issue so far and that has been
>>>>>> the
>>>>>> translation of exceptions through the layers.  Our Java application
>>>>>> does a
>>>>>> lot of double-dispatch (visitation) type of execution and we're
>>>>>> calling
>>>>>> outwards to Python to implement the concrete visitation logic.
>>>>>>
>>>>>> In the case of an exception being thrown it results in JCC collating a
>>>>>> traceback and captured exception at every level of scope, resulting in
>>>>>> a
>>>>>> giant JavaError exception which has lost it's original PythonException
>>>>>> context.
>>>>>>
>>>>>> I was actually able to fix this by attempting to capture the full
>>>>>> context
>>>>>> via PyErr_Fetch() when the first python object is thrown and storing
>>>>>> it
>>>>>> within PythonException, and then ensuring that this is carried through
>>>>>> each
>>>>>> layer appropriately - This seems to work very well, although
>>>>>> admittedly I
>>>>>> haven't considered regressive errors or compatibility yet, but just
>>>>>> wanted
>>>>>> to proof-of-concept it first of all then speak with a JCC maintainer.
>>>>>>
>>>>>> The net result is being able to catch errors in a pipeline similar to
>>>>>> the
>>>>>> following:
>>>>>>
>>>>>> Python
>>>>>>    -> Java
>>>>>>        -> Python
>>>>>>          -> Raise MyException
>>>>>>        <- Throw PythonException (containing MyException)
>>>>>>    <- Inject MyException
>>>>>> Catch MyException
>>>>>>
>>>>>> My main questions are to ask your thoughts about :-
>>>>>>
>>>>>> - What are your thoughts about through-layer exceptions and a feature
>>>>>> improvement like this?
>>>>>>
>>>>>>
>>>>>> I agree that losing information during exception throwing is not good.
>>>>>> If
>>>>>> you've got an improvement in the form of a patch, do not hesitate in
>>>>>> sending it in.
>>>>>>
>>>>>> - JCC is (I think) currently at home with pylucene, are there are
>>>>>> plans to
>>>>>> making it separate?
>>>>>>
>>>>>>
>>>>>> No such plans at the moment.
>>>>>>
>>>>>> - Are you happy if I create a public github project and add the
>>>>>> patched
>>>>>> code in there for review?
>>>>>>
>>>>>>
>>>>>> You're welcome to fork the code if you'd like but you don't have to if
>>>>>> all
>>>>>> you want is sending a patch. Your call.
>>>>>>
>>>>>> Thank you for the kind words !
>>>>>>
>>>>>> Andi..
>>>>>>
>>>>>>
>>>>>> Hope to hear from you!
>>>>>>
>>>>>> Cheers,
>>>>>> Lee
>>>>>>
>>>>>>
>>>>>>
>>>>>> TL;DR;
>>>>>>
>>>>>> We would like to :-
>>>>>>
>>>>>> - Improve JCC's through-layer exception support.
>>>>>> - Establish a github project for JCC alone.
>>>>>>
>>>>>> --
>>>>>> Lee Skillen
>>>>>>
>>>>>> Vulcan Financial Technologies
>>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>>
>>>>>> Office:  +44 (0)28 95 817888
>>>>>> Mobile:  +44 (0)78 41 425152
>>>>>> Web:     www.vulcanft.com
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Lee Skillen
>>>>>
>>>>> Vulcan Financial Technologies
>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>>
>>>>> Office:  +44 (0)28 95 817888
>>>>> Mobile:  +44 (0)78 41 425152
>>>>> Web:     www.vulcanft.com
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Lee Skillen
>>>
>>> Vulcan Financial Technologies
>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>
>>> Office:  +44 (0)28 95 817888
>>> Web:     www.vulcanft.com
>>>
>



-- 
Lee Skillen

Vulcan Financial Technologies
1st Floor, 47 Malone Road, Belfast, BT9 6RY

Office:  +44 (0)28 95 817888
Web:     www.vulcanft.com

Re: JCC Project Extensions

Posted by Andi Vajda <va...@apache.org>.
On Wed, 9 Jul 2014, Andi Vajda wrote:

>
>
> On Wed, 9 Jul 2014, Lee Skillen wrote:
>
>> Hey Andi,
>> 
>> On 9 July 2014 13:50, Andi Vajda <va...@apache.org> wrote:
>>> 
>>>
>>>  Hi Lee,
>>> 
>>> 
>>> On Tue, 8 Jul 2014, Lee Skillen wrote:
>>> 
>>>> Andi, thanks for the reply - I've created a github mirror of the pylucene
>>>> project for our own use (which I intend to keep synced with your SVN
>>>> repository as its official upstream), located at:
>>>> 
>>>> https://github.com/lskillen/pylucene
>>>> 
>>>> As suggested I have formatted (and attached) a patch of the unfinished 
>>>> code
>>>> that we're using for the through-layer exceptions.  Alternatively the 
>>>> diff
>>>> can be inspected via github by diff'ing between the new
>>>> feature-thru-exception branch that I have created and the master branch, 
>>>> as
>>>> such:
>>>> 
>>>> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1
>>>> 
>>>> Although we've run the test suite without issues I realise there may 
>>>> still
>>>> be functionality/style/logical issues with the code.  I also suspect that
>>>> there may not be specific test cases that target regression failures for
>>>> exceptions (yet), so confidence isn't high!  All comments are welcome and 
>>>> I
>>>> realise this will likely require further changes and a repeatable test 
>>>> case
>>>> before it is acceptable, but that's fine.
>>> 
>>> 
>>> I took a look at your patch and used the main idea to rewrite it so that:
>>>   - ref handling is correct (hopefully): you had it backwards when calling
>>>     Py_Restore(), it steals what you currently must own.
>>>   - the Python error state is now saved as one tuple of (type, value, tb) 
>>> on
>>>     PythonException, not as three strings and three fake Objects (longs).
>>>   - ref handling is correct via a finalize() method DECREF'ing the saved
>>>     error state tuple on PythonException when that exception is collected.
>>>   - getMessage() still works as before but the traceback is extracted on
>>>     demand, not at throw time as was done before.
>>> 
>>> The previous implementation was done so that no Python cross-VM reference 
>>> had to be tracked, all the error data was string'ified at error time.
>>> Your change now requires that the error be kept 'alive' accross layers and 
>>> refs must be tracked to avoid leaks.
>>> 
>> 
>> Great feedback, thank you - I suspected that the reference handling
>> wasn't quite there (first foray into CPython internals), and I also
>> really wanted to utilise a tuple and get rid of the strings but wasn't
>> sure what the standard was for extending a class such as
>> PythonException.  I actually did a quick attempt at running everything
>> under debug mode to inspect for leaks, but suffice to say that my
>> usual toolbox for debugging and tracking memory leaks
>> (gdb/clang/valgrind) didn't work too well - Had some minor success
>> with the excellent objgraph Python package, but would need to spend
>> more time on it.
>> 
>>> 
>>> I did _not_ test this as I don't have a system currently running that uses 
>>> JCC in reverse like this. Since you're actively using the feature, please 
>>> test the attached patch (against trunk) and report back with comments, 
>>> bugs, fixes, etc...
>>> 
>> 
>> Not a problem!  I applied your modified patch against trunk, rebuilt
>> and re-ran our application.  The good news is that everything is
>> working really well, and the only bug that I could see was within the
>> RegisterNatives() call within registerNatives() for PythonException
>> (size was still 2, changed it to calculate the size of the struct).
>> I'll re-attach the patch with the changes for you to review.
>
> Woops, I missed that - even though I looked for it. Oh well. Thanks.

I attached a new patch with your fix. I also removed the 'clear()' method 
since it's no longer necessary: once the PythonException is constructed, the 
error state in the Python VM is cleared because of PyErr_Fetch() being 
called during saveErrorState().

Andi..

>
>> The only other comments I have are that:
>> 
>> (1) This was still an issue with my patch, but I don't know if there
>> is anyone out there relying upon the exact format of the string that
>> gets returned by getMessage(), as it is probably going to be different
>> now that it is calculated at a different point - In saying that I
>> imagine you would only be doing that if you were specifically trying
>> to handle an exception from the Python layer and were trying to
>> re-create it using the string, which wouldn't be an issue now.
>
> Yeah, I'm just concatenating the strings. I'm already using PyErr_Print().
>
>> (2) I also noticed that your patch doesn't have the #if
>> defined(PYTHON) part to protect certain parts that rely on things like
>> getPythonExceptionClass() - Not sure if that was intentional or not
>> (guessing it might be one of those things that are always defined
>> anyway), but just wanted to highlight it.
>
> Not needed, all of functions.cpp depends on python amyway.
>
>> Apart from that, it looks great - In order to assist with replication
>> of the through-exception issue on your side I also whipped up a (very)
>> quick example and am also attaching this to the email.  Utilises a
>> noddy Makefile to get the job done, but running "make" or "make test"
>> will generate the java/jcc packages and then run the test.  Running
>> "make uninstall" will remove the jcc package, and "make clean" will
>> tidy up the build artifacts.  Standard stuff!
>> 
>> Hope that helps!
>
> Excellent, thanks !
>
> Andi..
>
>> 
>> Thanks,
>> Lee
>> 
>>> Thanks !
>>> 
>>> Andi..
>>> 
>>> 
>>>> 
>>>> Thanks,
>>>> Lee
>>>> 
>>>> 
>>>> 
>>>> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:
>>>> 
>>>>> 
>>>>> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>>>>> 
>>>>> Hi Andi,
>>>>> 
>>>>> First of all, absolutely fantastic work on the JCC project, we love it -
>>>>> Second of all, apologies for contacting you out of the blue via email, 
>>>>> but
>>>>> I wasn't sure of the best place to contact you to ask a few questions
>>>>> (technical-level, not support-level).  If it isn't acceptable to email
>>>>> please let me know and I can move the discussion somewhere else.
>>>>> 
>>>>> 
>>>>> Yes, please include pylucene-dev@lucene.apache.org (subscription
>>>>> required. send mail pylucene-dev-subscribe@ and follow the instructions
>>>>> in the reply).
>>>>> 
>>>>> A bit of background first of all :-
>>>>> 
>>>>> We've been integrating JCC for the past two weeks on a sizeable
>>>>> Python/Java/C++ project (where Python is a controller/glue layer) having
>>>>> migrated from a previous solution to integrate across the languages, and
>>>>> for the most part JCC has been a lifesaver.
>>>>> 
>>>>> Technically we've only hit one major issue so far and that has been the
>>>>> translation of exceptions through the layers.  Our Java application does 
>>>>> a
>>>>> lot of double-dispatch (visitation) type of execution and we're calling
>>>>> outwards to Python to implement the concrete visitation logic.
>>>>> 
>>>>> In the case of an exception being thrown it results in JCC collating a
>>>>> traceback and captured exception at every level of scope, resulting in a
>>>>> giant JavaError exception which has lost it's original PythonException
>>>>> context.
>>>>> 
>>>>> I was actually able to fix this by attempting to capture the full 
>>>>> context
>>>>> via PyErr_Fetch() when the first python object is thrown and storing it
>>>>> within PythonException, and then ensuring that this is carried through 
>>>>> each
>>>>> layer appropriately - This seems to work very well, although admittedly 
>>>>> I
>>>>> haven't considered regressive errors or compatibility yet, but just 
>>>>> wanted
>>>>> to proof-of-concept it first of all then speak with a JCC maintainer.
>>>>> 
>>>>> The net result is being able to catch errors in a pipeline similar to 
>>>>> the
>>>>> following:
>>>>> 
>>>>> Python
>>>>>    -> Java
>>>>>        -> Python
>>>>>          -> Raise MyException
>>>>>        <- Throw PythonException (containing MyException)
>>>>>    <- Inject MyException
>>>>> Catch MyException
>>>>> 
>>>>> My main questions are to ask your thoughts about :-
>>>>> 
>>>>> - What are your thoughts about through-layer exceptions and a feature
>>>>> improvement like this?
>>>>> 
>>>>> 
>>>>> I agree that losing information during exception throwing is not good. 
>>>>> If
>>>>> you've got an improvement in the form of a patch, do not hesitate in
>>>>> sending it in.
>>>>> 
>>>>> - JCC is (I think) currently at home with pylucene, are there are plans 
>>>>> to
>>>>> making it separate?
>>>>> 
>>>>> 
>>>>> No such plans at the moment.
>>>>> 
>>>>> - Are you happy if I create a public github project and add the patched
>>>>> code in there for review?
>>>>> 
>>>>> 
>>>>> You're welcome to fork the code if you'd like but you don't have to if 
>>>>> all
>>>>> you want is sending a patch. Your call.
>>>>> 
>>>>> Thank you for the kind words !
>>>>> 
>>>>> Andi..
>>>>> 
>>>>> 
>>>>> Hope to hear from you!
>>>>> 
>>>>> Cheers,
>>>>> Lee
>>>>> 
>>>>> 
>>>>> 
>>>>> TL;DR;
>>>>> 
>>>>> We would like to :-
>>>>> 
>>>>> - Improve JCC's through-layer exception support.
>>>>> - Establish a github project for JCC alone.
>>>>> 
>>>>> --
>>>>> Lee Skillen
>>>>> 
>>>>> Vulcan Financial Technologies
>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>> 
>>>>> Office:  +44 (0)28 95 817888
>>>>> Mobile:  +44 (0)78 41 425152
>>>>> Web:     www.vulcanft.com
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Lee Skillen
>>>> 
>>>> Vulcan Financial Technologies
>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>> 
>>>> Office:  +44 (0)28 95 817888
>>>> Mobile:  +44 (0)78 41 425152
>>>> Web:     www.vulcanft.com
>> 
>> 
>> 
>> 
>> -- 
>> Lee Skillen
>> 
>> Vulcan Financial Technologies
>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>> 
>> Office:  +44 (0)28 95 817888
>> Web:     www.vulcanft.com
>> 
>

Re: JCC Project Extensions

Posted by Andi Vajda <va...@apache.org>.

On Wed, 9 Jul 2014, Lee Skillen wrote:

> Hey Andi,
>
> On 9 July 2014 13:50, Andi Vajda <va...@apache.org> wrote:
>>
>>
>>  Hi Lee,
>>
>>
>> On Tue, 8 Jul 2014, Lee Skillen wrote:
>>
>>> Andi, thanks for the reply - I've created a github mirror of the pylucene
>>> project for our own use (which I intend to keep synced with your SVN
>>> repository as its official upstream), located at:
>>>
>>> https://github.com/lskillen/pylucene
>>>
>>> As suggested I have formatted (and attached) a patch of the unfinished code
>>> that we're using for the through-layer exceptions.  Alternatively the diff
>>> can be inspected via github by diff'ing between the new
>>> feature-thru-exception branch that I have created and the master branch, as
>>> such:
>>>
>>> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1
>>>
>>> Although we've run the test suite without issues I realise there may still
>>> be functionality/style/logical issues with the code.  I also suspect that
>>> there may not be specific test cases that target regression failures for
>>> exceptions (yet), so confidence isn't high!  All comments are welcome and I
>>> realise this will likely require further changes and a repeatable test case
>>> before it is acceptable, but that's fine.
>>
>>
>> I took a look at your patch and used the main idea to rewrite it so that:
>>   - ref handling is correct (hopefully): you had it backwards when calling
>>     Py_Restore(), it steals what you currently must own.
>>   - the Python error state is now saved as one tuple of (type, value, tb) on
>>     PythonException, not as three strings and three fake Objects (longs).
>>   - ref handling is correct via a finalize() method DECREF'ing the saved
>>     error state tuple on PythonException when that exception is collected.
>>   - getMessage() still works as before but the traceback is extracted on
>>     demand, not at throw time as was done before.
>>
>> The previous implementation was done so that no Python cross-VM reference had to be tracked, all the error data was string'ified at error time.
>> Your change now requires that the error be kept 'alive' accross layers and refs must be tracked to avoid leaks.
>>
>
> Great feedback, thank you - I suspected that the reference handling
> wasn't quite there (first foray into CPython internals), and I also
> really wanted to utilise a tuple and get rid of the strings but wasn't
> sure what the standard was for extending a class such as
> PythonException.  I actually did a quick attempt at running everything
> under debug mode to inspect for leaks, but suffice to say that my
> usual toolbox for debugging and tracking memory leaks
> (gdb/clang/valgrind) didn't work too well - Had some minor success
> with the excellent objgraph Python package, but would need to spend
> more time on it.
>
>>
>> I did _not_ test this as I don't have a system currently running that uses JCC in reverse like this. Since you're actively using the feature, please test the attached patch (against trunk) and report back with comments, bugs, fixes, etc...
>>
>
> Not a problem!  I applied your modified patch against trunk, rebuilt
> and re-ran our application.  The good news is that everything is
> working really well, and the only bug that I could see was within the
> RegisterNatives() call within registerNatives() for PythonException
> (size was still 2, changed it to calculate the size of the struct).
> I'll re-attach the patch with the changes for you to review.

Woops, I missed that - even though I looked for it. Oh well. Thanks.

> The only other comments I have are that:
>
> (1) This was still an issue with my patch, but I don't know if there
> is anyone out there relying upon the exact format of the string that
> gets returned by getMessage(), as it is probably going to be different
> now that it is calculated at a different point - In saying that I
> imagine you would only be doing that if you were specifically trying
> to handle an exception from the Python layer and were trying to
> re-create it using the string, which wouldn't be an issue now.

Yeah, I'm just concatenating the strings. I'm already using PyErr_Print().

> (2) I also noticed that your patch doesn't have the #if
> defined(PYTHON) part to protect certain parts that rely on things like
> getPythonExceptionClass() - Not sure if that was intentional or not
> (guessing it might be one of those things that are always defined
> anyway), but just wanted to highlight it.

Not needed, all of functions.cpp depends on python amyway.

> Apart from that, it looks great - In order to assist with replication
> of the through-exception issue on your side I also whipped up a (very)
> quick example and am also attaching this to the email.  Utilises a
> noddy Makefile to get the job done, but running "make" or "make test"
> will generate the java/jcc packages and then run the test.  Running
> "make uninstall" will remove the jcc package, and "make clean" will
> tidy up the build artifacts.  Standard stuff!
>
> Hope that helps!

Excellent, thanks !

Andi..

>
> Thanks,
> Lee
>
>> Thanks !
>>
>> Andi..
>>
>>
>>>
>>> Thanks,
>>> Lee
>>>
>>>
>>>
>>> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:
>>>
>>>>
>>>> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>>>>
>>>> Hi Andi,
>>>>
>>>> First of all, absolutely fantastic work on the JCC project, we love it -
>>>> Second of all, apologies for contacting you out of the blue via email, but
>>>> I wasn't sure of the best place to contact you to ask a few questions
>>>> (technical-level, not support-level).  If it isn't acceptable to email
>>>> please let me know and I can move the discussion somewhere else.
>>>>
>>>>
>>>> Yes, please include pylucene-dev@lucene.apache.org (subscription
>>>> required. send mail pylucene-dev-subscribe@ and follow the instructions
>>>> in the reply).
>>>>
>>>> A bit of background first of all :-
>>>>
>>>> We've been integrating JCC for the past two weeks on a sizeable
>>>> Python/Java/C++ project (where Python is a controller/glue layer) having
>>>> migrated from a previous solution to integrate across the languages, and
>>>> for the most part JCC has been a lifesaver.
>>>>
>>>> Technically we've only hit one major issue so far and that has been the
>>>> translation of exceptions through the layers.  Our Java application does a
>>>> lot of double-dispatch (visitation) type of execution and we're calling
>>>> outwards to Python to implement the concrete visitation logic.
>>>>
>>>> In the case of an exception being thrown it results in JCC collating a
>>>> traceback and captured exception at every level of scope, resulting in a
>>>> giant JavaError exception which has lost it's original PythonException
>>>> context.
>>>>
>>>> I was actually able to fix this by attempting to capture the full context
>>>> via PyErr_Fetch() when the first python object is thrown and storing it
>>>> within PythonException, and then ensuring that this is carried through each
>>>> layer appropriately - This seems to work very well, although admittedly I
>>>> haven't considered regressive errors or compatibility yet, but just wanted
>>>> to proof-of-concept it first of all then speak with a JCC maintainer.
>>>>
>>>> The net result is being able to catch errors in a pipeline similar to the
>>>> following:
>>>>
>>>> Python
>>>>    -> Java
>>>>        -> Python
>>>>          -> Raise MyException
>>>>        <- Throw PythonException (containing MyException)
>>>>    <- Inject MyException
>>>> Catch MyException
>>>>
>>>> My main questions are to ask your thoughts about :-
>>>>
>>>> - What are your thoughts about through-layer exceptions and a feature
>>>> improvement like this?
>>>>
>>>>
>>>> I agree that losing information during exception throwing is not good. If
>>>> you've got an improvement in the form of a patch, do not hesitate in
>>>> sending it in.
>>>>
>>>> - JCC is (I think) currently at home with pylucene, are there are plans to
>>>> making it separate?
>>>>
>>>>
>>>> No such plans at the moment.
>>>>
>>>> - Are you happy if I create a public github project and add the patched
>>>> code in there for review?
>>>>
>>>>
>>>> You're welcome to fork the code if you'd like but you don't have to if all
>>>> you want is sending a patch. Your call.
>>>>
>>>> Thank you for the kind words !
>>>>
>>>> Andi..
>>>>
>>>>
>>>> Hope to hear from you!
>>>>
>>>> Cheers,
>>>> Lee
>>>>
>>>>
>>>>
>>>> TL;DR;
>>>>
>>>> We would like to :-
>>>>
>>>> - Improve JCC's through-layer exception support.
>>>> - Establish a github project for JCC alone.
>>>>
>>>> --
>>>> Lee Skillen
>>>>
>>>> Vulcan Financial Technologies
>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>>
>>>> Office:  +44 (0)28 95 817888
>>>> Mobile:  +44 (0)78 41 425152
>>>> Web:     www.vulcanft.com
>>>>
>>>>
>>>
>>>
>>> --
>>> Lee Skillen
>>>
>>> Vulcan Financial Technologies
>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>
>>> Office:  +44 (0)28 95 817888
>>> Mobile:  +44 (0)78 41 425152
>>> Web:     www.vulcanft.com
>
>
>
>
> -- 
> Lee Skillen
>
> Vulcan Financial Technologies
> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>
> Office:  +44 (0)28 95 817888
> Web:     www.vulcanft.com
>

Re: JCC Project Extensions

Posted by Lee Skillen <ls...@vulcanft.com>.
Hey Andi,

On 9 July 2014 13:50, Andi Vajda <va...@apache.org> wrote:
>
>
>  Hi Lee,
>
>
> On Tue, 8 Jul 2014, Lee Skillen wrote:
>
>> Andi, thanks for the reply - I've created a github mirror of the pylucene
>> project for our own use (which I intend to keep synced with your SVN
>> repository as its official upstream), located at:
>>
>> https://github.com/lskillen/pylucene
>>
>> As suggested I have formatted (and attached) a patch of the unfinished code
>> that we're using for the through-layer exceptions.  Alternatively the diff
>> can be inspected via github by diff'ing between the new
>> feature-thru-exception branch that I have created and the master branch, as
>> such:
>>
>> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1
>>
>> Although we've run the test suite without issues I realise there may still
>> be functionality/style/logical issues with the code.  I also suspect that
>> there may not be specific test cases that target regression failures for
>> exceptions (yet), so confidence isn't high!  All comments are welcome and I
>> realise this will likely require further changes and a repeatable test case
>> before it is acceptable, but that's fine.
>
>
> I took a look at your patch and used the main idea to rewrite it so that:
>   - ref handling is correct (hopefully): you had it backwards when calling
>     Py_Restore(), it steals what you currently must own.
>   - the Python error state is now saved as one tuple of (type, value, tb) on
>     PythonException, not as three strings and three fake Objects (longs).
>   - ref handling is correct via a finalize() method DECREF'ing the saved
>     error state tuple on PythonException when that exception is collected.
>   - getMessage() still works as before but the traceback is extracted on
>     demand, not at throw time as was done before.
>
> The previous implementation was done so that no Python cross-VM reference had to be tracked, all the error data was string'ified at error time.
> Your change now requires that the error be kept 'alive' accross layers and refs must be tracked to avoid leaks.
>

Great feedback, thank you - I suspected that the reference handling
wasn't quite there (first foray into CPython internals), and I also
really wanted to utilise a tuple and get rid of the strings but wasn't
sure what the standard was for extending a class such as
PythonException.  I actually did a quick attempt at running everything
under debug mode to inspect for leaks, but suffice to say that my
usual toolbox for debugging and tracking memory leaks
(gdb/clang/valgrind) didn't work too well - Had some minor success
with the excellent objgraph Python package, but would need to spend
more time on it.

>
> I did _not_ test this as I don't have a system currently running that uses JCC in reverse like this. Since you're actively using the feature, please test the attached patch (against trunk) and report back with comments, bugs, fixes, etc...
>

Not a problem!  I applied your modified patch against trunk, rebuilt
and re-ran our application.  The good news is that everything is
working really well, and the only bug that I could see was within the
RegisterNatives() call within registerNatives() for PythonException
(size was still 2, changed it to calculate the size of the struct).
I'll re-attach the patch with the changes for you to review.

The only other comments I have are that:

(1) This was still an issue with my patch, but I don't know if there
is anyone out there relying upon the exact format of the string that
gets returned by getMessage(), as it is probably going to be different
now that it is calculated at a different point - In saying that I
imagine you would only be doing that if you were specifically trying
to handle an exception from the Python layer and were trying to
re-create it using the string, which wouldn't be an issue now.

(2) I also noticed that your patch doesn't have the #if
defined(PYTHON) part to protect certain parts that rely on things like
getPythonExceptionClass() - Not sure if that was intentional or not
(guessing it might be one of those things that are always defined
anyway), but just wanted to highlight it.

Apart from that, it looks great - In order to assist with replication
of the through-exception issue on your side I also whipped up a (very)
quick example and am also attaching this to the email.  Utilises a
noddy Makefile to get the job done, but running "make" or "make test"
will generate the java/jcc packages and then run the test.  Running
"make uninstall" will remove the jcc package, and "make clean" will
tidy up the build artifacts.  Standard stuff!

Hope that helps!

Thanks,
Lee

> Thanks !
>
> Andi..
>
>
>>
>> Thanks,
>> Lee
>>
>>
>>
>> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:
>>
>>>
>>> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>>>
>>> Hi Andi,
>>>
>>> First of all, absolutely fantastic work on the JCC project, we love it -
>>> Second of all, apologies for contacting you out of the blue via email, but
>>> I wasn't sure of the best place to contact you to ask a few questions
>>> (technical-level, not support-level).  If it isn't acceptable to email
>>> please let me know and I can move the discussion somewhere else.
>>>
>>>
>>> Yes, please include pylucene-dev@lucene.apache.org (subscription
>>> required. send mail pylucene-dev-subscribe@ and follow the instructions
>>> in the reply).
>>>
>>> A bit of background first of all :-
>>>
>>> We've been integrating JCC for the past two weeks on a sizeable
>>> Python/Java/C++ project (where Python is a controller/glue layer) having
>>> migrated from a previous solution to integrate across the languages, and
>>> for the most part JCC has been a lifesaver.
>>>
>>> Technically we've only hit one major issue so far and that has been the
>>> translation of exceptions through the layers.  Our Java application does a
>>> lot of double-dispatch (visitation) type of execution and we're calling
>>> outwards to Python to implement the concrete visitation logic.
>>>
>>> In the case of an exception being thrown it results in JCC collating a
>>> traceback and captured exception at every level of scope, resulting in a
>>> giant JavaError exception which has lost it's original PythonException
>>> context.
>>>
>>> I was actually able to fix this by attempting to capture the full context
>>> via PyErr_Fetch() when the first python object is thrown and storing it
>>> within PythonException, and then ensuring that this is carried through each
>>> layer appropriately - This seems to work very well, although admittedly I
>>> haven't considered regressive errors or compatibility yet, but just wanted
>>> to proof-of-concept it first of all then speak with a JCC maintainer.
>>>
>>> The net result is being able to catch errors in a pipeline similar to the
>>> following:
>>>
>>> Python
>>>    -> Java
>>>        -> Python
>>>          -> Raise MyException
>>>        <- Throw PythonException (containing MyException)
>>>    <- Inject MyException
>>> Catch MyException
>>>
>>> My main questions are to ask your thoughts about :-
>>>
>>> - What are your thoughts about through-layer exceptions and a feature
>>> improvement like this?
>>>
>>>
>>> I agree that losing information during exception throwing is not good. If
>>> you've got an improvement in the form of a patch, do not hesitate in
>>> sending it in.
>>>
>>> - JCC is (I think) currently at home with pylucene, are there are plans to
>>> making it separate?
>>>
>>>
>>> No such plans at the moment.
>>>
>>> - Are you happy if I create a public github project and add the patched
>>> code in there for review?
>>>
>>>
>>> You're welcome to fork the code if you'd like but you don't have to if all
>>> you want is sending a patch. Your call.
>>>
>>> Thank you for the kind words !
>>>
>>> Andi..
>>>
>>>
>>> Hope to hear from you!
>>>
>>> Cheers,
>>> Lee
>>>
>>>
>>>
>>> TL;DR;
>>>
>>> We would like to :-
>>>
>>> - Improve JCC's through-layer exception support.
>>> - Establish a github project for JCC alone.
>>>
>>> --
>>> Lee Skillen
>>>
>>> Vulcan Financial Technologies
>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>>
>>> Office:  +44 (0)28 95 817888
>>> Mobile:  +44 (0)78 41 425152
>>> Web:     www.vulcanft.com
>>>
>>>
>>
>>
>> --
>> Lee Skillen
>>
>> Vulcan Financial Technologies
>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>
>> Office:  +44 (0)28 95 817888
>> Mobile:  +44 (0)78 41 425152
>> Web:     www.vulcanft.com




-- 
Lee Skillen

Vulcan Financial Technologies
1st Floor, 47 Malone Road, Belfast, BT9 6RY

Office:  +44 (0)28 95 817888
Web:     www.vulcanft.com

Re: JCC Project Extensions

Posted by Andi Vajda <va...@apache.org>.
  Hi Lee,

On Tue, 8 Jul 2014, Lee Skillen wrote:

> Andi, thanks for the reply - I've created a github mirror of the pylucene
> project for our own use (which I intend to keep synced with your SVN
> repository as its official upstream), located at:
>
> https://github.com/lskillen/pylucene
>
> As suggested I have formatted (and attached) a patch of the unfinished code
> that we're using for the through-layer exceptions.  Alternatively the diff
> can be inspected via github by diff'ing between the new
> feature-thru-exception branch that I have created and the master branch, as
> such:
>
> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1
>
> Although we've run the test suite without issues I realise there may still
> be functionality/style/logical issues with the code.  I also suspect that
> there may not be specific test cases that target regression failures for
> exceptions (yet), so confidence isn't high!  All comments are welcome and I
> realise this will likely require further changes and a repeatable test case
> before it is acceptable, but that's fine.

I took a look at your patch and used the main idea to rewrite it so that:
   - ref handling is correct (hopefully): you had it backwards when calling
     Py_Restore(), it steals what you currently must own.
   - the Python error state is now saved as one tuple of (type, value, tb) on
     PythonException, not as three strings and three fake Objects (longs).
   - ref handling is correct via a finalize() method DECREF'ing the saved
     error state tuple on PythonException when that exception is collected.
   - getMessage() still works as before but the traceback is extracted on
     demand, not at throw time as was done before.

The previous implementation was done so that no Python cross-VM reference 
had to be tracked, all the error data was string'ified at error time.
Your change now requires that the error be kept 'alive' accross layers and 
refs must be tracked to avoid leaks.

I did _not_ test this as I don't have a system currently running that uses 
JCC in reverse like this. Since you're actively using the feature, please 
test the attached patch (against trunk) and report back with comments, bugs, 
fixes, etc...

Thanks !

Andi..

>
> Thanks,
> Lee
>
>
>
> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:
>
>>
>> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>>
>> Hi Andi,
>>
>> First of all, absolutely fantastic work on the JCC project, we love it -
>> Second of all, apologies for contacting you out of the blue via email, but
>> I wasn't sure of the best place to contact you to ask a few questions
>> (technical-level, not support-level).  If it isn't acceptable to email
>> please let me know and I can move the discussion somewhere else.
>>
>>
>> Yes, please include pylucene-dev@lucene.apache.org (subscription
>> required. send mail pylucene-dev-subscribe@ and follow the instructions
>> in the reply).
>>
>> A bit of background first of all :-
>>
>> We've been integrating JCC for the past two weeks on a sizeable
>> Python/Java/C++ project (where Python is a controller/glue layer) having
>> migrated from a previous solution to integrate across the languages, and
>> for the most part JCC has been a lifesaver.
>>
>> Technically we've only hit one major issue so far and that has been the
>> translation of exceptions through the layers.  Our Java application does a
>> lot of double-dispatch (visitation) type of execution and we're calling
>> outwards to Python to implement the concrete visitation logic.
>>
>> In the case of an exception being thrown it results in JCC collating a
>> traceback and captured exception at every level of scope, resulting in a
>> giant JavaError exception which has lost it's original PythonException
>> context.
>>
>> I was actually able to fix this by attempting to capture the full context
>> via PyErr_Fetch() when the first python object is thrown and storing it
>> within PythonException, and then ensuring that this is carried through each
>> layer appropriately - This seems to work very well, although admittedly I
>> haven't considered regressive errors or compatibility yet, but just wanted
>> to proof-of-concept it first of all then speak with a JCC maintainer.
>>
>> The net result is being able to catch errors in a pipeline similar to the
>> following:
>>
>> Python
>>    -> Java
>>        -> Python
>>          -> Raise MyException
>>        <- Throw PythonException (containing MyException)
>>    <- Inject MyException
>> Catch MyException
>>
>> My main questions are to ask your thoughts about :-
>>
>> - What are your thoughts about through-layer exceptions and a feature
>> improvement like this?
>>
>>
>> I agree that losing information during exception throwing is not good. If
>> you've got an improvement in the form of a patch, do not hesitate in
>> sending it in.
>>
>> - JCC is (I think) currently at home with pylucene, are there are plans to
>> making it separate?
>>
>>
>> No such plans at the moment.
>>
>> - Are you happy if I create a public github project and add the patched
>> code in there for review?
>>
>>
>> You're welcome to fork the code if you'd like but you don't have to if all
>> you want is sending a patch. Your call.
>>
>> Thank you for the kind words !
>>
>> Andi..
>>
>>
>> Hope to hear from you!
>>
>> Cheers,
>> Lee
>>
>>
>>
>> TL;DR;
>>
>> We would like to :-
>>
>> - Improve JCC's through-layer exception support.
>> - Establish a github project for JCC alone.
>>
>> --
>> Lee Skillen
>>
>> Vulcan Financial Technologies
>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>>
>> Office:  +44 (0)28 95 817888
>> Mobile:  +44 (0)78 41 425152
>> Web:     www.vulcanft.com
>>
>>
>
>
> -- 
> Lee Skillen
>
> Vulcan Financial Technologies
> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>
> Office:  +44 (0)28 95 817888
> Mobile:  +44 (0)78 41 425152
> Web:     www.vulcanft.com
>

Re: JCC Project Extensions

Posted by Lee Skillen <ls...@vulcanft.com>.
Hey,

Andi, thanks for the reply - I've created a github mirror of the pylucene
project for our own use (which I intend to keep synced with your SVN
repository as its official upstream), located at:

https://github.com/lskillen/pylucene

As suggested I have formatted (and attached) a patch of the unfinished code
that we're using for the through-layer exceptions.  Alternatively the diff
can be inspected via github by diff'ing between the new
feature-thru-exception branch that I have created and the master branch, as
such:

https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1

Although we've run the test suite without issues I realise there may still
be functionality/style/logical issues with the code.  I also suspect that
there may not be specific test cases that target regression failures for
exceptions (yet), so confidence isn't high!  All comments are welcome and I
realise this will likely require further changes and a repeatable test case
before it is acceptable, but that's fine.

Thanks,
Lee



On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote:

>
> On Jul 4, 2014, at 18:33, Lee Skillen <ls...@vulcanft.com> wrote:
>
> Hi Andi,
>
> First of all, absolutely fantastic work on the JCC project, we love it -
> Second of all, apologies for contacting you out of the blue via email, but
> I wasn't sure of the best place to contact you to ask a few questions
> (technical-level, not support-level).  If it isn't acceptable to email
> please let me know and I can move the discussion somewhere else.
>
>
> Yes, please include pylucene-dev@lucene.apache.org (subscription
> required. send mail pylucene-dev-subscribe@ and follow the instructions
> in the reply).
>
> A bit of background first of all :-
>
> We've been integrating JCC for the past two weeks on a sizeable
> Python/Java/C++ project (where Python is a controller/glue layer) having
> migrated from a previous solution to integrate across the languages, and
> for the most part JCC has been a lifesaver.
>
> Technically we've only hit one major issue so far and that has been the
> translation of exceptions through the layers.  Our Java application does a
> lot of double-dispatch (visitation) type of execution and we're calling
> outwards to Python to implement the concrete visitation logic.
>
> In the case of an exception being thrown it results in JCC collating a
> traceback and captured exception at every level of scope, resulting in a
> giant JavaError exception which has lost it's original PythonException
> context.
>
> I was actually able to fix this by attempting to capture the full context
> via PyErr_Fetch() when the first python object is thrown and storing it
> within PythonException, and then ensuring that this is carried through each
> layer appropriately - This seems to work very well, although admittedly I
> haven't considered regressive errors or compatibility yet, but just wanted
> to proof-of-concept it first of all then speak with a JCC maintainer.
>
> The net result is being able to catch errors in a pipeline similar to the
> following:
>
> Python
>    -> Java
>        -> Python
>          -> Raise MyException
>        <- Throw PythonException (containing MyException)
>    <- Inject MyException
> Catch MyException
>
> My main questions are to ask your thoughts about :-
>
> - What are your thoughts about through-layer exceptions and a feature
> improvement like this?
>
>
> I agree that losing information during exception throwing is not good. If
> you've got an improvement in the form of a patch, do not hesitate in
> sending it in.
>
> - JCC is (I think) currently at home with pylucene, are there are plans to
> making it separate?
>
>
> No such plans at the moment.
>
> - Are you happy if I create a public github project and add the patched
> code in there for review?
>
>
> You're welcome to fork the code if you'd like but you don't have to if all
> you want is sending a patch. Your call.
>
> Thank you for the kind words !
>
> Andi..
>
>
> Hope to hear from you!
>
> Cheers,
> Lee
>
>
>
> TL;DR;
>
> We would like to :-
>
> - Improve JCC's through-layer exception support.
> - Establish a github project for JCC alone.
>
> --
> Lee Skillen
>
> Vulcan Financial Technologies
> 1st Floor, 47 Malone Road, Belfast, BT9 6RY
>
> Office:  +44 (0)28 95 817888
> Mobile:  +44 (0)78 41 425152
> Web:     www.vulcanft.com
>
>


-- 
Lee Skillen

Vulcan Financial Technologies
1st Floor, 47 Malone Road, Belfast, BT9 6RY

Office:  +44 (0)28 95 817888
Mobile:  +44 (0)78 41 425152
Web:     www.vulcanft.com