You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by Andi Vajda <va...@apache.org> on 2017/03/19 20:42:01 UTC

python 3 support is checked into trunk

I just now checked in support for Python 3 (3.5+), built and tested on
Mac OS X 10.12 only, with Python 3.6. Linux support should be next. I have no
access to Windows anymore and thus can't test support there.

I manually integrated/merged/changed/fixed the patches proposed by Rdiger
Meier and Thomas Koch earlier as well as the much older experimental branch
I checked in many years ago.

I refactored the JCC egg's structure, splitting it into two parts, one for
Python 2 support (jcc2, _jcc2, helpers2) and one for Python 3 (jcc3, _jcc3,
helper3). That way, less conditional code is required and Python 3 support can
proceed unshackled.
There still is only one egg for JCC, called jcc.
MANIFEST.in was updated accordingly (but is still untested).

Similarly, the PyLucene tests got split into test2 and test3.

This is still work in progress but it seems to work well on Mac OS X, so far.
All tests pass.

I intend to work next on support for Linux.
Someone with access to Windows, please help test/fix/finish support for
Python 3 on Windows, both with the MSVC and Mingw compilers.
I have no access to Windows anymore. 
I expect to mostly work but with caveats on building the libjcc3.dll with the
version of setuptools included in Python 3.6.

Thank you all for your contributions and patience so far !

Andi..

Re: python 3 support is checked into trunk

Posted by Andi Vajda <va...@apache.org>.
On Mon, 20 Mar 2017, Ruediger Meier wrote:

> On Sunday 19 March 2017, Andi Vajda wrote:
>> I just now checked in support for Python 3 (3.5+),
>
> Thanks a lot!
>
>> built and tested
>> on Mac OS X 10.12 only, with Python 3.6.
>
> FYI I've tested a HelloWorld.jar using your svn trunk on travis build
> farm for OSX 10.11 and xcode 7.3, 8, 8.1, 8.2:
> https://travis-ci.org/rudimeier/jcc/builds/212820385
>
>> Linux support should be next.
>
> It would work already on Linux with this patch
> https://github.com/rudimeier/jcc/commit/2ccf7e4b828c678577fc0ace24bdb4680ede207a

I changed linux2 to linux but still need to see what kind of setuptools 
patching/monkeypatching hackery is needed to produce a usable libjcc.so.

Thanks !

Andi..

>
> plus fixing -ljcc and -lpython soname.
>
> BTW support for python >=3.3 would be nice and very easy. We just need
> one define like this in macros.h:
>
> #if PY_VERSION_HEX < 0x03050000
> #define Py_DecodeLocale(_arg_, _size_) _Py_char2wchar(_arg_, _size_)
> #endif
>
> ... as you see here a huge Ubuntu 14.04 build matrix
> https://travis-ci.org/rudimeier/jcc/builds/212830434
>
>> Someone with access to Windows, please help test/fix/finish support
>> for Python 3 on Windows, both with the MSVC and Mingw compilers. I
>> have no access to Windows anymore.
>
> I know already about one MSVC issue:
> https://github.com/rudimeier/jcc/issues/1
>
> probably fixed by
> https://github.com/rudimeier/jcc/commit/764ed0dc1f77c68e4a6998688d2e8340704fd237
> (But this fix is also not tested yet.)
>
> cu,
> Rudi
>

Re: python 3 support is checked into trunk

Posted by Andi Vajda <va...@apache.org>.
On Mon, 20 Mar 2017, Andi Vajda wrote:

>
>> On Mar 20, 2017, at 10:12, Ruediger Meier <sw...@gmx.de> wrote:
>>
>> On Monday 20 March 2017, Andi Vajda wrote:
>>>>> On Mar 20, 2017, at 05:16, Ruediger Meier <sw...@gmx.de> wrote:
>>>>> On Monday 20 March 2017, Andi Vajda wrote:
>>>>>
>>>>> On Mon, 20 Mar 2017, Ruediger Meier wrote:
>>>>>>> Someone with access to Windows, please help test/fix/finish
>>>>>>> support for Python 3 on Windows, both with the MSVC and Mingw
>>>>>>> compilers. I have no access to Windows anymore.
>>>>>>
>>>>>> I know already about one MSVC issue:
>>>>>> https://github.com/rudimeier/jcc/issues/1
>>>>>>
>>>>>> probably fixed by
>>>>>> https://github.com/rudimeier/jcc/commit/764ed0dc1f77c68e4a6998688
>>>>>> d2 e8340704fd237 (But this fix is also not tested yet.)
>>>>>
>>>>> I changed strhash to use Py_hash_t.
>>>>
>>>> This is now wrong and I could reproduce a segfault on OSX 10.11,
>>>> xcode 8. The buffersize "hexdig + 1" has to match the type we are
>>>> printing. We can't calculate the size from Py_hash_t but print
>>>> ulong.
>>>
>>> Ah yes, good point. Sorry for the mess up.
>>>
>>>> Most safely and without precision loss we could do it like the
>>>> patch below.
>>>>
>>>> Notes:
>>>> 1. "static const" was required to actually fix MSVC's VLA issue.
>>>
>>> Yes, but that's not reentrant, thus we need to switch back to a
>>> constant size for the array, like [20] we ad before, or [40] now.
>>
>> Not reentrant? static const should be as good as a #define IMO.
>
> Ah you mean static const for the size, not for the array. That would work.
>
>> In doubt
>> you could avoid the variable and use  "sizeof(hash) * 2" two times
>> where we need it.
>>
>>>> 2. The macro PRIxMAX is the same as "%jx". I've choosed the macro
>>>> because it should be compatible to Visual Studio >=2013 while "%jx"
>>>> would need Visual Studio >=2015. Moreover when using incompatible
>>>> compilers the macro would give an error at compile time rather
>>>> than "%jx" would just crash at runtime.
>>>
>>> What's wring with %lx ?
>>
>> %lx is for long but Py_hash_t can be longer.
>
> Can it ? Py_hash_t is defined to be the same as Py_ssize_t. What's its size ?
>
>> %jx/PRIxMAX  is the biggest
>> possible for uintmax_t.
>
> Is that bigger than unsigned long ?
>
> Ok, I think the time has come to see if this function can be removed 
> altogether... let me see...

Ah, no, this is used quite a bit. It's got to stay and work.

Andi..

>
> Andi..
>
>>
>> cu,
>> Rudi
>
>

Re: python 3 support is checked into trunk

Posted by Andi Vajda <va...@apache.org>.
> On Mar 20, 2017, at 10:12, Ruediger Meier <sw...@gmx.de> wrote:
> 
> On Monday 20 March 2017, Andi Vajda wrote:
>>>> On Mar 20, 2017, at 05:16, Ruediger Meier <sw...@gmx.de> wrote:
>>>> On Monday 20 March 2017, Andi Vajda wrote:
>>>> 
>>>> On Mon, 20 Mar 2017, Ruediger Meier wrote:
>>>>>> Someone with access to Windows, please help test/fix/finish
>>>>>> support for Python 3 on Windows, both with the MSVC and Mingw
>>>>>> compilers. I have no access to Windows anymore.
>>>>> 
>>>>> I know already about one MSVC issue:
>>>>> https://github.com/rudimeier/jcc/issues/1
>>>>> 
>>>>> probably fixed by
>>>>> https://github.com/rudimeier/jcc/commit/764ed0dc1f77c68e4a6998688
>>>>> d2 e8340704fd237 (But this fix is also not tested yet.)
>>>> 
>>>> I changed strhash to use Py_hash_t.
>>> 
>>> This is now wrong and I could reproduce a segfault on OSX 10.11,
>>> xcode 8. The buffersize "hexdig + 1" has to match the type we are
>>> printing. We can't calculate the size from Py_hash_t but print
>>> ulong.
>> 
>> Ah yes, good point. Sorry for the mess up.
>> 
>>> Most safely and without precision loss we could do it like the
>>> patch below.
>>> 
>>> Notes:
>>> 1. "static const" was required to actually fix MSVC's VLA issue.
>> 
>> Yes, but that's not reentrant, thus we need to switch back to a
>> constant size for the array, like [20] we ad before, or [40] now.
> 
> Not reentrant? static const should be as good as a #define IMO.

Ah you mean static const for the size, not for the array. That would work.

> In doubt 
> you could avoid the variable and use  "sizeof(hash) * 2" two times 
> where we need it.
> 
>>> 2. The macro PRIxMAX is the same as "%jx". I've choosed the macro
>>> because it should be compatible to Visual Studio >=2013 while "%jx"
>>> would need Visual Studio >=2015. Moreover when using incompatible
>>> compilers the macro would give an error at compile time rather
>>> than "%jx" would just crash at runtime.
>> 
>> What's wring with %lx ?
> 
> %lx is for long but Py_hash_t can be longer.

Can it ? Py_hash_t is defined to be the same as Py_ssize_t. What's its size ?

> %jx/PRIxMAX  is the biggest 
> possible for uintmax_t.

Is that bigger than unsigned long ?

Ok, I think the time has come to see if this function can be removed altogether... let me see...

Andi..

> 
> cu,
> Rudi


Re: python 3 support is checked into trunk

Posted by Ruediger Meier <sw...@gmx.de>.
On Monday 20 March 2017, Andi Vajda wrote:
> > On Mar 20, 2017, at 05:16, Ruediger Meier <sw...@gmx.de> wrote:
> >> On Monday 20 March 2017, Andi Vajda wrote:
> >>
> >> On Mon, 20 Mar 2017, Ruediger Meier wrote:
> >>>> Someone with access to Windows, please help test/fix/finish
> >>>> support for Python 3 on Windows, both with the MSVC and Mingw
> >>>> compilers. I have no access to Windows anymore.
> >>>
> >>> I know already about one MSVC issue:
> >>> https://github.com/rudimeier/jcc/issues/1
> >>>
> >>> probably fixed by
> >>> https://github.com/rudimeier/jcc/commit/764ed0dc1f77c68e4a6998688
> >>>d2 e8340704fd237 (But this fix is also not tested yet.)
> >>
> >> I changed strhash to use Py_hash_t.
> >
> > This is now wrong and I could reproduce a segfault on OSX 10.11,
> > xcode 8. The buffersize "hexdig + 1" has to match the type we are
> > printing. We can't calculate the size from Py_hash_t but print
> > ulong.
>
> Ah yes, good point. Sorry for the mess up.
>
> > Most safely and without precision loss we could do it like the
> > patch below.
> >
> > Notes:
> > 1. "static const" was required to actually fix MSVC's VLA issue.
>
> Yes, but that's not reentrant, thus we need to switch back to a
> constant size for the array, like [20] we ad before, or [40] now.

Not reentrant? static const should be as good as a #define IMO. In doubt 
you could avoid the variable and use  "sizeof(hash) * 2" two times 
where we need it.

> > 2. The macro PRIxMAX is the same as "%jx". I've choosed the macro
> > because it should be compatible to Visual Studio >=2013 while "%jx"
> > would need Visual Studio >=2015. Moreover when using incompatible
> > compilers the macro would give an error at compile time rather
> > than "%jx" would just crash at runtime.
>
> What's wring with %lx ?

%lx is for long but Py_hash_t can be longer. %jx/PRIxMAX  is the biggest 
possible for uintmax_t.

cu,
Rudi

Re: python 3 support is checked into trunk

Posted by Andi Vajda <va...@apache.org>.
> On Mar 20, 2017, at 05:16, Ruediger Meier <sw...@gmx.de> wrote:
> 
>> On Monday 20 March 2017, Andi Vajda wrote:
>> On Mon, 20 Mar 2017, Ruediger Meier wrote:
> 
>>>> Someone with access to Windows, please help test/fix/finish
>>>> support for Python 3 on Windows, both with the MSVC and Mingw
>>>> compilers. I have no access to Windows anymore.
>>> 
>>> I know already about one MSVC issue:
>>> https://github.com/rudimeier/jcc/issues/1
>>> 
>>> probably fixed by
>>> https://github.com/rudimeier/jcc/commit/764ed0dc1f77c68e4a6998688d2
>>> e8340704fd237 (But this fix is also not tested yet.)
>> 
>> I changed strhash to use Py_hash_t.
> 
> This is now wrong and I could reproduce a segfault on OSX 10.11, xcode 
> 8. The buffersize "hexdig + 1" has to match the type we are printing. 
> We can't calculate the size from Py_hash_t but print ulong.

Ah yes, good point. Sorry for the mess up.

> 
> Most safely and without precision loss we could do it like the patch 
> below.
> 
> Notes:
> 1. "static const" was required to actually fix MSVC's VLA issue.

Yes, but that's not reentrant, thus we need to switch back to a constant size for the array, like [20] we had before, or [40] now.

> 2. The macro PRIxMAX is the same as "%jx". I've choosed the macro 
> because it should be compatible to Visual Studio >=2013 while "%jx" 
> would need Visual Studio >=2015. Moreover when using incompatible 
> compilers the macro would give an error at compile time rather 
> than "%jx" would just crash at runtime.

What's wring with %lx ?

Andi..

> 
> 
> --------
> diff --git a/jcc3/sources/jcc.cpp b/jcc3/sources/jcc.cpp
> index 8c12f00..90baa8b 100644
> --- a/jcc3/sources/jcc.cpp
> +++ b/jcc3/sources/jcc.cpp
> @@ -15,6 +15,7 @@
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> +#include <inttypes.h>
> #include <jni.h>
> 
> #ifdef linux
> @@ -194,11 +195,11 @@ static PyObject *t_jccenv_isShared(PyObject *self)
> 
> static PyObject *t_jccenv_strhash(PyObject *self, PyObject *arg)
> {
> -    Py_hash_t hash = PyObject_Hash(arg);
> -    size_t hexdig = sizeof(Py_hash_t) * 2;
> +    uintmax_t hash = (uintmax_t) PyObject_Hash(arg);
> +    static const size_t hexdig = sizeof(hash) * 2;
>     char buffer[hexdig + 1];
> 
> -    sprintf(buffer, "%0*lx", (int) hexdig, (unsigned long) hash);
> +    sprintf(buffer, "%0*"PRIxMAX, (int) hexdig, hash);
>     return PyUnicode_FromStringAndSize(buffer, hexdig);
> }
> --------------
> 
> cu,
> Rudi


Re: python 3 support is checked into trunk

Posted by Ruediger Meier <sw...@gmx.de>.
On Monday 20 March 2017, Andi Vajda wrote:
> On Mon, 20 Mar 2017, Ruediger Meier wrote:

> >> Someone with access to Windows, please help test/fix/finish
> >> support for Python 3 on Windows, both with the MSVC and Mingw
> >> compilers. I have no access to Windows anymore.
> >
> > I know already about one MSVC issue:
> > https://github.com/rudimeier/jcc/issues/1
> >
> > probably fixed by
> > https://github.com/rudimeier/jcc/commit/764ed0dc1f77c68e4a6998688d2
> >e8340704fd237 (But this fix is also not tested yet.)
>
> I changed strhash to use Py_hash_t.

This is now wrong and I could reproduce a segfault on OSX 10.11, xcode 
8. The buffersize "hexdig + 1" has to match the type we are printing. 
We can't calculate the size from Py_hash_t but print ulong.

Most safely and without precision loss we could do it like the patch 
below.

Notes:
1. "static const" was required to actually fix MSVC's VLA issue.
2. The macro PRIxMAX is the same as "%jx". I've choosed the macro 
because it should be compatible to Visual Studio >=2013 while "%jx" 
would need Visual Studio >=2015. Moreover when using incompatible 
compilers the macro would give an error at compile time rather 
than "%jx" would just crash at runtime.


--------
diff --git a/jcc3/sources/jcc.cpp b/jcc3/sources/jcc.cpp
index 8c12f00..90baa8b 100644
--- a/jcc3/sources/jcc.cpp
+++ b/jcc3/sources/jcc.cpp
@@ -15,6 +15,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <inttypes.h>
 #include <jni.h>

 #ifdef linux
@@ -194,11 +195,11 @@ static PyObject *t_jccenv_isShared(PyObject *self)

 static PyObject *t_jccenv_strhash(PyObject *self, PyObject *arg)
 {
-    Py_hash_t hash = PyObject_Hash(arg);
-    size_t hexdig = sizeof(Py_hash_t) * 2;
+    uintmax_t hash = (uintmax_t) PyObject_Hash(arg);
+    static const size_t hexdig = sizeof(hash) * 2;
     char buffer[hexdig + 1];

-    sprintf(buffer, "%0*lx", (int) hexdig, (unsigned long) hash);
+    sprintf(buffer, "%0*"PRIxMAX, (int) hexdig, hash);
     return PyUnicode_FromStringAndSize(buffer, hexdig);
 }
--------------

cu,
Rudi

Re: python 3 support is checked into trunk

Posted by Andi Vajda <va...@apache.org>.
On Mon, 20 Mar 2017, Ruediger Meier wrote:

> On Sunday 19 March 2017, Andi Vajda wrote:
>> I just now checked in support for Python 3 (3.5+),
>
> Thanks a lot!
>
>> built and tested
>> on Mac OS X 10.12 only, with Python 3.6.
>
> FYI I've tested a HelloWorld.jar using your svn trunk on travis build
> farm for OSX 10.11 and xcode 7.3, 8, 8.1, 8.2:
> https://travis-ci.org/rudimeier/jcc/builds/212820385
>
>> Linux support should be next.
>
> It would work already on Linux with this patch
> https://github.com/rudimeier/jcc/commit/2ccf7e4b828c678577fc0ace24bdb4680ede207a
>
> plus fixing -ljcc and -lpython soname.
>
> BTW support for python >=3.3 would be nice and very easy. We just need
> one define like this in macros.h:
>
> #if PY_VERSION_HEX < 0x03050000
> #define Py_DecodeLocale(_arg_, _size_) _Py_char2wchar(_arg_, _size_)
> #endif

Added to jcc.cpp since that't the only place this is used.

> ... as you see here a huge Ubuntu 14.04 build matrix
> https://travis-ci.org/rudimeier/jcc/builds/212830434
>
>> Someone with access to Windows, please help test/fix/finish support
>> for Python 3 on Windows, both with the MSVC and Mingw compilers. I
>> have no access to Windows anymore.
>
> I know already about one MSVC issue:
> https://github.com/rudimeier/jcc/issues/1
>
> probably fixed by
> https://github.com/rudimeier/jcc/commit/764ed0dc1f77c68e4a6998688d2e8340704fd237
> (But this fix is also not tested yet.)

I changed strhash to use Py_hash_t.

Andi..

>
> cu,
> Rudi
>

Re: python 3 support is checked into trunk

Posted by Ruediger Meier <sw...@gmx.de>.
On Sunday 19 March 2017, Andi Vajda wrote:
> I just now checked in support for Python 3 (3.5+), 

Thanks a lot!

> built and tested 
> on Mac OS X 10.12 only, with Python 3.6.

FYI I've tested a HelloWorld.jar using your svn trunk on travis build 
farm for OSX 10.11 and xcode 7.3, 8, 8.1, 8.2:
https://travis-ci.org/rudimeier/jcc/builds/212820385

> Linux support should be next.

It would work already on Linux with this patch
https://github.com/rudimeier/jcc/commit/2ccf7e4b828c678577fc0ace24bdb4680ede207a

plus fixing -ljcc and -lpython soname.

BTW support for python >=3.3 would be nice and very easy. We just need 
one define like this in macros.h:

#if PY_VERSION_HEX < 0x03050000
#define Py_DecodeLocale(_arg_, _size_) _Py_char2wchar(_arg_, _size_)
#endif

... as you see here a huge Ubuntu 14.04 build matrix
https://travis-ci.org/rudimeier/jcc/builds/212830434

> Someone with access to Windows, please help test/fix/finish support
> for Python 3 on Windows, both with the MSVC and Mingw compilers. I
> have no access to Windows anymore.

I know already about one MSVC issue:
https://github.com/rudimeier/jcc/issues/1

probably fixed by 
https://github.com/rudimeier/jcc/commit/764ed0dc1f77c68e4a6998688d2e8340704fd237
(But this fix is also not tested yet.)

cu,
Rudi