You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by gs...@apache.org on 2011/08/11 14:37:55 UTC

svn commit: r1156598 - in /lucene/cms/trunk/content/lucene/pylucene/jcc: ./ install.mdtext readme.mdtext

Author: gsingers
Date: Thu Aug 11 12:37:55 2011
New Revision: 1156598

URL: http://svn.apache.org/viewvc?rev=1156598&view=rev
Log:
pylucene

Added:
    lucene/cms/trunk/content/lucene/pylucene/jcc/
    lucene/cms/trunk/content/lucene/pylucene/jcc/install.mdtext
    lucene/cms/trunk/content/lucene/pylucene/jcc/readme.mdtext

Added: lucene/cms/trunk/content/lucene/pylucene/jcc/install.mdtext
URL: http://svn.apache.org/viewvc/lucene/cms/trunk/content/lucene/pylucene/jcc/install.mdtext?rev=1156598&view=auto
==============================================================================
--- lucene/cms/trunk/content/lucene/pylucene/jcc/install.mdtext (added)
+++ lucene/cms/trunk/content/lucene/pylucene/jcc/install.mdtext Thu Aug 11 12:37:55 2011
@@ -0,0 +1,216 @@
+
+##Getting JCC's Source Code
+
+JCC's source code is included with PyLucene's. If you've downloaded
+the PyLucene source code already, JCC's is to be found in
+the <code>jcc</code> subdirectory.
+
+
+To get the JCC source code only from SVN use:<br/>
+<code>$ svn co
+http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc jcc</code>
+
+##Building JCC
+
+JCC is a Python extension written in Python and C++. It requires a
+Java Runtime Environment to operate as it uses Java's reflection
+APIs to do its work. It is built and installed
+via <code>distutils</code>
+or <a href="http://pypi.python.org/pypi/setuptools">setuptools</a>.
+
+
+- Edit <code>setup.py</code> and review that values in
+the <code>INCLUDES</code>, <code>CFLAGS</code>,
+<code>DEBUG_CFLAGS</code>, <code>LFLAGS</code>
+and <code>JAVAC</code> are correct for your system. These values
+are also going to be compiled into JCC's <code>config.py</code>
+file and are going to be used by JCC when
+invoking <code>distutils</code> or <code>setuptools</code> to
+compile extensions it is generating code for.
+- At the command line, enter:
+<source>
+$ python setup.py build
+$ sudo python setup.py install
+</source>
+
+
+##Requirements
+
+JCC requires a Java Development Kit to be present. It uses the Java
+Native Invocation Interface and expects <code>&lt;jni.h&gt;</code>
+and the Java libraries to be present at build and runtime.
+
+
+JCC requires a C++ compiler. A recent C++ compiler for your
+platform is expected to work as expected.
+
+
+##Shared Mode: Support for the <code>--shared</code> Flag
+
+JCC includes a small runtime that keeps track of the Java VM and of
+Java objects escaping it. Because there can be only one Java VM
+embedded in a given process at a time, the JCC runtime must be
+compiled as a shared library when more than one JCC-built Python
+extension is going to be imported into a given Python process.
+
+
+Shared mode depends on <code>setuptools</code>' capability of
+building plain shared libraries (as opposed to shared libraries for
+Python extensions).
+
+
+Currently, shared mode is supported with <code>setuptools
+0.6c7</code> and above out of the box on Mac OS X and Windows. On
+Linux, a patch to <code>setuptools</code> needs to be applied
+first. This patch is included in the JCC source distribution in
+the <code>jcc/patches</code> directory, <code>patch.43</code>. This
+patch was submitted to the <code>setuptools</code> project
+via <a href="http://bugs.python.org/setuptools/issue43">issue
+43</a>.
+
+
+The <code>shared mode disabled</code> error reported during the
+build of JCC's on Linux contains the exact instructions on how to
+patch the <code>setuptools</code> installation
+with <code>patch.43</code> on your system.
+
+
+Shared mode is also required when embedding Python in a Java VM as
+JCC's runtime shared library is used by the JVM to load JCC and
+bootstrap the Python VM via the JNI.
+
+
+When shared mode is not enabled, not supported
+or <code>distutils</code> is used instead
+of <code>setuptools</code>, static mode is used instead. The JCC
+runtime code is statically linked with each JCC-built Python
+extension and only one such extension can be used in a given Python
+process at a time.
+
+
+As setuptools grows its shared library building capability it is
+expected that more operating systems should be supported with shared
+mode in the future.
+
+
+Shared mode can be forced off by building JCC with
+the <code>NO_SHARED</code> environment variable set.
+
+
+There are two defaults to consider here:
+
+
+- Is JCC built with shared mode support or not ?
+
+- By default, on Mac OS X and Windows this is the case.
+
+- By default, on Linux, this is the case.
+  if <code>setuptools</code> is patched.
+
+- On other operating systems shared mode support is off by
+  default - not supported - because shared mode depends on
+  <code>setuptools</code>'s capability of building a regular
+  shared library which is still an experimental feature.
+
+- Is a JCC-built Python extension built with shared mode ?<br/>
+    By default, no, shared mode is enabled only with
+    the <code>--shared</code> command line argument.
+
+
+
+##Notes for Mac OS X
+
+On Mac OS X, Java is installed by Apple's setup as a framework. The
+values in <code>setup.py</code> for <code>INCLUDES</code>
+and <code>LFLAGS</code> for <code>darwin</code> should be correct
+and ready to use.
+
+
+  However, if you intend to use the 'system' Python from a Java VM
+  on Mac OS X -- Python embedded in Java --
+  you will need to add the flags <code>"-framework", "Python"</code>
+  to the <code>LFLAGS</code> value.
+
+
+##Notes for Linux
+
+JCC has been built and tested on a variety of Linux distributions,
+32- and 64-bit. Getting the java configuration correct is important
+and is done differently for every distribution.<br/>
+For example:
+
+
+- On Ubuntu, to install Java 5, these commands may be used:
+<source>
+      $ sudo apt-get install sun-java5-jdk
+      $ sudo update-java-alternatives -s java-1.5.0-sun
+</source>
+The samples flags for Linux in JCC's setup.py should be close to
+correct.
+
+- On Gentoo, the <code>java-config</code> utility should be used to
+locate, and possibly change, the default java installation.
+The sample flags for Linux in JCC's <code>setup.py</code> should
+be changed to reflect the root of the Java installation which may
+be obtained via:
+<source>
+      $ java-config -O
+</source>
+
+
+
+See earlier section about <a href="#shared">Shared Mode</a> for
+Linux support.
+
+
+##Notes for Solaris
+
+At this time, JCC has been built and tested only on Solaris 11 with Sun
+Studio C++ 12, Java 1.6 and Python 2.4.
+
+
+Because JCC is written in C++, Python's <code>distutils</code> must
+be nudged a bit to invoke the correct compiler. Sun Studio's C
+compiler is called <code>cc</code> while its C++ compiler is
+called <code>CC</code>. To build JCC, use the following shell
+command to ensure that the C++ compiler is used:
+
+<source>
+$ CC=CC python setup.py build
+</source>
+
+Shared mode is not currently implemented for
+Solaris, <code>setuptools</code> needs to be taught how to build
+plain shared libraries on Solaris first.
+
+
+##Notes for Windows
+
+At this time, JCC has been built and tested on Win2k and WinXP with
+a variety of Python and Java versions.
+
+
+- Adding the Python directory to <code>PATH</code> is recommended.
+
+- Adding the Java directories containing the necessary DLLs and to
+<code>PATH</code> is a must.
+
+- Adding the directory containing <code>javac.exe</code>
+to <code>PATH</code> is required for shared mode (enabled by
+default if <code>setuptools >= 0.6c7</code> is found to be
+installed).
+
+
+
+##Notes for Python 2.3
+
+To use JCC with Python 2.3, setuptools is required
+
+
+- download <a href="http://pypi.python.org/pypi/setuptools">setuptools</a>.
+
+- edit the downloaded <code>setuptools</code> egg file to use
+python2.3 instead of python2.4.
+
+- At the command line, run:<br/>
+<code>$ sudo sh setuptools-0.6c7-py2.4.egg</code>

Added: lucene/cms/trunk/content/lucene/pylucene/jcc/readme.mdtext
URL: http://svn.apache.org/viewvc/lucene/cms/trunk/content/lucene/pylucene/jcc/readme.mdtext?rev=1156598&view=auto
==============================================================================
--- lucene/cms/trunk/content/lucene/pylucene/jcc/readme.mdtext (added)
+++ lucene/cms/trunk/content/lucene/pylucene/jcc/readme.mdtext Thu Aug 11 12:37:55 2011
@@ -0,0 +1,844 @@
+
+## Warning    
+*Before calling any PyLucene API that requires the Java VM, start it by
+calling <code>initVM(classpath, ...)</code>. More about this function
+in <a href="#api">here</a>.*
+
+##Installing JCC
+
+JCC is a Python extension written in Python and C++. It requires a
+Java Runtime Environment (JRE) to operate as it uses Java's
+reflection APIs to do its work. It is built and installed
+via <code>distutils</code> or <code>setuptools</code>.
+
+
+See <a href="site:jcc/documentation/install">installation</a> for more
+information and operating system specific notes.
+
+##Invoking JCC
+
+JCC is installed as a package and how to invoke it depends on the
+Python version used:
+
+
+- python 2.7: <code>python -m jcc</code>
+- python 2.6: <code>python -m jcc.__main__</code>
+- python 2.5: <code>python -m jcc</code>
+- python 2.4:
+
+- no setuptools: <code>python </code><em><code>site-packages</code></em><code>/jcc/__init__.py</code>
+- with setuptools: <code>python </code><em><code>site-packages</code></em>/<em><code>jcc egg directory</code></em><code>/jcc/__init__.py</code>
+
+
+- python 2.3: <code>python </code><em><code>site-packages</code></em>/<em><code>jcc egg directory</code></em><code>/jcc/__init__.py</code>
+
+##Generating C++ and Python wrappers with JCC
+
+JCC started as a C++ code generator for hiding the gory details of
+accessing methods and fields on Java classes via
+Java's <a href="http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/invocation.html">Native Invocation Interface</a>.
+These C++ wrappers make it possible to access a Java object as if it
+was a regular C++ object very much like GCJ's
+<a href="http://gcc.gnu.org/onlinedocs/gcj/About-CNI.html">CNI
+interface</a>.
+
+
+It then became apparent that JCC could also generate the C++
+wrappers for making these classes available to Python. Every class
+that gets thus wrapped becomes a
+<a href="http://docs.python.org/ext/defining-new-types.html">CPython
+type</a>.
+
+
+JCC generates wrappers for all public classes that are requested by
+name on the command line or via the <code>--jar</code> command line
+argument. It generates wrapper methods for all public methods and
+fields on these classes whose return type and parameter types are
+found in one of the following ways:
+
+
+- the type is one of the requested classes
+
+- the type is one of the requested classes' superclass or implemented
+interfaces 
+
+- the type is available from one of the packages listed via the
+<code>--package</code> command line argument
+
+
+
+Overloaded methods are supported and are selected at runtime on the
+basis of the type and number of arguments passed in.
+
+
+JCC does not generate wrappers for methods or fields which don't
+satisfy these requirements. Thus, JCC can avoid generating code for
+runaway transitive closures of type dependencies.
+
+
+JCC generates property accessors for a property
+called <em><code>field</code></em> when it finds Java methods
+named <code>set</code><em><code>Field</code></em><code>(value)</code>,
+<code>get</code><em><code>Field</code></em><code>()</code> or
+<code>is</code><em><code>Field</code></em><code>()</code>.
+
+
+The C++ wrappers are declared in a C++ namespace structure that
+mirrors the Java classes' Java packages. The Python types are
+declared in a flat namespace at the top level of the resulting
+Python extension module.
+
+
+JCC's command-line arguments are best illustrated via the PyLucene
+example:
+
+<source>
+$ python -m jcc           # run JCC to wrap
+--jar lucene.jar      # all public classes in the lucene jar file
+--jar analyzers.jar   # and the lucene analyzers contrib package
+--jar snowball.jar    # and the snowball contrib package
+--jar highlighter.jar # and the highlighter contrib package
+--jar regex.jar       # and the regex search contrib package
+--jar queries.jar     # and the queries contrib package
+--jar extensions.jar  # and the Python extensions package
+--package java.lang   # including all dependencies found in the 
+                    # java.lang package
+--package java.util   # and the java.util package
+--package java.io     # and the java.io package
+java.lang.System    # and to explicitely wrap java.lang.System
+java.lang.Runtime   # as well as java.lang.Runtime
+java.lang.Boolean   # and java.lang.Boolean
+java.lang.Byte      # and java.lang.Byte
+java.lang.Character # and java.lang.Character
+java.lang.Integer   # and java.lang.Integer
+java.lang.Short     # and java.lang.Short
+java.lang.Long      # and java.lang.Long
+java.lang.Double    # and java.lang.Double
+java.lang.Float     # and java.lang.Float
+java.text.SimpleDateFormat
+                    # and java.text.SimpleDateFormat
+java.io.StringReader
+                    # and java.io.StringReader
+java.io.InputStreamReader
+                    # and java.io.InputStreamReader
+java.io.FileInputStream
+                    # and java.io.FileInputStream
+java.util.Arrays    # and java.util.Arrays
+--exclude org.apache.lucene.queryParser.Token
+                    # while explicitely not wrapping
+                    # org.apache.lucene.queryParser.Token
+--exclude org.apache.lucene.queryParser.TokenMgrError
+                    # nor org.apache.lucene.queryParser.TokenMgrError
+--exclude org.apache.lucene.queryParser.ParseException
+                    # nor.apache.lucene.queryParser.ParseException
+--python lucene       # generating Python wrappers into a module
+                    # called lucene
+--version 2.4.0       # giving the Python extension egg version 2.4.0
+--mapping org.apache.lucene.document.Document 
+        'get:(Ljava/lang/String;)Ljava/lang/String;' 
+                    # asking for a Python mapping protocol wrapper
+                    # for get access on the Document class by
+                    # calling its get method
+--mapping java.util.Properties 
+        'getProperty:(Ljava/lang/String;)Ljava/lang/String;'
+                    # asking for a Python mapping protocol wrapper
+                    # for get access on the Properties class by
+                    # calling its getProperty method
+--sequence org.apache.lucene.search.Hits
+         'length:()I' 
+         'doc:(I)Lorg/apache/lucene/document/Document;'
+                    # asking for a Python sequence protocol wrapper
+                    # for length and get access on the Hits class by
+                    # calling its length and doc methods
+--files 2             # generating all C++ classes into about 2 .cpp
+                    # files
+--build               # and finally compiling the generated C++ code
+                    # into a Python egg via setuptools - when
+                    # installed - or a regular Python extension via
+                    # distutils or setuptools otherwise 
+--module collections.py
+                    # copying the collections.py module into the egg
+--install             # installing it into Python's site-packages
+                    # directory.
+</source>
+
+There are limits to both how many files can fit on the command line
+and how large a C++ file the C++ compiler can handle. By default,
+JCC generates one large C++ file containing the source code for all
+wrapper classes.
+
+
+Using the <code>--files</code> command line argument, this behaviour
+can be tuned to workaround various limits:<br/>
+for example:
+
+
+- to break up the large wrapper class file into about 2 files:<br/>
+<code>--files 2</code>
+
+- to break up the large wrapper class file into about 10 files:<br/>
+<code> --files 10</code>
+
+- to generate one C++ file per Java class wrapped:<br/>
+<code>--files separate</code>
+
+
+The <code>--prefix</code> and <code>--root</code> arguments are
+passed through to <code>distutils</code>' <code>setup()</code>.
+
+##Classpath considerations
+
+When generating wrappers for Python, the JAR files passed to JCC
+via <code>--jar</code> are copied into the resulting Python extension
+egg as resources and added to the extension
+module's <code>CLASSPATH</code> variable. Classes or JAR files that
+are required by the classes contained in the argument JAR files need
+to be made findable via JCC's <code>--classpath</code> command line
+argument. At runtime, these need to be appended to the
+extension's <code>CLASSPATH</code> variable before starting the VM
+with <code>initVM(CLASSPATH)</code>.
+
+
+To have such required jar files also automatically copied into
+resulting Python extension egg and added to the classpath at build
+and runtime, use the <code>--include</code> option. This option
+works like the <code>--jar</code> option except that no wrappers are
+generated for the classes contained in them unless they're
+explicitely named on the command line. 
+
+
+When more than one JCC-built extension module is going to be used in
+the same Python VM and these extension modules share Java classes,
+only one extension module should be generated with wrappers for these
+shared classes. The other extension modules must be built by importing
+the one with the shared classes by using the <code>--import</code>
+command line parameter. This ensures that only one copy of the
+wrappers for the shared classes are generated and that they are
+compatible among all extension modules sharing them.
+
+
+##Using <code>distutils</code> vs <code>setuptools</code>
+
+By default, when building a Python extension,
+if <code>setuptools</code> is found to be installed, it is used
+over <code>distutils</code>. If you want to force the use
+of <code>distutils</code> over <code>setuptools</code>, use
+the <code>--use-distutils</code> command line argument.
+
+
+##Distributing an egg
+
+The <code>--bdist</code> option can be used to ask JCC to
+invoke <code>distutils</code> with <code>bdist</code>
+or <code>setuptools</code>
+with <code>bdist_egg</code>. If <code>setuptools</code> is used,
+the resulting egg has to be installed with the
+<a href="http://peak.telecommunity.com/DevCenter/EasyInstall"><code>easy_install</code></a>
+installer which is normally part of a Python installation that
+includes <code>setuptools</code>.
+
+
+##JCC's runtime API functions
+
+JCC includes a small runtime component that is compiled into any
+Python extension it produces.
+
+
+This runtime component makes it possible to manage the Java VM from
+Python. Because a Java VM can be configured with a myriad of
+options, it is not automatically started when the resulting Python
+extension module is loaded into the Python interpreter.
+
+
+Instead, the <code>initVM()</code> function must be called from the
+main thread before using any of the wrapped classes. It takes the
+following keyword arguments:
+
+
+- 
+<code>classpath</code><br/>
+A string containing one or more directories or jar files for the
+Java VM to search for classes. Every Python extension produced by
+JCC exports a <code>CLASSPATH</code> variable that is hardcoded to
+the jar files that it was produced from. A copy of each jar file
+is installed as a resource file with the extension when JCC is
+invoked with the <code>--install</code> command line argument. 
+This parameter is optional and defaults to the
+<code>CLASSPATH</code> string exported by the module
+<code>initVM</code> is imported from.
+<source>
+  >>> import lucene
+  >>> lucene.initVM(classpath=lucene.CLASSPATH)
+</source>
+
+- 
+<code>initialheap</code><br/>
+The initial amount of Java heap to start the Java VM with. This
+argument is a string that follows the same syntax as the
+similar <code>-Xms</code> java command line argument.
+<source>
+  >>> import lucene
+  >>> lucene.initVM(initialheap='32m')
+  >>> lucene.Runtime.getRuntime().totalMemory()
+  33357824L
+</source>
+
+- 
+<code>maxheap</code><br/>
+The maximum amount of Java heap that could become available to the
+Java VM. This argument is a string that follows the same syntax as
+the similar <code>-Xmx</code> java command line argument.
+
+- 
+<code>maxstack</code><br/>
+The maximum amount of stack space that available to the Java
+VM. This argument is a string that follows the same syntax as the
+similar <code>-Xss</code> java command line argument.
+
+- 
+<code>vmargs</code><br/>
+A string of comma separated additional options to pass to the VM
+startup rountine. These are passed through as-is. For example:
+<source>
+  >>> import lucene
+  >>> lucene.initVM(vmargs='-Xcheck:jni,-verbose:jni,-verbose:gc')
+</source>
+
+
+
+The <code>initVM()</code> and <code>getVMEnv()</code> functions
+return a JCCEnv object that has a few utility methods on it:
+
+
+- 
+<code>attachCurrentThread(name, asDaemon)</code><br/>
+Before a thread created in Python or elsewhere but not in the Java
+VM can be used with the Java VM, this method needs to be
+invoked. The two arguments it takes are optional and
+self-explanatory.
+
+- 
+<code>detachCurrentThread()</code>
+The opposite of <code>attachCurrentThread()</code>. This method
+should be used with extreme caution as Python's and java VM's
+garbage collectors may use a thread detached too early causing a
+system crash. The utility of this method seems dubious at the
+moment.
+
+
+
+There are several differences between JNI's <code>findClass()</code>
+and Java's <code>Class.forName()</code>:
+
+
+- 
+className is a '/' separated string of names
+
+- 
+the class loaders are different, <code>findClass()</code> may find
+classes that <code>Class.forName()</code> won't.
+
+
+
+For example:
+
+<source>
+>>> from lucene import *
+>>> initVM(CLASSPATH)
+>>> findClass('org/apache/lucene/document/Document')
+&lt;Class: class org.apache.lucene.document.Document&gt;
+>>> Class.forName('org.apache.lucene.document.Document')
+Traceback (most recent call last):
+File "&lt;stdin&gt;", line 1, in &lt;module&gt;
+lucene.JavaError: java.lang.ClassNotFoundException:
+                org/apache/lucene/document/Document
+>>> Class.forName('java.lang.Object')
+&lt;Class: class java.lang.Object&gt;
+</source>
+
+##Type casting and instance checks
+
+Many Java APIs are declared to return types that are less specific
+than the types actually returned. In Java 1.5, this is worked around
+with type parameters. JCC generates code to heed type parameters
+unless the <code>--no-generics</code> is used. See next section for
+details on Java generics support.
+
+
+In C++, casting the object into its actual type is supported via the
+regular C casting operator.
+
+
+In Python each wrapped class has a class method
+called <code>cast_</code> that implements the same functionality.
+
+
+Similarly, each wrapped class has a class method
+called <code>instance_</code> that tests whether the wrapped java
+instance is of the given type. For example:
+
+<source>
+if BooleanQuery.instance_(query):
+  booleanQuery = BooleanQuery.cast_(query)
+
+print booleanQuery.getClauses()
+</source>
+
+##Handling generic classes
+
+Java 1.5 added support for parameterized types. JCC generates code
+to heed type parameters unless the <code>--no-generics</code>
+command line parameter is used. Java type parameterization is a
+runtime feature. The same class is used for all its
+parameterizations. Similarly, JCC wrapper objects all use the same
+class but store type parameterizations on instances and make them
+accessible as a tuple via the <code>parameters_</code> property.
+
+
+For example, an <code>ArrayList&lt;Document&gt;</code> instance,
+has <code>(&lt;type 'Document'&gt;,)</code>
+for <code>parameters_</code> and its <code>get()</code> method uses
+that type parameter to wrap its return values.
+
+
+To allocate an instance of a generic Java class with specific type
+parameters use the <code>of_()</code> method. This method accepts
+one or more Python wrapper classes to use as type parameters. For
+example, <code>java.util.ArrayList&lt;E&gt;</code> is declared to
+accept one type parameter. Its wrapper's <code>of_()</code> method
+hence accepts one parameter, a Python class, to use as type
+parameter for the return type of its <code>get()</code> method, among
+others: 
+
+<source>
+>>> a = ArrayList().of_(Document)
+>>> a
+&lt;ArrayList: []&gt;
+>>> a.parameters_
+(&lt;type 'Document'&gt;,)
+>>> a.add(Document())
+True
+>>> a.get(0)
+&lt;Document: Document&lt;&gt;&gt;
+</source>
+
+The use of type parameters is, of course, optional. A generic Java
+class can still be used as before, without type parameters.
+Downcasting from <code>Object</code> is then necessary:  
+
+<source>
+>>> a = ArrayList()
+>>> a
+&lt;ArrayList: []&gt;
+>>> a.parameters_
+(None,)
+>>> a.add(Document())
+True
+>>> a.get(0)
+&lt;Object: Document&lt;&gt;&gt;
+>>> Document.cast_(a.get(0))
+&lt;Document: Document&lt;&gt;&gt;
+</source>
+
+##Handling arrays
+
+Java arrays are wrapped with a C++ JArray
+template. The <code>[]</code> is available for read
+access. This template, <code>JArray&lt;T&gt;</code>, accomodates all
+java primitive types, <code>jstring</code>, <code>jobject</code> and
+wrapper class arrays.
+
+
+Java arrays are returned to Python in a <code>JArray</code> wrapper
+instance that implements the Python sequence protocol. It is
+possible to change an array's elements but not to change an array's
+size.
+
+
+To convert a char array to a Python string use
+a <code>''.join(array)</code> construct.
+
+
+Any Java method expecting an array can be called with the corresponding
+sequence object from python.
+
+
+To instantiate a Java array from Python, use one of the following
+forms:
+
+<source>
+>>> array = JArray('int')(size)
+# the resulting Java int array is initialized with zeroes
+
+>>> array = JArray('int')(sequence)
+# the sequence must only contain ints
+# the resulting Java int array contains the ints in the sequence
+</source>
+
+Instead of <code>'int'</code>, you may also use one
+of <code>'object'</code>, <code>'string'</code>, <code>'bool'</code>,
+<code>'byte'</code>, <code>'char'</code>, <code>'double'</code>,
+<code>'float'</code>, <code>'long'</code> and <code>'short'</code>
+to create an array of the corresponding type.
+
+
+Because there is only one wrapper class for object arrays,
+the <code>JArray('object')</code> type's constructor takes a second
+argument denoting the class of the object elements. This argument is
+optional and defaults to <code>Object</code>.
+
+
+As with the <code>Object</code> types, the <code>JArray</code> types
+also include a <code>cast_</code> method. This method becomes useful
+when the array returned to Python is wrapped as a
+plain <code>Object</code>. This is the case, for example, with
+nested arrays since there is no distinct Python type for every
+different java object array class - all java object arrays are
+wrapped by <code>JArray('object')</code>. For example:
+
+<source>
+# cast obj to an array of ints
+>>> JArray('int').cast_(obj)
+# cast obj to an array of Document
+>>> JArray('object').cast_(obj, Document)
+</source>
+
+In both cases, the java type of obj must be compatible with the
+array type it is being cast to.
+
+<source>
+# using nested array:
+
+>>> d = JArray('object')(1, Document)
+>>> d[0] = Document()
+>>> d
+JArray&lt;object&gt;[&lt;Document: Document&lt;>>]
+>>> d[0]
+&lt;Document: Document&lt;&gt;&gt;
+>>> a = JArray('object')(2)
+>>> a[0] = d
+>>> a[1] = JArray('int')([0, 1, 2])
+>>> a
+JArray&lt;object&gt;[&lt;Object: [Lorg.apache.lucene.document.Document;@694f12&gt;, &lt;Object: [I@234265&gt;]
+>>> a[0]
+&lt;Object: [Lorg.apache.lucene.document.Document;@694f12&gt;
+>>> a[1]
+&lt;Object: [I@234265&gt;
+>>> JArray('object').cast_(a[0])[0]
+&lt;Object: Document&lt;&gt;&gt;
+>>> JArray('object').cast_(a[0], Document)[0]
+&lt;Document: Document&lt;&gt;&gt;
+>>> JArray('int').cast_(a[1])
+JArray&lt;int&gt;[0, 1, 2]
+>>> JArray('int').cast_(a[1])[0]
+0
+</source>
+
+To verify that a Java object is of a given array type, use
+the <code>instance_()</code> method available on the array
+type. This is not the same as verifying that it is assignable with
+elements of a given type. For example, using the arrays created
+above:
+
+<source>
+# is d array of Object ? are d's elements of type Object ?
+>>> JArray('object').instance_(d)
+True
+
+# can it receive Object instances ?
+>>> JArray('object').assignable_(d)
+False
+
+# is it array of Document ? are d's elements of type Document ?
+>>> JArray('object').instance_(d, Document)
+True
+
+# is it array of Class ? are d's elements of type Class ?
+>>> JArray('object').instance_(d, Class)
+False
+
+# can it receive Document instances ?
+>>> JArray('object').assignable_(d, Document)
+True
+</source>
+
+##Exception reporting
+
+Exceptions that occur in the Java VM and that escape to C++ are
+reported as a <code>javaError</code> C++ exception. When using
+Python wrappers, the C++ exceptions are handled and reported with
+Python exceptions. When using C++ only, failure to handle the
+exception in your C++ code will cause the process to crash.
+
+
+Exceptions that occur in the Java VM and that escape to the Python
+VM are reported with a <code>JavaError</code> python exception
+object. The <code>getJavaException()</code> method can be called
+on <code>JavaError</code> objects to obtain the original java
+exception object wrapped as any other Java object. This Java object
+can be used to obtain a Java stack trace for the error, for example.
+
+
+Exceptions that occur in the Python VM and that escape to the Java
+VM, as for example can happen in Python extensions (see topic below)
+are reported to the Java VM as a <code>RuntimeException</code> or as
+a <code>PythonException</code> when using shared
+mode. See <a href="site:jcc/documentation/install">installation
+instructions</a> for more information about shared mode.
+
+
+##Writing Java class extensions in Python
+
+JCC makes it relatively easy to extend a Java class from
+Python. This is done via an intermediary class written in Java that
+implements a special method called <code>pythonExtension()</code>
+and that declares a number of native methods that are to be
+implemented by the actual Python extension.
+
+
+When JCC sees these special extension java classes it generates the
+C++ code implementing the native methods they declare. These native
+methods call the corresponding Python method implementations passing
+in parameters and returning the result to the Java VM caller.
+
+
+For example, to implement a Lucene analyzer in Python, one would
+implement first such an extension class in Java:
+
+<source>
+package org.apache.pylucene.analysis;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import java.io.Reader;
+
+public class PythonAnalyzer extends Analyzer {
+private long pythonObject;
+
+public PythonAnalyzer()
+{
+}
+
+public void pythonExtension(long pythonObject)
+{
+  this.pythonObject = pythonObject;
+}
+public long pythonExtension()
+{
+  return this.pythonObject;
+}
+
+public void finalize()
+  throws Throwable
+{
+  pythonDecRef();
+}
+
+public native void pythonDecRef();
+public native TokenStream tokenStream(String fieldName, Reader reader);
+}
+</source>
+
+The <code>pythonExtension()</code> methods is what makes this class
+recognized as an extension class by JCC. They should be included
+verbatim as above along with the declaration of
+the <code>pythonObject</code> instance variable.
+
+
+The implementation of the native <code>pythonDecRef()</code> method
+is generated by JCC and is necessary because it seems
+that <code>finalize()</code> cannot itself be native. Since an
+extension class wraps the Python instance object it's going to be
+calling methods on, its ref count needs to be decremented when this
+Java wrapper class disappears. A declaration
+for <code>pythonDecRef()</code> and a <code>finalize()</code>
+implementation should always be included verbatim as above.
+
+
+Really, the only non boilerplate user input is the constructor of the
+class and the other native methods, <code>tokenStream()</code> in
+the example above.
+
+
+The corresponding Python class(es) are implemented as follows:
+
+<source>
+class _analyzer(PythonAnalyzer):
+  def tokenStream(_self, fieldName, reader):
+      class _tokenStream(PythonTokenStream):
+          def __init__(self_):
+              super(_tokenStream, self_).__init__()
+              self_.TOKENS = ["1", "2", "3", "4", "5"]
+              self_.INCREMENTS = [1, 2, 1, 0, 1]
+              self_.i = 0
+              self_.posIncrAtt = self_.addAttribute(PositionIncrementAttribute.class_)
+              self_.termAtt = self_.addAttribute(TermAttribute.class_)
+              self_.offsetAtt = self_.addAttribute(OffsetAttribute.class_)
+          def incrementToken(self_):
+              if self_.i == len(self_.TOKENS):
+                  return False
+              self_.termAtt.setTermBuffer(self_.TOKENS[self_.i])
+              self_.offsetAtt.setOffset(self_.i, self_.i)
+              self_.posIncrAtt.setPositionIncrement(self_.INCREMENTS[self_.i])
+              self_.i += 1
+              return True
+          def end(self_):
+              pass
+          def reset(self_):
+              pass
+          def close(self_):
+              pass
+      return _tokenStream()
+</source>
+
+When an <code>__init__()</code> is declared, <code>super()</code>
+must be called or else the Java wrapper class will not know about
+the Python instance it needs to invoke.
+
+
+When a java extension class declares native methods for which there
+are public or protected equivalents available on the parent class,
+JCC generates code that makes it possible to
+call <code>super()</code> on these methods from Python as well.
+
+
+There are a number of extension examples available in PyLucene's test
+<a href="http://svn.apache.org/viewcvs.cgi/lucene/pylucene/trunk/test">suite</a>
+and <a href="site:documentation/readme">samples</a>.
+
+##Embedding a Python VM in a Java VM
+
+Using the same techniques used when writing a Python extension of a
+Java class, JCC may also be used to embed a Python VM in a Java VM.
+Following are the steps and constraints to follow to achieve this:
+
+
+- 
+JCC must be built in shared mode.  See
+<a href="site:jcc/documentation/install">installation
+instructions</a> for more information about shared mode.
+Note that for this use on Mac OS X, JCC must also be built
+with the link flags <code>"-framework", "Python"</code> in
+the <code>LFLAGS</code> value.
+
+- 
+As described in the previous section, define one or more Java
+classes to be "extended" from Python to provide the
+implementations of the native methods declared on them. Instances
+of these classes implement the bridges into the Python VM from
+Java.
+
+- 
+The <code>org.apache.jcc.PythonVM</code> Java class is going be
+used from the Java VM's main thread to initialize the embedded
+Python VM. This class is installed inside the JCC egg under the
+<code>jcc/classes</code> directory and the full path to this
+directory must be on the Java <code>CLASSPATH</code>.
+
+- 
+The JCC egg directory contains the JCC shared runtime library - not
+the JCC Python extension shared library - but a library
+called <code>libjcc.dylib</code> on Mac OS X, 
+<code>libjcc.so</code> on Linux or <code>jcc.dll</code> on Windows. 
+This directory must be added to the Java VM's shared library path
+via the <code>-Djava.library.path</code> command line parameter.
+
+- 
+In the Java VM's main thread, initialize the Python VM by
+calling its static <code>start()</code> method passing it a
+Python program name string and optional start-up arguments
+in a string array that will be made accessible in Python via
+<code>sys.argv</code>.  Note that the program name string is
+purely informational, and is not used by the
+<code>start()</code> code other than to initialize that
+Python variable.  This method returns the singleton PythonVM
+instance to be used in this Java VM. <code>start()</code>
+may be called multiple times; it will always return the same
+singleton instance.  This instance may also be retrieved at any
+later time via the static <code>get()</code> method defined
+on the <code>org.apache.jcc.PythonVM</code> class.
+
+- 
+Any Java VM thread that is going to be calling into the Python VM
+should start with acquiring a reference to the Python thread state
+object by calling <code>acquireThreadState()</code> method on the
+Python VM instance. It should then release the Python thread state
+before terminating by calling <code>releaseThreadState()</code>. 
+Calling these methods is optional but strongly recommended as it
+ensures that Python is not creating and throwing away a thread
+state everytime the Python VM is entered and exited from a given
+Java VM thread.
+
+- 
+Any Java VM thread may instantiate a Python object for which an
+extension class was defined in Java as described in the previous
+section by calling the <code>instantiate()</code> method on the 
+PythonVM instance. This method takes two string parameters, the
+name of the Python module and the name of the Python class to
+import and instantiate from it. The <code>__init__()</code>
+constructor on this class must be callable without any parameters
+and, if defined, must call <code>super()</code> in order to
+initialize the Java side. The <code>instantiate()</code> method is
+declared to return <code>java.lang.Object</code> but the return
+value is actually an instance of the Java extension class used and
+must be downcast to it.
+
+##Pythonic protocols
+
+When generating wrappers for Python, JCC attempts to detect which
+classes can be made iterable:
+
+
+- 
+When a class declares to
+implement <code>java.lang.Iterable</code>, JCC makes it iterable
+from Python.
+
+- 
+When a Java class declares a method called <code>next()</code>
+with no arguments returning an object type, this class is made
+iterable. Its <code>next()</code> method is assumed to terminate
+iteration by returning <code>null</code>.
+
+
+
+JCC generates a Python mapping get method for a class when requested
+to do so via the <code>--mapping</code> command line option which
+takes two arguments, the class to generate the mapping get for and
+the Java method to use. The method is specified with its name
+followed by ':' and its Java
+<a href="http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/types.html#wp16432">signature</a>.
+
+
+For example, <code>System.getProperties()['java.class.path']</code> is
+made possible by:
+
+<source>
+--mapping java.util.Properties 
+        'getProperty:(Ljava/lang/String;)Ljava/lang/String;'
+                    # asking for a Python mapping protocol wrapper
+                    # for get access on the Properties class by
+                    # calling its getProperty method
+</source>
+
+JCC generates Python sequence length and get methods for a class
+when requested to do so via the <code>--sequence</code> command line
+option which takes three arguments, the class to generate the
+sequence length and get for and the two java methods to use. The
+methods are specified with their name followed by ':' and their Java
+<a href="http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/types.html#wp16432">signature</a>. For example:
+
+<source>
+for i in xrange(len(hits)): 
+doc = hits[i]
+...
+</source>
+
+is made possible by:
+
+<source>
+--sequence org.apache.lucene.search.Hits
+         'length:()I' 
+         'doc:(I)Lorg/apache/lucene/document/Document;'
+</source>
+