You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Maciej Bryński <ma...@brynski.pl> on 2016/11/03 06:15:30 UTC

Problem with Jython UDF

Hi,
I have following problem with Jython UDF.

1) I'm using Cassandra 3.9 deb packages and Ubuntu 14.04. I'm running
Oracle Java 1.8.0_101-b13)

2) I added jython jar to /usr/share/cassandra/lib. (jython version 2.7.0)
This makes creating python function possible

3) I want to test function.

cqlsh:e> CREATE FUNCTION IF NOT EXISTS test123 (input bigint) CALLED ON
NULL INPUT RETURNS text LANGUAGE python AS 'return "123"';

This worked, but running select with udf returns exception:
Traceback (most recent call last):
  File "/usr/bin/cqlsh.py", line 1264, in perform_simple_statement
    result = future.result()
  File
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
line 3650, in result
    raise self._final_exception
FunctionFailure: Error from server: code=1400 [User Defined Function
failure] message="execution of 'e.test123[bigint]' failed:
java.security.AccessControlException: access denied:
("java.lang.RuntimePermission"
"accessClassInPackage.org.python.jline.console")

4) I tried to modify /etc/java-8-oracle/security/java.policy and added:

grant codeBase "file:/usr/share/cassandra/lib/*" {
        permission java.security.AllPermission;
};

Still no improvement.

Any ideas how to run python UDFs in Cassandra ?

Regards,
-- 
Maciek Bryński

Re: Problem with Jython UDF

Posted by Robert Stupp <sn...@snazy.de>.
C* uses a hard coded security manager, which is only effective when called from a UDF. I’ve experimented using Java’s security manager for the whole C* process, but a) the performance impact was too high and b) it had too many side effects in the whole code base. TL;DR - it was too complicated ;)
There’s no way around the sandbox.

Scala is “enabled” via a JSR-223 provider - i.e. the same restrictions apply. There’s no special handling for Scala.

—
Robert Stupp
@snazy

> On 5 Nov 2016, at 20:36, Maciej Bryński <ma...@brynski.pl> wrote:
> 
> Robert,
> Thank you for the answer.
> 
> Do you know if there is a possibility to replace current security manager configuration with my own ?
> I still want to try run Jython :)
> 
> One more question. You wrote about limiting languages to Java and Javascript.
> What about Scala ?
> 
> M.
> 
> 2016-11-05 20:20 GMT+01:00 Robert Stupp <snazy@snazy.de <ma...@snazy.de>>:
> Maciek,
> 
> I fear that Python - or better: Jython - UDFs no longer work since C* 3.0.
> 
> Back in C* 2.2.x, there was the idea to allow the use of “all” JSR223 languages for UDFs - basically all languages that are listed in the lib/jsr223 directory.
> 
> UDFs in 2.2.x were not “sandboxed” - i.e. unrestricted access to files, network, classes etc - so users could actually execute “evil” code on the nodes by creating and executing a UDF. This is definitely something nobody wants to allow to see in production (e.g. a UDF body like Runtime.getRuntime().exec(“rm -rf /“) ).
> 
> Therefore we added a so called “sandbox” to C* 3.0.0, which means access to classes and even specific functions is restricted. Additionally, runtime quotas (heap usage and CPU time consumption) are checked. This is pretty straight forward for Java-UDFs. Unfortunately it is not straight forward for JavaScript UDFs - frankly speaking, it is difficult - and honestly speaking it’s annoying to secure all the possible runtime characteristics via JSR223.
> 
> I strongly recommend to use Java UDFs for various reasons:
> * performance - Java UDFs get compiled to bytecode and are subject to Hotspot optimizations
> * security - Java bytecode is inspected and rejected if a UDF calls an “evil” function. JSR223 (including JavaScript!) is not and we have to rely on the (limited) security checks for example in Nashorn. See also CASSANDRA-9954 - improving both performance and security for Java UDFs
> * maintenance - Java code (or better: bytecode) is well defined. However, JavaScript (i.e. the Nashorn implementation) changes.
> 
> IMHO your “best” option is to switch to Java UDFs.
> 
> TL;DR Python and probably all script languages except JavaScript don’t work since 3.0.
> 
> Robert
> 
> PS: Honestly, looking backwards it was maybe a mistake to allow “all” JSR-223 languages, so I’ve opened https://issues.apache.org/jira/browse/CASSANDRA-12883 <https://issues.apache.org/jira/browse/CASSANDRA-12883>.
> 
> —
> Robert Stupp
> @snazy
> 
>> On 3 Nov 2016, at 07:15, Maciej Bryński <maciek@brynski.pl <ma...@brynski.pl>> wrote:
>> 
>> Hi,
>> I have following problem with Jython UDF.
>> 
>> 1) I'm using Cassandra 3.9 deb packages and Ubuntu 14.04. I'm running Oracle Java 1.8.0_101-b13)
>> 
>> 2) I added jython jar to /usr/share/cassandra/lib. (jython version 2.7.0)
>> This makes creating python function possible
>> 
>> 3) I want to test function.
>> 
>> cqlsh:e> CREATE FUNCTION IF NOT EXISTS test123 (input bigint) CALLED ON NULL INPUT RETURNS text LANGUAGE python AS 'return "123"';
>> 
>> This worked, but running select with udf returns exception:
>> Traceback (most recent call last):
>>   File "/usr/bin/cqlsh.py", line 1264, in perform_simple_statement
>>     result = future.result()
>>   File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", line 3650, in result
>>     raise self._final_exception
>> FunctionFailure: Error from server: code=1400 [User Defined Function failure] message="execution of 'e.test123[bigint]' failed: java.security.AccessControlException: access denied: ("java.lang.RuntimePermission" "accessClassInPackage.org.python.jline.console")
>> 
>> 4) I tried to modify /etc/java-8-oracle/security/java.policy and added:
>> 
>> grant codeBase "file:/usr/share/cassandra/lib/*" {
>>         permission java.security.AllPermission;
>> };
>> 
>> Still no improvement.
>> 
>> Any ideas how to run python UDFs in Cassandra ?
>> 
>> Regards,
>> -- 
>> Maciek Bryński
> 
> 
> 
> 
> -- 
> Maciek Bryński


Re: Problem with Jython UDF

Posted by Maciej Bryński <ma...@brynski.pl>.
Robert,
Thank you for the answer.

Do you know if there is a possibility to replace current security manager
configuration with my own ?
I still want to try run Jython :)

One more question. You wrote about limiting languages to Java and
Javascript.
What about Scala ?

M.

2016-11-05 20:20 GMT+01:00 Robert Stupp <sn...@snazy.de>:

> Maciek,
>
> I fear that Python - or better: Jython - UDFs no longer work since C* 3.0.
>
> Back in C* 2.2.x, there was the idea to allow the use of “all” JSR223
> languages for UDFs - basically all languages that are listed in the
> lib/jsr223 directory.
>
> UDFs in 2.2.x were not “sandboxed” - i.e. unrestricted access to files,
> network, classes etc - so users could actually execute “evil” code on the
> nodes by creating and executing a UDF. This is definitely something nobody
> wants to allow to see in production (e.g. a UDF body like
> Runtime.getRuntime().exec(“rm -rf /“) ).
>
> Therefore we added a so called “sandbox” to C* 3.0.0, which means access
> to classes and even specific functions is restricted. Additionally, runtime
> quotas (heap usage and CPU time consumption) are checked. This is pretty
> straight forward for Java-UDFs. Unfortunately it is not straight forward
> for JavaScript UDFs - frankly speaking, it is difficult - and honestly
> speaking it’s annoying to secure all the possible runtime characteristics
> via JSR223.
>
> I strongly recommend to use Java UDFs for various reasons:
> * performance - Java UDFs get compiled to bytecode and are subject to
> Hotspot optimizations
> * security - Java bytecode is inspected and rejected if a UDF calls an
> “evil” function. JSR223 (including JavaScript!) is not and we have to rely
> on the (limited) security checks for example in Nashorn. See
> also CASSANDRA-9954 - improving both performance and security for Java UDFs
> * maintenance - Java code (or better: bytecode) is well defined. However,
> JavaScript (i.e. the Nashorn implementation) changes.
>
> IMHO your “best” option is to switch to Java UDFs.
>
> TL;DR Python and probably all script languages except JavaScript don’t
> work since 3.0.
>
> Robert
>
> PS: Honestly, looking backwards it was maybe a mistake to allow “all”
> JSR-223 languages, so I’ve opened https://issues.apache.
> org/jira/browse/CASSANDRA-12883.
>
> —
> Robert Stupp
> @snazy
>
> On 3 Nov 2016, at 07:15, Maciej Bryński <ma...@brynski.pl> wrote:
>
> Hi,
> I have following problem with Jython UDF.
>
> 1) I'm using Cassandra 3.9 deb packages and Ubuntu 14.04. I'm running
> Oracle Java 1.8.0_101-b13)
>
> 2) I added jython jar to /usr/share/cassandra/lib. (jython version 2.7.0)
> This makes creating python function possible
>
> 3) I want to test function.
>
> cqlsh:e> CREATE FUNCTION IF NOT EXISTS test123 (input bigint) CALLED ON
> NULL INPUT RETURNS text LANGUAGE python AS 'return "123"';
>
> This worked, but running select with udf returns exception:
> Traceback (most recent call last):
>   File "/usr/bin/cqlsh.py", line 1264, in perform_simple_statement
>     result = future.result()
>   File "/usr/share/cassandra/lib/cassandra-driver-internal-
> only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
> line 3650, in result
>     raise self._final_exception
> FunctionFailure: Error from server: code=1400 [User Defined Function
> failure] message="execution of 'e.test123[bigint]' failed: java.security.AccessControlException:
> access denied: ("java.lang.RuntimePermission" "accessClassInPackage.org.
> python.jline.console")
>
> 4) I tried to modify /etc/java-8-oracle/security/java.policy and added:
>
> grant codeBase "file:/usr/share/cassandra/lib/*" {
>         permission java.security.AllPermission;
> };
>
> Still no improvement.
>
> Any ideas how to run python UDFs in Cassandra ?
>
> Regards,
> --
> Maciek Bryński
>
>
>


-- 
Maciek Bryński

Re: Problem with Jython UDF

Posted by Robert Stupp <sn...@snazy.de>.
Maciek,

I fear that Python - or better: Jython - UDFs no longer work since C* 3.0.

Back in C* 2.2.x, there was the idea to allow the use of “all” JSR223 languages for UDFs - basically all languages that are listed in the lib/jsr223 directory.

UDFs in 2.2.x were not “sandboxed” - i.e. unrestricted access to files, network, classes etc - so users could actually execute “evil” code on the nodes by creating and executing a UDF. This is definitely something nobody wants to allow to see in production (e.g. a UDF body like Runtime.getRuntime().exec(“rm -rf /“) ).

Therefore we added a so called “sandbox” to C* 3.0.0, which means access to classes and even specific functions is restricted. Additionally, runtime quotas (heap usage and CPU time consumption) are checked. This is pretty straight forward for Java-UDFs. Unfortunately it is not straight forward for JavaScript UDFs - frankly speaking, it is difficult - and honestly speaking it’s annoying to secure all the possible runtime characteristics via JSR223.

I strongly recommend to use Java UDFs for various reasons:
* performance - Java UDFs get compiled to bytecode and are subject to Hotspot optimizations
* security - Java bytecode is inspected and rejected if a UDF calls an “evil” function. JSR223 (including JavaScript!) is not and we have to rely on the (limited) security checks for example in Nashorn. See also CASSANDRA-9954 - improving both performance and security for Java UDFs
* maintenance - Java code (or better: bytecode) is well defined. However, JavaScript (i.e. the Nashorn implementation) changes.

IMHO your “best” option is to switch to Java UDFs.

TL;DR Python and probably all script languages except JavaScript don’t work since 3.0.

Robert

PS: Honestly, looking backwards it was maybe a mistake to allow “all” JSR-223 languages, so I’ve opened https://issues.apache.org/jira/browse/CASSANDRA-12883 <https://issues.apache.org/jira/browse/CASSANDRA-12883>.

—
Robert Stupp
@snazy

> On 3 Nov 2016, at 07:15, Maciej Bryński <ma...@brynski.pl> wrote:
> 
> Hi,
> I have following problem with Jython UDF.
> 
> 1) I'm using Cassandra 3.9 deb packages and Ubuntu 14.04. I'm running Oracle Java 1.8.0_101-b13)
> 
> 2) I added jython jar to /usr/share/cassandra/lib. (jython version 2.7.0)
> This makes creating python function possible
> 
> 3) I want to test function.
> 
> cqlsh:e> CREATE FUNCTION IF NOT EXISTS test123 (input bigint) CALLED ON NULL INPUT RETURNS text LANGUAGE python AS 'return "123"';
> 
> This worked, but running select with udf returns exception:
> Traceback (most recent call last):
>   File "/usr/bin/cqlsh.py", line 1264, in perform_simple_statement
>     result = future.result()
>   File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", line 3650, in result
>     raise self._final_exception
> FunctionFailure: Error from server: code=1400 [User Defined Function failure] message="execution of 'e.test123[bigint]' failed: java.security.AccessControlException: access denied: ("java.lang.RuntimePermission" "accessClassInPackage.org.python.jline.console")
> 
> 4) I tried to modify /etc/java-8-oracle/security/java.policy and added:
> 
> grant codeBase "file:/usr/share/cassandra/lib/*" {
>         permission java.security.AllPermission;
> };
> 
> Still no improvement.
> 
> Any ideas how to run python UDFs in Cassandra ?
> 
> Regards,
> -- 
> Maciek Bryński