You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by doanduyhai <gi...@git.apache.org> on 2015/07/22 12:09:19 UTC
[GitHub] incubator-zeppelin pull request: Cassandra Interpreter
GitHub user doanduyhai opened a pull request:
https://github.com/apache/incubator-zeppelin/pull/162
Cassandra Interpreter
This is a Cassandra interpreter for Zeppelin.
I tried to make the code as clean & modular as possible. The code coverage is quite high with 75 tests (unit tests + integration tests).
Below are the features of the interpreters:
- support single-line and multi-line comments
- one CQL statement can span many line
- a magic `@prefix` system to pass in runtime parameters to queries
- support for preparing statements before-hand and injecting bound values to prepared statements
- parallel execution of each paragraphs
- the last statement is displayed as tabular data if it is a SELECT statement. For non SELECT statements, execution statistics are returned
- simple syntax validation by the interpreter, CQL syntax validation is delegated to Cassandra
- support for Zeppelin dynamic form with the mustache syntax {{input_name=default value}} or {{select_name=val1 | val2 | ... | valN}}
For all the details about the features of this interpreter, please read the doc **[here]**:
[here]: https://docs.google.com/document/d/1krRrpZ3jKx_EOnALp30R1aAL8_tqCiu3W9oz5og0hDg/pub
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/doanduyhai/incubator-zeppelin CassandraInterpreter
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-zeppelin/pull/162.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #162
----
commit 859de21c4269be8adf976b970a99e1fa9579bcb0
Author: DuyHai DOAN <do...@gmail.com>
Date: 2015-07-19T11:20:54Z
Cassandra Interpreter
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-126111502
Ok, tested with a VM using Ubuntu 64 bits and the above **steps** to build, it worked
![workingbuildinsidevm](https://cloud.githubusercontent.com/assets/1532977/8971289/5cfdaf0c-364f-11e5-9d28-722df1eed357.png)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125896425
How did you build the interpreter @Leemoonsoo ? Can you give me the "mvn" command you use so that I can reproduce ? I think it's some kind of dependency class
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125970888
Just tried on an Ubuntu laptop following the above tests, same results, it worked.
I'm going to create a VM and try to reproduce using the VM, if possible
Can you put somewhere the complete exception stacktrace @Leemoonsoo ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/incubator-zeppelin/pull/162
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: Cassandra Interpreter
Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125110445
It was because the Maven dependencies to **Scala** library was set to _provided_. It fixed it in the `pom.xml`, it should work now
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-127051724
@doanduyhai Apologies for late response.
Below is full stack trace.
```
INFO [2015-08-03 03:10:30,322] ({Thread-0} RemoteInterpreterServer.java[run]:97) - Starting remote interpreter server on port 54081
INFO [2015-08-03 03:10:30,780] ({pool-1-thread-3} CassandraInterpreter.java[<clinit>]:154) - Bootstrapping Cassandra Interpreter
INFO [2015-08-03 03:10:30,788] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:152) - Instantiate interpreter org.apache.zeppelin.cassandra.CassandraInterpreter
INFO [2015-08-03 03:10:30,899] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:132) - Job remoteInterpretJob_1438539030896 started by scheduler org.apache.zeppelin.cassandra.CassandraInterpreter1313211444
INFO [2015-08-03 03:10:30,900] ({pool-2-thread-2} CassandraInterpreter.java[open]:268) - Bootstrapping Cassandra Java Driver to connect to localhost,on port 9042
ERROR [2015-08-03 03:10:31,024] ({pool-2-thread-2} Job.java[run]:183) - Job failed
java.lang.NoClassDefFoundError: com/google/common/collect/ImmutableMap
at com.datastax.driver.core.ProtocolVersion.<clinit>(ProtocolVersion.java:78)
at org.apache.zeppelin.cassandra.JavaDriverConfig.getProtocolVersion(JavaDriverConfig.scala:209)
at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:273)
at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:146)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.google.common.collect.ImmutableMap
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 16 more
INFO [2015-08-03 03:10:31,046] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:138) - Job remoteInterpretJob_1438539030896 finished by scheduler org.apache.zeppelin.cassandra.CassandraInterpreter1313211444
```
I have realized that when i run zeppelin through zeppelin-daemon.sh, it works fine. The stacktrace is only seen when i'm running it inside of my IDE.
I think it's good to go. +1 for merge!
Thanks for really great Cassandra interpreter implementation!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125895799
Now i have different class not found exception
```
%cassandra
describe tables
java.lang.ClassNotFoundException: com.google.common.collect.ImmutableMap
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at com.datastax.driver.core.ProtocolVersion.<clinit>(ProtocolVersion.java:78)
at org.apache.zeppelin.cassandra.JavaDriverConfig.getProtocolVersion(JavaDriverConfig.scala:209)
at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:273)
at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:146)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
```
If i run the same paragraph again, i get different exception on log file
```
java.lang.NoClassDefFoundError: Could not initialize class com.datastax.driver.core.ProtocolVersion
at org.apache.zeppelin.cassandra.JavaDriverConfig.getProtocolVersion(JavaDriverConfig.scala:209)
at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:273)
at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:146)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by seufagner <gi...@git.apache.org>.
Github user seufagner commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-200913876
It works now! Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-127503824
Merging it if there're no more discussions.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: Cassandra Interpreter
Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125078941
Thanks for the great contribution!
I successfully build this branch. But If i try to run something, I'm getting following error
```
INFO [2015-07-27 12:51:35,351] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:132) - Job remoteInterpretJob_1437969095343 started by scheduler org.apache.zeppelin.cassandra.CassandraInterpreter40023061
INFO [2015-07-27 12:51:35,351] ({pool-2-thread-2} CassandraInterpreter.java[open]:268) - Bootstrapping Cassandra Java Driver to connect to localhost,on port 9042
ERROR [2015-07-27 12:51:35,352] ({pool-2-thread-2} Job.java[run]:183) - Job failed
java.lang.NoClassDefFoundError: scala/collection/mutable/StringBuilder
at org.apache.zeppelin.cassandra.JavaDriverConfig.getCompressionProtocol(JavaDriverConfig.scala:335)
at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:271)
at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:146)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
INFO [2015-07-27 12:51:35,353] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:138) - Job remoteInterpretJob_1437969095343 finished by scheduler org.apache.zeppelin.cassandra.CassandraInterpreter40023061
```
Could you help me to make it work?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125958145
Ok so I check out and build Zeppelin from a **clean** Mac Book Pro (never install or build Zeppelin before on it) and it works.
Steps:
1. `git clone https://github.com/apache/incubator-zeppelin.git`
2. `cd incubator-zeppelin`
3. `git fetch origin pull/162/head:CheckPull`. **162** is this pull request ID, **CheckPull** is the name of the local branch to create for this pull request
4. `mvn -Pspark-1.4 -DskipTests -Dspark.version=1.4.1 -Phadoop-2.6 -Pyarn package`. I added **-DskipTests** because there were some front-end test failing when trying to open a browser ...
5. Start Cassandra locally and ensure I can bind to its port (127.0.0.1:9042)
6. Start Zeppelin `bin/zeppelin-daemon.sh start`
7. Create a nootebook and paste
<pre>
%cassandra
describe tables
</pre>
Then I got a syntax error (expected):
![image](https://cloud.githubusercontent.com/assets/1532977/8958524/9fb01938-3605-11e5-89be-f48592255792.png)
After adding a semi-colon to fix the syntax error, it worked:
![image](https://cloud.githubusercontent.com/assets/1532977/8958535/b8b3d41a-3605-11e5-9d2b-97511e2612b3.png)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-200882148
@seufagner This is because the zeppelin-0.5.6-incubating-bin-all.tgz build is using an old version of Guava. Just put Guava version >= 16.0.1 jars (for example `guava-16.0.1.jar`) inside the folder `$ZEPPELIN_HOME/interpreter/cassandra` and it will work.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by seufagner <gi...@git.apache.org>.
Github user seufagner commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-200502995
I got binary version (zeppelin-0.5.6-incubating-bin-all.tgz) and I still get the same error.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...
Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125898630
My build command was `mvn -Pspark-1.4 -Dspark.version=1.4.1 -Phadoop-2.6 -Pyarn package`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---