You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by doanduyhai <gi...@git.apache.org> on 2015/07/22 12:09:19 UTC

[GitHub] incubator-zeppelin pull request: Cassandra Interpreter

GitHub user doanduyhai opened a pull request:

    https://github.com/apache/incubator-zeppelin/pull/162

    Cassandra Interpreter

    This is a Cassandra interpreter for Zeppelin.
    
    I tried to make the code as clean & modular as possible. The code coverage is quite high with 75 tests (unit tests + integration tests).
    
    Below are the features of the interpreters:
    
    - support single-line and multi-line comments
    - one CQL statement can span many line
    - a magic `@prefix` system to pass in runtime parameters to queries
    - support for preparing statements before-hand and injecting bound values to prepared statements
    - parallel execution of each paragraphs
    - the last statement is displayed as tabular data if it is a SELECT statement. For non SELECT statements, execution statistics are returned
    - simple syntax validation by the interpreter, CQL syntax validation is delegated to Cassandra 
    - support for Zeppelin dynamic form with the mustache syntax {{input_name=default value}} or {{select_name=val1 | val2 | ... | valN}}
    
    For all the details about the features of this interpreter, please read the doc **[here]**: 
    
    [here]: https://docs.google.com/document/d/1krRrpZ3jKx_EOnALp30R1aAL8_tqCiu3W9oz5og0hDg/pub

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/doanduyhai/incubator-zeppelin CassandraInterpreter

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-zeppelin/pull/162.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #162
    
----
commit 859de21c4269be8adf976b970a99e1fa9579bcb0
Author: DuyHai DOAN <do...@gmail.com>
Date:   2015-07-19T11:20:54Z

    Cassandra Interpreter

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-126111502
  
    Ok, tested with a VM using Ubuntu 64 bits and the above **steps** to build, it worked
    
    ![workingbuildinsidevm](https://cloud.githubusercontent.com/assets/1532977/8971289/5cfdaf0c-364f-11e5-9d28-722df1eed357.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125896425
  
    How did you build the interpreter  @Leemoonsoo ? Can you give me the "mvn" command you use so that I can reproduce ? I think it's some kind of dependency class


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125970888
  
    Just tried on an Ubuntu laptop following the above tests, same results, it worked.
    
    I'm going to create a VM and try to reproduce using the VM, if possible
    
    Can you put somewhere the complete exception stacktrace @Leemoonsoo ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-zeppelin/pull/162


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Cassandra Interpreter

Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125110445
  
    It was because the Maven dependencies to **Scala** library was set to _provided_. It fixed it in the `pom.xml`, it should work now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-127051724
  
    @doanduyhai Apologies for late response.
    
    Below is full stack trace.
    ```
     INFO [2015-08-03 03:10:30,322] ({Thread-0} RemoteInterpreterServer.java[run]:97) - Starting remote interpreter server on port 54081
     INFO [2015-08-03 03:10:30,780] ({pool-1-thread-3} CassandraInterpreter.java[<clinit>]:154) - Bootstrapping Cassandra Interpreter
     INFO [2015-08-03 03:10:30,788] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:152) - Instantiate interpreter org.apache.zeppelin.cassandra.CassandraInterpreter
     INFO [2015-08-03 03:10:30,899] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:132) - Job remoteInterpretJob_1438539030896 started by scheduler org.apache.zeppelin.cassandra.CassandraInterpreter1313211444
     INFO [2015-08-03 03:10:30,900] ({pool-2-thread-2} CassandraInterpreter.java[open]:268) - Bootstrapping Cassandra Java Driver to connect to localhost,on port 9042
    ERROR [2015-08-03 03:10:31,024] ({pool-2-thread-2} Job.java[run]:183) - Job failed
    java.lang.NoClassDefFoundError: com/google/common/collect/ImmutableMap
    	at com.datastax.driver.core.ProtocolVersion.<clinit>(ProtocolVersion.java:78)
    	at org.apache.zeppelin.cassandra.JavaDriverConfig.getProtocolVersion(JavaDriverConfig.scala:209)
    	at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:273)
    	at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
    	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
    	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
    	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
    	at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
    	at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:146)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    	at java.lang.Thread.run(Thread.java:745)
    Caused by: java.lang.ClassNotFoundException: com.google.common.collect.ImmutableMap
    	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    	... 16 more
     INFO [2015-08-03 03:10:31,046] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:138) - Job remoteInterpretJob_1438539030896 finished by scheduler org.apache.zeppelin.cassandra.CassandraInterpreter1313211444
    ```
    
    I have realized that when i run zeppelin through zeppelin-daemon.sh, it works fine. The stacktrace is only seen when i'm running it inside of my IDE.
    
    I think it's good to go. +1 for merge!
    
    Thanks for really great Cassandra interpreter implementation!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125895799
  
    Now i have different class not found exception
    
    ```
    %cassandra
    describe tables
    
    java.lang.ClassNotFoundException: com.google.common.collect.ImmutableMap
    	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    	at com.datastax.driver.core.ProtocolVersion.<clinit>(ProtocolVersion.java:78)
    	at org.apache.zeppelin.cassandra.JavaDriverConfig.getProtocolVersion(JavaDriverConfig.scala:209)
    	at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:273)
    	at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
    	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
    	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
    	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
    	at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
    	at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:146)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    	at java.lang.Thread.run(Thread.java:745)
    ```
    
    If i run the same paragraph again, i get different exception on log file
    
    ```
    java.lang.NoClassDefFoundError: Could not initialize class com.datastax.driver.core.ProtocolVersion
    	at org.apache.zeppelin.cassandra.JavaDriverConfig.getProtocolVersion(JavaDriverConfig.scala:209)
    	at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:273)
    	at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
    	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
    	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
    	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
    	at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
    	at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:146)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    	at java.lang.Thread.run(Thread.java:745)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by seufagner <gi...@git.apache.org>.
Github user seufagner commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-200913876
  
    It works now! Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-127503824
  
    Merging it if there're no more discussions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Cassandra Interpreter

Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125078941
  
    Thanks for the great contribution!
    
    I successfully build this branch. But If i try to run something, I'm getting following error
    ```
     INFO [2015-07-27 12:51:35,351] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:132) - Job remoteInterpretJob_1437969095343 started by scheduler org.apache.zeppelin.cassandra.CassandraInterpreter40023061
     INFO [2015-07-27 12:51:35,351] ({pool-2-thread-2} CassandraInterpreter.java[open]:268) - Bootstrapping Cassandra Java Driver to connect to localhost,on port 9042
    ERROR [2015-07-27 12:51:35,352] ({pool-2-thread-2} Job.java[run]:183) - Job failed
    java.lang.NoClassDefFoundError: scala/collection/mutable/StringBuilder
    	at org.apache.zeppelin.cassandra.JavaDriverConfig.getCompressionProtocol(JavaDriverConfig.scala:335)
    	at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:271)
    	at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
    	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
    	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
    	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
    	at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
    	at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:146)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    	at java.lang.Thread.run(Thread.java:745)
     INFO [2015-07-27 12:51:35,353] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:138) - Job remoteInterpretJob_1437969095343 finished by scheduler org.apache.zeppelin.cassandra.CassandraInterpreter40023061
    ```
    Could you help me to make it work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125958145
  
    Ok so I check out and build Zeppelin from a **clean** Mac Book Pro (never install or build Zeppelin before on it) and it works.
    
    Steps:
    
    1. `git clone https://github.com/apache/incubator-zeppelin.git`
    2. `cd incubator-zeppelin`
    3. `git fetch origin pull/162/head:CheckPull`. **162** is this pull request ID, **CheckPull** is the name of the local branch to create for this pull request
    4. `mvn -Pspark-1.4 -DskipTests -Dspark.version=1.4.1 -Phadoop-2.6 -Pyarn package`. I added **-DskipTests** because there were some front-end test failing when trying to open a browser ...
    5. Start Cassandra locally and ensure I can bind to its port (127.0.0.1:9042) 
    6. Start Zeppelin `bin/zeppelin-daemon.sh start`
    7. Create a nootebook and paste 
    
    <pre>
            %cassandra
            describe tables
    </pre>
      
    Then I got a syntax error (expected):
    ![image](https://cloud.githubusercontent.com/assets/1532977/8958524/9fb01938-3605-11e5-89be-f48592255792.png)
    
    After adding a semi-colon to fix the syntax error, it worked:
    
    ![image](https://cloud.githubusercontent.com/assets/1532977/8958535/b8b3d41a-3605-11e5-9d2b-97511e2612b3.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by doanduyhai <gi...@git.apache.org>.
Github user doanduyhai commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-200882148
  
    @seufagner This is because the zeppelin-0.5.6-incubating-bin-all.tgz build is using an old version of Guava. Just put Guava version >= 16.0.1 jars (for example `guava-16.0.1.jar`) inside the folder `$ZEPPELIN_HOME/interpreter/cassandra` and it will work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by seufagner <gi...@git.apache.org>.
Github user seufagner commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-200502995
  
    I got binary version (zeppelin-0.5.6-incubating-bin-all.tgz) and I still get the same error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: ZEPPELIN-179: Cassandra Interpret...

Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/162#issuecomment-125898630
  
    My build command was `mvn -Pspark-1.4 -Dspark.version=1.4.1 -Phadoop-2.6 -Pyarn package`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---