You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@bigtop.apache.org by "Sean Mackrory (JIRA)" <ji...@apache.org> on 2014/01/14 22:41:19 UTC

[jira] [Commented] (BIGTOP-1181) Add pyspark to spark package

    [ https://issues.apache.org/jira/browse/BIGTOP-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871238#comment-13871238 ] 

Sean Mackrory commented on BIGTOP-1181:
---------------------------------------

pyspark is a python shell for spark. A couple of quick examples that I tested:
{code}sc.parallelize([1,2,3]).sum(){/code}
And assuming you have a dictionary at hdfs:///words:
{code}sc.textFile("/usr/share/dict/words").filter(lambda w: w.startswith("spar")).take(5){code}


> Add pyspark to spark package
> ----------------------------
>
>                 Key: BIGTOP-1181
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1181
>             Project: Bigtop
>          Issue Type: Bug
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>         Attachments: 0001-BIGTOP-1181.-Add-pyspark-to-spark-package.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)