You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Diana Carroll <dc...@cloudera.com> on 2014/01/09 17:15:04 UTC
hadoop files in Python
Hello! I'm exploring using custom input formats, which it seems I can do
in Scala using sc.hadoopNewAPIFile or sc.hadoopNewAPIRDD.
My question is: is it possible to do this in Python? The Python API
doesn't have (AFAICT) the sc.hadoop* functions.
Thanks,
Diana
Re: hadoop files in Python
Posted by Josh Rosen <ro...@gmail.com>.
There's an open pull request to add support for additional Hadoop file
formats to PySpark: https://github.com/apache/incubator-spark/pull/263
On Thu, Jan 9, 2014 at 8:15 AM, Diana Carroll <dc...@cloudera.com> wrote:
> Hello! I'm exploring using custom input formats, which it seems I can do
> in Scala using sc.hadoopNewAPIFile or sc.hadoopNewAPIRDD.
>
> My question is: is it possible to do this in Python? The Python API
> doesn't have (AFAICT) the sc.hadoop* functions.
>
> Thanks,
> Diana
>