You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "Nick R. Katsipoulakis" <ka...@cs.pitt.edu> on 2014/07/10 21:13:25 UTC

Use of the SparkContext.hadoopRDD function in Scala code

Hello,

I want to run an MLlib task in Scala API, that creates a hadoopRDD from a
CustomInputFormat. According to Spark API

def  hadoopRDD[K, V](conf: JobConf, inputFormatClass: Class[_ <:
org.apache.hadoop.mapred.InputFormat[K,V]], keyClass: Class[K], valueClass:
Class[V], minSplits: Int): RDD
<http://spark.apache.org/docs/0.6.2/api/core/spark/RDD.html>[(K, V)]


I need to provide a JobConf object for my Hadoop installation. However, I
do not know how can I define that object inside my Code. How should I do
that?

Also, can I just import my Java Classes for my CustomInputFormat and
compile the Code (even though they are in Java) ?

Thank you,
Nick