You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Mohit Anchlia <mo...@gmail.com> on 2015/08/13 19:49:28 UTC
Spark RuntimeException hadoop output format
I have this call trying to save to hdfs 2.6
wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt");
but I am getting the following:
java.lang.RuntimeException: class scala.runtime.Nothing$ not
org.apache.hadoop.mapreduce.OutputFormat
Re: Spark RuntimeException hadoop output format
Posted by Ted Yu <yu...@gmail.com>.
First you create the file:
final File outputFile = new File(outputPath);
Then you write to it:
Files.append(counts + "\n", outputFile, Charset.defaultCharset());
Cheers
On Fri, Aug 14, 2015 at 4:38 PM, Mohit Anchlia <mo...@gmail.com>
wrote:
> I thought prefix meant the output path? What's the purpose of prefix and
> where do I specify the path if not in prefix?
>
> On Fri, Aug 14, 2015 at 4:36 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> Please take a look at JavaPairDStream.scala:
>> def saveAsHadoopFiles[F <: OutputFormat[_, _]](
>> prefix: String,
>> suffix: String,
>> keyClass: Class[_],
>> valueClass: Class[_],
>> outputFormatClass: Class[F]) {
>>
>> Did you intend to use outputPath as prefix ?
>>
>> Cheers
>>
>>
>> On Fri, Aug 14, 2015 at 1:36 PM, Mohit Anchlia <mo...@gmail.com>
>> wrote:
>>
>>> Spark 1.3
>>>
>>> Code:
>>>
>>> wordCounts.foreachRDD(*new* *Function2<JavaPairRDD<String, Integer>,
>>> Time, Void>()* {
>>>
>>> @Override
>>>
>>> *public* Void call(JavaPairRDD<String, Integer> rdd, Time time) *throws*
>>> IOException {
>>>
>>> String counts = "Counts at time " + time + " " + rdd.collect();
>>>
>>> System.*out*.println(counts);
>>>
>>> System.*out*.println("Appending to " + outputFile.getAbsolutePath());
>>>
>>> Files.*append*(counts + "\n", outputFile, Charset.*defaultCharset*());
>>>
>>> *return* *null*;
>>>
>>> }
>>>
>>> });
>>>
>>> wordCounts.saveAsHadoopFiles(outputPath, "txt", Text.*class*, Text.
>>> *class*, TextOutputFormat.*class*);
>>>
>>>
>>> What do I need to check in namenode? I see 0 bytes files like this:
>>>
>>>
>>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>>> /tmp/out-1439495124000.txt
>>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>>> /tmp/out-1439495125000.txt
>>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>>> /tmp/out-1439495126000.txt
>>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>>> /tmp/out-1439495127000.txt
>>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>>> /tmp/out-1439495128000.txt
>>>
>>>
>>>
>>> However, I also wrote data to a local file on the local file system for
>>> verification and I see the data:
>>>
>>>
>>> $ ls -ltr !$
>>> ls -ltr /tmp/out
>>> -rw-r--r-- 1 yarn yarn 5230 Aug 13 15:45 /tmp/out
>>>
>>>
>>> On Fri, Aug 14, 2015 at 6:15 AM, Ted Yu <yu...@gmail.com> wrote:
>>>
>>>> Which Spark release are you using ?
>>>>
>>>> Can you show us snippet of your code ?
>>>>
>>>> Have you checked namenode log ?
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> On Aug 13, 2015, at 10:21 PM, Mohit Anchlia <mo...@gmail.com>
>>>> wrote:
>>>>
>>>> I was able to get this working by using an alternative method however I
>>>> only see 0 bytes files in hadoop. I've verified that the output does exist
>>>> in the logs however it's missing from hdfs.
>>>>
>>>> On Thu, Aug 13, 2015 at 10:49 AM, Mohit Anchlia <mohitanchlia@gmail.com
>>>> > wrote:
>>>>
>>>>> I have this call trying to save to hdfs 2.6
>>>>>
>>>>> wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt");
>>>>>
>>>>> but I am getting the following:
>>>>> java.lang.RuntimeException: class scala.runtime.Nothing$ not
>>>>> org.apache.hadoop.mapreduce.OutputFormat
>>>>>
>>>>
>>>>
>>>
>>
>
Re: Spark RuntimeException hadoop output format
Posted by Mohit Anchlia <mo...@gmail.com>.
I thought prefix meant the output path? What's the purpose of prefix and
where do I specify the path if not in prefix?
On Fri, Aug 14, 2015 at 4:36 PM, Ted Yu <yu...@gmail.com> wrote:
> Please take a look at JavaPairDStream.scala:
> def saveAsHadoopFiles[F <: OutputFormat[_, _]](
> prefix: String,
> suffix: String,
> keyClass: Class[_],
> valueClass: Class[_],
> outputFormatClass: Class[F]) {
>
> Did you intend to use outputPath as prefix ?
>
> Cheers
>
>
> On Fri, Aug 14, 2015 at 1:36 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
>
>> Spark 1.3
>>
>> Code:
>>
>> wordCounts.foreachRDD(*new* *Function2<JavaPairRDD<String, Integer>,
>> Time, Void>()* {
>>
>> @Override
>>
>> *public* Void call(JavaPairRDD<String, Integer> rdd, Time time) *throws*
>> IOException {
>>
>> String counts = "Counts at time " + time + " " + rdd.collect();
>>
>> System.*out*.println(counts);
>>
>> System.*out*.println("Appending to " + outputFile.getAbsolutePath());
>>
>> Files.*append*(counts + "\n", outputFile, Charset.*defaultCharset*());
>>
>> *return* *null*;
>>
>> }
>>
>> });
>>
>> wordCounts.saveAsHadoopFiles(outputPath, "txt", Text.*class*, Text.
>> *class*, TextOutputFormat.*class*);
>>
>>
>> What do I need to check in namenode? I see 0 bytes files like this:
>>
>>
>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>> /tmp/out-1439495124000.txt
>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>> /tmp/out-1439495125000.txt
>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>> /tmp/out-1439495126000.txt
>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>> /tmp/out-1439495127000.txt
>> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
>> /tmp/out-1439495128000.txt
>>
>>
>>
>> However, I also wrote data to a local file on the local file system for
>> verification and I see the data:
>>
>>
>> $ ls -ltr !$
>> ls -ltr /tmp/out
>> -rw-r--r-- 1 yarn yarn 5230 Aug 13 15:45 /tmp/out
>>
>>
>> On Fri, Aug 14, 2015 at 6:15 AM, Ted Yu <yu...@gmail.com> wrote:
>>
>>> Which Spark release are you using ?
>>>
>>> Can you show us snippet of your code ?
>>>
>>> Have you checked namenode log ?
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Aug 13, 2015, at 10:21 PM, Mohit Anchlia <mo...@gmail.com>
>>> wrote:
>>>
>>> I was able to get this working by using an alternative method however I
>>> only see 0 bytes files in hadoop. I've verified that the output does exist
>>> in the logs however it's missing from hdfs.
>>>
>>> On Thu, Aug 13, 2015 at 10:49 AM, Mohit Anchlia <mo...@gmail.com>
>>> wrote:
>>>
>>>> I have this call trying to save to hdfs 2.6
>>>>
>>>> wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt");
>>>>
>>>> but I am getting the following:
>>>> java.lang.RuntimeException: class scala.runtime.Nothing$ not
>>>> org.apache.hadoop.mapreduce.OutputFormat
>>>>
>>>
>>>
>>
>
Re: Spark RuntimeException hadoop output format
Posted by Ted Yu <yu...@gmail.com>.
Please take a look at JavaPairDStream.scala:
def saveAsHadoopFiles[F <: OutputFormat[_, _]](
prefix: String,
suffix: String,
keyClass: Class[_],
valueClass: Class[_],
outputFormatClass: Class[F]) {
Did you intend to use outputPath as prefix ?
Cheers
On Fri, Aug 14, 2015 at 1:36 PM, Mohit Anchlia <mo...@gmail.com>
wrote:
> Spark 1.3
>
> Code:
>
> wordCounts.foreachRDD(*new* *Function2<JavaPairRDD<String, Integer>,
> Time, Void>()* {
>
> @Override
>
> *public* Void call(JavaPairRDD<String, Integer> rdd, Time time) *throws*
> IOException {
>
> String counts = "Counts at time " + time + " " + rdd.collect();
>
> System.*out*.println(counts);
>
> System.*out*.println("Appending to " + outputFile.getAbsolutePath());
>
> Files.*append*(counts + "\n", outputFile, Charset.*defaultCharset*());
>
> *return* *null*;
>
> }
>
> });
>
> wordCounts.saveAsHadoopFiles(outputPath, "txt", Text.*class*, Text.*class*,
> TextOutputFormat.*class*);
>
>
> What do I need to check in namenode? I see 0 bytes files like this:
>
>
> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
> /tmp/out-1439495124000.txt
> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
> /tmp/out-1439495125000.txt
> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
> /tmp/out-1439495126000.txt
> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
> /tmp/out-1439495127000.txt
> drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
> /tmp/out-1439495128000.txt
>
>
>
> However, I also wrote data to a local file on the local file system for
> verification and I see the data:
>
>
> $ ls -ltr !$
> ls -ltr /tmp/out
> -rw-r--r-- 1 yarn yarn 5230 Aug 13 15:45 /tmp/out
>
>
> On Fri, Aug 14, 2015 at 6:15 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> Which Spark release are you using ?
>>
>> Can you show us snippet of your code ?
>>
>> Have you checked namenode log ?
>>
>> Thanks
>>
>>
>>
>> On Aug 13, 2015, at 10:21 PM, Mohit Anchlia <mo...@gmail.com>
>> wrote:
>>
>> I was able to get this working by using an alternative method however I
>> only see 0 bytes files in hadoop. I've verified that the output does exist
>> in the logs however it's missing from hdfs.
>>
>> On Thu, Aug 13, 2015 at 10:49 AM, Mohit Anchlia <mo...@gmail.com>
>> wrote:
>>
>>> I have this call trying to save to hdfs 2.6
>>>
>>> wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt");
>>>
>>> but I am getting the following:
>>> java.lang.RuntimeException: class scala.runtime.Nothing$ not
>>> org.apache.hadoop.mapreduce.OutputFormat
>>>
>>
>>
>
Re: Spark RuntimeException hadoop output format
Posted by Mohit Anchlia <mo...@gmail.com>.
Spark 1.3
Code:
wordCounts.foreachRDD(*new* *Function2<JavaPairRDD<String, Integer>, Time,
Void>()* {
@Override
*public* Void call(JavaPairRDD<String, Integer> rdd, Time time) *throws*
IOException {
String counts = "Counts at time " + time + " " + rdd.collect();
System.*out*.println(counts);
System.*out*.println("Appending to " + outputFile.getAbsolutePath());
Files.*append*(counts + "\n", outputFile, Charset.*defaultCharset*());
*return* *null*;
}
});
wordCounts.saveAsHadoopFiles(outputPath, "txt", Text.*class*, Text.*class*,
TextOutputFormat.*class*);
What do I need to check in namenode? I see 0 bytes files like this:
drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
/tmp/out-1439495124000.txt
drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
/tmp/out-1439495125000.txt
drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
/tmp/out-1439495126000.txt
drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
/tmp/out-1439495127000.txt
drwxr-xr-x - ec2-user supergroup 0 2015-08-13 15:45
/tmp/out-1439495128000.txt
However, I also wrote data to a local file on the local file system for
verification and I see the data:
$ ls -ltr !$
ls -ltr /tmp/out
-rw-r--r-- 1 yarn yarn 5230 Aug 13 15:45 /tmp/out
On Fri, Aug 14, 2015 at 6:15 AM, Ted Yu <yu...@gmail.com> wrote:
> Which Spark release are you using ?
>
> Can you show us snippet of your code ?
>
> Have you checked namenode log ?
>
> Thanks
>
>
>
> On Aug 13, 2015, at 10:21 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
>
> I was able to get this working by using an alternative method however I
> only see 0 bytes files in hadoop. I've verified that the output does exist
> in the logs however it's missing from hdfs.
>
> On Thu, Aug 13, 2015 at 10:49 AM, Mohit Anchlia <mo...@gmail.com>
> wrote:
>
>> I have this call trying to save to hdfs 2.6
>>
>> wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt");
>>
>> but I am getting the following:
>> java.lang.RuntimeException: class scala.runtime.Nothing$ not
>> org.apache.hadoop.mapreduce.OutputFormat
>>
>
>
Re: Spark RuntimeException hadoop output format
Posted by Ted Yu <yu...@gmail.com>.
Which Spark release are you using ?
Can you show us snippet of your code ?
Have you checked namenode log ?
Thanks
> On Aug 13, 2015, at 10:21 PM, Mohit Anchlia <mo...@gmail.com> wrote:
>
> I was able to get this working by using an alternative method however I only see 0 bytes files in hadoop. I've verified that the output does exist in the logs however it's missing from hdfs.
>
>> On Thu, Aug 13, 2015 at 10:49 AM, Mohit Anchlia <mo...@gmail.com> wrote:
>> I have this call trying to save to hdfs 2.6
>> wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt");
>>
>> but I am getting the following:
>>
>> java.lang.RuntimeException: class scala.runtime.Nothing$ not org.apache.hadoop.mapreduce.OutputFormat
>
Re: Spark RuntimeException hadoop output format
Posted by Mohit Anchlia <mo...@gmail.com>.
I was able to get this working by using an alternative method however I
only see 0 bytes files in hadoop. I've verified that the output does exist
in the logs however it's missing from hdfs.
On Thu, Aug 13, 2015 at 10:49 AM, Mohit Anchlia <mo...@gmail.com>
wrote:
> I have this call trying to save to hdfs 2.6
>
> wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt");
>
> but I am getting the following:
> java.lang.RuntimeException: class scala.runtime.Nothing$ not
> org.apache.hadoop.mapreduce.OutputFormat
>