You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Chengi Liu <ch...@gmail.com> on 2013/06/04 20:09:38 UTC

Reducer to output only json

Hi,

 I have the following redcuer class

public static class TokenCounterReducer
    extends Reducer<Text, Text, Text, Text> {
    public void reduce(Text key, Iterable<Text> values, Context context)
        throws IOException, InterruptedException {

    //String[] fields = s.split("\t", -1)
    JSONObject jsn = new JSONObject();
        int sum = 0;
        for (Text value : values) {
        String[] vals = value.toString().split("\t");
        String[] targetNodes = vals[0].toString().split(",",-1);
        jsn.put("source",vals[1] );
        jsn.put("target",targetNodes);
            //sum += value.get();
        }
       // context.write(key, new Text(sum));
    }
}

I want to save that json to hdfs?

It was very trivial in hadoop streaming.. but how do i do it in hadoop java?
Thanks

Re: Reducer to output only json

Posted by Mohammad Tariq <do...@gmail.com>.
If you need to save the JSON as it is then you could implement OutputFormat
to create you custom outputformat that'll allow you to write the data as
per your wish.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jun 4, 2013 at 11:39 PM, Chengi Liu <ch...@gmail.com> wrote:

> Hi,
>
>  I have the following redcuer class
>
> public static class TokenCounterReducer
>     extends Reducer<Text, Text, Text, Text> {
>     public void reduce(Text key, Iterable<Text> values, Context context)
>         throws IOException, InterruptedException {
>
>     //String[] fields = s.split("\t", -1)
>     JSONObject jsn = new JSONObject();
>         int sum = 0;
>         for (Text value : values) {
>         String[] vals = value.toString().split("\t");
>         String[] targetNodes = vals[0].toString().split(",",-1);
>         jsn.put("source",vals[1] );
>         jsn.put("target",targetNodes);
>             //sum += value.get();
>         }
>        // context.write(key, new Text(sum));
>     }
> }
>
> I want to save that json to hdfs?
>
> It was very trivial in hadoop streaming.. but how do i do it in hadoop
> java?
> Thanks
>

Re: Reducer to output only json

Posted by Mohammad Tariq <do...@gmail.com>.
If you need to save the JSON as it is then you could implement OutputFormat
to create you custom outputformat that'll allow you to write the data as
per your wish.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jun 4, 2013 at 11:39 PM, Chengi Liu <ch...@gmail.com> wrote:

> Hi,
>
>  I have the following redcuer class
>
> public static class TokenCounterReducer
>     extends Reducer<Text, Text, Text, Text> {
>     public void reduce(Text key, Iterable<Text> values, Context context)
>         throws IOException, InterruptedException {
>
>     //String[] fields = s.split("\t", -1)
>     JSONObject jsn = new JSONObject();
>         int sum = 0;
>         for (Text value : values) {
>         String[] vals = value.toString().split("\t");
>         String[] targetNodes = vals[0].toString().split(",",-1);
>         jsn.put("source",vals[1] );
>         jsn.put("target",targetNodes);
>             //sum += value.get();
>         }
>        // context.write(key, new Text(sum));
>     }
> }
>
> I want to save that json to hdfs?
>
> It was very trivial in hadoop streaming.. but how do i do it in hadoop
> java?
> Thanks
>

Re: Reducer to output only json

Posted by Shahab Yunus <sh...@gmail.com>.
Chengi,

You can also see this for pointers:
http://java.dzone.com/articles/hadoop-practice

Regards,
Shahab


On Tue, Jun 4, 2013 at 4:15 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Yes...This should do the trick.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Wed, Jun 5, 2013 at 1:38 AM, Niels Basjes <Ni...@basjes.nl> wrote:
>
>> Have you tried something like this (i do not have a pc here to check this
>> code)
>>
>> context.write(NullWritable, new Text(jsn.toString()));
>> On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>  I have the following redcuer class
>>>
>>> public static class TokenCounterReducer
>>>     extends Reducer<Text, Text, Text, Text> {
>>>     public void reduce(Text key, Iterable<Text> values, Context context)
>>>         throws IOException, InterruptedException {
>>>
>>>     //String[] fields = s.split("\t", -1)
>>>     JSONObject jsn = new JSONObject();
>>>         int sum = 0;
>>>         for (Text value : values) {
>>>         String[] vals = value.toString().split("\t");
>>>         String[] targetNodes = vals[0].toString().split(",",-1);
>>>         jsn.put("source",vals[1] );
>>>         jsn.put("target",targetNodes);
>>>             //sum += value.get();
>>>         }
>>>        // context.write(key, new Text(sum));
>>>     }
>>> }
>>>
>>> I want to save that json to hdfs?
>>>
>>> It was very trivial in hadoop streaming.. but how do i do it in hadoop
>>> java?
>>> Thanks
>>>
>>
>

Re: Reducer to output only json

Posted by Shahab Yunus <sh...@gmail.com>.
Chengi,

You can also see this for pointers:
http://java.dzone.com/articles/hadoop-practice

Regards,
Shahab


On Tue, Jun 4, 2013 at 4:15 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Yes...This should do the trick.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Wed, Jun 5, 2013 at 1:38 AM, Niels Basjes <Ni...@basjes.nl> wrote:
>
>> Have you tried something like this (i do not have a pc here to check this
>> code)
>>
>> context.write(NullWritable, new Text(jsn.toString()));
>> On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>  I have the following redcuer class
>>>
>>> public static class TokenCounterReducer
>>>     extends Reducer<Text, Text, Text, Text> {
>>>     public void reduce(Text key, Iterable<Text> values, Context context)
>>>         throws IOException, InterruptedException {
>>>
>>>     //String[] fields = s.split("\t", -1)
>>>     JSONObject jsn = new JSONObject();
>>>         int sum = 0;
>>>         for (Text value : values) {
>>>         String[] vals = value.toString().split("\t");
>>>         String[] targetNodes = vals[0].toString().split(",",-1);
>>>         jsn.put("source",vals[1] );
>>>         jsn.put("target",targetNodes);
>>>             //sum += value.get();
>>>         }
>>>        // context.write(key, new Text(sum));
>>>     }
>>> }
>>>
>>> I want to save that json to hdfs?
>>>
>>> It was very trivial in hadoop streaming.. but how do i do it in hadoop
>>> java?
>>> Thanks
>>>
>>
>

Re: Reducer to output only json

Posted by Shahab Yunus <sh...@gmail.com>.
Chengi,

You can also see this for pointers:
http://java.dzone.com/articles/hadoop-practice

Regards,
Shahab


On Tue, Jun 4, 2013 at 4:15 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Yes...This should do the trick.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Wed, Jun 5, 2013 at 1:38 AM, Niels Basjes <Ni...@basjes.nl> wrote:
>
>> Have you tried something like this (i do not have a pc here to check this
>> code)
>>
>> context.write(NullWritable, new Text(jsn.toString()));
>> On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>  I have the following redcuer class
>>>
>>> public static class TokenCounterReducer
>>>     extends Reducer<Text, Text, Text, Text> {
>>>     public void reduce(Text key, Iterable<Text> values, Context context)
>>>         throws IOException, InterruptedException {
>>>
>>>     //String[] fields = s.split("\t", -1)
>>>     JSONObject jsn = new JSONObject();
>>>         int sum = 0;
>>>         for (Text value : values) {
>>>         String[] vals = value.toString().split("\t");
>>>         String[] targetNodes = vals[0].toString().split(",",-1);
>>>         jsn.put("source",vals[1] );
>>>         jsn.put("target",targetNodes);
>>>             //sum += value.get();
>>>         }
>>>        // context.write(key, new Text(sum));
>>>     }
>>> }
>>>
>>> I want to save that json to hdfs?
>>>
>>> It was very trivial in hadoop streaming.. but how do i do it in hadoop
>>> java?
>>> Thanks
>>>
>>
>

Re: Reducer to output only json

Posted by Shahab Yunus <sh...@gmail.com>.
Chengi,

You can also see this for pointers:
http://java.dzone.com/articles/hadoop-practice

Regards,
Shahab


On Tue, Jun 4, 2013 at 4:15 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Yes...This should do the trick.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Wed, Jun 5, 2013 at 1:38 AM, Niels Basjes <Ni...@basjes.nl> wrote:
>
>> Have you tried something like this (i do not have a pc here to check this
>> code)
>>
>> context.write(NullWritable, new Text(jsn.toString()));
>> On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>  I have the following redcuer class
>>>
>>> public static class TokenCounterReducer
>>>     extends Reducer<Text, Text, Text, Text> {
>>>     public void reduce(Text key, Iterable<Text> values, Context context)
>>>         throws IOException, InterruptedException {
>>>
>>>     //String[] fields = s.split("\t", -1)
>>>     JSONObject jsn = new JSONObject();
>>>         int sum = 0;
>>>         for (Text value : values) {
>>>         String[] vals = value.toString().split("\t");
>>>         String[] targetNodes = vals[0].toString().split(",",-1);
>>>         jsn.put("source",vals[1] );
>>>         jsn.put("target",targetNodes);
>>>             //sum += value.get();
>>>         }
>>>        // context.write(key, new Text(sum));
>>>     }
>>> }
>>>
>>> I want to save that json to hdfs?
>>>
>>> It was very trivial in hadoop streaming.. but how do i do it in hadoop
>>> java?
>>> Thanks
>>>
>>
>

Re: Reducer to output only json

Posted by Mohammad Tariq <do...@gmail.com>.
Yes...This should do the trick.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Wed, Jun 5, 2013 at 1:38 AM, Niels Basjes <Ni...@basjes.nl> wrote:

> Have you tried something like this (i do not have a pc here to check this
> code)
>
> context.write(NullWritable, new Text(jsn.toString()));
> On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:
>
>> Hi,
>>
>>  I have the following redcuer class
>>
>> public static class TokenCounterReducer
>>     extends Reducer<Text, Text, Text, Text> {
>>     public void reduce(Text key, Iterable<Text> values, Context context)
>>         throws IOException, InterruptedException {
>>
>>     //String[] fields = s.split("\t", -1)
>>     JSONObject jsn = new JSONObject();
>>         int sum = 0;
>>         for (Text value : values) {
>>         String[] vals = value.toString().split("\t");
>>         String[] targetNodes = vals[0].toString().split(",",-1);
>>         jsn.put("source",vals[1] );
>>         jsn.put("target",targetNodes);
>>             //sum += value.get();
>>         }
>>        // context.write(key, new Text(sum));
>>     }
>> }
>>
>> I want to save that json to hdfs?
>>
>> It was very trivial in hadoop streaming.. but how do i do it in hadoop
>> java?
>> Thanks
>>
>

Re: Reducer to output only json

Posted by Mohammad Tariq <do...@gmail.com>.
Yes...This should do the trick.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Wed, Jun 5, 2013 at 1:38 AM, Niels Basjes <Ni...@basjes.nl> wrote:

> Have you tried something like this (i do not have a pc here to check this
> code)
>
> context.write(NullWritable, new Text(jsn.toString()));
> On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:
>
>> Hi,
>>
>>  I have the following redcuer class
>>
>> public static class TokenCounterReducer
>>     extends Reducer<Text, Text, Text, Text> {
>>     public void reduce(Text key, Iterable<Text> values, Context context)
>>         throws IOException, InterruptedException {
>>
>>     //String[] fields = s.split("\t", -1)
>>     JSONObject jsn = new JSONObject();
>>         int sum = 0;
>>         for (Text value : values) {
>>         String[] vals = value.toString().split("\t");
>>         String[] targetNodes = vals[0].toString().split(",",-1);
>>         jsn.put("source",vals[1] );
>>         jsn.put("target",targetNodes);
>>             //sum += value.get();
>>         }
>>        // context.write(key, new Text(sum));
>>     }
>> }
>>
>> I want to save that json to hdfs?
>>
>> It was very trivial in hadoop streaming.. but how do i do it in hadoop
>> java?
>> Thanks
>>
>

Re: Reducer to output only json

Posted by Mohammad Tariq <do...@gmail.com>.
Yes...This should do the trick.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Wed, Jun 5, 2013 at 1:38 AM, Niels Basjes <Ni...@basjes.nl> wrote:

> Have you tried something like this (i do not have a pc here to check this
> code)
>
> context.write(NullWritable, new Text(jsn.toString()));
> On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:
>
>> Hi,
>>
>>  I have the following redcuer class
>>
>> public static class TokenCounterReducer
>>     extends Reducer<Text, Text, Text, Text> {
>>     public void reduce(Text key, Iterable<Text> values, Context context)
>>         throws IOException, InterruptedException {
>>
>>     //String[] fields = s.split("\t", -1)
>>     JSONObject jsn = new JSONObject();
>>         int sum = 0;
>>         for (Text value : values) {
>>         String[] vals = value.toString().split("\t");
>>         String[] targetNodes = vals[0].toString().split(",",-1);
>>         jsn.put("source",vals[1] );
>>         jsn.put("target",targetNodes);
>>             //sum += value.get();
>>         }
>>        // context.write(key, new Text(sum));
>>     }
>> }
>>
>> I want to save that json to hdfs?
>>
>> It was very trivial in hadoop streaming.. but how do i do it in hadoop
>> java?
>> Thanks
>>
>

Re: Reducer to output only json

Posted by Mohammad Tariq <do...@gmail.com>.
Yes...This should do the trick.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Wed, Jun 5, 2013 at 1:38 AM, Niels Basjes <Ni...@basjes.nl> wrote:

> Have you tried something like this (i do not have a pc here to check this
> code)
>
> context.write(NullWritable, new Text(jsn.toString()));
> On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:
>
>> Hi,
>>
>>  I have the following redcuer class
>>
>> public static class TokenCounterReducer
>>     extends Reducer<Text, Text, Text, Text> {
>>     public void reduce(Text key, Iterable<Text> values, Context context)
>>         throws IOException, InterruptedException {
>>
>>     //String[] fields = s.split("\t", -1)
>>     JSONObject jsn = new JSONObject();
>>         int sum = 0;
>>         for (Text value : values) {
>>         String[] vals = value.toString().split("\t");
>>         String[] targetNodes = vals[0].toString().split(",",-1);
>>         jsn.put("source",vals[1] );
>>         jsn.put("target",targetNodes);
>>             //sum += value.get();
>>         }
>>        // context.write(key, new Text(sum));
>>     }
>> }
>>
>> I want to save that json to hdfs?
>>
>> It was very trivial in hadoop streaming.. but how do i do it in hadoop
>> java?
>> Thanks
>>
>

Re: Reducer to output only json

Posted by Niels Basjes <Ni...@basjes.nl>.
Have you tried something like this (i do not have a pc here to check this
code)

context.write(NullWritable, new Text(jsn.toString()));
On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:

> Hi,
>
>  I have the following redcuer class
>
> public static class TokenCounterReducer
>     extends Reducer<Text, Text, Text, Text> {
>     public void reduce(Text key, Iterable<Text> values, Context context)
>         throws IOException, InterruptedException {
>
>     //String[] fields = s.split("\t", -1)
>     JSONObject jsn = new JSONObject();
>         int sum = 0;
>         for (Text value : values) {
>         String[] vals = value.toString().split("\t");
>         String[] targetNodes = vals[0].toString().split(",",-1);
>         jsn.put("source",vals[1] );
>         jsn.put("target",targetNodes);
>             //sum += value.get();
>         }
>        // context.write(key, new Text(sum));
>     }
> }
>
> I want to save that json to hdfs?
>
> It was very trivial in hadoop streaming.. but how do i do it in hadoop
> java?
> Thanks
>

Re: Reducer to output only json

Posted by Mohammad Tariq <do...@gmail.com>.
If you need to save the JSON as it is then you could implement OutputFormat
to create you custom outputformat that'll allow you to write the data as
per your wish.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jun 4, 2013 at 11:39 PM, Chengi Liu <ch...@gmail.com> wrote:

> Hi,
>
>  I have the following redcuer class
>
> public static class TokenCounterReducer
>     extends Reducer<Text, Text, Text, Text> {
>     public void reduce(Text key, Iterable<Text> values, Context context)
>         throws IOException, InterruptedException {
>
>     //String[] fields = s.split("\t", -1)
>     JSONObject jsn = new JSONObject();
>         int sum = 0;
>         for (Text value : values) {
>         String[] vals = value.toString().split("\t");
>         String[] targetNodes = vals[0].toString().split(",",-1);
>         jsn.put("source",vals[1] );
>         jsn.put("target",targetNodes);
>             //sum += value.get();
>         }
>        // context.write(key, new Text(sum));
>     }
> }
>
> I want to save that json to hdfs?
>
> It was very trivial in hadoop streaming.. but how do i do it in hadoop
> java?
> Thanks
>

Re: Reducer to output only json

Posted by Niels Basjes <Ni...@basjes.nl>.
Have you tried something like this (i do not have a pc here to check this
code)

context.write(NullWritable, new Text(jsn.toString()));
On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:

> Hi,
>
>  I have the following redcuer class
>
> public static class TokenCounterReducer
>     extends Reducer<Text, Text, Text, Text> {
>     public void reduce(Text key, Iterable<Text> values, Context context)
>         throws IOException, InterruptedException {
>
>     //String[] fields = s.split("\t", -1)
>     JSONObject jsn = new JSONObject();
>         int sum = 0;
>         for (Text value : values) {
>         String[] vals = value.toString().split("\t");
>         String[] targetNodes = vals[0].toString().split(",",-1);
>         jsn.put("source",vals[1] );
>         jsn.put("target",targetNodes);
>             //sum += value.get();
>         }
>        // context.write(key, new Text(sum));
>     }
> }
>
> I want to save that json to hdfs?
>
> It was very trivial in hadoop streaming.. but how do i do it in hadoop
> java?
> Thanks
>

Re: Reducer to output only json

Posted by Mohammad Tariq <do...@gmail.com>.
If you need to save the JSON as it is then you could implement OutputFormat
to create you custom outputformat that'll allow you to write the data as
per your wish.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jun 4, 2013 at 11:39 PM, Chengi Liu <ch...@gmail.com> wrote:

> Hi,
>
>  I have the following redcuer class
>
> public static class TokenCounterReducer
>     extends Reducer<Text, Text, Text, Text> {
>     public void reduce(Text key, Iterable<Text> values, Context context)
>         throws IOException, InterruptedException {
>
>     //String[] fields = s.split("\t", -1)
>     JSONObject jsn = new JSONObject();
>         int sum = 0;
>         for (Text value : values) {
>         String[] vals = value.toString().split("\t");
>         String[] targetNodes = vals[0].toString().split(",",-1);
>         jsn.put("source",vals[1] );
>         jsn.put("target",targetNodes);
>             //sum += value.get();
>         }
>        // context.write(key, new Text(sum));
>     }
> }
>
> I want to save that json to hdfs?
>
> It was very trivial in hadoop streaming.. but how do i do it in hadoop
> java?
> Thanks
>

Re: Reducer to output only json

Posted by Niels Basjes <Ni...@basjes.nl>.
Have you tried something like this (i do not have a pc here to check this
code)

context.write(NullWritable, new Text(jsn.toString()));
On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:

> Hi,
>
>  I have the following redcuer class
>
> public static class TokenCounterReducer
>     extends Reducer<Text, Text, Text, Text> {
>     public void reduce(Text key, Iterable<Text> values, Context context)
>         throws IOException, InterruptedException {
>
>     //String[] fields = s.split("\t", -1)
>     JSONObject jsn = new JSONObject();
>         int sum = 0;
>         for (Text value : values) {
>         String[] vals = value.toString().split("\t");
>         String[] targetNodes = vals[0].toString().split(",",-1);
>         jsn.put("source",vals[1] );
>         jsn.put("target",targetNodes);
>             //sum += value.get();
>         }
>        // context.write(key, new Text(sum));
>     }
> }
>
> I want to save that json to hdfs?
>
> It was very trivial in hadoop streaming.. but how do i do it in hadoop
> java?
> Thanks
>

Re: Reducer to output only json

Posted by Niels Basjes <Ni...@basjes.nl>.
Have you tried something like this (i do not have a pc here to check this
code)

context.write(NullWritable, new Text(jsn.toString()));
On Jun 4, 2013 8:10 PM, "Chengi Liu" <ch...@gmail.com> wrote:

> Hi,
>
>  I have the following redcuer class
>
> public static class TokenCounterReducer
>     extends Reducer<Text, Text, Text, Text> {
>     public void reduce(Text key, Iterable<Text> values, Context context)
>         throws IOException, InterruptedException {
>
>     //String[] fields = s.split("\t", -1)
>     JSONObject jsn = new JSONObject();
>         int sum = 0;
>         for (Text value : values) {
>         String[] vals = value.toString().split("\t");
>         String[] targetNodes = vals[0].toString().split(",",-1);
>         jsn.put("source",vals[1] );
>         jsn.put("target",targetNodes);
>             //sum += value.get();
>         }
>        // context.write(key, new Text(sum));
>     }
> }
>
> I want to save that json to hdfs?
>
> It was very trivial in hadoop streaming.. but how do i do it in hadoop
> java?
> Thanks
>