You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by dyuti a <ha...@gmail.com> on 2012/11/30 13:21:38 UTC

Mapper outputs an empty file

Hi All,
Am trying with xml processing in hadoop,used the below code inside map
method. It results an empty file (not used reducer class).is there anything
 wrong ?

//code used inside map method
public void map(LongWritable key, Text value1,Context context)
throws IOException, InterruptedException {
        String xmlString = value1.toString();
SAXBuilder builder = new SAXBuilder();
Reader in = new StringReader(xmlString);
String value="";
try {
                        Document doc = builder.build(in);
Element rootNode = doc.getRootElement();
                        List<Element> list = rootNode.getChildren("staff");
            for (int i = 0; i < list.size(); i++) {
Element node = (Element) list.get(i);
              String tag1 = node.getChildText("firstname");
                                String tag2 = node.getChildText("lastname");
                         String tag3 = node.getChildText("nickname");
                   String tag4 = node.getChildText("salary");

value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
context.write(NullWritable.get(), new Text(value));
}
} followed by catch statements....................

//xml input file
<?xml version="1.0" encoding="UTF-8"?>
<company>
<staff>
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
<staff>
<firstname>low121</firstname>
<lastname>yin fong1</lastname>
<nickname>fong fong1</nickname>
<salary>2000001</salary>
</staff>
</company>

Thanks for your help!

Regards,
dti

Re: Mapper outputs an empty file

Posted by Harsh J <ha...@cloudera.com>.
The lack of conditional logic suggests that an empty file should never
occur for a _successful_ parse.

So the question boils down to successful parsing. What exactly is your
RecordReader/InputFormat here? The TextInputFormat reads documents
line by line, and is not suited for direct XML document based parsing,
which you rely on here, as you are considering a single KV pair input
into the mapper to contain the whole document to run the parser upon.

If your catching logic is catching and logging exceptions, I suggest
taking a look at the Mapper's task logs to see your actual error here.

On Fri, Nov 30, 2012 at 5:51 PM, dyuti a <ha...@gmail.com> wrote:
> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
> Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
> Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
> Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 = node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
> context.write(NullWritable.get(), new Text(value));
> }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
> <staff>
> <firstname>yong</firstname>
> <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
> <salary>100000</salary>
> </staff>
> <staff>
> <firstname>low121</firstname>
> <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
> <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>



-- 
Harsh J

Re: Mapper outputs an empty file

Posted by Harsh J <ha...@cloudera.com>.
The lack of conditional logic suggests that an empty file should never
occur for a _successful_ parse.

So the question boils down to successful parsing. What exactly is your
RecordReader/InputFormat here? The TextInputFormat reads documents
line by line, and is not suited for direct XML document based parsing,
which you rely on here, as you are considering a single KV pair input
into the mapper to contain the whole document to run the parser upon.

If your catching logic is catching and logging exceptions, I suggest
taking a look at the Mapper's task logs to see your actual error here.

On Fri, Nov 30, 2012 at 5:51 PM, dyuti a <ha...@gmail.com> wrote:
> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
> Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
> Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
> Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 = node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
> context.write(NullWritable.get(), new Text(value));
> }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
> <staff>
> <firstname>yong</firstname>
> <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
> <salary>100000</salary>
> </staff>
> <staff>
> <firstname>low121</firstname>
> <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
> <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>



-- 
Harsh J

Re: Mapper outputs an empty file

Posted by Harsh J <ha...@cloudera.com>.
The lack of conditional logic suggests that an empty file should never
occur for a _successful_ parse.

So the question boils down to successful parsing. What exactly is your
RecordReader/InputFormat here? The TextInputFormat reads documents
line by line, and is not suited for direct XML document based parsing,
which you rely on here, as you are considering a single KV pair input
into the mapper to contain the whole document to run the parser upon.

If your catching logic is catching and logging exceptions, I suggest
taking a look at the Mapper's task logs to see your actual error here.

On Fri, Nov 30, 2012 at 5:51 PM, dyuti a <ha...@gmail.com> wrote:
> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
> Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
> Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
> Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 = node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
> context.write(NullWritable.get(), new Text(value));
> }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
> <staff>
> <firstname>yong</firstname>
> <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
> <salary>100000</salary>
> </staff>
> <staff>
> <firstname>low121</firstname>
> <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
> <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>



-- 
Harsh J

Re: Mapper outputs an empty file

Posted by dyuti a <ha...@gmail.com>.
Hey Harsh/Bertrand,
Thank you so much. Problem is bcz of packaging JDOM jars inside jar file.
now it got fixed.

Regards,
dti


On Fri, Nov 30, 2012 at 6:36 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> You should write unit tests (MRUnit) and do debugging if that's not enough.
> I would assume that you are a reading your file line by line. And each
> line is not a valid xml, thus an exception is thrown and then caught but
> without any logs or counters.
>
> Regards
>
> Bertrand
>
>
> On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <ha...@gmail.com> wrote:
>
>> Hi All,
>> Am trying with xml processing in hadoop,used the below code inside map
>> method. It results an empty file (not used reducer class).is there anything
>>  wrong ?
>>
>> //code used inside map method
>> public void map(LongWritable key, Text value1,Context context)
>> throws IOException, InterruptedException {
>>         String xmlString = value1.toString();
>> SAXBuilder builder = new SAXBuilder();
>>  Reader in = new StringReader(xmlString);
>> String value="";
>> try {
>>                         Document doc = builder.build(in);
>>  Element rootNode = doc.getRootElement();
>>                         List<Element> list =
>> rootNode.getChildren("staff");
>>             for (int i = 0; i < list.size(); i++) {
>>  Element node = (Element) list.get(i);
>>               String tag1 = node.getChildText("firstname");
>>                                 String tag2 =
>> node.getChildText("lastname");
>>                          String tag3 = node.getChildText("nickname");
>>                    String tag4 = node.getChildText("salary");
>>
>> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
>>  context.write(NullWritable.get(), new Text(value));
>>  }
>> } followed by catch statements....................
>>
>> //xml input file
>> <?xml version="1.0" encoding="UTF-8"?>
>> <company>
>>  <staff>
>> <firstname>yong</firstname>
>>  <lastname>mook kim</lastname>
>> <nickname>mkyong</nickname>
>>  <salary>100000</salary>
>> </staff>
>>  <staff>
>> <firstname>low121</firstname>
>>  <lastname>yin fong1</lastname>
>> <nickname>fong fong1</nickname>
>>  <salary>2000001</salary>
>> </staff>
>> </company>
>>
>> Thanks for your help!
>>
>> Regards,
>> dti
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: Mapper outputs an empty file

Posted by dyuti a <ha...@gmail.com>.
Hey Harsh/Bertrand,
Thank you so much. Problem is bcz of packaging JDOM jars inside jar file.
now it got fixed.

Regards,
dti


On Fri, Nov 30, 2012 at 6:36 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> You should write unit tests (MRUnit) and do debugging if that's not enough.
> I would assume that you are a reading your file line by line. And each
> line is not a valid xml, thus an exception is thrown and then caught but
> without any logs or counters.
>
> Regards
>
> Bertrand
>
>
> On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <ha...@gmail.com> wrote:
>
>> Hi All,
>> Am trying with xml processing in hadoop,used the below code inside map
>> method. It results an empty file (not used reducer class).is there anything
>>  wrong ?
>>
>> //code used inside map method
>> public void map(LongWritable key, Text value1,Context context)
>> throws IOException, InterruptedException {
>>         String xmlString = value1.toString();
>> SAXBuilder builder = new SAXBuilder();
>>  Reader in = new StringReader(xmlString);
>> String value="";
>> try {
>>                         Document doc = builder.build(in);
>>  Element rootNode = doc.getRootElement();
>>                         List<Element> list =
>> rootNode.getChildren("staff");
>>             for (int i = 0; i < list.size(); i++) {
>>  Element node = (Element) list.get(i);
>>               String tag1 = node.getChildText("firstname");
>>                                 String tag2 =
>> node.getChildText("lastname");
>>                          String tag3 = node.getChildText("nickname");
>>                    String tag4 = node.getChildText("salary");
>>
>> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
>>  context.write(NullWritable.get(), new Text(value));
>>  }
>> } followed by catch statements....................
>>
>> //xml input file
>> <?xml version="1.0" encoding="UTF-8"?>
>> <company>
>>  <staff>
>> <firstname>yong</firstname>
>>  <lastname>mook kim</lastname>
>> <nickname>mkyong</nickname>
>>  <salary>100000</salary>
>> </staff>
>>  <staff>
>> <firstname>low121</firstname>
>>  <lastname>yin fong1</lastname>
>> <nickname>fong fong1</nickname>
>>  <salary>2000001</salary>
>> </staff>
>> </company>
>>
>> Thanks for your help!
>>
>> Regards,
>> dti
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: Mapper outputs an empty file

Posted by dyuti a <ha...@gmail.com>.
Hey Harsh/Bertrand,
Thank you so much. Problem is bcz of packaging JDOM jars inside jar file.
now it got fixed.

Regards,
dti


On Fri, Nov 30, 2012 at 6:36 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> You should write unit tests (MRUnit) and do debugging if that's not enough.
> I would assume that you are a reading your file line by line. And each
> line is not a valid xml, thus an exception is thrown and then caught but
> without any logs or counters.
>
> Regards
>
> Bertrand
>
>
> On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <ha...@gmail.com> wrote:
>
>> Hi All,
>> Am trying with xml processing in hadoop,used the below code inside map
>> method. It results an empty file (not used reducer class).is there anything
>>  wrong ?
>>
>> //code used inside map method
>> public void map(LongWritable key, Text value1,Context context)
>> throws IOException, InterruptedException {
>>         String xmlString = value1.toString();
>> SAXBuilder builder = new SAXBuilder();
>>  Reader in = new StringReader(xmlString);
>> String value="";
>> try {
>>                         Document doc = builder.build(in);
>>  Element rootNode = doc.getRootElement();
>>                         List<Element> list =
>> rootNode.getChildren("staff");
>>             for (int i = 0; i < list.size(); i++) {
>>  Element node = (Element) list.get(i);
>>               String tag1 = node.getChildText("firstname");
>>                                 String tag2 =
>> node.getChildText("lastname");
>>                          String tag3 = node.getChildText("nickname");
>>                    String tag4 = node.getChildText("salary");
>>
>> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
>>  context.write(NullWritable.get(), new Text(value));
>>  }
>> } followed by catch statements....................
>>
>> //xml input file
>> <?xml version="1.0" encoding="UTF-8"?>
>> <company>
>>  <staff>
>> <firstname>yong</firstname>
>>  <lastname>mook kim</lastname>
>> <nickname>mkyong</nickname>
>>  <salary>100000</salary>
>> </staff>
>>  <staff>
>> <firstname>low121</firstname>
>>  <lastname>yin fong1</lastname>
>> <nickname>fong fong1</nickname>
>>  <salary>2000001</salary>
>> </staff>
>> </company>
>>
>> Thanks for your help!
>>
>> Regards,
>> dti
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: Mapper outputs an empty file

Posted by dyuti a <ha...@gmail.com>.
Hey Harsh/Bertrand,
Thank you so much. Problem is bcz of packaging JDOM jars inside jar file.
now it got fixed.

Regards,
dti


On Fri, Nov 30, 2012 at 6:36 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> You should write unit tests (MRUnit) and do debugging if that's not enough.
> I would assume that you are a reading your file line by line. And each
> line is not a valid xml, thus an exception is thrown and then caught but
> without any logs or counters.
>
> Regards
>
> Bertrand
>
>
> On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <ha...@gmail.com> wrote:
>
>> Hi All,
>> Am trying with xml processing in hadoop,used the below code inside map
>> method. It results an empty file (not used reducer class).is there anything
>>  wrong ?
>>
>> //code used inside map method
>> public void map(LongWritable key, Text value1,Context context)
>> throws IOException, InterruptedException {
>>         String xmlString = value1.toString();
>> SAXBuilder builder = new SAXBuilder();
>>  Reader in = new StringReader(xmlString);
>> String value="";
>> try {
>>                         Document doc = builder.build(in);
>>  Element rootNode = doc.getRootElement();
>>                         List<Element> list =
>> rootNode.getChildren("staff");
>>             for (int i = 0; i < list.size(); i++) {
>>  Element node = (Element) list.get(i);
>>               String tag1 = node.getChildText("firstname");
>>                                 String tag2 =
>> node.getChildText("lastname");
>>                          String tag3 = node.getChildText("nickname");
>>                    String tag4 = node.getChildText("salary");
>>
>> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
>>  context.write(NullWritable.get(), new Text(value));
>>  }
>> } followed by catch statements....................
>>
>> //xml input file
>> <?xml version="1.0" encoding="UTF-8"?>
>> <company>
>>  <staff>
>> <firstname>yong</firstname>
>>  <lastname>mook kim</lastname>
>> <nickname>mkyong</nickname>
>>  <salary>100000</salary>
>> </staff>
>>  <staff>
>> <firstname>low121</firstname>
>>  <lastname>yin fong1</lastname>
>> <nickname>fong fong1</nickname>
>>  <salary>2000001</salary>
>> </staff>
>> </company>
>>
>> Thanks for your help!
>>
>> Regards,
>> dti
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: Mapper outputs an empty file

Posted by Bertrand Dechoux <de...@gmail.com>.
You should write unit tests (MRUnit) and do debugging if that's not enough.
I would assume that you are a reading your file line by line. And each line
is not a valid xml, thus an exception is thrown and then caught but without
any logs or counters.

Regards

Bertrand

On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <ha...@gmail.com> wrote:

> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
>  Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
>  Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
>  Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 =
> node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
>  context.write(NullWritable.get(), new Text(value));
>  }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
>  <staff>
> <firstname>yong</firstname>
>  <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
>  <salary>100000</salary>
> </staff>
>  <staff>
> <firstname>low121</firstname>
>  <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
>  <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>
>


-- 
Bertrand Dechoux

Re: Mapper outputs an empty file

Posted by Bertrand Dechoux <de...@gmail.com>.
You should write unit tests (MRUnit) and do debugging if that's not enough.
I would assume that you are a reading your file line by line. And each line
is not a valid xml, thus an exception is thrown and then caught but without
any logs or counters.

Regards

Bertrand

On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <ha...@gmail.com> wrote:

> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
>  Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
>  Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
>  Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 =
> node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
>  context.write(NullWritable.get(), new Text(value));
>  }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
>  <staff>
> <firstname>yong</firstname>
>  <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
>  <salary>100000</salary>
> </staff>
>  <staff>
> <firstname>low121</firstname>
>  <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
>  <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>
>


-- 
Bertrand Dechoux

Re: Mapper outputs an empty file

Posted by Harsh J <ha...@cloudera.com>.
The lack of conditional logic suggests that an empty file should never
occur for a _successful_ parse.

So the question boils down to successful parsing. What exactly is your
RecordReader/InputFormat here? The TextInputFormat reads documents
line by line, and is not suited for direct XML document based parsing,
which you rely on here, as you are considering a single KV pair input
into the mapper to contain the whole document to run the parser upon.

If your catching logic is catching and logging exceptions, I suggest
taking a look at the Mapper's task logs to see your actual error here.

On Fri, Nov 30, 2012 at 5:51 PM, dyuti a <ha...@gmail.com> wrote:
> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
> Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
> Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
> Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 = node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
> context.write(NullWritable.get(), new Text(value));
> }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
> <staff>
> <firstname>yong</firstname>
> <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
> <salary>100000</salary>
> </staff>
> <staff>
> <firstname>low121</firstname>
> <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
> <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>



-- 
Harsh J

Re: Mapper outputs an empty file

Posted by Bertrand Dechoux <de...@gmail.com>.
You should write unit tests (MRUnit) and do debugging if that's not enough.
I would assume that you are a reading your file line by line. And each line
is not a valid xml, thus an exception is thrown and then caught but without
any logs or counters.

Regards

Bertrand

On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <ha...@gmail.com> wrote:

> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
>  Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
>  Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
>  Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 =
> node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
>  context.write(NullWritable.get(), new Text(value));
>  }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
>  <staff>
> <firstname>yong</firstname>
>  <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
>  <salary>100000</salary>
> </staff>
>  <staff>
> <firstname>low121</firstname>
>  <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
>  <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>
>


-- 
Bertrand Dechoux

Re: Mapper outputs an empty file

Posted by Bertrand Dechoux <de...@gmail.com>.
You should write unit tests (MRUnit) and do debugging if that's not enough.
I would assume that you are a reading your file line by line. And each line
is not a valid xml, thus an exception is thrown and then caught but without
any logs or counters.

Regards

Bertrand

On Fri, Nov 30, 2012 at 1:21 PM, dyuti a <ha...@gmail.com> wrote:

> Hi All,
> Am trying with xml processing in hadoop,used the below code inside map
> method. It results an empty file (not used reducer class).is there anything
>  wrong ?
>
> //code used inside map method
> public void map(LongWritable key, Text value1,Context context)
> throws IOException, InterruptedException {
>         String xmlString = value1.toString();
> SAXBuilder builder = new SAXBuilder();
>  Reader in = new StringReader(xmlString);
> String value="";
> try {
>                         Document doc = builder.build(in);
>  Element rootNode = doc.getRootElement();
>                         List<Element> list = rootNode.getChildren("staff");
>             for (int i = 0; i < list.size(); i++) {
>  Element node = (Element) list.get(i);
>               String tag1 = node.getChildText("firstname");
>                                 String tag2 =
> node.getChildText("lastname");
>                          String tag3 = node.getChildText("nickname");
>                    String tag4 = node.getChildText("salary");
>
> value = tag1 + "," + tag2 + "," + tag3 + "," + tag4;
>  context.write(NullWritable.get(), new Text(value));
>  }
> } followed by catch statements....................
>
> //xml input file
> <?xml version="1.0" encoding="UTF-8"?>
> <company>
>  <staff>
> <firstname>yong</firstname>
>  <lastname>mook kim</lastname>
> <nickname>mkyong</nickname>
>  <salary>100000</salary>
> </staff>
>  <staff>
> <firstname>low121</firstname>
>  <lastname>yin fong1</lastname>
> <nickname>fong fong1</nickname>
>  <salary>2000001</salary>
> </staff>
> </company>
>
> Thanks for your help!
>
> Regards,
> dti
>
>


-- 
Bertrand Dechoux