You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Korb, Michael [USA]" <Ko...@bah.com> on 2011/02/07 16:51:40 UTC

Hadoop XML Error

I am running two instances of Hadoop on a cluster and want to copy all the data from hadoop1 to the updated hadoop2. From hadoop2, I am running the command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/" where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of hadoop2. I get the following error:

11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
[Fatal Error] :1:215: XML document structures must start and end within the same entity.
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: invalid xml directory content
	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
	at org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
	at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
	at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
Caused by: org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
	... 9 more

I am fairly certain that none of the XML files are malformed or corrupted. This thread (http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html) discusses a similar problem caused by file permissions but doesn't seem to offer a solution. Any help would be appreciated.

Thanks,
Mike

Re: Hadoop XML Error

Posted by Ted Dunning <td...@maprtech.com>.
This is due to the security API not being available.  You are crossing from
a cluster with security to one without and that is causing confusion.
 Presumably your client assumes that it is available and your hadoop library
doesn't provide it.

Check your class path very carefully looking for version assumptions and
confusions.

On Mon, Feb 7, 2011 at 11:43 AM, Korb, Michael [USA]
<Ko...@bah.com>wrote:

> We're migrating from CDH3b3 to a recent build of 0.20-append published by
> Ryan Rawson. This isn't something covered by normal upgrade scripts. I've
> tried several commands with different protocols and port numbers, but now
> keep getting the same error:
>
> 11/02/07 14:35:06 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 14:35:06 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>        at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> Has anyone seen this before? What might be causing it?
>
> Thanks,
> Mike
>
>
> ________________________________________
> From: Xavier Stevens [xstevens@mozilla.com]
> Sent: Monday, February 07, 2011 1:47 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Hadoop XML Error
>
> You don't need to distcp to upgrade a cluster.  You just need to go
> through the upgrade process.  Bumping from 0.20.2 to 0.20.3 you might
> not even need to do anything other than stop the cluster processes, and
> then restart them using the 0.20.3 install.
>
> Here's a link to the upgrade and rollback docs:
>
> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_user_guide.html#Upgrade+and+Rollback
>
>
> -Xavier
>
>
> On 2/7/11 10:22 AM, Korb, Michael [USA] wrote:
> > Xavier,
> >
> > Yes, I'm trying to upgrade from 0.20.2 to 0.20.3. Both are running on the
> same cluster. I'm trying to distcp everything from the 0.20.2 instance over
> to the 0.20.3 instance, without any luck yet.
> >
> > Mike
> > ________________________________________
> > From: Xavier Stevens [xstevens@mozilla.com]
> > Sent: Monday, February 07, 2011 1:20 PM
> > To: common-user@hadoop.apache.org
> > Subject: Re: Hadoop XML Error
> >
> > Mike,
> >
> > Are you just trying to upgrade then?  I've never heard of anyone trying
> > to run two versions of hadoop on the same cluster.  I'm don't think
> > that's even possible, but maybe someone else knows.
> >
> > -Xavier
> >
> >
> > On 2/7/11 10:03 AM, Korb, Michael [USA] wrote:
> >> Xavier,
> >>
> >> Both instances of Hadoop are running on the same cluster. I tried the
> command "sudo -u hdfs ./hadoop distcp -update hftp://mc00001:50070/
> hdfs://mc00000:55310" from the hadoop2 bin directory (the 0.20.3 install) on
> mc00000 (the port 55310 is specified in core-site.xml). Now I'm getting
> this:
> >>
> >> 11/02/07 13:03:14 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> >> 11/02/07 13:03:14 INFO tools.DistCp: destPath=hdfs://mc00000:55310
> >> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
> >>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
> >>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
> >>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
> >>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> >>
> >> Thanks,
> >> Mike
> >> ________________________________________
> >> From: Xavier Stevens [xstevens@mozilla.com]
> >> Sent: Monday, February 07, 2011 12:56 PM
> >> To: common-user@hadoop.apache.org
> >> Subject: Re: Hadoop XML Error
> >>
> >> Mike,
> >>
> >> I've seen this when a directory has been removed or is missing from the
> >> time distcp starting stating the source files.  You'll probably want to
> >> make sure that no code or person is messing with the filesystem during
> >> your copy.  I would make sure you only have one version of hadoop
> >> installed on your destination cluster.  Also you should use hdfs as the
> >> destination protocol and run the command as the hdfs user if you're
> >> using hadoop security.
> >>
> >> Example (Running on destination cluster):
> >>
> >> sudo -u hdfs /usr/lib/hadoop-0.20.3/bin/hadoop distcp -update
> >> hftp://mc00001:50070/ hdfs://mc00000:8020/
> >>
> >>  Cheers,
> >>
> >>
> >> -Xavier
> >>
> >>
> >> On 2/7/11 9:39 AM, Korb, Michael [USA] wrote:
> >>> I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the
> DistCp Guide but I think I know the problem. I'm trying to run the command
> on the destination cluster, but when I call hadoop, I think the path is set
> to run the hadoop1 executable. So I tried going to the hadoop2 install and
> running it with "./hadoop distcp -update hftp://mc00001:50070/
> hdfs://mc00000:55310/" but now I get this error:
> >>>
> >>> 11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> >>> 11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
> >>> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
> >>>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
> >>>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
> >>>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
> >>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> >>>
> >>>
> >>> ________________________________________
> >>> From: Sonal Goyal [sonalgoyal4@gmail.com]
> >>> Sent: Monday, February 07, 2011 12:11 PM
> >>> To: common-user@hadoop.apache.org
> >>> Subject: Re: Hadoop XML Error
> >>>
> >>> Mike,
> >>>
> >>> This error is not related to malformed XML files etc you are trying to
> copy,
> >>> but because for some reason, the source or destination listing can not
> be
> >>> retrieved/parsed. Are you trying to copy between diff versions of
> clusters?
> >>> As far as I know, your destination should be writable, distcp should be
> run
> >>> from the destination cluster. See more here:
> >>> http://hadoop.apache.org/common/docs/r0.20.2/distcp.html
> >>>
> >>> Let us know how it goes.
> >>>
> >>> Thanks and Regards,
> >>> Sonal
> >>> <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
> >>> Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho
> >
> >>> Nube Technologies <http://www.nubetech.co>
> >>>
> >>> <http://in.linkedin.com/in/sonalgoyal>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <
> Korb_Michael@bah.com>wrote:
> >>>
> >>>> I am running two instances of Hadoop on a cluster and want to copy all
> the
> >>>> data from hadoop1 to the updated hadoop2. From hadoop2, I am running
> the
> >>>> command "hadoop distcp -update hftp://mc00001:50070/
> hftp://mc00000:50070/"
> >>>> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode
> of
> >>>> hadoop2. I get the following error:
> >>>>
> >>>> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> >>>> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
> >>>> [Fatal Error] :1:215: XML document structures must start and end
> within the
> >>>> same entity.
> >>>> With failures, global counters are inaccurate; consider running with
> -i
> >>>> Copy failed: java.io.IOException: invalid xml directory content
> >>>>        at
> >>>>
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
> >>>>        at
> >>>>
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
> >>>>        at
> >>>>
> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
> >>>>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
> >>>>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
> >>>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
> >>>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
> >>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> >>>> Caused by: org.xml.sax.SAXParseException: XML document structures must
> >>>> start and end within the same entity.
> >>>>        at
> >>>>
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
> >>>>        at
> >>>>
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
> >>>>        ... 9 more
> >>>>
> >>>> I am fairly certain that none of the XML files are malformed or
> corrupted.
> >>>> This thread (
> >>>> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
> >>>> discusses a similar problem caused by file permissions but doesn't
> seem to
> >>>> offer a solution. Any help would be appreciated.
> >>>>
> >>>> Thanks,
> >>>> Mike
> >>>>
>

RE: Hadoop XML Error

Posted by "Korb, Michael [USA]" <Ko...@bah.com>.
We're migrating from CDH3b3 to a recent build of 0.20-append published by Ryan Rawson. This isn't something covered by normal upgrade scripts. I've tried several commands with different protocols and port numbers, but now keep getting the same error:

11/02/07 14:35:06 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
11/02/07 14:35:06 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
	at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

Has anyone seen this before? What might be causing it?

Thanks,
Mike


________________________________________
From: Xavier Stevens [xstevens@mozilla.com]
Sent: Monday, February 07, 2011 1:47 PM
To: common-user@hadoop.apache.org
Subject: Re: Hadoop XML Error

You don't need to distcp to upgrade a cluster.  You just need to go
through the upgrade process.  Bumping from 0.20.2 to 0.20.3 you might
not even need to do anything other than stop the cluster processes, and
then restart them using the 0.20.3 install.

Here's a link to the upgrade and rollback docs:
http://hadoop.apache.org/common/docs/r0.20.0/hdfs_user_guide.html#Upgrade+and+Rollback


-Xavier


On 2/7/11 10:22 AM, Korb, Michael [USA] wrote:
> Xavier,
>
> Yes, I'm trying to upgrade from 0.20.2 to 0.20.3. Both are running on the same cluster. I'm trying to distcp everything from the 0.20.2 instance over to the 0.20.3 instance, without any luck yet.
>
> Mike
> ________________________________________
> From: Xavier Stevens [xstevens@mozilla.com]
> Sent: Monday, February 07, 2011 1:20 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Hadoop XML Error
>
> Mike,
>
> Are you just trying to upgrade then?  I've never heard of anyone trying
> to run two versions of hadoop on the same cluster.  I'm don't think
> that's even possible, but maybe someone else knows.
>
> -Xavier
>
>
> On 2/7/11 10:03 AM, Korb, Michael [USA] wrote:
>> Xavier,
>>
>> Both instances of Hadoop are running on the same cluster. I tried the command "sudo -u hdfs ./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310" from the hadoop2 bin directory (the 0.20.3 install) on mc00000 (the port 55310 is specified in core-site.xml). Now I'm getting this:
>>
>> 11/02/07 13:03:14 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>> 11/02/07 13:03:14 INFO tools.DistCp: destPath=hdfs://mc00000:55310
>> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>
>> Thanks,
>> Mike
>> ________________________________________
>> From: Xavier Stevens [xstevens@mozilla.com]
>> Sent: Monday, February 07, 2011 12:56 PM
>> To: common-user@hadoop.apache.org
>> Subject: Re: Hadoop XML Error
>>
>> Mike,
>>
>> I've seen this when a directory has been removed or is missing from the
>> time distcp starting stating the source files.  You'll probably want to
>> make sure that no code or person is messing with the filesystem during
>> your copy.  I would make sure you only have one version of hadoop
>> installed on your destination cluster.  Also you should use hdfs as the
>> destination protocol and run the command as the hdfs user if you're
>> using hadoop security.
>>
>> Example (Running on destination cluster):
>>
>> sudo -u hdfs /usr/lib/hadoop-0.20.3/bin/hadoop distcp -update
>> hftp://mc00001:50070/ hdfs://mc00000:8020/
>>
>>  Cheers,
>>
>>
>> -Xavier
>>
>>
>> On 2/7/11 9:39 AM, Korb, Michael [USA] wrote:
>>> I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the DistCp Guide but I think I know the problem. I'm trying to run the command on the destination cluster, but when I call hadoop, I think the path is set to run the hadoop1 executable. So I tried going to the hadoop2 install and running it with "./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310/" but now I get this error:
>>>
>>> 11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>>> 11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
>>> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>>>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>>>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>>>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>>
>>>
>>> ________________________________________
>>> From: Sonal Goyal [sonalgoyal4@gmail.com]
>>> Sent: Monday, February 07, 2011 12:11 PM
>>> To: common-user@hadoop.apache.org
>>> Subject: Re: Hadoop XML Error
>>>
>>> Mike,
>>>
>>> This error is not related to malformed XML files etc you are trying to copy,
>>> but because for some reason, the source or destination listing can not be
>>> retrieved/parsed. Are you trying to copy between diff versions of clusters?
>>> As far as I know, your destination should be writable, distcp should be run
>>> from the destination cluster. See more here:
>>> http://hadoop.apache.org/common/docs/r0.20.2/distcp.html
>>>
>>> Let us know how it goes.
>>>
>>> Thanks and Regards,
>>> Sonal
>>> <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
>>> Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
>>> Nube Technologies <http://www.nubetech.co>
>>>
>>> <http://in.linkedin.com/in/sonalgoyal>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <Ko...@bah.com>wrote:
>>>
>>>> I am running two instances of Hadoop on a cluster and want to copy all the
>>>> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
>>>> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"
>>>> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
>>>> hadoop2. I get the following error:
>>>>
>>>> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>>>> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
>>>> [Fatal Error] :1:215: XML document structures must start and end within the
>>>> same entity.
>>>> With failures, global counters are inaccurate; consider running with -i
>>>> Copy failed: java.io.IOException: invalid xml directory content
>>>>        at
>>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>>>>        at
>>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>>>>        at
>>>> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>>>>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>>>>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>>>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>>>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>>> Caused by: org.xml.sax.SAXParseException: XML document structures must
>>>> start and end within the same entity.
>>>>        at
>>>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>>>>        at
>>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>>>>        ... 9 more
>>>>
>>>> I am fairly certain that none of the XML files are malformed or corrupted.
>>>> This thread (
>>>> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
>>>> discusses a similar problem caused by file permissions but doesn't seem to
>>>> offer a solution. Any help would be appreciated.
>>>>
>>>> Thanks,
>>>> Mike
>>>>

Re: Hadoop XML Error

Posted by Xavier Stevens <xs...@mozilla.com>.
You don't need to distcp to upgrade a cluster.  You just need to go
through the upgrade process.  Bumping from 0.20.2 to 0.20.3 you might
not even need to do anything other than stop the cluster processes, and
then restart them using the 0.20.3 install.

Here's a link to the upgrade and rollback docs:
http://hadoop.apache.org/common/docs/r0.20.0/hdfs_user_guide.html#Upgrade+and+Rollback


-Xavier


On 2/7/11 10:22 AM, Korb, Michael [USA] wrote:
> Xavier,
>
> Yes, I'm trying to upgrade from 0.20.2 to 0.20.3. Both are running on the same cluster. I'm trying to distcp everything from the 0.20.2 instance over to the 0.20.3 instance, without any luck yet.
>
> Mike
> ________________________________________
> From: Xavier Stevens [xstevens@mozilla.com]
> Sent: Monday, February 07, 2011 1:20 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Hadoop XML Error
>
> Mike,
>
> Are you just trying to upgrade then?  I've never heard of anyone trying
> to run two versions of hadoop on the same cluster.  I'm don't think
> that's even possible, but maybe someone else knows.
>
> -Xavier
>
>
> On 2/7/11 10:03 AM, Korb, Michael [USA] wrote:
>> Xavier,
>>
>> Both instances of Hadoop are running on the same cluster. I tried the command "sudo -u hdfs ./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310" from the hadoop2 bin directory (the 0.20.3 install) on mc00000 (the port 55310 is specified in core-site.xml). Now I'm getting this:
>>
>> 11/02/07 13:03:14 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>> 11/02/07 13:03:14 INFO tools.DistCp: destPath=hdfs://mc00000:55310
>> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>
>> Thanks,
>> Mike
>> ________________________________________
>> From: Xavier Stevens [xstevens@mozilla.com]
>> Sent: Monday, February 07, 2011 12:56 PM
>> To: common-user@hadoop.apache.org
>> Subject: Re: Hadoop XML Error
>>
>> Mike,
>>
>> I've seen this when a directory has been removed or is missing from the
>> time distcp starting stating the source files.  You'll probably want to
>> make sure that no code or person is messing with the filesystem during
>> your copy.  I would make sure you only have one version of hadoop
>> installed on your destination cluster.  Also you should use hdfs as the
>> destination protocol and run the command as the hdfs user if you're
>> using hadoop security.
>>
>> Example (Running on destination cluster):
>>
>> sudo -u hdfs /usr/lib/hadoop-0.20.3/bin/hadoop distcp -update
>> hftp://mc00001:50070/ hdfs://mc00000:8020/
>>
>>  Cheers,
>>
>>
>> -Xavier
>>
>>
>> On 2/7/11 9:39 AM, Korb, Michael [USA] wrote:
>>> I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the DistCp Guide but I think I know the problem. I'm trying to run the command on the destination cluster, but when I call hadoop, I think the path is set to run the hadoop1 executable. So I tried going to the hadoop2 install and running it with "./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310/" but now I get this error:
>>>
>>> 11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>>> 11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
>>> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>>>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>>>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>>>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>>
>>>
>>> ________________________________________
>>> From: Sonal Goyal [sonalgoyal4@gmail.com]
>>> Sent: Monday, February 07, 2011 12:11 PM
>>> To: common-user@hadoop.apache.org
>>> Subject: Re: Hadoop XML Error
>>>
>>> Mike,
>>>
>>> This error is not related to malformed XML files etc you are trying to copy,
>>> but because for some reason, the source or destination listing can not be
>>> retrieved/parsed. Are you trying to copy between diff versions of clusters?
>>> As far as I know, your destination should be writable, distcp should be run
>>> from the destination cluster. See more here:
>>> http://hadoop.apache.org/common/docs/r0.20.2/distcp.html
>>>
>>> Let us know how it goes.
>>>
>>> Thanks and Regards,
>>> Sonal
>>> <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
>>> Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
>>> Nube Technologies <http://www.nubetech.co>
>>>
>>> <http://in.linkedin.com/in/sonalgoyal>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <Ko...@bah.com>wrote:
>>>
>>>> I am running two instances of Hadoop on a cluster and want to copy all the
>>>> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
>>>> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"
>>>> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
>>>> hadoop2. I get the following error:
>>>>
>>>> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>>>> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
>>>> [Fatal Error] :1:215: XML document structures must start and end within the
>>>> same entity.
>>>> With failures, global counters are inaccurate; consider running with -i
>>>> Copy failed: java.io.IOException: invalid xml directory content
>>>>        at
>>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>>>>        at
>>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>>>>        at
>>>> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>>>>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>>>>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>>>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>>>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>>> Caused by: org.xml.sax.SAXParseException: XML document structures must
>>>> start and end within the same entity.
>>>>        at
>>>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>>>>        at
>>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>>>>        ... 9 more
>>>>
>>>> I am fairly certain that none of the XML files are malformed or corrupted.
>>>> This thread (
>>>> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
>>>> discusses a similar problem caused by file permissions but doesn't seem to
>>>> offer a solution. Any help would be appreciated.
>>>>
>>>> Thanks,
>>>> Mike
>>>>

RE: Hadoop XML Error

Posted by "Korb, Michael [USA]" <Ko...@bah.com>.
Xavier,

Yes, I'm trying to upgrade from 0.20.2 to 0.20.3. Both are running on the same cluster. I'm trying to distcp everything from the 0.20.2 instance over to the 0.20.3 instance, without any luck yet.

Mike
________________________________________
From: Xavier Stevens [xstevens@mozilla.com]
Sent: Monday, February 07, 2011 1:20 PM
To: common-user@hadoop.apache.org
Subject: Re: Hadoop XML Error

Mike,

Are you just trying to upgrade then?  I've never heard of anyone trying
to run two versions of hadoop on the same cluster.  I'm don't think
that's even possible, but maybe someone else knows.

-Xavier


On 2/7/11 10:03 AM, Korb, Michael [USA] wrote:
> Xavier,
>
> Both instances of Hadoop are running on the same cluster. I tried the command "sudo -u hdfs ./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310" from the hadoop2 bin directory (the 0.20.3 install) on mc00000 (the port 55310 is specified in core-site.xml). Now I'm getting this:
>
> 11/02/07 13:03:14 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 13:03:14 INFO tools.DistCp: destPath=hdfs://mc00000:55310
> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> Thanks,
> Mike
> ________________________________________
> From: Xavier Stevens [xstevens@mozilla.com]
> Sent: Monday, February 07, 2011 12:56 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Hadoop XML Error
>
> Mike,
>
> I've seen this when a directory has been removed or is missing from the
> time distcp starting stating the source files.  You'll probably want to
> make sure that no code or person is messing with the filesystem during
> your copy.  I would make sure you only have one version of hadoop
> installed on your destination cluster.  Also you should use hdfs as the
> destination protocol and run the command as the hdfs user if you're
> using hadoop security.
>
> Example (Running on destination cluster):
>
> sudo -u hdfs /usr/lib/hadoop-0.20.3/bin/hadoop distcp -update
> hftp://mc00001:50070/ hdfs://mc00000:8020/
>
>  Cheers,
>
>
> -Xavier
>
>
> On 2/7/11 9:39 AM, Korb, Michael [USA] wrote:
>> I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the DistCp Guide but I think I know the problem. I'm trying to run the command on the destination cluster, but when I call hadoop, I think the path is set to run the hadoop1 executable. So I tried going to the hadoop2 install and running it with "./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310/" but now I get this error:
>>
>> 11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>> 11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
>> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>
>>
>> ________________________________________
>> From: Sonal Goyal [sonalgoyal4@gmail.com]
>> Sent: Monday, February 07, 2011 12:11 PM
>> To: common-user@hadoop.apache.org
>> Subject: Re: Hadoop XML Error
>>
>> Mike,
>>
>> This error is not related to malformed XML files etc you are trying to copy,
>> but because for some reason, the source or destination listing can not be
>> retrieved/parsed. Are you trying to copy between diff versions of clusters?
>> As far as I know, your destination should be writable, distcp should be run
>> from the destination cluster. See more here:
>> http://hadoop.apache.org/common/docs/r0.20.2/distcp.html
>>
>> Let us know how it goes.
>>
>> Thanks and Regards,
>> Sonal
>> <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
>> Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
>> Nube Technologies <http://www.nubetech.co>
>>
>> <http://in.linkedin.com/in/sonalgoyal>
>>
>>
>>
>>
>>
>> On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <Ko...@bah.com>wrote:
>>
>>> I am running two instances of Hadoop on a cluster and want to copy all the
>>> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
>>> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"
>>> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
>>> hadoop2. I get the following error:
>>>
>>> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>>> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
>>> [Fatal Error] :1:215: XML document structures must start and end within the
>>> same entity.
>>> With failures, global counters are inaccurate; consider running with -i
>>> Copy failed: java.io.IOException: invalid xml directory content
>>>        at
>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>>>        at
>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>>>        at
>>> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>>>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>>>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>> Caused by: org.xml.sax.SAXParseException: XML document structures must
>>> start and end within the same entity.
>>>        at
>>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>>>        at
>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>>>        ... 9 more
>>>
>>> I am fairly certain that none of the XML files are malformed or corrupted.
>>> This thread (
>>> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
>>> discusses a similar problem caused by file permissions but doesn't seem to
>>> offer a solution. Any help would be appreciated.
>>>
>>> Thanks,
>>> Mike
>>>

Re: Hadoop XML Error

Posted by Xavier Stevens <xs...@mozilla.com>.
Mike,

Are you just trying to upgrade then?  I've never heard of anyone trying
to run two versions of hadoop on the same cluster.  I'm don't think
that's even possible, but maybe someone else knows.

-Xavier


On 2/7/11 10:03 AM, Korb, Michael [USA] wrote:
> Xavier,
>
> Both instances of Hadoop are running on the same cluster. I tried the command "sudo -u hdfs ./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310" from the hadoop2 bin directory (the 0.20.3 install) on mc00000 (the port 55310 is specified in core-site.xml). Now I'm getting this:
>
> 11/02/07 13:03:14 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 13:03:14 INFO tools.DistCp: destPath=hdfs://mc00000:55310
> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
> 	at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
> 	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
> 	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> 	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> Thanks,
> Mike
> ________________________________________
> From: Xavier Stevens [xstevens@mozilla.com]
> Sent: Monday, February 07, 2011 12:56 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Hadoop XML Error
>
> Mike,
>
> I've seen this when a directory has been removed or is missing from the
> time distcp starting stating the source files.  You'll probably want to
> make sure that no code or person is messing with the filesystem during
> your copy.  I would make sure you only have one version of hadoop
> installed on your destination cluster.  Also you should use hdfs as the
> destination protocol and run the command as the hdfs user if you're
> using hadoop security.
>
> Example (Running on destination cluster):
>
> sudo -u hdfs /usr/lib/hadoop-0.20.3/bin/hadoop distcp -update
> hftp://mc00001:50070/ hdfs://mc00000:8020/
>
>  Cheers,
>
>
> -Xavier
>
>
> On 2/7/11 9:39 AM, Korb, Michael [USA] wrote:
>> I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the DistCp Guide but I think I know the problem. I'm trying to run the command on the destination cluster, but when I call hadoop, I think the path is set to run the hadoop1 executable. So I tried going to the hadoop2 install and running it with "./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310/" but now I get this error:
>>
>> 11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>> 11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
>> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>
>>
>> ________________________________________
>> From: Sonal Goyal [sonalgoyal4@gmail.com]
>> Sent: Monday, February 07, 2011 12:11 PM
>> To: common-user@hadoop.apache.org
>> Subject: Re: Hadoop XML Error
>>
>> Mike,
>>
>> This error is not related to malformed XML files etc you are trying to copy,
>> but because for some reason, the source or destination listing can not be
>> retrieved/parsed. Are you trying to copy between diff versions of clusters?
>> As far as I know, your destination should be writable, distcp should be run
>> from the destination cluster. See more here:
>> http://hadoop.apache.org/common/docs/r0.20.2/distcp.html
>>
>> Let us know how it goes.
>>
>> Thanks and Regards,
>> Sonal
>> <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
>> Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
>> Nube Technologies <http://www.nubetech.co>
>>
>> <http://in.linkedin.com/in/sonalgoyal>
>>
>>
>>
>>
>>
>> On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <Ko...@bah.com>wrote:
>>
>>> I am running two instances of Hadoop on a cluster and want to copy all the
>>> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
>>> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"
>>> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
>>> hadoop2. I get the following error:
>>>
>>> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>>> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
>>> [Fatal Error] :1:215: XML document structures must start and end within the
>>> same entity.
>>> With failures, global counters are inaccurate; consider running with -i
>>> Copy failed: java.io.IOException: invalid xml directory content
>>>        at
>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>>>        at
>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>>>        at
>>> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>>>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>>>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>> Caused by: org.xml.sax.SAXParseException: XML document structures must
>>> start and end within the same entity.
>>>        at
>>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>>>        at
>>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>>>        ... 9 more
>>>
>>> I am fairly certain that none of the XML files are malformed or corrupted.
>>> This thread (
>>> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
>>> discusses a similar problem caused by file permissions but doesn't seem to
>>> offer a solution. Any help would be appreciated.
>>>
>>> Thanks,
>>> Mike
>>>

RE: Hadoop XML Error

Posted by "Korb, Michael [USA]" <Ko...@bah.com>.
Xavier,

Both instances of Hadoop are running on the same cluster. I tried the command "sudo -u hdfs ./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310" from the hadoop2 bin directory (the 0.20.3 install) on mc00000 (the port 55310 is specified in core-site.xml). Now I'm getting this:

11/02/07 13:03:14 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
11/02/07 13:03:14 INFO tools.DistCp: destPath=hdfs://mc00000:55310
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
	at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

Thanks,
Mike
________________________________________
From: Xavier Stevens [xstevens@mozilla.com]
Sent: Monday, February 07, 2011 12:56 PM
To: common-user@hadoop.apache.org
Subject: Re: Hadoop XML Error

Mike,

I've seen this when a directory has been removed or is missing from the
time distcp starting stating the source files.  You'll probably want to
make sure that no code or person is messing with the filesystem during
your copy.  I would make sure you only have one version of hadoop
installed on your destination cluster.  Also you should use hdfs as the
destination protocol and run the command as the hdfs user if you're
using hadoop security.

Example (Running on destination cluster):

sudo -u hdfs /usr/lib/hadoop-0.20.3/bin/hadoop distcp -update
hftp://mc00001:50070/ hdfs://mc00000:8020/

 Cheers,


-Xavier


On 2/7/11 9:39 AM, Korb, Michael [USA] wrote:
> I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the DistCp Guide but I think I know the problem. I'm trying to run the command on the destination cluster, but when I call hadoop, I think the path is set to run the hadoop1 executable. So I tried going to the hadoop2 install and running it with "./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310/" but now I get this error:
>
> 11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
>       at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
>
> ________________________________________
> From: Sonal Goyal [sonalgoyal4@gmail.com]
> Sent: Monday, February 07, 2011 12:11 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Hadoop XML Error
>
> Mike,
>
> This error is not related to malformed XML files etc you are trying to copy,
> but because for some reason, the source or destination listing can not be
> retrieved/parsed. Are you trying to copy between diff versions of clusters?
> As far as I know, your destination should be writable, distcp should be run
> from the destination cluster. See more here:
> http://hadoop.apache.org/common/docs/r0.20.2/distcp.html
>
> Let us know how it goes.
>
> Thanks and Regards,
> Sonal
> <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
> Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
> Nube Technologies <http://www.nubetech.co>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
>
> On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <Ko...@bah.com>wrote:
>
>> I am running two instances of Hadoop on a cluster and want to copy all the
>> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
>> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"
>> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
>> hadoop2. I get the following error:
>>
>> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
>> [Fatal Error] :1:215: XML document structures must start and end within the
>> same entity.
>> With failures, global counters are inaccurate; consider running with -i
>> Copy failed: java.io.IOException: invalid xml directory content
>>        at
>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>>        at
>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>>        at
>> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>> Caused by: org.xml.sax.SAXParseException: XML document structures must
>> start and end within the same entity.
>>        at
>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>>        at
>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>>        ... 9 more
>>
>> I am fairly certain that none of the XML files are malformed or corrupted.
>> This thread (
>> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
>> discusses a similar problem caused by file permissions but doesn't seem to
>> offer a solution. Any help would be appreciated.
>>
>> Thanks,
>> Mike
>>

Re: Hadoop XML Error

Posted by Xavier Stevens <xs...@mozilla.com>.
Mike,

I've seen this when a directory has been removed or is missing from the
time distcp starting stating the source files.  You'll probably want to
make sure that no code or person is messing with the filesystem during
your copy.  I would make sure you only have one version of hadoop
installed on your destination cluster.  Also you should use hdfs as the
destination protocol and run the command as the hdfs user if you're
using hadoop security.

Example (Running on destination cluster):

sudo -u hdfs /usr/lib/hadoop-0.20.3/bin/hadoop distcp -update
hftp://mc00001:50070/ hdfs://mc00000:8020/

 Cheers,


-Xavier


On 2/7/11 9:39 AM, Korb, Michael [USA] wrote:
> I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the DistCp Guide but I think I know the problem. I'm trying to run the command on the destination cluster, but when I call hadoop, I think the path is set to run the hadoop1 executable. So I tried going to the hadoop2 install and running it with "./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310/" but now I get this error:
>
> 11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
> 	at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
> 	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
> 	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> 	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
>
> ________________________________________
> From: Sonal Goyal [sonalgoyal4@gmail.com]
> Sent: Monday, February 07, 2011 12:11 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Hadoop XML Error
>
> Mike,
>
> This error is not related to malformed XML files etc you are trying to copy,
> but because for some reason, the source or destination listing can not be
> retrieved/parsed. Are you trying to copy between diff versions of clusters?
> As far as I know, your destination should be writable, distcp should be run
> from the destination cluster. See more here:
> http://hadoop.apache.org/common/docs/r0.20.2/distcp.html
>
> Let us know how it goes.
>
> Thanks and Regards,
> Sonal
> <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
> Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
> Nube Technologies <http://www.nubetech.co>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
>
> On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <Ko...@bah.com>wrote:
>
>> I am running two instances of Hadoop on a cluster and want to copy all the
>> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
>> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"
>> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
>> hadoop2. I get the following error:
>>
>> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
>> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
>> [Fatal Error] :1:215: XML document structures must start and end within the
>> same entity.
>> With failures, global counters are inaccurate; consider running with -i
>> Copy failed: java.io.IOException: invalid xml directory content
>>        at
>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>>        at
>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>>        at
>> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>> Caused by: org.xml.sax.SAXParseException: XML document structures must
>> start and end within the same entity.
>>        at
>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>>        at
>> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>>        ... 9 more
>>
>> I am fairly certain that none of the XML files are malformed or corrupted.
>> This thread (
>> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
>> discusses a similar problem caused by file permissions but doesn't seem to
>> offer a solution. Any help would be appreciated.
>>
>> Thanks,
>> Mike
>>

RE: Hadoop XML Error

Posted by "Korb, Michael [USA]" <Ko...@bah.com>.
I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the DistCp Guide but I think I know the problem. I'm trying to run the command on the destination cluster, but when I call hadoop, I think the path is set to run the hadoop1 executable. So I tried going to the hadoop2 install and running it with "./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310/" but now I get this error:

11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials;
	at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632)
	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)


________________________________________
From: Sonal Goyal [sonalgoyal4@gmail.com]
Sent: Monday, February 07, 2011 12:11 PM
To: common-user@hadoop.apache.org
Subject: Re: Hadoop XML Error

Mike,

This error is not related to malformed XML files etc you are trying to copy,
but because for some reason, the source or destination listing can not be
retrieved/parsed. Are you trying to copy between diff versions of clusters?
As far as I know, your destination should be writable, distcp should be run
from the destination cluster. See more here:
http://hadoop.apache.org/common/docs/r0.20.2/distcp.html

Let us know how it goes.

Thanks and Regards,
Sonal
<https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>





On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <Ko...@bah.com>wrote:

> I am running two instances of Hadoop on a cluster and want to copy all the
> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"
> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
> hadoop2. I get the following error:
>
> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
> [Fatal Error] :1:215: XML document structures must start and end within the
> same entity.
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: java.io.IOException: invalid xml directory content
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> Caused by: org.xml.sax.SAXParseException: XML document structures must
> start and end within the same entity.
>        at
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>        ... 9 more
>
> I am fairly certain that none of the XML files are malformed or corrupted.
> This thread (
> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
> discusses a similar problem caused by file permissions but doesn't seem to
> offer a solution. Any help would be appreciated.
>
> Thanks,
> Mike
>

Re: Hadoop XML Error

Posted by Sonal Goyal <so...@gmail.com>.
Mike,

This error is not related to malformed XML files etc you are trying to copy,
but because for some reason, the source or destination listing can not be
retrieved/parsed. Are you trying to copy between diff versions of clusters?
As far as I know, your destination should be writable, distcp should be run
from the destination cluster. See more here:
http://hadoop.apache.org/common/docs/r0.20.2/distcp.html

Let us know how it goes.

Thanks and Regards,
Sonal
<https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>





On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <Ko...@bah.com>wrote:

> I am running two instances of Hadoop on a cluster and want to copy all the
> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"
> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
> hadoop2. I get the following error:
>
> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
> [Fatal Error] :1:215: XML document structures must start and end within the
> same entity.
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: java.io.IOException: invalid xml directory content
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> Caused by: org.xml.sax.SAXParseException: XML document structures must
> start and end within the same entity.
>        at
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>        ... 9 more
>
> I am fairly certain that none of the XML files are malformed or corrupted.
> This thread (
> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
> discusses a similar problem caused by file permissions but doesn't seem to
> offer a solution. Any help would be appreciated.
>
> Thanks,
> Mike
>

Re: Hadoop XML Error

Posted by Xavier Stevens <xs...@mozilla.com>.
Mike,

I've seen this when a directory has been removed or is missing from the
time distcp starting stating the source files.  You'll probably want to
make sure that no code or person is messing with the filesystem during
your copy.  Also you should use hdfs as the destination protocol.

Cheers,

-Xavier


On 2/7/11 7:51 AM, Korb, Michael [USA] wrote:
> I am running two instances of Hadoop on a cluster and want to copy all the data from hadoop1 to the updated hadoop2. From hadoop2, I am running the command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/" where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of hadoop2. I get the following error:
>
> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
> [Fatal Error] :1:215: XML document structures must start and end within the same entity.
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: java.io.IOException: invalid xml directory content
> 	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
> 	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
> 	at org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
> 	at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
> 	at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
> 	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
> 	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> 	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> Caused by: org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
> 	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
> 	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
> 	... 9 more
>
> I am fairly certain that none of the XML files are malformed or corrupted. This thread (http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html) discusses a similar problem caused by file permissions but doesn't seem to offer a solution. Any help would be appreciated.
>
> Thanks,
> Mike