You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Vikas Jadhav <vi...@gmail.com> on 2013/01/22 12:53:40 UTC

Bulk Loading DFS Space issue in Hbase

Hi
I am trying to bulk load 700m CSV data with 31 colms into Hbase

I have written MapReduce Program for but when run my program
it takes whole disk space and fails

Here is Status before running
*
 *
**
Configured Capacity : 116.16 GB DFS Used : 13.28 GB Non DFS Used :
61.41 GBDFS Remaining:41.47 GBDFS Used%:11.43 %DFS Remaining%:35.7 %
Live
Nodes <http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
: 1 Dead Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
: 0 Decommissioning
Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
: 0 Number of Under-Replicated Blocks : 68



After Runnign

* *

* Configured Capacity*

 :

 116.16 GB

* DFS Used*

 :

 52.07 GB

* Non DFS Used*

 :

 61.47 GB

* DFS Remaining*

 :

 2.62 GB

* DFS Used%*

 :

 44.83 %

* DFS Remaining%*

 :

 2.26 %

* **Live Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
* *

 :

 1

* **Dead Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
* *

 :

 0

* **Decommissioning
Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
* *

 :

 0

* Number of Under-Replicated Blocks*

 :

 455





So what is taking so much DFS space.

Has Anybody come across this issue.



even though map and reduce complete 100%

For incramental loading of HFILE it again keep on

Demanding spcace until whole drive ..





52 GB for 700 MB csv File










-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*

Fwd: Bulk Loading DFS Space issue in Hbase

Posted by Vikas Jadhav <vi...@gmail.com>.
Hi
I am trying to bulk load 700m CSV data with 31 colms into Hbase

I have written MapReduce Program for but when run my program
it takes whole disk space and fails

Here is Status before running
*
 *
**
 Configured Capacity : 116.16 GB DFS Used : 13.28 GB Non DFS Used :
61.41 GBDFS Remaining:41.47 GBDFS Used%:11.43 %DFS Remaining%:35.7 %
Live
Nodes <http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
: 1 Dead Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
:0 Decommissioning
Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
: 0 Number of Under-Replicated Blocks : 68



After Runnign

* *

* Configured Capacity*

 :

 116.16 GB

* DFS Used*

 :

 52.07 GB

* Non DFS Used*

 :

 61.47 GB

* DFS Remaining*

 :

 2.62 GB

* DFS Used%*

 :

 44.83 %

* DFS Remaining%*

 :

 2.26 %

* **Live Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
* *

 :

 1

* **Dead Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
* *

 :

 0

* **Decommissioning
Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
* *

 :

 0

* Number of Under-Replicated Blocks*

 :

 455





So what is taking so much DFS space.

Has Anybody come across this issue.



even though map and reduce complete 100%

For incramental loading of HFILE it again keep on

Demanding spcace until whole drive ..





52 GB for 700 MB csv File





 I am able to trace problem  to bulk loading

 700mb csv file (31 column) generate 6.5 GB HFile

But while loading this  these following lines excution take so much space

  LoadIncrementalHFiles loader = new LoadIncrementalHFiles(*conf*);

   loader.doBulkLoad(new Path(*args*[1]), hTable);





*
*
*

Thanx and Regards*
* Vikas Jadhav*

Fwd: Bulk Loading DFS Space issue in Hbase

Posted by Vikas Jadhav <vi...@gmail.com>.
---------- Forwarded message ----------
From: Vikas Jadhav <vi...@gmail.com>
Date: Tue, Jan 22, 2013 at 5:23 PM
Subject: Bulk Loading DFS Space issue in Hbase
To: user@hbase.apache.org


Hi
I am trying to bulk load 700m CSV data with 31 colms into Hbase

I have written MapReduce Program for but when run my program
it takes whole disk space and fails

Here is Status before running
*
 *
**
 Configured Capacity : 116.16 GB DFS Used : 13.28 GB Non DFS Used :
61.41 GBDFS Remaining:41.47 GBDFS Used%:11.43 %DFS Remaining%:35.7 %
Live
Nodes <http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
: 1 Dead Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
:0 Decommissioning
Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
: 0 Number of Under-Replicated Blocks : 68



After Runnign

* *

* Configured Capacity*

 :

 116.16 GB

* DFS Used*

 :

 52.07 GB

* Non DFS Used*

 :

 61.47 GB

* DFS Remaining*

 :

 2.62 GB

* DFS Used%*

 :

 44.83 %

* DFS Remaining%*

 :

 2.26 %

* **Live Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
* *

 :

 1

* **Dead Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
* *

 :

 0

* **Decommissioning
Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
* *

 :

 0

* Number of Under-Replicated Blocks*

 :

 455





So what is taking so much DFS space.

Has Anybody come across this issue.



even though map and reduce complete 100%

For incramental loading of HFILE it again keep on

Demanding spcace until whole drive ..





52 GB for 700 MB csv File










-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*



-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*

Fwd: Bulk Loading DFS Space issue in Hbase

Posted by Vikas Jadhav <vi...@gmail.com>.
---------- Forwarded message ----------
From: Vikas Jadhav <vi...@gmail.com>
Date: Tue, Jan 22, 2013 at 5:23 PM
Subject: Bulk Loading DFS Space issue in Hbase
To: user@hbase.apache.org


Hi
I am trying to bulk load 700m CSV data with 31 colms into Hbase

I have written MapReduce Program for but when run my program
it takes whole disk space and fails

Here is Status before running
*
 *
**
 Configured Capacity : 116.16 GB DFS Used : 13.28 GB Non DFS Used :
61.41 GBDFS Remaining:41.47 GBDFS Used%:11.43 %DFS Remaining%:35.7 %
Live
Nodes <http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
: 1 Dead Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
:0 Decommissioning
Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
: 0 Number of Under-Replicated Blocks : 68



After Runnign

* *

* Configured Capacity*

 :

 116.16 GB

* DFS Used*

 :

 52.07 GB

* Non DFS Used*

 :

 61.47 GB

* DFS Remaining*

 :

 2.62 GB

* DFS Used%*

 :

 44.83 %

* DFS Remaining%*

 :

 2.26 %

* **Live Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
* *

 :

 1

* **Dead Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
* *

 :

 0

* **Decommissioning
Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
* *

 :

 0

* Number of Under-Replicated Blocks*

 :

 455





So what is taking so much DFS space.

Has Anybody come across this issue.



even though map and reduce complete 100%

For incramental loading of HFILE it again keep on

Demanding spcace until whole drive ..





52 GB for 700 MB csv File










-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*



-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*

Fwd: Bulk Loading DFS Space issue in Hbase

Posted by Vikas Jadhav <vi...@gmail.com>.
---------- Forwarded message ----------
From: Vikas Jadhav <vi...@gmail.com>
Date: Tue, Jan 22, 2013 at 5:23 PM
Subject: Bulk Loading DFS Space issue in Hbase
To: user@hbase.apache.org


Hi
I am trying to bulk load 700m CSV data with 31 colms into Hbase

I have written MapReduce Program for but when run my program
it takes whole disk space and fails

Here is Status before running
*
 *
**
 Configured Capacity : 116.16 GB DFS Used : 13.28 GB Non DFS Used :
61.41 GBDFS Remaining:41.47 GBDFS Used%:11.43 %DFS Remaining%:35.7 %
Live
Nodes <http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
: 1 Dead Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
:0 Decommissioning
Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
: 0 Number of Under-Replicated Blocks : 68



After Runnign

* *

* Configured Capacity*

 :

 116.16 GB

* DFS Used*

 :

 52.07 GB

* Non DFS Used*

 :

 61.47 GB

* DFS Remaining*

 :

 2.62 GB

* DFS Used%*

 :

 44.83 %

* DFS Remaining%*

 :

 2.26 %

* **Live Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
* *

 :

 1

* **Dead Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
* *

 :

 0

* **Decommissioning
Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
* *

 :

 0

* Number of Under-Replicated Blocks*

 :

 455





So what is taking so much DFS space.

Has Anybody come across this issue.



even though map and reduce complete 100%

For incramental loading of HFILE it again keep on

Demanding spcace until whole drive ..





52 GB for 700 MB csv File










-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*



-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*

Fwd: Bulk Loading DFS Space issue in Hbase

Posted by Vikas Jadhav <vi...@gmail.com>.
---------- Forwarded message ----------
From: Vikas Jadhav <vi...@gmail.com>
Date: Tue, Jan 22, 2013 at 5:23 PM
Subject: Bulk Loading DFS Space issue in Hbase
To: user@hbase.apache.org


Hi
I am trying to bulk load 700m CSV data with 31 colms into Hbase

I have written MapReduce Program for but when run my program
it takes whole disk space and fails

Here is Status before running
*
 *
**
 Configured Capacity : 116.16 GB DFS Used : 13.28 GB Non DFS Used :
61.41 GBDFS Remaining:41.47 GBDFS Used%:11.43 %DFS Remaining%:35.7 %
Live
Nodes <http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
: 1 Dead Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
:0 Decommissioning
Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
: 0 Number of Under-Replicated Blocks : 68



After Runnign

* *

* Configured Capacity*

 :

 116.16 GB

* DFS Used*

 :

 52.07 GB

* Non DFS Used*

 :

 61.47 GB

* DFS Remaining*

 :

 2.62 GB

* DFS Used%*

 :

 44.83 %

* DFS Remaining%*

 :

 2.26 %

* **Live Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
* *

 :

 1

* **Dead Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
* *

 :

 0

* **Decommissioning
Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
* *

 :

 0

* Number of Under-Replicated Blocks*

 :

 455





So what is taking so much DFS space.

Has Anybody come across this issue.



even though map and reduce complete 100%

For incramental loading of HFILE it again keep on

Demanding spcace until whole drive ..





52 GB for 700 MB csv File










-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*



-- 
*
*
*

Thanx and Regards*
* Vikas Jadhav*