You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by R P <ha...@outlook.com> on 2015/09/23 03:29:05 UTC

CombineFileInputFormat with Gzip files

Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
It's creating temp files on the HDFS. See code below.Thanks for your response through, I wrote my own record reader which is passing file splits to LineRecordReader which works for my problem. 


public CompressedCombineFileRecordReader(CombineFileSplit split,
	      TaskAttemptContext context, Integer index) throws IOException {
	    
			Configuration currentConf = context.getConfiguration();
		  	this.path = split.getPath(index);
		  	boolean isCompressed =  findCodec(currentConf ,path);
		  	if(isCompressed)
		  		codecWiseDecompress(context.getConfiguration());
	
		  	fs = this.path.getFileSystem(currentConf);
		  	
		  	this.startOffset = split.getOffset(index);
	
		  	if(isCompressed){
		  		this.end = startOffset + rlength;
		  	}else{
		  		this.end = startOffset + split.getLength(index);
		  		dPath =path;
		  	}
		  	
		  	boolean skipFirstLine = false;
	    
	        fileIn = fs.open(dPath);
	        
	        if(isCompressed)  fs.deleteOnExit(dPath);
	        
	        if (startOffset != 0) {
	        	skipFirstLine = true;
	        	--startOffset;
	        	fileIn.seek(startOffset);
	        }
	        reader = new LineReader(fileIn);
	        if (skipFirstLine) {  
	        	startOffset += reader.readLine(new Text(), 0,
	        	(int)Math.min((long)Integer.MAX_VALUE, end - startOffset));
	        }
	        this.pos = startOffset;
	  }

Date: Thu, 24 Sep 2015 14:38:45 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org

what sought of side effects?

On Thu, Sep 24, 2015 at 2:35 PM, R P <ha...@outlook.com> wrote:



Thanks Harshit. That approach doesn't look good as it will write uncompressed data to HDFS resulting into job side effects. -R P

Date: Thu, 24 Sep 2015 09:55:49 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org
CC: mapreduce-user@hadoop.apache.org

Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:



Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.

From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  


-- 
Harshit Mathur
 		 	   		  


-- 
Harshit Mathur
 		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
It's creating temp files on the HDFS. See code below.Thanks for your response through, I wrote my own record reader which is passing file splits to LineRecordReader which works for my problem. 


public CompressedCombineFileRecordReader(CombineFileSplit split,
	      TaskAttemptContext context, Integer index) throws IOException {
	    
			Configuration currentConf = context.getConfiguration();
		  	this.path = split.getPath(index);
		  	boolean isCompressed =  findCodec(currentConf ,path);
		  	if(isCompressed)
		  		codecWiseDecompress(context.getConfiguration());
	
		  	fs = this.path.getFileSystem(currentConf);
		  	
		  	this.startOffset = split.getOffset(index);
	
		  	if(isCompressed){
		  		this.end = startOffset + rlength;
		  	}else{
		  		this.end = startOffset + split.getLength(index);
		  		dPath =path;
		  	}
		  	
		  	boolean skipFirstLine = false;
	    
	        fileIn = fs.open(dPath);
	        
	        if(isCompressed)  fs.deleteOnExit(dPath);
	        
	        if (startOffset != 0) {
	        	skipFirstLine = true;
	        	--startOffset;
	        	fileIn.seek(startOffset);
	        }
	        reader = new LineReader(fileIn);
	        if (skipFirstLine) {  
	        	startOffset += reader.readLine(new Text(), 0,
	        	(int)Math.min((long)Integer.MAX_VALUE, end - startOffset));
	        }
	        this.pos = startOffset;
	  }

Date: Thu, 24 Sep 2015 14:38:45 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org

what sought of side effects?

On Thu, Sep 24, 2015 at 2:35 PM, R P <ha...@outlook.com> wrote:



Thanks Harshit. That approach doesn't look good as it will write uncompressed data to HDFS resulting into job side effects. -R P

Date: Thu, 24 Sep 2015 09:55:49 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org
CC: mapreduce-user@hadoop.apache.org

Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:



Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.

From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  


-- 
Harshit Mathur
 		 	   		  


-- 
Harshit Mathur
 		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
It's creating temp files on the HDFS. See code below.Thanks for your response through, I wrote my own record reader which is passing file splits to LineRecordReader which works for my problem. 


public CompressedCombineFileRecordReader(CombineFileSplit split,
	      TaskAttemptContext context, Integer index) throws IOException {
	    
			Configuration currentConf = context.getConfiguration();
		  	this.path = split.getPath(index);
		  	boolean isCompressed =  findCodec(currentConf ,path);
		  	if(isCompressed)
		  		codecWiseDecompress(context.getConfiguration());
	
		  	fs = this.path.getFileSystem(currentConf);
		  	
		  	this.startOffset = split.getOffset(index);
	
		  	if(isCompressed){
		  		this.end = startOffset + rlength;
		  	}else{
		  		this.end = startOffset + split.getLength(index);
		  		dPath =path;
		  	}
		  	
		  	boolean skipFirstLine = false;
	    
	        fileIn = fs.open(dPath);
	        
	        if(isCompressed)  fs.deleteOnExit(dPath);
	        
	        if (startOffset != 0) {
	        	skipFirstLine = true;
	        	--startOffset;
	        	fileIn.seek(startOffset);
	        }
	        reader = new LineReader(fileIn);
	        if (skipFirstLine) {  
	        	startOffset += reader.readLine(new Text(), 0,
	        	(int)Math.min((long)Integer.MAX_VALUE, end - startOffset));
	        }
	        this.pos = startOffset;
	  }

Date: Thu, 24 Sep 2015 14:38:45 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org

what sought of side effects?

On Thu, Sep 24, 2015 at 2:35 PM, R P <ha...@outlook.com> wrote:



Thanks Harshit. That approach doesn't look good as it will write uncompressed data to HDFS resulting into job side effects. -R P

Date: Thu, 24 Sep 2015 09:55:49 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org
CC: mapreduce-user@hadoop.apache.org

Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:



Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.

From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  


-- 
Harshit Mathur
 		 	   		  


-- 
Harshit Mathur
 		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
It's creating temp files on the HDFS. See code below.Thanks for your response through, I wrote my own record reader which is passing file splits to LineRecordReader which works for my problem. 


public CompressedCombineFileRecordReader(CombineFileSplit split,
	      TaskAttemptContext context, Integer index) throws IOException {
	    
			Configuration currentConf = context.getConfiguration();
		  	this.path = split.getPath(index);
		  	boolean isCompressed =  findCodec(currentConf ,path);
		  	if(isCompressed)
		  		codecWiseDecompress(context.getConfiguration());
	
		  	fs = this.path.getFileSystem(currentConf);
		  	
		  	this.startOffset = split.getOffset(index);
	
		  	if(isCompressed){
		  		this.end = startOffset + rlength;
		  	}else{
		  		this.end = startOffset + split.getLength(index);
		  		dPath =path;
		  	}
		  	
		  	boolean skipFirstLine = false;
	    
	        fileIn = fs.open(dPath);
	        
	        if(isCompressed)  fs.deleteOnExit(dPath);
	        
	        if (startOffset != 0) {
	        	skipFirstLine = true;
	        	--startOffset;
	        	fileIn.seek(startOffset);
	        }
	        reader = new LineReader(fileIn);
	        if (skipFirstLine) {  
	        	startOffset += reader.readLine(new Text(), 0,
	        	(int)Math.min((long)Integer.MAX_VALUE, end - startOffset));
	        }
	        this.pos = startOffset;
	  }

Date: Thu, 24 Sep 2015 14:38:45 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org

what sought of side effects?

On Thu, Sep 24, 2015 at 2:35 PM, R P <ha...@outlook.com> wrote:



Thanks Harshit. That approach doesn't look good as it will write uncompressed data to HDFS resulting into job side effects. -R P

Date: Thu, 24 Sep 2015 09:55:49 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org
CC: mapreduce-user@hadoop.apache.org

Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:



Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.

From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  


-- 
Harshit Mathur
 		 	   		  


-- 
Harshit Mathur
 		 	   		  

Re: CombineFileInputFormat with Gzip files

Posted by Harshit Mathur <ma...@gmail.com>.
what sought of side effects?

On Thu, Sep 24, 2015 at 2:35 PM, R P <ha...@outlook.com> wrote:

> Thanks Harshit. That approach doesn't look good as it will write
> uncompressed data to HDFS resulting into job side effects.
> -
> R P
>
>
> ------------------------------
> Date: Thu, 24 Sep 2015 09:55:49 +0530
> Subject: Re: CombineFileInputFormat with Gzip files
> From: mathursharp@gmail.com
> To: user@hadoop.apache.org
> CC: mapreduce-user@hadoop.apache.org
>
>
> Hi R P,
>
> Follow this link,
>
> http://www.ibm.com/developerworks/library/bd-hadoopcombine/
>
>
> Regards,
> Harshit
>
> On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:
>
> Hello All,
>
> What is the best way to process small Gzip files with
> CombineFileInputFormat ?  If possible please provide link to the
> documentation.
> Appreciate your help.
>
> Thanks,
>
> *Adding  mapreduce-dev to the mailing list.
>
>
> ------------------------------
> From: hadooper@outlook.com
> To: user@hadoop.apache.org
> Subject: CombineFileInputFormat with Gzip files
> Date: Tue, 22 Sep 2015 18:29:05 -0700
>
> Hello All,
>   What is the best way to use CombineFileInputFormat with Gzip files as
> input?
>
> Thanks,
>
>
>
>
> --
> Harshit Mathur
>



-- 
Harshit Mathur

Re: CombineFileInputFormat with Gzip files

Posted by Harshit Mathur <ma...@gmail.com>.
what sought of side effects?

On Thu, Sep 24, 2015 at 2:35 PM, R P <ha...@outlook.com> wrote:

> Thanks Harshit. That approach doesn't look good as it will write
> uncompressed data to HDFS resulting into job side effects.
> -
> R P
>
>
> ------------------------------
> Date: Thu, 24 Sep 2015 09:55:49 +0530
> Subject: Re: CombineFileInputFormat with Gzip files
> From: mathursharp@gmail.com
> To: user@hadoop.apache.org
> CC: mapreduce-user@hadoop.apache.org
>
>
> Hi R P,
>
> Follow this link,
>
> http://www.ibm.com/developerworks/library/bd-hadoopcombine/
>
>
> Regards,
> Harshit
>
> On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:
>
> Hello All,
>
> What is the best way to process small Gzip files with
> CombineFileInputFormat ?  If possible please provide link to the
> documentation.
> Appreciate your help.
>
> Thanks,
>
> *Adding  mapreduce-dev to the mailing list.
>
>
> ------------------------------
> From: hadooper@outlook.com
> To: user@hadoop.apache.org
> Subject: CombineFileInputFormat with Gzip files
> Date: Tue, 22 Sep 2015 18:29:05 -0700
>
> Hello All,
>   What is the best way to use CombineFileInputFormat with Gzip files as
> input?
>
> Thanks,
>
>
>
>
> --
> Harshit Mathur
>



-- 
Harshit Mathur

Re: CombineFileInputFormat with Gzip files

Posted by Harshit Mathur <ma...@gmail.com>.
what sought of side effects?

On Thu, Sep 24, 2015 at 2:35 PM, R P <ha...@outlook.com> wrote:

> Thanks Harshit. That approach doesn't look good as it will write
> uncompressed data to HDFS resulting into job side effects.
> -
> R P
>
>
> ------------------------------
> Date: Thu, 24 Sep 2015 09:55:49 +0530
> Subject: Re: CombineFileInputFormat with Gzip files
> From: mathursharp@gmail.com
> To: user@hadoop.apache.org
> CC: mapreduce-user@hadoop.apache.org
>
>
> Hi R P,
>
> Follow this link,
>
> http://www.ibm.com/developerworks/library/bd-hadoopcombine/
>
>
> Regards,
> Harshit
>
> On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:
>
> Hello All,
>
> What is the best way to process small Gzip files with
> CombineFileInputFormat ?  If possible please provide link to the
> documentation.
> Appreciate your help.
>
> Thanks,
>
> *Adding  mapreduce-dev to the mailing list.
>
>
> ------------------------------
> From: hadooper@outlook.com
> To: user@hadoop.apache.org
> Subject: CombineFileInputFormat with Gzip files
> Date: Tue, 22 Sep 2015 18:29:05 -0700
>
> Hello All,
>   What is the best way to use CombineFileInputFormat with Gzip files as
> input?
>
> Thanks,
>
>
>
>
> --
> Harshit Mathur
>



-- 
Harshit Mathur

Re: CombineFileInputFormat with Gzip files

Posted by Harshit Mathur <ma...@gmail.com>.
what sought of side effects?

On Thu, Sep 24, 2015 at 2:35 PM, R P <ha...@outlook.com> wrote:

> Thanks Harshit. That approach doesn't look good as it will write
> uncompressed data to HDFS resulting into job side effects.
> -
> R P
>
>
> ------------------------------
> Date: Thu, 24 Sep 2015 09:55:49 +0530
> Subject: Re: CombineFileInputFormat with Gzip files
> From: mathursharp@gmail.com
> To: user@hadoop.apache.org
> CC: mapreduce-user@hadoop.apache.org
>
>
> Hi R P,
>
> Follow this link,
>
> http://www.ibm.com/developerworks/library/bd-hadoopcombine/
>
>
> Regards,
> Harshit
>
> On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:
>
> Hello All,
>
> What is the best way to process small Gzip files with
> CombineFileInputFormat ?  If possible please provide link to the
> documentation.
> Appreciate your help.
>
> Thanks,
>
> *Adding  mapreduce-dev to the mailing list.
>
>
> ------------------------------
> From: hadooper@outlook.com
> To: user@hadoop.apache.org
> Subject: CombineFileInputFormat with Gzip files
> Date: Tue, 22 Sep 2015 18:29:05 -0700
>
> Hello All,
>   What is the best way to use CombineFileInputFormat with Gzip files as
> input?
>
> Thanks,
>
>
>
>
> --
> Harshit Mathur
>



-- 
Harshit Mathur

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
Thanks Harshit. That approach doesn't look good as it will write uncompressed data to HDFS resulting into job side effects. -R PDate: Thu, 24 Sep 2015 09:55:49 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org
CC: mapreduce-user@hadoop.apache.org

Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:



Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.

From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  


-- 
Harshit Mathur
 		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
Thanks Harshit. That approach doesn't look good as it will write uncompressed data to HDFS resulting into job side effects. -R PDate: Thu, 24 Sep 2015 09:55:49 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org
CC: mapreduce-user@hadoop.apache.org

Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:



Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.

From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  


-- 
Harshit Mathur
 		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
Thanks Harshit. That approach doesn't look good as it will write uncompressed data to HDFS resulting into job side effects. -R PDate: Thu, 24 Sep 2015 09:55:49 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org
CC: mapreduce-user@hadoop.apache.org

Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:



Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.

From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  


-- 
Harshit Mathur
 		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
Thanks Harshit. That approach doesn't look good as it will write uncompressed data to HDFS resulting into job side effects. -R PDate: Thu, 24 Sep 2015 09:55:49 +0530
Subject: Re: CombineFileInputFormat with Gzip files
From: mathursharp@gmail.com
To: user@hadoop.apache.org
CC: mapreduce-user@hadoop.apache.org

Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:



Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.

From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  


-- 
Harshit Mathur
 		 	   		  

Re: CombineFileInputFormat with Gzip files

Posted by Harshit Mathur <ma...@gmail.com>.
Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:

> Hello All,
>
> What is the best way to process small Gzip files with
> CombineFileInputFormat ?  If possible please provide link to the
> documentation.
> Appreciate your help.
>
> Thanks,
>
> *Adding  mapreduce-dev to the mailing list.
>
>
> ------------------------------
> From: hadooper@outlook.com
> To: user@hadoop.apache.org
> Subject: CombineFileInputFormat with Gzip files
> Date: Tue, 22 Sep 2015 18:29:05 -0700
>
> Hello All,
>   What is the best way to use CombineFileInputFormat with Gzip files as
> input?
>
> Thanks,
>
>


-- 
Harshit Mathur

Re: CombineFileInputFormat with Gzip files

Posted by Harshit Mathur <ma...@gmail.com>.
Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:

> Hello All,
>
> What is the best way to process small Gzip files with
> CombineFileInputFormat ?  If possible please provide link to the
> documentation.
> Appreciate your help.
>
> Thanks,
>
> *Adding  mapreduce-dev to the mailing list.
>
>
> ------------------------------
> From: hadooper@outlook.com
> To: user@hadoop.apache.org
> Subject: CombineFileInputFormat with Gzip files
> Date: Tue, 22 Sep 2015 18:29:05 -0700
>
> Hello All,
>   What is the best way to use CombineFileInputFormat with Gzip files as
> input?
>
> Thanks,
>
>


-- 
Harshit Mathur

Re: CombineFileInputFormat with Gzip files

Posted by Harshit Mathur <ma...@gmail.com>.
Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:

> Hello All,
>
> What is the best way to process small Gzip files with
> CombineFileInputFormat ?  If possible please provide link to the
> documentation.
> Appreciate your help.
>
> Thanks,
>
> *Adding  mapreduce-dev to the mailing list.
>
>
> ------------------------------
> From: hadooper@outlook.com
> To: user@hadoop.apache.org
> Subject: CombineFileInputFormat with Gzip files
> Date: Tue, 22 Sep 2015 18:29:05 -0700
>
> Hello All,
>   What is the best way to use CombineFileInputFormat with Gzip files as
> input?
>
> Thanks,
>
>


-- 
Harshit Mathur

Re: CombineFileInputFormat with Gzip files

Posted by Harshit Mathur <ma...@gmail.com>.
Hi R P,

Follow this link,

http://www.ibm.com/developerworks/library/bd-hadoopcombine/


Regards,
Harshit

On Thu, Sep 24, 2015 at 4:46 AM, R P <ha...@outlook.com> wrote:

> Hello All,
>
> What is the best way to process small Gzip files with
> CombineFileInputFormat ?  If possible please provide link to the
> documentation.
> Appreciate your help.
>
> Thanks,
>
> *Adding  mapreduce-dev to the mailing list.
>
>
> ------------------------------
> From: hadooper@outlook.com
> To: user@hadoop.apache.org
> Subject: CombineFileInputFormat with Gzip files
> Date: Tue, 22 Sep 2015 18:29:05 -0700
>
> Hello All,
>   What is the best way to use CombineFileInputFormat with Gzip files as
> input?
>
> Thanks,
>
>


-- 
Harshit Mathur

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,
 		 	   		   		 	   		  

RE: CombineFileInputFormat with Gzip files

Posted by R P <ha...@outlook.com>.
Hello All,
What is the best way to process small Gzip files with CombineFileInputFormat ?  If possible please provide link to the documentation.Appreciate your help. 
Thanks,
*Adding  mapreduce-dev to the mailing list.From: hadooper@outlook.com
To: user@hadoop.apache.org
Subject: CombineFileInputFormat with Gzip files
Date: Tue, 22 Sep 2015 18:29:05 -0700




Hello All,  What is the best way to use CombineFileInputFormat with Gzip files as input? 
Thanks,