You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by rab ra <ra...@gmail.com> on 2014/08/24 08:39:17 UTC

Sequence files and merging

Hello,

I need few clarifications for the following questions related to
sequenceFiles

1. I have a bunch of sequence file. Each file has 8 keys and corresponding
values. The values are float array bytes, and key is a name which is a
string.  Now, storing these smaller files and processing is not efficient
as there can be milliions of such files. Hence, I am thinking of creating
one sequence file out of such large number of files. Is it possible? I read
in the literature that there are ways to merge sequence files. My question
is that if I merge large number of sequence files, how can I retrieve
individual small sequence file in my map processes?

2. when I merge, it becomes a different sequence file altogether with keys
merged? If this is the case, my keys will be same for all the files. How it
will be handled?  Will there be any problem here?

3. Is it possible to append keys and values to existing sequence file?



regards
rab