You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Amir Mohammad Saied <am...@gmail.com> on 2013/12/16 14:42:55 UTC
Specifying row key in sqoop-import
Hi,
I'm using Sqoop to import (only one column of) a table from MySQL to HDFS.
I'd like records to be stored as SequenceFiles so I can run Mahout's
"seq2sparse" to generate Vectors from them later.
I've two questions regarding the import process:
1) Dumping SequenceFiles generated by sqoop-import, I realized the row
"Key" is automatically generated by Sqoop, and is not the "id" column of
the MySQL table row. Can I ask sqoop-import to use the row's "id" field as
Key?
2) If its possible to set row "Key" (above question), can I cast it to a
specific class using sqoop-import?
Thanks,
amir
Re: Specifying row key in sqoop-import
Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Amir,
Sqoop will generate special class when importing table (even with only one column) and will use this class as a key for the SequenceFile. I'm not familiar with mahout, so I'm not sure if this format can be consumed by it.
Jarcec
On Mon, Dec 16, 2013 at 01:42:55PM +0000, Amir Mohammad Saied wrote:
> Hi,
>
> I'm using Sqoop to import (only one column of) a table from MySQL to HDFS.
> I'd like records to be stored as SequenceFiles so I can run Mahout's
> "seq2sparse" to generate Vectors from them later.
>
> I've two questions regarding the import process:
>
> 1) Dumping SequenceFiles generated by sqoop-import, I realized the row
> "Key" is automatically generated by Sqoop, and is not the "id" column of
> the MySQL table row. Can I ask sqoop-import to use the row's "id" field as
> Key?
>
> 2) If its possible to set row "Key" (above question), can I cast it to a
> specific class using sqoop-import?
>
> Thanks,
>
> amir