You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by lovely kasi <ka...@gmail.com> on 2013/12/06 15:03:55 UTC

Issues with the sqoop merge feature of sqoop

Sqoop import uses --hive-table option to import the data to hive and the
final result appears like an hive internal table .But why doesn't the sqoop
merge do the same thing.The sqoop merge can merge two HDFS directories and
also data from hive internal tables but it doesn't write the output in the
same way to an hive internal table.


Thanks,
Lovely

Re: Issues with the sqoop merge feature of sqoop

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Lovely,
there is no difference between Hive import and import into HDFS from a storage perspective. The data will always end up on HDFS. The only difference is that you with --hive-import parameter Sqoop will automatically populate Hive's metadata and move the data to a different location. Did you try the incremental import without the --hive-import and pointing the --target-dir directly into the Hive warehouse directory?

Jarcec

On Sun, Dec 08, 2013 at 11:55:09PM +0530, lovely kasi wrote:
> I am trying to do incremental import of a table from DB into hive using
> sqoop import
> Then since the sqoop incrremental import is not able to replace the old
> records with new ones or write to the same directory as the previous import
> , i had to do the incremental import to another directory and then merge
> them.
> 
> This merge works fine if i imported only to HDFS. But if i imported
> directly to hive in the form of internal tables then merge doesn't work.
> I mean if the inputs to sqoop merge are normal HDFS directories or hive
> internal table directories it always writes to HDFS only but doesnt write
> the merge output to hive internal table once again.
> 
> I am asking why cant it write?
> 
> 
> 
> On Sat, Dec 7, 2013 at 10:35 PM, Jarek Jarcec Cecho <ja...@apache.org>wrote:
> 
> > Hi Lovely,
> > Would you mind iterating a bit about your use case? What you are trying to
> > accomplish?
> >
> > Jarcec
> >
> > On Fri, Dec 06, 2013 at 06:03:55AM -0800, lovely kasi wrote:
> > > Sqoop import uses --hive-table option to import the data to hive and the
> > > final result appears like an hive internal table .But why doesn't the
> > sqoop
> > > merge do the same thing.The sqoop merge can merge two HDFS directories
> > and
> > > also data from hive internal tables but it doesn't write the output in
> > the
> > > same way to an hive internal table.
> > >
> > >
> > > Thanks,
> > > Lovely
> >

Re: Issues with the sqoop merge feature of sqoop

Posted by lovely kasi <ka...@gmail.com>.
I am trying to do incremental import of a table from DB into hive using
sqoop import
Then since the sqoop incrremental import is not able to replace the old
records with new ones or write to the same directory as the previous import
, i had to do the incremental import to another directory and then merge
them.

This merge works fine if i imported only to HDFS. But if i imported
directly to hive in the form of internal tables then merge doesn't work.
I mean if the inputs to sqoop merge are normal HDFS directories or hive
internal table directories it always writes to HDFS only but doesnt write
the merge output to hive internal table once again.

I am asking why cant it write?



On Sat, Dec 7, 2013 at 10:35 PM, Jarek Jarcec Cecho <ja...@apache.org>wrote:

> Hi Lovely,
> Would you mind iterating a bit about your use case? What you are trying to
> accomplish?
>
> Jarcec
>
> On Fri, Dec 06, 2013 at 06:03:55AM -0800, lovely kasi wrote:
> > Sqoop import uses --hive-table option to import the data to hive and the
> > final result appears like an hive internal table .But why doesn't the
> sqoop
> > merge do the same thing.The sqoop merge can merge two HDFS directories
> and
> > also data from hive internal tables but it doesn't write the output in
> the
> > same way to an hive internal table.
> >
> >
> > Thanks,
> > Lovely
>

Re: Issues with the sqoop merge feature of sqoop

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Lovely,
Would you mind iterating a bit about your use case? What you are trying to accomplish?

Jarcec

On Fri, Dec 06, 2013 at 06:03:55AM -0800, lovely kasi wrote:
> Sqoop import uses --hive-table option to import the data to hive and the
> final result appears like an hive internal table .But why doesn't the sqoop
> merge do the same thing.The sqoop merge can merge two HDFS directories and
> also data from hive internal tables but it doesn't write the output in the
> same way to an hive internal table.
> 
> 
> Thanks,
> Lovely