You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Bhavesh Shah <bh...@gmail.com> on 2012/02/09 05:12:53 UTC

Showing wrong count after importing table in Hive

   Hello All,

I have imported near about 10 tables in Hive from MS SQL Server. But when I
try to cross check the records in Hive in one of the Table I have found
more record when I run the query (select count(*) from tblName;).

Then I have drop the that Table and again imported it in Hive. I have
observed in Console Logs that (Retrieved 203 records). And then I tried
again for (select count(*) from tblName;) and I got the count as 298.

I dont understand this why this happens. Is anything is wrong in query or
it happens due to some incorrect command of sqoop-import.

All other table records are fine.

I got stuck here and I had spend much time to search for this. Pls help me
out from this.


-- 
Thanks and Regards,
Bhavesh Shah

Re: [sqoop-user] Showing wrong count after importing table in Hive

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Bhavesh,
we were experiencing similar issue in the past - table in hive appear to have more rows than were reported to be imported by sqoop and that were actually available in the database.

Described problem on our side was in incorrect characters in exported data that broke lines in the exported test CSV file. For example some of our rows contained data with new line characters. Because couple of exported rows were split into more lines, number of hive rows appeared to be more than the import number. You might be experiencing similar issue. We've solved the issue by using parameter --hive-drop-import-delims (or you can possibly use --hive-delims-replacement). For semantics and usage, please consider taking look at manual:

http://incubator.apache.org/sqoop/docs/1.4.0-incubating/SqoopUserGuide.html#id1765770

Jarcec

On Wed, Feb 08, 2012 at 08:17:54PM -0800, Kathleen Ting wrote:
> Bhavesh - please subscribe to sqoop-user@incubator.apache.org for faster
> response.
> 
> It would be helpful if you could re-run it with the --verbose flag and then
> attach the console log.
> 
> Thanks, Kathleen
> 
> On Wed, Feb 8, 2012 at 8:12 PM, Bhavesh Shah <bh...@gmail.com>wrote:
> 
> >
> >
> >
> >
> >    Hello All,
> >
> > I have imported near about 10 tables in Hive from MS SQL Server. But when
> > I try to cross check the records in Hive in one of the Table I have found
> > more record when I run the query (select count(*) from tblName;).
> >
> > Then I have drop the that Table and again imported it in Hive. I have
> > observed in Console Logs that (Retrieved 203 records). And then I tried
> > again for (select count(*) from tblName;) and I got the count as 298.
> >
> > I dont understand this why this happens. Is anything is wrong in query or
> > it happens due to some incorrect command of sqoop-import.
> >
> > All other table records are fine.
> >
> > I got stuck here and I had spend much time to search for this. Pls help me
> > out from this.
> >
> >
> > --
> > Thanks and Regards,
> > Bhavesh Shah
> >
> >  --
> > NOTE: The mailing list sqoop-user@cloudera.org is deprecated in favor of
> > Apache Sqoop mailing list sqoop-user@incubator.apache.org. Please
> > subscribe to it by sending an email to
> > incubator-sqoop-user-subscribe@apache.org.
> >

Re: [sqoop-user] Showing wrong count after importing table in Hive

Posted by Kathleen Ting <ka...@cloudera.com>.
Bhavesh - please subscribe to sqoop-user@incubator.apache.org for faster
response.

It would be helpful if you could re-run it with the --verbose flag and then
attach the console log.

Thanks, Kathleen

On Wed, Feb 8, 2012 at 8:12 PM, Bhavesh Shah <bh...@gmail.com>wrote:

>
>
>
>
>    Hello All,
>
> I have imported near about 10 tables in Hive from MS SQL Server. But when
> I try to cross check the records in Hive in one of the Table I have found
> more record when I run the query (select count(*) from tblName;).
>
> Then I have drop the that Table and again imported it in Hive. I have
> observed in Console Logs that (Retrieved 203 records). And then I tried
> again for (select count(*) from tblName;) and I got the count as 298.
>
> I dont understand this why this happens. Is anything is wrong in query or
> it happens due to some incorrect command of sqoop-import.
>
> All other table records are fine.
>
> I got stuck here and I had spend much time to search for this. Pls help me
> out from this.
>
>
> --
> Thanks and Regards,
> Bhavesh Shah
>
>  --
> NOTE: The mailing list sqoop-user@cloudera.org is deprecated in favor of
> Apache Sqoop mailing list sqoop-user@incubator.apache.org. Please
> subscribe to it by sending an email to
> incubator-sqoop-user-subscribe@apache.org.
>

Re: Showing wrong count after importing table in Hive

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Bhavesh,
I believe that this question is more sqoop related than hive, so I've add sqoop user mailing list to CC.

Parameter --hive-drop-import-delims is actually very simple, it will blindly remove all '\n', '\r' and '\01' characters from all input data. It will let untouched all entires that do not contain such characters. If I understand your problem correctly and you're just trying to move your entire MS SQL database to hive using sqoop, then you should be fine running the parameter with importing all tables at once. Of course on precondition that you can live with loosing those three characters. I would suggest using --hive-delims-replacement in case that you can't.

And you're welcome, It was pleasure to help you.

Jarcec


On Thu, Feb 09, 2012 at 11:50:50AM +0530, Bhavesh Shah wrote:
> Thanks all for your reply.
> My problem is solved using --hive-drop-import-delims.
> Now I am getting the correct count as that in MS SQL Server.
> But I want to ask one more thing that if I continue to use
> --hive-drop-import-delims option everytime to all tables  (in case of
> sqoop-import-all-tables) while sqoop import, then will it be create some
> problems later or just it will work fine as now? (Means on other tables
> which does not have new line character)
> 
> Thanks Jarek Jarcec Cecho for the solution
> 
> 
> 
> 
> On Thu, Feb 9, 2012 at 11:41 AM, Felix.徐 <yg...@gmail.com> wrote:
> 
> > Hi, I meet the same problem once, then I change the amount of imported
> >  columns it works fine. Sometimes blank rows would be generated by sqoop..I
> > do not actually know what the problem really is.
> 
> 
> 
> 
> 
> 
> 
> >
> > 2012/2/9 Bhavesh Shah <bh...@gmail.com>
> >
> >>
> >>
> >>
> >>
> >>    Hello All,
> >>
> >> I have imported near about 10 tables in Hive from MS SQL Server. But when
> >> I try to cross check the records in Hive in one of the Table I have found
> >> more record when I run the query (select count(*) from tblName;).
> >>
> >> Then I have drop the that Table and again imported it in Hive. I have
> >> observed in Console Logs that (Retrieved 203 records). And then I tried
> >> again for (select count(*) from tblName;) and I got the count as 298.
> >>
> >> I dont understand this why this happens. Is anything is wrong in query or
> >> it happens due to some incorrect command of sqoop-import.
> >>
> >> All other table records are fine.
> >>
> >> I got stuck here and I had spend much time to search for this. Pls help
> >> me out from this.
> >>
> >>
> >> --
> >> Thanks and Regards,
> >> Bhavesh Shah
> >>
> >>
> >
> 
> 
> -- 
> Regards,
> Bhavesh Shah

Re: Showing wrong count after importing table in Hive

Posted by Bhavesh Shah <bh...@gmail.com>.
Thanks all for your reply.
My problem is solved using --hive-drop-import-delims.
Now I am getting the correct count as that in MS SQL Server.
But I want to ask one more thing that if I continue to use
--hive-drop-import-delims option everytime to all tables  (in case of
sqoop-import-all-tables) while sqoop import, then will it be create some
problems later or just it will work fine as now? (Means on other tables
which does not have new line character)

Thanks Jarek Jarcec Cecho for the solution




On Thu, Feb 9, 2012 at 11:41 AM, Felix.徐 <yg...@gmail.com> wrote:

> Hi, I meet the same problem once, then I change the amount of imported
>  columns it works fine. Sometimes blank rows would be generated by sqoop..I
> do not actually know what the problem really is.







>
> 2012/2/9 Bhavesh Shah <bh...@gmail.com>
>
>>
>>
>>
>>
>>    Hello All,
>>
>> I have imported near about 10 tables in Hive from MS SQL Server. But when
>> I try to cross check the records in Hive in one of the Table I have found
>> more record when I run the query (select count(*) from tblName;).
>>
>> Then I have drop the that Table and again imported it in Hive. I have
>> observed in Console Logs that (Retrieved 203 records). And then I tried
>> again for (select count(*) from tblName;) and I got the count as 298.
>>
>> I dont understand this why this happens. Is anything is wrong in query or
>> it happens due to some incorrect command of sqoop-import.
>>
>> All other table records are fine.
>>
>> I got stuck here and I had spend much time to search for this. Pls help
>> me out from this.
>>
>>
>> --
>> Thanks and Regards,
>> Bhavesh Shah
>>
>>
>


-- 
Regards,
Bhavesh Shah

Re: Showing wrong count after importing table in Hive

Posted by "Felix.徐" <yg...@gmail.com>.
Hi, I meet the same problem once, then I change the amount of imported
 columns it works fine. Sometimes blank rows would be generated by sqoop..I
do not actually know what the problem really is..

2012/2/9 Bhavesh Shah <bh...@gmail.com>

>
>
>
>
>    Hello All,
>
> I have imported near about 10 tables in Hive from MS SQL Server. But when
> I try to cross check the records in Hive in one of the Table I have found
> more record when I run the query (select count(*) from tblName;).
>
> Then I have drop the that Table and again imported it in Hive. I have
> observed in Console Logs that (Retrieved 203 records). And then I tried
> again for (select count(*) from tblName;) and I got the count as 298.
>
> I dont understand this why this happens. Is anything is wrong in query or
> it happens due to some incorrect command of sqoop-import.
>
> All other table records are fine.
>
> I got stuck here and I had spend much time to search for this. Pls help me
> out from this.
>
>
> --
> Thanks and Regards,
> Bhavesh Shah
>
>