You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Thuy Nguyen <th...@gmail.com> on 2013/04/02 19:34:52 UTC

Fwd: sqoop and --direct option

Hi all,

I got error exporting from hive to mysql (without --direct option) as below:

java.lang.NumberFormatException: For input string: "\N"
        at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
        at java.lang.Long.parseLong(Long.java:410)
        at java.lang.Long.valueOf(Long.java:525)
        at
hive_category_event_type_facts.__loadFromFields(hive_category_event_type_facts.java:314)
        at
hive_category_event_type_facts.parse(hive_category_event_type_facts.java:251)
        at
com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:81)
        at
com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:40)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at
com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:189)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.had

and by running swoop with --direct option, '\N' is converted to 0 and thus,
sqoop succeeded.

Could you please advise if --direct option is expected to have that side
effect?

Thanks,
Thuy

Re: Fwd: sqoop and --direct option

Posted by Thuy Nguyen <th...@gmail.com>.
Thanks, Jarek for the insights. And using the suggested options converts
'\N' to null correctly.

Thuy


On Tue, Apr 2, 2013 at 10:56 AM, Jarek Jarcec Cecho <ja...@apache.org>wrote:

> Hi Thuy,
> in case of direct import all your data are passed directly to mysqlimport
> utility that is then doing all the parsing and inserting into your
> database. Without direct mode Sqoop itself will parse data so that they can
> be inserted. Thus slight differences in edge error cases are expected as
> there are different tools involved to process the data.
>
> Sqoop by default assumes that you're using string constant "null" to
> encode NULL values and everything else is assumed to be proper content of
> the column. However Hive is using different string constant "\N" to encode
> the NULL value. You need to tell Sqoop what is the correct substitution
> constant using --input-(non-)string parameters, like:
>
>   sqoop export ... --input-null-string '\\N' --input-null-non-string '\\N'
>
> Please note that the double backslash is needed due to way how this
> parameter is used withing Sqoop. Please check out Sqoop user guide if you
> have further questions [1].
>
> Jarcec
>
> Links:
> 1:
> http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_importing_data_into_hive
>
> On Tue, Apr 02, 2013 at 10:34:52AM -0700, Thuy Nguyen wrote:
> > Hi all,
> >
> > I got error exporting from hive to mysql (without --direct option) as
> below:
> >
> > java.lang.NumberFormatException: For input string: "\N"
> >         at
> >
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> >         at java.lang.Long.parseLong(Long.java:410)
> >         at java.lang.Long.valueOf(Long.java:525)
> >         at
> >
> hive_category_event_type_facts.__loadFromFields(hive_category_event_type_facts.java:314)
> >         at
> >
> hive_category_event_type_facts.parse(hive_category_event_type_facts.java:251)
> >         at
> >
> com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:81)
> >         at
> >
> com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:40)
> >         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >         at
> >
> com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:189)
> >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:396)
> >         at org.apache.had
> >
> > and by running swoop with --direct option, '\N' is converted to 0 and
> thus,
> > sqoop succeeded.
> >
> > Could you please advise if --direct option is expected to have that side
> > effect?
> >
> > Thanks,
> > Thuy
>

Re: Fwd: sqoop and --direct option

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Thuy,
in case of direct import all your data are passed directly to mysqlimport utility that is then doing all the parsing and inserting into your database. Without direct mode Sqoop itself will parse data so that they can be inserted. Thus slight differences in edge error cases are expected as there are different tools involved to process the data.

Sqoop by default assumes that you're using string constant "null" to encode NULL values and everything else is assumed to be proper content of the column. However Hive is using different string constant "\N" to encode the NULL value. You need to tell Sqoop what is the correct substitution constant using --input-(non-)string parameters, like:

  sqoop export ... --input-null-string '\\N' --input-null-non-string '\\N'

Please note that the double backslash is needed due to way how this parameter is used withing Sqoop. Please check out Sqoop user guide if you have further questions [1].

Jarcec

Links:
1: http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_importing_data_into_hive

On Tue, Apr 02, 2013 at 10:34:52AM -0700, Thuy Nguyen wrote:
> Hi all,
> 
> I got error exporting from hive to mysql (without --direct option) as below:
> 
> java.lang.NumberFormatException: For input string: "\N"
>         at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
>         at java.lang.Long.parseLong(Long.java:410)
>         at java.lang.Long.valueOf(Long.java:525)
>         at
> hive_category_event_type_facts.__loadFromFields(hive_category_event_type_facts.java:314)
>         at
> hive_category_event_type_facts.parse(hive_category_event_type_facts.java:251)
>         at
> com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:81)
>         at
> com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:40)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at
> com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:189)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.had
> 
> and by running swoop with --direct option, '\N' is converted to 0 and thus,
> sqoop succeeded.
> 
> Could you please advise if --direct option is expected to have that side
> effect?
> 
> Thanks,
> Thuy