You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by KayVajj <va...@gmail.com> on 2015/07/29 19:37:02 UTC
Fwd: Sqoop Codegen Null String
Sorry the earlier email I sent didn't show up while I searched for it so
resending.
Hi,
I have a question with the Sqoop CodeGen. I'm trying to load data from a
DB. I have used the codgen tool to generate the java code. I wanted to
treat the null-strings and null-non-strings as
--null-string '\\N'
--null-non-string '\\N'
Now the code that is generated looks like (The below code is excerpt from
__loadFromFields method in the generated code)
__cur_str = __it.next();
if (__cur_str.equals("null")) { this.org_id = null; } else {
this.org_id = __cur_str;
}
I was wondering even with the input options specifically provided it still
treats string "null" as the null string as if I did not provide. Then after
some code browsing, I saw the below code in org.apache.sqoop.orm.ClassWriter
private void parseNullVal(String javaType, String colName, StringBuilder
sb) {
if (javaType.equals("String")) {
sb.append(" if (__cur_str.equals(\""
+ this.options.getInNullStringValue() + "\")) { this.");
sb.append(colName);
sb.append(" = null; } else {\n");
} else {
sb.append(" if (__cur_str.equals(\""
+ this.options.getInNullNonStringValue());
sb.append("\") || __cur_str.length() == 0) { this.");
sb.append(colName);
sb.append(" = null; } else {\n");
}
}
This tells me that the loadFromFields will be correct if I set the below
options
--input-null-string '\\N'
--input-null-non-string '\\N'
My understanding is these values are to be set only if we are writing to
the DB and not while reading. I'm not writing to the DB yet I ended up
setting both set of options which resulted in the below code in
__loadFromFields method in the new generated code
__cur_str = __it.next();
if (__cur_str.equals("\\N")) { this.org_id = null; } else {
this.org_id = __cur_str;
}
Is this a bug?
Thanks
Kay
Re: Sqoop Codegen Null String
Posted by Abraham Elmahrek <ab...@cloudera.com>.
Hey man,
--null-string and --null-non-string are used when serializing for writing
to Hadoop. Check out
https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/orm/ClassWriter.java#L360
and
https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/orm/ClassWriter.java#L1326
.
-Abe
On Wed, Jul 29, 2015 at 10:37 AM, KayVajj <va...@gmail.com> wrote:
> Sorry the earlier email I sent didn't show up while I searched for it so
> resending.
>
> Hi,
>
> I have a question with the Sqoop CodeGen. I'm trying to load data from a
> DB. I have used the codgen tool to generate the java code. I wanted to
> treat the null-strings and null-non-strings as
>
> --null-string '\\N'
> --null-non-string '\\N'
>
>
> Now the code that is generated looks like (The below code is excerpt from
> __loadFromFields method in the generated code)
>
> __cur_str = __it.next();
> if (__cur_str.equals("null")) { this.org_id = null; } else {
> this.org_id = __cur_str;
> }
>
> I was wondering even with the input options specifically provided it still
> treats string "null" as the null string as if I did not provide. Then after
> some code browsing, I saw the below code in org.apache.sqoop.orm.ClassWriter
>
> private void parseNullVal(String javaType, String colName, StringBuilder
> sb) {
> if (javaType.equals("String")) {
> sb.append(" if (__cur_str.equals(\""
> + this.options.getInNullStringValue() + "\")) { this.");
> sb.append(colName);
> sb.append(" = null; } else {\n");
> } else {
> sb.append(" if (__cur_str.equals(\""
> + this.options.getInNullNonStringValue());
> sb.append("\") || __cur_str.length() == 0) { this.");
> sb.append(colName);
> sb.append(" = null; } else {\n");
> }
> }
>
> This tells me that the loadFromFields will be correct if I set the below
> options
>
> --input-null-string '\\N'
> --input-null-non-string '\\N'
>
> My understanding is these values are to be set only if we are writing to
> the DB and not while reading. I'm not writing to the DB yet I ended up
> setting both set of options which resulted in the below code in
> __loadFromFields method in the new generated code
>
> __cur_str = __it.next();
> if (__cur_str.equals("\\N")) { this.org_id = null; } else {
> this.org_id = __cur_str;
> }
>
> Is this a bug?
>
> Thanks
>
> Kay
>
>
>