You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by shakun grover <s2...@gmail.com> on 2014/09/24 08:00:18 UTC
Sqoop2 - Sequence Output File has (null) appended with the original values
Hi All,
Whenever I do a Sqoop Import using the following command:
Name: test
Database configuration
Schema name: test
Table name: emp
Table SQL statement:
Table column names: name,id
Partition column name: id
Nulls in partition column: true
Boundary query:
Output configuration
Storage type:
0 : HDFS
Choose: 0
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 1
Output directory: /tmp/Seq1/1
Throttling resources
Extractors:
Loaders:
Job was successfully updated with status FINE
It gives me the following output file:
'Tom',1 (null)
''Blue',2 (null)
'James',3 (null)
'Tom',4 (null)
'Erik',5 (null)
I want to know that why it is appending (null) in the output sequence file.
Any help will be highly appreciated.
Thanks in advance!!
--
Thanks & Regards,
Shakun Grover
Re: Sqoop2 - Sequence Output File has (null) appended with the
original values
Posted by Gwen Shapira <gs...@cloudera.com>.
I don't have a good work-around at the moment, but can you open a Jira? I
believe we can and should fix it (by grabbing the PK from the DB and using
it as a key in a sequence file).
On Wed, Sep 24, 2014 at 10:12 PM, shakun grover <s2...@gmail.com> wrote:
> Thanks Jarcec for your reply.
> Yes I am using generic tool (hadoop dfs -text) for viewing the output.
> But could you please tell me that how can I avoid using 'value' field for
> Sequence file as I am using GenericJdbcConnector for importing data from
> RDBMS to HDFS through Sqoop2.
>
>
> On Wed, Sep 24, 2014 at 5:43 PM, Jarek Jarcec Cecho <ja...@apache.org>
> wrote:
>
> > Hi Shakun,
> > SequenceFile always contains key-value pairs - that is how the format is
> > defined. However this doesn’t suite Sqoop as we consider entire row as
> > “key” and hence we’re not using the “value” field - and that is the null
> > that you’re observing. If you use generic tool such as (hadoop dfs -text)
> > you will get generic output that will include the value field and hence
> > will show a null string. Simply don’t use the “value” field in your
> > application and you will be good to go!
> >
> > Jarcec
> >
> > On Sep 23, 2014, at 11:00 PM, shakun grover <s2...@gmail.com> wrote:
> >
> > > Hi All,
> > >
> > > Whenever I do a Sqoop Import using the following command:
> > >
> > > Name: test
> > >
> > > Database configuration
> > >
> > > Schema name: test
> > > Table name: emp
> > > Table SQL statement:
> > > Table column names: name,id
> > > Partition column name: id
> > > Nulls in partition column: true
> > > Boundary query:
> > >
> > > Output configuration
> > >
> > > Storage type:
> > > 0 : HDFS
> > > Choose: 0
> > > Output format:
> > > 0 : TEXT_FILE
> > > 1 : SEQUENCE_FILE
> > > Choose: 1
> > > Output directory: /tmp/Seq1/1
> > >
> > > Throttling resources
> > >
> > > Extractors:
> > > Loaders:
> > > Job was successfully updated with status FINE
> > >
> > > It gives me the following output file:
> > > 'Tom',1 (null)
> > > ''Blue',2 (null)
> > > 'James',3 (null)
> > > 'Tom',4 (null)
> > > 'Erik',5 (null)
> > >
> > >
> > > I want to know that why it is appending (null) in the output sequence
> > file.
> > >
> > > Any help will be highly appreciated.
> > >
> > > Thanks in advance!!
> > >
> > > --
> > > Thanks & Regards,
> > > Shakun Grover
> >
> >
>
>
> --
> Thanks & Regards,
> Shakun Grover
>
Re: Sqoop2 - Sequence Output File has (null) appended with the
original values
Posted by shakun grover <s2...@gmail.com>.
Thanks Jarcec for your reply.
Yes I am using generic tool (hadoop dfs -text) for viewing the output.
But could you please tell me that how can I avoid using 'value' field for
Sequence file as I am using GenericJdbcConnector for importing data from
RDBMS to HDFS through Sqoop2.
On Wed, Sep 24, 2014 at 5:43 PM, Jarek Jarcec Cecho <ja...@apache.org>
wrote:
> Hi Shakun,
> SequenceFile always contains key-value pairs - that is how the format is
> defined. However this doesn’t suite Sqoop as we consider entire row as
> “key” and hence we’re not using the “value” field - and that is the null
> that you’re observing. If you use generic tool such as (hadoop dfs -text)
> you will get generic output that will include the value field and hence
> will show a null string. Simply don’t use the “value” field in your
> application and you will be good to go!
>
> Jarcec
>
> On Sep 23, 2014, at 11:00 PM, shakun grover <s2...@gmail.com> wrote:
>
> > Hi All,
> >
> > Whenever I do a Sqoop Import using the following command:
> >
> > Name: test
> >
> > Database configuration
> >
> > Schema name: test
> > Table name: emp
> > Table SQL statement:
> > Table column names: name,id
> > Partition column name: id
> > Nulls in partition column: true
> > Boundary query:
> >
> > Output configuration
> >
> > Storage type:
> > 0 : HDFS
> > Choose: 0
> > Output format:
> > 0 : TEXT_FILE
> > 1 : SEQUENCE_FILE
> > Choose: 1
> > Output directory: /tmp/Seq1/1
> >
> > Throttling resources
> >
> > Extractors:
> > Loaders:
> > Job was successfully updated with status FINE
> >
> > It gives me the following output file:
> > 'Tom',1 (null)
> > ''Blue',2 (null)
> > 'James',3 (null)
> > 'Tom',4 (null)
> > 'Erik',5 (null)
> >
> >
> > I want to know that why it is appending (null) in the output sequence
> file.
> >
> > Any help will be highly appreciated.
> >
> > Thanks in advance!!
> >
> > --
> > Thanks & Regards,
> > Shakun Grover
>
>
--
Thanks & Regards,
Shakun Grover
Re: Sqoop2 - Sequence Output File has (null) appended with the original values
Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Shakun,
SequenceFile always contains key-value pairs - that is how the format is defined. However this doesn’t suite Sqoop as we consider entire row as “key” and hence we’re not using the “value” field - and that is the null that you’re observing. If you use generic tool such as (hadoop dfs -text) you will get generic output that will include the value field and hence will show a null string. Simply don’t use the “value” field in your application and you will be good to go!
Jarcec
On Sep 23, 2014, at 11:00 PM, shakun grover <s2...@gmail.com> wrote:
> Hi All,
>
> Whenever I do a Sqoop Import using the following command:
>
> Name: test
>
> Database configuration
>
> Schema name: test
> Table name: emp
> Table SQL statement:
> Table column names: name,id
> Partition column name: id
> Nulls in partition column: true
> Boundary query:
>
> Output configuration
>
> Storage type:
> 0 : HDFS
> Choose: 0
> Output format:
> 0 : TEXT_FILE
> 1 : SEQUENCE_FILE
> Choose: 1
> Output directory: /tmp/Seq1/1
>
> Throttling resources
>
> Extractors:
> Loaders:
> Job was successfully updated with status FINE
>
> It gives me the following output file:
> 'Tom',1 (null)
> ''Blue',2 (null)
> 'James',3 (null)
> 'Tom',4 (null)
> 'Erik',5 (null)
>
>
> I want to know that why it is appending (null) in the output sequence file.
>
> Any help will be highly appreciated.
>
> Thanks in advance!!
>
> --
> Thanks & Regards,
> Shakun Grover