You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Jilani Shaik <ji...@gmail.com> on 2017/07/03 14:22:12 UTC

Re: Review Request 57499: Sqoop incremental import - NULL column updates are not pulled into HBase table

Hi Anna,

Thanks for reviewing and providing suggestions. 
I have analyzed this further now. When I started I was not aware of the code that much as I am now. so I can incorporate the changes as suggested and do the validation of unit test. Once everything good, I will update the request. 

Thanks,
Jilani

> On Tue, Jun 27, 2017 at 7:29 AM, Anna Szonyi <sz...@cloudera.com> wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57499/#review178977
> -----------------------------------------------------------
> 
> 
> 
> Hi Jilani,
> 
> Thank you for your changes and apologies for the latency! I've reviewed the changes and they look really good! There is one last comment I am making that I want to ask your thoughts on and if we resolve that one I will commit your changes as soon as possible.
> 
> Thanks,
> Anna
> 
> 
> src/java/org/apache/sqoop/hbase/ToStringPutTransformer.java
> Lines 172-200 (original), 174-217 (patched)
> <https://reviews.apache.org/r/57499/#comment253359>
> 
>     Hey Jilani,
> 
>     Instead of this, maybe we could initialize the List at the top of the method and instead of using the booleans, like:
> 
>      List<Mutation> mutationList = new ArrayList<>();
>       ...
>         for (...) {
>              ...
>             Put put = new Put(Bytes.toBytes(rowKey));
>             if (null != val) {
>               if ( val instanceof byte[]) {
>                 put.addColumn(...);
>               } else {
>                 put.addColumn(...);
>               }
>               mutationList.add(put);
>             } else {
>               Delete delete = new Delete(Bytes.toBytes(rowKey));
>               delete.addColumn(...);
>               mutationList.add(delete);
>             }
>           }
>         }
> 
>         return Collections.unmodifiableList(mutationList);
>       }
> 
>     or even use a Stack/Deque if you want to ensure that there is only a single (last) element that we return in either case:
> 
>         Deque<Mutation> mutations = new ArrayDeque<>();
>         ...
>         for (...) {
>           ...
>             Put put = new Put(Bytes.toBytes(rowKey));
>             if (null != val) {
>               if ( val instanceof byte[]) {
>                 put.addColumn(...);
>               } else {
>                 put.addColumn(...);
>               }
>               mutations.addFirst(put);
>             } else {
>               Delete delete = new Delete(Bytes.toBytes(rowKey));
>               delete.addColumn(...)
>               mutations.addFirst(delete);
>             }
>           }
>         }
> 
>         return Collections.singletonList(mutations.peekFirst());
> 
>     Or something similar.
>     This way we don't have to track and check the isDelete and isPop, which makes the code a bit more clean and extensible.
>     Please let me know your thoughts on this!
> 
> 
> - Anna Szonyi
> 
> 
> On June 11, 2017, 8:21 p.m., Jilani Shaik wrote:
> >
> > -----------------------------------------------------------
> > This is an automatically generated e-mail. To reply, visit:
> > https://reviews.apache.org/r/57499/
> > -----------------------------------------------------------
> >
> > (Updated June 11, 2017, 8:21 p.m.)
> >
> >
> > Review request for Sqoop, Boglarka Egyed, Attila Szabo, and Anna Szonyi.
> >
> >
> > Bugs: SQOOP-3149
> >     https://issues.apache.org/jira/browse/SQOOP-3149
> >
> >
> > Repository: sqoop-trunk
> >
> >
> > Description
> > -------
> >
> > HBase delete is added as part of incremental data import, Modified the return type of method which is responsible for holding list of Put objects to List of Mutation objects and at the time of execution of insert or delete using the type of object in Mutation, executed the actaul operation either insert or delete.
> >
> > Jira ticket for the same: https://issues.apache.org/jira/browse/SQOOP-3149
> >
> > Similar ticket to above:  SQOOP-3125
> >
> >
> > Diffs
> > -----
> >
> >   src/java/org/apache/sqoop/hbase/HBasePutProcessor.java fdbe1276
> >   src/java/org/apache/sqoop/hbase/PutTransformer.java 533467e5
> >   src/java/org/apache/sqoop/hbase/ToStringPutTransformer.java 363e1456
> >   src/java/org/apache/sqoop/mapreduce/HBaseBulkImportMapper.java 58ccee7b
> >   src/test/com/cloudera/sqoop/hbase/HBaseImportTest.java fa14a013
> >   src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java a054eb66
> >   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 6310a398
> >
> >
> > Diff: https://reviews.apache.org/r/57499/diff/6/
> >
> >
> > Testing
> > -------
> >
> > Executed with jar prepared with these changes into hadoop cluster and tested bot import and then incremental import.
> >
> >
> > File Attachments
> > ----------------
> >
> > HBase delete support for incremental import
> >   https://reviews.apache.org/media/uploaded/files/2017/03/11/5b1895fd-4c6b-42fa-8a92-4e093153e370__hbase_delete_support_in_incremental_import
> > updated with unit test and based on below suggestions
> >   https://reviews.apache.org/media/uploaded/files/2017/03/22/708f63ba-2d8a-4a47-ab67-c1d2776354fd__hbase_delete_support_in_incremental_unit_test_also
> >
> >
> > Thanks,
> >
> > Jilani Shaik
> >
> >