You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@gora.apache.org by Jared Wong <jw...@pinterest.com> on 2015/09/27 08:53:00 UTC

HBaseStore Deletes After Puts?

Hi,

Why does gora process the deletes after the puts? Could this cause a
potential problem when processing MAP types because MAP types have their
column or column family deleted while adding new content?

In put:
if (put.size() > 0) {
  table.put(put);
}
if (delete.size() > 0) {
  table.delete(delete);
  table.delete(delete);
  table.delete(delete); // HBase sometimes does not delete arbitrarily
}

In addPutsAndDeletes:
case MAP:
  // if it's a map that has been modified, then the content should be
replaced by the new one
  // This is because we don't know if the content has changed or not.
  if (qualifier == null) {
    delete.deleteFamily(hcol.getFamily());
  } else {
    delete.deleteColumn(hcol.getFamily(), qualifier);
  }
  @SuppressWarnings({ "rawtypes", "unchecked" })
  Set<Entry> set = ((Map) o).entrySet();
  for (@SuppressWarnings("rawtypes") Entry entry : set) {
    byte[] qual = toBytes(entry.getKey());
    addPutsAndDeletes(put, delete, entry.getValue(), schema.getValueType()
        .getType(), schema.getValueType(), hcol, qual);
  }
  break;

https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L253

Best,
Jared

Re: HBaseStore Deletes After Puts?

Posted by Lewis John Mcgibbney <le...@gmail.com>.
I agree actually.
I've logged the issue
https://issues.apache.org/jira/browse/GORA-449
I'm working on fixing the tests in MemStore. I'll probably pivot to this if
no one else does.

On Thu, Oct 8, 2015 at 5:54 AM, Tim Robertson <ti...@gmail.com>
wrote:

> Hi Lewis,
>
> I am not an HBase developer, but have done a lot of work with HBase over
> the past years.
> It looks rather dubious that there are 3 deletes like that, and it should
> not be necessary unless there is some other race condition going on.
>
> I'd suggest removing those, and if an issue arises I'd be happy to take a
> look.
>
> Cheers,
> Tim
>
>
>
>
> On Thu, Oct 8, 2015 at 2:07 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> Yes this should be handled by the datastore.
>> I have a feeling that this was inherent behavior for hBase some time ago
>> and it had changed and become more robust now.
>> We should confirm from hBase devs.
>>
>>
>> On Thursday, October 8, 2015, Renato Marroquín Mogrovejo <
>> renatoj.marroquin@gmail.com> wrote:
>>
>>> Hi Lewis,
>>>
>>> I think that one makes sense,  we have three fields inserted, and we
>>> delete the three of them. But still seems off, and that test being part of
>>> core makes it even worst because this is not an HBase specific test. So for
>>> every field that we added to the query object, we have to call delete?
>>> shouldn't the dataStore do that by itself?
>>>
>>> 2015-10-08 2:40 GMT+02:00 Lewis John Mcgibbney <
>>> lewis.mcgibbney@gmail.com>:
>>>
>>>> Hi Renato,
>>>>
>>>> On Wed, Oct 7, 2015 at 3:31 PM, Renato Marroquín Mogrovejo <
>>>> renatoj.marroquin@gmail.com> wrote:
>>>>
>>>>>  I also don't understand why there are three table deletes.
>>>>>
>>>>
>>>> did you ever notice
>>>>
>>>>
>>>> https://github.com/apache/gora/blob/master/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java#L1041-L1043
>>>>
>>>>
>>>>
>>>
>>>
>>
>> --
>> *Lewis*
>>
>>
>


-- 
*Lewis*

Re: HBaseStore Deletes After Puts?

Posted by Tim Robertson <ti...@gmail.com>.
Hi Lewis,

I am not an HBase developer, but have done a lot of work with HBase over
the past years.
It looks rather dubious that there are 3 deletes like that, and it should
not be necessary unless there is some other race condition going on.

I'd suggest removing those, and if an issue arises I'd be happy to take a
look.

Cheers,
Tim




On Thu, Oct 8, 2015 at 2:07 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Yes this should be handled by the datastore.
> I have a feeling that this was inherent behavior for hBase some time ago
> and it had changed and become more robust now.
> We should confirm from hBase devs.
>
>
> On Thursday, October 8, 2015, Renato Marroquín Mogrovejo <
> renatoj.marroquin@gmail.com> wrote:
>
>> Hi Lewis,
>>
>> I think that one makes sense,  we have three fields inserted, and we
>> delete the three of them. But still seems off, and that test being part of
>> core makes it even worst because this is not an HBase specific test. So for
>> every field that we added to the query object, we have to call delete?
>> shouldn't the dataStore do that by itself?
>>
>> 2015-10-08 2:40 GMT+02:00 Lewis John Mcgibbney <lewis.mcgibbney@gmail.com
>> >:
>>
>>> Hi Renato,
>>>
>>> On Wed, Oct 7, 2015 at 3:31 PM, Renato Marroquín Mogrovejo <
>>> renatoj.marroquin@gmail.com> wrote:
>>>
>>>>  I also don't understand why there are three table deletes.
>>>>
>>>
>>> did you ever notice
>>>
>>>
>>> https://github.com/apache/gora/blob/master/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java#L1041-L1043
>>>
>>>
>>>
>>
>>
>
> --
> *Lewis*
>
>

Re: HBaseStore Deletes After Puts?

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Yes this should be handled by the datastore.
I have a feeling that this was inherent behavior for hBase some time ago
and it had changed and become more robust now.
We should confirm from hBase devs.

On Thursday, October 8, 2015, Renato Marroquín Mogrovejo <
renatoj.marroquin@gmail.com> wrote:

> Hi Lewis,
>
> I think that one makes sense,  we have three fields inserted, and we
> delete the three of them. But still seems off, and that test being part of
> core makes it even worst because this is not an HBase specific test. So for
> every field that we added to the query object, we have to call delete?
> shouldn't the dataStore do that by itself?
>
> 2015-10-08 2:40 GMT+02:00 Lewis John Mcgibbney <lewis.mcgibbney@gmail.com
> <javascript:_e(%7B%7D,'cvml','lewis.mcgibbney@gmail.com');>>:
>
>> Hi Renato,
>>
>> On Wed, Oct 7, 2015 at 3:31 PM, Renato Marroquín Mogrovejo <
>> renatoj.marroquin@gmail.com
>> <javascript:_e(%7B%7D,'cvml','renatoj.marroquin@gmail.com');>> wrote:
>>
>>>  I also don't understand why there are three table deletes.
>>>
>>
>> did you ever notice
>>
>>
>> https://github.com/apache/gora/blob/master/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java#L1041-L1043
>>
>>
>>
>
>

-- 
*Lewis*

Re: HBaseStore Deletes After Puts?

Posted by Renato Marroquín Mogrovejo <re...@gmail.com>.
Hi Lewis,

I think that one makes sense,  we have three fields inserted, and we delete
the three of them. But still seems off, and that test being part of core
makes it even worst because this is not an HBase specific test. So for
every field that we added to the query object, we have to call delete?
shouldn't the dataStore do that by itself?

2015-10-08 2:40 GMT+02:00 Lewis John Mcgibbney <le...@gmail.com>:

> Hi Renato,
>
> On Wed, Oct 7, 2015 at 3:31 PM, Renato Marroquín Mogrovejo <
> renatoj.marroquin@gmail.com> wrote:
>
>>  I also don't understand why there are three table deletes.
>>
>
> did you ever notice
>
>
> https://github.com/apache/gora/blob/master/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java#L1041-L1043
>
>
>

Re: HBaseStore Deletes After Puts?

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Renato,

On Wed, Oct 7, 2015 at 3:31 PM, Renato Marroquín Mogrovejo <
renatoj.marroquin@gmail.com> wrote:

>  I also don't understand why there are three table deletes.
>

did you ever notice

https://github.com/apache/gora/blob/master/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java#L1041-L1043

Re: HBaseStore Deletes After Puts?

Posted by Renato Marroquín Mogrovejo <re...@gmail.com>.
Hi Jared,

I am not very familiar with this part of the code, but I think you might be
right. Because if we are updating a Map, we should first delete it the
contents and then add it, not the other way around. I also don't understand
why there are three table deletes.
Maybe you could play around with a specific test like

https://github.com/apache/gora/blob/master/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java#L535

And maybe we can detect this wrong behaviour. But I do agree with you that
there is something strange in those code lines.
Hope you can take the time to help us debug this one Jared.


Best,

Renato M.

2015-09-29 20:08 GMT+02:00 Lewis John Mcgibbney <le...@gmail.com>:

> Hi Jared,
> I briefly had a look into this and I am still trying to properly
> understand it.
> Do you have some suggestion(s) as to answering your own questions if
> you've maybe thought about it a bit more?
> Lewis
>
> On Sun, Sep 27, 2015 at 1:53 AM, Jared Wong <jw...@pinterest.com> wrote:
>
>> Hi,
>>
>> Why does gora process the deletes after the puts? Could this cause a
>> potential problem when processing MAP types because MAP types have their
>> column or column family deleted while adding new content?
>>
>> In put:
>> if (put.size() > 0) {
>>   table.put(put);
>> }
>> if (delete.size() > 0) {
>>   table.delete(delete);
>>   table.delete(delete);
>>   table.delete(delete); // HBase sometimes does not delete arbitrarily
>> }
>>
>> In addPutsAndDeletes:
>> case MAP:
>>   // if it's a map that has been modified, then the content should be
>> replaced by the new one
>>   // This is because we don't know if the content has changed or not.
>>   if (qualifier == null) {
>>     delete.deleteFamily(hcol.getFamily());
>>   } else {
>>     delete.deleteColumn(hcol.getFamily(), qualifier);
>>   }
>>   @SuppressWarnings({ "rawtypes", "unchecked" })
>>   Set<Entry> set = ((Map) o).entrySet();
>>   for (@SuppressWarnings("rawtypes") Entry entry : set) {
>>     byte[] qual = toBytes(entry.getKey());
>>     addPutsAndDeletes(put, delete, entry.getValue(), schema.getValueType()
>>         .getType(), schema.getValueType(), hcol, qual);
>>   }
>>   break;
>>
>>
>> https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L253
>>
>> Best,
>> Jared
>>
>
>
>
> --
> *Lewis*
>

Re: HBaseStore Deletes After Puts?

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Jared,
I briefly had a look into this and I am still trying to properly understand
it.
Do you have some suggestion(s) as to answering your own questions if you've
maybe thought about it a bit more?
Lewis

On Sun, Sep 27, 2015 at 1:53 AM, Jared Wong <jw...@pinterest.com> wrote:

> Hi,
>
> Why does gora process the deletes after the puts? Could this cause a
> potential problem when processing MAP types because MAP types have their
> column or column family deleted while adding new content?
>
> In put:
> if (put.size() > 0) {
>   table.put(put);
> }
> if (delete.size() > 0) {
>   table.delete(delete);
>   table.delete(delete);
>   table.delete(delete); // HBase sometimes does not delete arbitrarily
> }
>
> In addPutsAndDeletes:
> case MAP:
>   // if it's a map that has been modified, then the content should be
> replaced by the new one
>   // This is because we don't know if the content has changed or not.
>   if (qualifier == null) {
>     delete.deleteFamily(hcol.getFamily());
>   } else {
>     delete.deleteColumn(hcol.getFamily(), qualifier);
>   }
>   @SuppressWarnings({ "rawtypes", "unchecked" })
>   Set<Entry> set = ((Map) o).entrySet();
>   for (@SuppressWarnings("rawtypes") Entry entry : set) {
>     byte[] qual = toBytes(entry.getKey());
>     addPutsAndDeletes(put, delete, entry.getValue(), schema.getValueType()
>         .getType(), schema.getValueType(), hcol, qual);
>   }
>   break;
>
>
> https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L253
>
> Best,
> Jared
>



-- 
*Lewis*