You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by "Kevin Ratnasekera (JIRA)" <ji...@apache.org> on 2019/04/03 09:19:00 UTC

[jira] [Comment Edited] (GORA-411) Add exists(key) to DataStore interface

    [ https://issues.apache.org/jira/browse/GORA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808518#comment-16808518 ] 

Kevin Ratnasekera edited comment on GORA-411 at 4/3/19 9:18 AM:
----------------------------------------------------------------

[~jhnmora000] This is really great :) I would like to add a minor suggestion to implementation. 

I noticed that you have defined exists(K key) and added a default behavior in DataStoreBase class. I personally think this change of adding default behavior to DataStoreBase is not needed. That would make that exists(K key) an abstract method and responsibility of providing a implementation to it, is totally left for the developer who implement Datastore  for a given database or backend. 

When you specify the default behavior it includes complete de-serialization which is not needed. That means not only it transfer all the data from network via wire, it also create the AVRO data bean using that transferred data. I think it s more sense to keep this method completely abstract and left to provide with a implementation. ( Using database client )

Anyway for now for the cases where we cannot do this database client level, you can keep the default behavior you added.


was (Author: djkevincr):
[~jhnmora000] This is really great :) I would like to add a minor suggestion to implementation. 

I noticed that you have defined exists(K key) and added a default behavior in DataStoreBase class. I personally think this change of adding default behavior to DataStoreBase is not needed. That would make that exists(K key) a abstract method and responsibility of providing a implementation to it, is totally left for the developer who implement Datastore  for a given database or backend. 

When you specify the default behavior it includes complete de-serialization which is not needed. That means not only it transfer all the data from network via wire, it also create the AVRO data bean using that transferred data. I think it s more sense to keep this method completely abstract and left to provide with a implementation. ( Using database client )

Anyway for now for the cases where we cannot do this database client level, you can keep the default behavior you added.

> Add exists(key) to DataStore interface
> --------------------------------------
>
>                 Key: GORA-411
>                 URL: https://issues.apache.org/jira/browse/GORA-411
>             Project: Apache Gora
>          Issue Type: Improvement
>          Components: gora-core, storage
>            Reporter: Alfonso Nishikawa
>            Priority: Minor
>             Fix For: 0.9
>
>
> NUTCH-1679 need to check if there exists some rows and they are proposing to use {{store.get(TableUtil.reverseUrl(url)))}}.
> This will have a considerably impact on performance since every column will be fetched.
> Some datastores implements a call to just check if a row exists (like HBase) so no data is transfered by network.
> If a datastore can't handle an "exists" call, can default to a get.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)