You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Ashutosh Chauhan <ha...@apache.org> on 2011/05/12 23:03:29 UTC

Review Request: HIVE-2147 : Add api to send / receive message to metastore

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/738/
-----------------------------------------------------------

Review request for hive and Carl Steinbach.


Summary
-------

Updated patch to include missing ASF license and generated thrift code.


This addresses bug HIVE-2147.
    https://issues.apache.org/jira/browse/HIVE-2147


Diffs
-----

  trunk/metastore/if/hive_metastore.thrift 1102450 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1102450 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1102450 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1102450 
  trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1102450 
  trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1102450 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1102450 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1102450 
  trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1102450 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1102450 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1102450 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1102450 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java 1102450 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/MessageEvent.java PRE-CREATION 
  trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 1102450 
  trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java 1102450 

Diff: https://reviews.apache.org/r/738/diff


Testing
-------

Updated TestMetaStoreEventListener to test new api.


Thanks,

Ashutosh


Re: Review Request: HIVE-2147 : Add api to send / receive message to metastore

Posted by Carl Steinbach <ca...@cloudera.com>.

> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/if/hive_metastore.thrift, line 348
> > <https://reviews.apache.org/r/738/diff/1/?file=18685#file18685line348>
> >
> >     Identifying the message type using an integer seems brittle. This won't work if you have more than one application that is firing events at the metastore.
> 
> Ashutosh Chauhan wrote:
>     There are two other alternatives that I thought of before settling on this one.
>     1) Add specific apis for different message types. This would have made doing this generic api redundant but then this will result in application specific apis in the metastore. E.g., in HCatalog we want to send a message for "set of partitions" telling Metastore to mark them as done. What does finalizePartition() mean in metastore api when Metastore itself is not aware of this concept as this is application specific. This would be confusing.
>     2) Use enums instead of integer. This will result in similar problem as above though on a lower scale. Enums give compile time safety so we have to define them in Metastore code. Defining application specific enums doesnt look like a good idea because of similar reasons.

Another approach would be to serialize the events, and then wrap the serialized event object in a Thrift struct that also contains a field with the fully qualified class name of the event. When the server receives the event struct it would pass the class name to each of the listeners, which then have the option of deserializing the event if they are capable of handling it.

Using qualified event class names would help to eliminate the burden of coordinating the event id namespace, and it would also make it much easier to emit meaningful logging information on the server side.


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, line 3126
> > <https://reviews.apache.org/r/738/diff/1/?file=18694#file18694line3126>
> >
> >     So the event model is that each event may be handled by at most one event handler?
> 
> Ashutosh Chauhan wrote:
>     Yes.

Why not give each listener the opportunity to handle every event? It seems like a pretty conventional use case that you would want to have more than one listener operate on some events. This also seems like a more conventional policy for a pluggable listener framework.


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/if/hive_metastore.thrift, line 347
> > <https://reviews.apache.org/r/738/diff/1/?file=18685#file18685line347>
> >
> >     Having separate calls for sending request and response messages looks unnecessary. A sendMessage() function with separate request and response message types should work just as well, and will help to avoid confusion -- otherwise I think people will assume that receiveMessage is a polling call.
> >     
> >     This is starting to look like a general purpose messaging/rpc framework. Is that the intent?
> >
> 
> Ashutosh Chauhan wrote:
>     >> A sendMessage() function with separate request and response message types should work just as well.
>     That is correct. But semantically they are different. In sendMessage() user is just notifying Metastore of an event and is not bothered of return value. recvMessage() user is asking for a response for his message. This distinction is further enforced by return types. We could just have one api sendMessage() for both as you suggested, but having distinct apis for sending and receiving makes it easier for client to understand the semantics.
>     
>     >> This is starting to look like a general purpose messaging/rpc framework. 
>     Well general purpose rpc framework would be much more sophisticated. I am not aiming for that.

>> Well general purpose rpc framework would be much more sophisticated. I am not aiming for that. 

Sorry, I shouldn't have said "general purpose", but it does sound like adding an extensible RPC mechanism is one of the goals of this patch. This seems like a significant departure from the goal of HIVE-2038, which I think was more narrowly focused on adding an event framework, or at least that's what I remember discussing at the contrib meeting. RPCs are quite different from events, and if if we add this feature it shouldn't be conflated with events. I think the RPC modifications should be proposed in a separate ticket, and we should keep the focus in this ticket on adding an event framework.


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java, line 86
> > <https://reviews.apache.org/r/738/diff/1/?file=18697#file18697line86>
> >
> >     canProcessSendMessage() looks like a redundant call. Is there any reason that this can't be be rolled into processSendMessage()?
> >
> 
> Ashutosh Chauhan wrote:
>     Event model is every event is handled by atmost one handler. If we roll this in processSendMsg() then we have to make this method return boolean which will tell whether this event got serviced by this handler or not. Then how will it communicate back the actual return value. In case of sendMsg() this is fine, but recvMsg() returns a valid value which is then need to be returned to a client. So, we first ask handler if it can handle the message and then expect a valid return value in processRecvMsg() call.

I wasn't sure before if these changes were supposed to enable RPCs. Now that that's clear, I don't think this code belongs in MetaStoreEventListener. It belongs in MetaStoreGenericRPCHandler, or something like that. Can you please remove the RPC code, and change the names of the event code (e.g. one way communication from client to server) to make it clear that these are events, and not something more generic like RPCs?


- Carl


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/738/#review713
-----------------------------------------------------------


On 2011-05-12 21:03:29, Ashutosh Chauhan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/738/
> -----------------------------------------------------------
> 
> (Updated 2011-05-12 21:03:29)
> 
> 
> Review request for hive and Carl Steinbach.
> 
> 
> Summary
> -------
> 
> Updated patch to include missing ASF license and generated thrift code.
> 
> 
> This addresses bug HIVE-2147.
>     https://issues.apache.org/jira/browse/HIVE-2147
> 
> 
> Diffs
> -----
> 
>   trunk/metastore/if/hive_metastore.thrift 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1102450 
>   trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1102450 
>   trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1102450 
>   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1102450 
>   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1102450 
>   trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/MessageEvent.java PRE-CREATION 
>   trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 1102450 
>   trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java 1102450 
> 
> Diff: https://reviews.apache.org/r/738/diff
> 
> 
> Testing
> -------
> 
> Updated TestMetaStoreEventListener to test new api.
> 
> 
> Thanks,
> 
> Ashutosh
> 
>


Re: Review Request: HIVE-2147 : Add api to send / receive message to metastore

Posted by Ashutosh Chauhan <ha...@apache.org>.

> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/if/hive_metastore.thrift, line 347
> > <https://reviews.apache.org/r/738/diff/1/?file=18685#file18685line347>
> >
> >     Having separate calls for sending request and response messages looks unnecessary. A sendMessage() function with separate request and response message types should work just as well, and will help to avoid confusion -- otherwise I think people will assume that receiveMessage is a polling call.
> >     
> >     This is starting to look like a general purpose messaging/rpc framework. Is that the intent?
> >

>> A sendMessage() function with separate request and response message types should work just as well.
That is correct. But semantically they are different. In sendMessage() user is just notifying Metastore of an event and is not bothered of return value. recvMessage() user is asking for a response for his message. This distinction is further enforced by return types. We could just have one api sendMessage() for both as you suggested, but having distinct apis for sending and receiving makes it easier for client to understand the semantics.

>> This is starting to look like a general purpose messaging/rpc framework. 
Well general purpose rpc framework would be much more sophisticated. I am not aiming for that. 


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/if/hive_metastore.thrift, line 348
> > <https://reviews.apache.org/r/738/diff/1/?file=18685#file18685line348>
> >
> >     Identifying the message type using an integer seems brittle. This won't work if you have more than one application that is firing events at the metastore.

There are two other alternatives that I thought of before settling on this one.
1) Add specific apis for different message types. This would have made doing this generic api redundant but then this will result in application specific apis in the metastore. E.g., in HCatalog we want to send a message for "set of partitions" telling Metastore to mark them as done. What does finalizePartition() mean in metastore api when Metastore itself is not aware of this concept as this is application specific. This would be confusing.
2) Use enums instead of integer. This will result in similar problem as above though on a lower scale. Enums give compile time safety so we have to define them in Metastore code. Defining application specific enums doesnt look like a good idea because of similar reasons. 


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, line 3126
> > <https://reviews.apache.org/r/738/diff/1/?file=18694#file18694line3126>
> >
> >     So the event model is that each event may be handled by at most one event handler?

Yes.


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, line 3134
> > <https://reviews.apache.org/r/738/diff/1/?file=18694#file18694line3134>
> >
> >     Please add some DEBUG or TRACE level logging here that indicates which handler consumed a particular event, or if an event was unserviceable.

Will add logging.


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, line 3149
> > <https://reviews.apache.org/r/738/diff/1/?file=18694#file18694line3149>
> >
> >     Semantically this function looks more like "sendRequest" than "receiveMessage" (and "sendMessage" looks more like "fireEvent").

Same as my very first comment.


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, line 3151
> > <https://reviews.apache.org/r/738/diff/1/?file=18694#file18694line3151>
> >
> >     Checkstyle: you need a space between control flow tokens and open parens.
> >

will roll this in.


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java, line 640
> > <https://reviews.apache.org/r/738/diff/1/?file=18696#file18696line640>
> >
> >     Nice to have: javadoc.

Will add.


> On 2011-05-25 03:43:30, Carl Steinbach wrote:
> > trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java, line 86
> > <https://reviews.apache.org/r/738/diff/1/?file=18697#file18697line86>
> >
> >     canProcessSendMessage() looks like a redundant call. Is there any reason that this can't be be rolled into processSendMessage()?
> >

Event model is every event is handled by atmost one handler. If we roll this in processSendMsg() then we have to make this method return boolean which will tell whether this event got serviced by this handler or not. Then how will it communicate back the actual return value. In case of sendMsg() this is fine, but recvMsg() returns a valid value which is then need to be returned to a client. So, we first ask handler if it can handle the message and then expect a valid return value in processRecvMsg() call.


- Ashutosh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/738/#review713
-----------------------------------------------------------


On 2011-05-12 21:03:29, Ashutosh Chauhan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/738/
> -----------------------------------------------------------
> 
> (Updated 2011-05-12 21:03:29)
> 
> 
> Review request for hive and Carl Steinbach.
> 
> 
> Summary
> -------
> 
> Updated patch to include missing ASF license and generated thrift code.
> 
> 
> This addresses bug HIVE-2147.
>     https://issues.apache.org/jira/browse/HIVE-2147
> 
> 
> Diffs
> -----
> 
>   trunk/metastore/if/hive_metastore.thrift 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1102450 
>   trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1102450 
>   trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1102450 
>   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1102450 
>   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1102450 
>   trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/MessageEvent.java PRE-CREATION 
>   trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 1102450 
>   trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java 1102450 
> 
> Diff: https://reviews.apache.org/r/738/diff
> 
> 
> Testing
> -------
> 
> Updated TestMetaStoreEventListener to test new api.
> 
> 
> Thanks,
> 
> Ashutosh
> 
>


Re: Review Request: HIVE-2147 : Add api to send / receive message to metastore

Posted by Carl Steinbach <ca...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/738/#review713
-----------------------------------------------------------



trunk/metastore/if/hive_metastore.thrift
<https://reviews.apache.org/r/738/#comment1428>

    Having separate calls for sending request and response messages looks unnecessary. A sendMessage() function with separate request and response message types should work just as well, and will help to avoid confusion -- otherwise I think people will assume that receiveMessage is a polling call.
    
    This is starting to look like a general purpose messaging/rpc framework. Is that the intent?
    



trunk/metastore/if/hive_metastore.thrift
<https://reviews.apache.org/r/738/#comment1429>

    Identifying the message type using an integer seems brittle. This won't work if you have more than one application that is firing events at the metastore. 



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
<https://reviews.apache.org/r/738/#comment1430>

    So the event model is that each event may be handled by at most one event handler?



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
<https://reviews.apache.org/r/738/#comment1431>

    Please add some DEBUG or TRACE level logging here that indicates which handler consumed a particular event, or if an event was unserviceable.



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
<https://reviews.apache.org/r/738/#comment1432>

    Semantically this function looks more like "sendRequest" than "receiveMessage" (and "sendMessage" looks more like "fireEvent"). 



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
<https://reviews.apache.org/r/738/#comment1433>

    Checkstyle: you need a space between control flow tokens and open parens.
    



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
<https://reviews.apache.org/r/738/#comment1434>

    Nice to have: javadoc.



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
<https://reviews.apache.org/r/738/#comment1435>

    canProcessSendMessage() looks like a redundant call. Is there any reason that this can't be be rolled into processSendMessage()?
    


- Carl


On 2011-05-12 21:03:29, Ashutosh Chauhan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/738/
> -----------------------------------------------------------
> 
> (Updated 2011-05-12 21:03:29)
> 
> 
> Review request for hive and Carl Steinbach.
> 
> 
> Summary
> -------
> 
> Updated patch to include missing ASF license and generated thrift code.
> 
> 
> This addresses bug HIVE-2147.
>     https://issues.apache.org/jira/browse/HIVE-2147
> 
> 
> Diffs
> -----
> 
>   trunk/metastore/if/hive_metastore.thrift 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1102450 
>   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1102450 
>   trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1102450 
>   trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1102450 
>   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1102450 
>   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1102450 
>   trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java 1102450 
>   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/MessageEvent.java PRE-CREATION 
>   trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 1102450 
>   trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java 1102450 
> 
> Diff: https://reviews.apache.org/r/738/diff
> 
> 
> Testing
> -------
> 
> Updated TestMetaStoreEventListener to test new api.
> 
> 
> Thanks,
> 
> Ashutosh
> 
>