You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@thrift.apache.org by Matthieu Imbert <ma...@ens-lyon.fr> on 2010/02/11 14:58:46 UTC

per-connexion state with txthrift

Hi,

Is there a way to maintain some per-connexion state with a python twisted thrift server?

Currently, my twisted thrift server's code is inside a handler which is instanciated only once, when initializing the server, with this kind of code:

  factory = ThriftServerFactory(
    processor = MyService.Processor(MyHandler()),
    iprot_factory = TBinaryProtocol.TBinaryProtocolFactory())
  reactor.listenTCP(some_port, factory)
  reactor.run()

There is only one MyHandler instance, thus this instance is shared for all incoming client connexions, i see no way to maintain per-connexion state.

regards,

-- 
Matthieu Imbert <ma...@ens-lyon.fr>
INRIA engineer / SED / GRAAL and RESO teams
http://www.inria.fr http://www.ens-lyon.fr/LIP
+33(0)472728741 / +33(0)437287473
LIP ENS-Lyon, 46 allée d'Italie
69364 Lyon cedex 07, FRANCE

Re: per-connexion state with txthrift

Posted by Matthieu Imbert <ma...@ens-lyon.fr>.
Hi Esteve,

Thanks for your answer. I understand your point, but i also have a few objections:

Esteve Fernandez wrote:
> Hi Matthieu
> 
>> Actually, this issue is not specific to the txthrift (twisted-thrift) python
>> generated code, but also to the regular python generated code, since in both
>> cases the thrift server handler is shared between all connexions, and has (to
>> the best of my knowledge) no way to access a connexion object.
> 
> Yes, that's true. However, in general it's a bad idea to bind your application
> logic to a particular protocol or transport. For example, the same handler
> could be used with a socket or an HTTP transport.
> 
>> I think this can be a huge problem for a lot of use cases. For example, it
>> seems difficult to implement any form of authentification (at least a
>> per-connexion authentification, as is usually the case) since it will be
>> impossible to attach any authentication token to the session.
> 
> You can either use Thrift over SSL, or add something like this to your service:
> 
> - authenticate(username, password) # This will return a "Session" object (a
> token) if valid credentials are passed, which contains a public part and a
> private one
> - doSomething(arg1, arg2, public_token, timestamp, nonce, signature) # Where
> public_token is the public part of the session token you received in the
> previous call, timestamp the current time, nonce a random string and signature
> is generated from all the arguments and the secret part of the session token
> 
> this is analogous to OAuth in some sense, though.
> 
> You may think in terms of conversations (which may require one or more
> connections), instead of one-shot connections.

i agree that binding the app to a specific transport may be a bad idea, but at some point, when dealing with authentication for example, you may not be able to avoid some transport specific code.

In the authentication example, we can use thrift over SSL, but then how can we retrieve the user's credentials (its certificate) from the thrift handler with the current design? For me it is necessary to get these credentials to set some permissions profile on the API provided through thrift.

Currently, i'm using thrift without SSL, on a connexion oriented transport, and i'm using a scheme similar to the one you describe (with authenticate(username, password)), so that each function of the thrift service takes an authentification token as first parameter, but the service API would be prettier without this authentication stuff (this is syntactic sugar, i agree)

>> I can also quickly imagine a lot of scenarios where not being able to maintain
>> some per-connexion state is an issue. Imagine for example a simple RPN
>> calculator service (with 3 functions: push, pop, add): how can one maintain a
>> per-connexion stack?
> 
> You may add a sequence number to every call and check it in the handler. Or
> you may generate random numbers and return them, and the client adds them as
> an argument for the next call, e.g.:
> 
> - client calls Service.push(value, seqid=0) # 0 means that it's starting a new
> conversation
> - the service generates a random number (e.g. a UUID) and returns it to the
> client
> - the client then issues Service.push(value, seqid=the_id_from_the_previous_call)
> - the service checks the seqid argument and checks that it matches an ongoing
> "conversation"
> 
> of course, this is a very simple case, you may need to add a timestamp so that
> conversations expire.
>
>> I don't know if other thrift language bindings suffer from the same
>> limitation? (so far, i've only implemented a server in python)
> 
> Thrift is stateless, but you can build stateful applications on top. It's more
> work, because you'll have to implement it in your handler, but it also means
> that Thrift won't force you to maintain state if you don't need to (all your
> calls are stateless).

Building stateless services may be a better alternative than building stateful ones, but sometimes you need stateful services (or perhaps, you only need to maintain a state for authentication or even caching purposes) and my point is that though you can manually do it with thrift (with methods like authenticate, returning a token which is then used in further services calls), it makes the service API far less straightforward. Being able to access and maintain some connexion state would make this easier when this is needed, without forcing thrift to become stateful when this is unneeded.

cheers,

-- 
Matthieu Imbert

Re: per-connexion state with txthrift

Posted by Esteve Fernandez <es...@sindominio.net>.
Hi Matthieu

> Actually, this issue is not specific to the txthrift (twisted-thrift) python
> generated code, but also to the regular python generated code, since in both
> cases the thrift server handler is shared between all connexions, and has (to
> the best of my knowledge) no way to access a connexion object.

Yes, that's true. However, in general it's a bad idea to bind your application
logic to a particular protocol or transport. For example, the same handler
could be used with a socket or an HTTP transport.

> I think this can be a huge problem for a lot of use cases. For example, it
> seems difficult to implement any form of authentification (at least a
> per-connexion authentification, as is usually the case) since it will be
> impossible to attach any authentication token to the session.

You can either use Thrift over SSL, or add something like this to your service:

- authenticate(username, password) # This will return a "Session" object (a
token) if valid credentials are passed, which contains a public part and a
private one
- doSomething(arg1, arg2, public_token, timestamp, nonce, signature) # Where
public_token is the public part of the session token you received in the
previous call, timestamp the current time, nonce a random string and signature
is generated from all the arguments and the secret part of the session token

this is analogous to OAuth in some sense, though.

You may think in terms of conversations (which may require one or more
connections), instead of one-shot connections.

> I can also quickly imagine a lot of scenarios where not being able to maintain
> some per-connexion state is an issue. Imagine for example a simple RPN
> calculator service (with 3 functions: push, pop, add): how can one maintain a
> per-connexion stack?

You may add a sequence number to every call and check it in the handler. Or
you may generate random numbers and return them, and the client adds them as
an argument for the next call, e.g.:

- client calls Service.push(value, seqid=0) # 0 means that it's starting a new
conversation
- the service generates a random number (e.g. a UUID) and returns it to the
client
- the client then issues Service.push(value, seqid=the_id_from_the_previous_call)
- the service checks the seqid argument and checks that it matches an ongoing
"conversation"

of course, this is a very simple case, you may need to add a timestamp so that
conversations expire.

> I don't know if other thrift language bindings suffer from the same
> limitation? (so far, i've only implemented a server in python)

Thrift is stateless, but you can build stateful applications on top. It's more
work, because you'll have to implement it in your handler, but it also means
that Thrift won't force you to maintain state if you don't need to (all your
calls are stateless).

Cheers.


Re: per-connexion state with txthrift

Posted by Matthieu Imbert <ma...@ens-lyon.fr>.
Matthieu Imbert wrote:
> It seems to be a half-working solution since I don't see how i can then access this data from the thrift handler (which is shared by all incoming client connexions).
> 
> For example, how could i access this protocol.connection_id from one of the thrift function handler of the MyHandler instance, in my previous example?

Actually, this issue is not specific to the txthrift (twisted-thrift) python generated code, but also to the regular python generated code, since in both cases the thrift server handler is shared between all connexions, and has (to the best of my knowledge) no way to access a connexion object.

I think this can be a huge problem for a lot of use cases. For example, it seems difficult to implement any form of authentification (at least a per-connexion authentification, as is usually the case) since it will be impossible to attach any authentication token to the session.

I can also quickly imagine a lot of scenarios where not being able to maintain some per-connexion state is an issue. Imagine for example a simple RPN calculator service (with 3 functions: push, pop, add): how can one maintain a per-connexion stack?

I don't know if other thrift language bindings suffer from the same limitation? (so far, i've only implemented a server in python)

cheers,

-- 
Matthieu

Re: per-connexion state with txthrift

Posted by Matthieu Imbert <ma...@ens-lyon.fr>.
Esteve Fernandez wrote:
> if I understood it correctly, this would suffice:
> 
> class MyThriftServerFactory(ThriftServerFactory):
> 
>     def buildProtocol(self, addr):
>         p = self.protocol()
>         p.factory = self
> 
>         # add something to the protocol instance
>         # e.g. a unique id
>         import uuid
>         p.connection_id = uuid.uuid4()
> 
>         return p
> 
> then use MyThriftServerFactory as you'd normally use ThriftServerFactory

Hi Esteve,

It seems to be a half-working solution since I don't see how i can then access this data from the thrift handler (which is shared by all incoming client connexions).

For example, how could i access this protocol.connection_id from one of the thrift function handler of the MyHandler instance, in my previous example?

cheers,

-- 
Matthieu

Re: per-connexion state with txthrift

Posted by Esteve Fernandez <es...@sindominio.net>.
Hi Matthieu

> Is there a way to maintain some per-connexion state with a python twisted
> thrift server?

if I understood it correctly, this would suffice:


class MyThriftServerFactory(ThriftServerFactory):

    def buildProtocol(self, addr):
        p = self.protocol()
        p.factory = self

        # add something to the protocol instance
        # e.g. a unique id
        import uuid
        p.connection_id = uuid.uuid4()

        return p

then use MyThriftServerFactory as you'd normally use ThriftServerFactory

Cheers.