You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@thrift.apache.org by Kyle B <kb...@gmail.com> on 2013/03/05 00:46:56 UTC

Thrift PHP Async

Hello,

I'm incorporating Thrift for remote Hive calls from PHP, using the Thrift
PHP library. I was wondering if there is a way to start an async hive job,
and get back the Hadoop job id?

My goal is to start a long running job, where I will send the results to a
file in HDFS. I just need a way to kick it off and get back the job id,
which I will track using the Hadoop jobtracker.

Is this possible? And if not, has anyone come up with a solution to this
problem before?

Thanks,

-Kyle

Re: Thrift PHP Async

Posted by Kyle B <kb...@gmail.com>.
It looks like this is not possible in PHP, so I am experimenting with the
Java thrift library. I'm trying to figure out how to make an asynchronous
non-blocking call. I found this sample online, but it doesn't seem to be
working. I feel like I'm missing some key piece. Does anyone know why the
following doesn't execute the query? I'm watching on the hive logs, and I
see the connection, but not the query:

-----------------------------------------------------------------------
public class Test {
 public static void main(String[] args) {
  Test obj = new Test();
  obj.invoke();
 }

 public void invoke() {
  System.out.println("Invoking.");
  try {
   TNonblockingTransport transport = new TNonblockingSocket("127.0.0.1",
10000);
   transport.startConnect();
   TAsyncClientManager clientManager = new TAsyncClientManager();
   TProtocolFactory protocolFactory = new TBinaryProtocol.Factory();
   ThriftHive.AsyncClient client = new
ThriftHive.AsyncClient(protocolFactory, clientManager, transport);
   System.out.println("Executing data.");
   client.execute("SELECT * FROM pokes order by foo asc", new
TestCallback());
   System.out.println("Executed data.");
  } catch(TTransportException e) {
   System.out.println("TTransportException");
   e.printStackTrace();
  } catch (TException e) {
   System.out.println("TException");
   e.printStackTrace();
  } catch(IOException e) {
   System.out.println("IOException");
   e.printStackTrace();
  }
  System.out.println("Finished");
 }
 class TestCallback implements
AsyncMethodCallback<ThriftHive.AsyncClient.execute_call> {
  public void onComplete(ThriftHive.AsyncClient.execute_call execute_call) {
   System.out.println("Complete");
  }
  public void onError(Exception e) {
   System.out.println("Error");
  }
 }
}
-----------------------------------------------------------------------

My output is just "Invoking. Executing data. Executed data. Finished", with
no exceptions thrown. Has anybody ever played with ThriftHive.AsyncClient
to execute an async query?

Thanks,

-Kyle
On Wed, Mar 6, 2013 at 10:17 AM, Mike Stanley <mi...@mikestanley.org> wrote:

> This sounds more like a question for Hive users.   Thrift is simply the
> technology used as the cross language service interface.  there is
> nothing inherently in Thrift that provides or prevents what you are asking
> for.  It sounds like you have a question on the capabilities of the Hive
> service and/or client functionality itself (and not on how Thrift was/is
> used to expose this service).
>
>
> On Wed, Mar 6, 2013 at 12:09 PM, Kyle B <kb...@gmail.com> wrote:
>
> > Hello Henrique,
> >
> > Thanks for the reply. When submitting the job using the Thrift PHP
> library,
> > it is synchronous in that it sits there spinning until it finally returns
> > the results. What I'd like to do, is submit a job to thrift, get back the
> > Hadoop job id, and then track that asynchronously with another task.
> >
> > Do you know if this is possible using the Thrift PHP libraries?
> >
> > Thanks,
> >
> > -Kyle
> >
> > On Tue, Mar 5, 2013 at 7:17 PM, Henrique Mendonça <henrique@apache.org
> > >wrote:
> >
> > > Hi Kyle,
> > >
> > > It sounds like you wanna use a normal thrift synchronous service that
> > > submits the job and returns its id back.
> > > A async call wouldn't return anything until it's done.
> > >
> > > Cheers,
> > > Henrique
> > >
> > > On 5 March 2013 00:46, Kyle B <kb...@gmail.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > I'm incorporating Thrift for remote Hive calls from PHP, using the
> > Thrift
> > > > PHP library. I was wondering if there is a way to start an async hive
> > > job,
> > > > and get back the Hadoop job id?
> > > >
> > > > My goal is to start a long running job, where I will send the results
> > to
> > > a
> > > > file in HDFS. I just need a way to kick it off and get back the job
> id,
> > > > which I will track using the Hadoop jobtracker.
> > > >
> > > > Is this possible? And if not, has anyone come up with a solution to
> > this
> > > > problem before?
> > > >
> > > > Thanks,
> > > >
> > > > -Kyle
> > > >
> > >
> >
>

Re: Thrift PHP Async

Posted by Mike Stanley <mi...@mikestanley.org>.
This sounds more like a question for Hive users.   Thrift is simply the
technology used as the cross language service interface.  there is
nothing inherently in Thrift that provides or prevents what you are asking
for.  It sounds like you have a question on the capabilities of the Hive
service and/or client functionality itself (and not on how Thrift was/is
used to expose this service).


On Wed, Mar 6, 2013 at 12:09 PM, Kyle B <kb...@gmail.com> wrote:

> Hello Henrique,
>
> Thanks for the reply. When submitting the job using the Thrift PHP library,
> it is synchronous in that it sits there spinning until it finally returns
> the results. What I'd like to do, is submit a job to thrift, get back the
> Hadoop job id, and then track that asynchronously with another task.
>
> Do you know if this is possible using the Thrift PHP libraries?
>
> Thanks,
>
> -Kyle
>
> On Tue, Mar 5, 2013 at 7:17 PM, Henrique Mendonça <henrique@apache.org
> >wrote:
>
> > Hi Kyle,
> >
> > It sounds like you wanna use a normal thrift synchronous service that
> > submits the job and returns its id back.
> > A async call wouldn't return anything until it's done.
> >
> > Cheers,
> > Henrique
> >
> > On 5 March 2013 00:46, Kyle B <kb...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > I'm incorporating Thrift for remote Hive calls from PHP, using the
> Thrift
> > > PHP library. I was wondering if there is a way to start an async hive
> > job,
> > > and get back the Hadoop job id?
> > >
> > > My goal is to start a long running job, where I will send the results
> to
> > a
> > > file in HDFS. I just need a way to kick it off and get back the job id,
> > > which I will track using the Hadoop jobtracker.
> > >
> > > Is this possible? And if not, has anyone come up with a solution to
> this
> > > problem before?
> > >
> > > Thanks,
> > >
> > > -Kyle
> > >
> >
>

Re: Thrift PHP Async

Posted by Kyle B <kb...@gmail.com>.
Hello Henrique,

Thanks for the reply. When submitting the job using the Thrift PHP library,
it is synchronous in that it sits there spinning until it finally returns
the results. What I'd like to do, is submit a job to thrift, get back the
Hadoop job id, and then track that asynchronously with another task.

Do you know if this is possible using the Thrift PHP libraries?

Thanks,

-Kyle

On Tue, Mar 5, 2013 at 7:17 PM, Henrique Mendonça <he...@apache.org>wrote:

> Hi Kyle,
>
> It sounds like you wanna use a normal thrift synchronous service that
> submits the job and returns its id back.
> A async call wouldn't return anything until it's done.
>
> Cheers,
> Henrique
>
> On 5 March 2013 00:46, Kyle B <kb...@gmail.com> wrote:
>
> > Hello,
> >
> > I'm incorporating Thrift for remote Hive calls from PHP, using the Thrift
> > PHP library. I was wondering if there is a way to start an async hive
> job,
> > and get back the Hadoop job id?
> >
> > My goal is to start a long running job, where I will send the results to
> a
> > file in HDFS. I just need a way to kick it off and get back the job id,
> > which I will track using the Hadoop jobtracker.
> >
> > Is this possible? And if not, has anyone come up with a solution to this
> > problem before?
> >
> > Thanks,
> >
> > -Kyle
> >
>

Re: Thrift PHP Async

Posted by Henrique Mendonça <he...@apache.org>.
Hi Kyle,

It sounds like you wanna use a normal thrift synchronous service that
submits the job and returns its id back.
A async call wouldn't return anything until it's done.

Cheers,
Henrique

On 5 March 2013 00:46, Kyle B <kb...@gmail.com> wrote:

> Hello,
>
> I'm incorporating Thrift for remote Hive calls from PHP, using the Thrift
> PHP library. I was wondering if there is a way to start an async hive job,
> and get back the Hadoop job id?
>
> My goal is to start a long running job, where I will send the results to a
> file in HDFS. I just need a way to kick it off and get back the job id,
> which I will track using the Hadoop jobtracker.
>
> Is this possible? And if not, has anyone come up with a solution to this
> problem before?
>
> Thanks,
>
> -Kyle
>