You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ahmed Kamal Abdelfatah <ah...@careem.com> on 2017/01/25 14:36:25 UTC

Scaling HCatalog REST API

Hi folks,

I’m working on some workflow that needs to hit an API for fetching hive table schema.

Currently, I’m using HCatalog Tempelton API
"apiURL:50111/templeton/v1/ddl/database/dbName/table/tableName?user.name=hive” but as the rate of requests is increasing (currently the max is 10 per second)

I get a lot of timeout from the API. I was looking to know how I can scale this API to be able to get better performance instead of increasing the timeout of course. Are there any params that needs tuning ? Should this be done on the metastore layer ? Thanks a lot.


Regards,


Ahmed Kamal
MTS Engineer in Data Science
Email: ahmed.abdelfatah@careem.com<ma...@careem.com>


Re: Scaling HCatalog REST API

Posted by Ahmed Kamal Abdelfatah <ah...@careem.com>.
Thanks a lot. I will try to test it soon.

Regards,


Ahmed Kamal
MTS in Data Science
Email: ahmed.abdelfatah@careem.com<ma...@careem.com>


From: Amin Abbaspour <am...@strutdigital.com>
Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
Date: Friday, January 27, 2017 at 2:37 AM
To: "user@hive.apache.org" <us...@hive.apache.org>
Subject: Re: Scaling HCatalog REST API

Hi Ahmed,

I've changed webhcat to use JDBC connectivity rather than running hcat command and it performs much faster.
There is ticket (https://issues.apache.org/jira/browse/HIVE-15690) and PR (https://github.com/apache/hive/pull/133) for that. Feel free to test it.

Kind regards,
Amin



On Thu, Jan 26, 2017 at 1:36 AM, Ahmed Kamal Abdelfatah <ah...@careem.com>> wrote:
Hi folks,

I’m working on some workflow that needs to hit an API for fetching hive table schema.

Currently, I’m using HCatalog Tempelton API
"apiURL:50111/templeton/v1/ddl/database/dbName/table/tableName?user.name<http://user.name>=hive” but as the rate of requests is increasing (currently the max is 10 per second)

I get a lot of timeout from the API. I was looking to know how I can scale this API to be able to get better performance instead of increasing the timeout of course. Are there any params that needs tuning ? Should this be done on the metastore layer ? Thanks a lot.


Regards,


Ahmed Kamal
MTS Engineer in Data Science
Email: ahmed.abdelfatah@careem.com<ma...@careem.com>




--

Amin Abbaspour

Senior AWS Consultant


[http://strutdigital.com/strut_logo_sig.png]


Phone  +61 450 445 271

Email     amin.abbaspour@strutdigital.com<ma...@strutdigital.com>

Website strutdigital.com<http://strutdigital.com/>

Twitter   @strutdigital<http://twitter.com/strutdigital>

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.

Re: Scaling HCatalog REST API

Posted by Amin Abbaspour <am...@strutdigital.com>.
Hi Ahmed,

I've changed webhcat to use JDBC connectivity rather than running hcat
command and it performs much faster.
There is ticket (https://issues.apache.org/jira/browse/HIVE-15690) and PR (
https://github.com/apache/hive/pull/133) for that. Feel free to test it.

Kind regards,
Amin



On Thu, Jan 26, 2017 at 1:36 AM, Ahmed Kamal Abdelfatah <
ahmed.abdelfatah@careem.com> wrote:

> Hi folks,
>
>
>
> I’m working on some workflow that needs to hit an API for fetching hive
> table schema.
>
>
>
> Currently, I’m using HCatalog Tempelton API
>
> "apiURL:50111/templeton/v1/ddl/database/dbName/table/tableName?user.name=hive”
> but as the rate of requests is increasing (currently the max is 10 per
> second)
>
>
>
> I get a lot of timeout from the API. I was looking to know how I can scale
> this API to be able to get better performance instead of increasing the
> timeout of course. Are there any params that needs tuning ? Should this be
> done on the metastore layer ? Thanks a lot.
>
>
>
>
>
> *Regards, *
>
>
>
>
> *Ahmed Kamal*
> *MTS Engineer in Data Science*
>
> *Email: **ahmed.abdelfatah@careem.com <ah...@careem.com>*
>
>
>



-- 

Amin Abbaspour

Senior AWS Consultant



Phone  +61 450 445 271

Email     amin.abbaspour@strutdigital.com <fi...@strutdigital.com>

Website strutdigital.com

Twitter   @strutdigital <http://twitter.com/strutdigital>

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system manager.
This message contains confidential information and is intended only for the
individual named. If you are not the named addressee you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately by e-mail if you have received this e-mail by mistake and
delete this e-mail from your system. If you are not the intended recipient
you are notified that disclosing, copying, distributing or taking any
action in reliance on the contents of this information is strictly
prohibited.

-- 
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. 
If you have received this email in error please notify the system manager. 
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and 
delete this e-mail from your system. If you are not the intended recipient 
you are notified that disclosing, copying, distributing or taking any 
action in reliance on the contents of this information is strictly 
prohibited.