You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sree Eedupuganti <sr...@inndata.in> on 2016/10/09 08:55:48 UTC
Convert hive sql to spark sql
Hi users i need to test the performance of the query in hive and spark. Can
any one convert these sql to spark sql. Here is the sql.
SELECT split(DTD.TRAN_RMKS,'/')[0] AS TRAB_RMK1,
split(DTD.TRAN_RMKS,'/')[1] AS ATM_ID,
DTD.ACID,
G.FORACID,
DTD.REF_NUM,
DTD.TRAN_ID,
DTD.TRAN_DATE,
DTD.VALUE_DATE,
DTD.TRAN_PARTICULAR,
DTD.TRAN_RMKS,
DTD.TRAN_AMT,
SYSDATE_ORA(),
DTD.PSTD_DATE,
DTD.PSTD_FLG,
G.CUSTID,
NULL AS PROC_FLG,
DTD.PSTD_USER_ID,
DTD.ENTRY_USER_ID,
G.schemecode as SCODE
FROM DAILY_TRAN_DETAIL_TABLE2 DTD
JOIN ods_gam G
ON DTD.ACID = G.ACID
where substr(DTD.TRAN_PARTICULAR,1,3) rlike '(PUR|POS).*'
AND DTD.PART_TRAN_TYPE = 'D'
AND DTD.DEL_FLG <> 'Y'
AND DTD.PSTD_FLG = 'Y'
AND G.schemecode IN
('SBPRV','SBPRS','WSSTF','BGFRN','NREPV','NROPV','BSNRE','BSNRO')
AND (SUBSTR(split(DTD.TRAN_RMKS,'/')[0],1,6) IN
('405997','406228','406229','415527','415528','417917','417918','418210','421539','421572','432198','435736','450502','450503','450504','468805','469190','469191','469192','474856','478286','478287','486292','490222','490223','490254','512932','512932','514833','522346','522352','524458','526106','526701','527114','527479','529608','529615','529616','532731','532734','533102','534680','536132','536610','536621','539149','539158','549751','557654','607118','607407','607445','607529','652189','652190','652157')
OR SUBSTR(split(DTD.TRAN_RMKS,'/')[0],1,8) IN
('53270200','53270201','53270202','60757401','60757402') )
limit 50;
--
Best Regards,
Sreeharsha Eedupuganti
Re: Convert hive sql to spark sql
Posted by Mich Talebzadeh <mi...@gmail.com>.
Ayan is correct. In Spark < 2 you can do
val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
scala> var sqltext =
| """
| select count(1) from prices
| """
sqltext: String =
"
select count(1) from prices
"
scala> HiveContext.sql(sqltext).show
+--------+
|count(1)|
+--------+
| 323|
+--------+
You can put your queries in between
var sqltext =
"""
SELECT split(DTD.TRAN_RMKS,'/')[0] AS TRAB_RMK1, .....
"""
HiveContext.sql(sqltext).collect.foreach(println)
HTH
Dr Mich Talebzadeh
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On 9 October 2016 at 09:58, ayan guha <gu...@gmail.com> wrote:
> have you tried in spark....it should work as it is....
>
> On Sun, Oct 9, 2016 at 7:55 PM, Sree Eedupuganti <sr...@inndata.in> wrote:
>
>> Hi users i need to test the performance of the query in hive and spark.
>> Can any one convert these sql to spark sql. Here is the sql.
>>
>>
>> SELECT split(DTD.TRAN_RMKS,'/')[0] AS TRAB_RMK1,
>> split(DTD.TRAN_RMKS,'/')[1] AS ATM_ID,
>> DTD.ACID,
>> G.FORACID,
>> DTD.REF_NUM,
>> DTD.TRAN_ID,
>> DTD.TRAN_DATE,
>> DTD.VALUE_DATE,
>> DTD.TRAN_PARTICULAR,
>> DTD.TRAN_RMKS,
>> DTD.TRAN_AMT,
>> SYSDATE_ORA(),
>> DTD.PSTD_DATE,
>> DTD.PSTD_FLG,
>> G.CUSTID,
>> NULL AS PROC_FLG,
>> DTD.PSTD_USER_ID,
>> DTD.ENTRY_USER_ID,
>> G.schemecode as SCODE
>> FROM DAILY_TRAN_DETAIL_TABLE2 DTD
>> JOIN ods_gam G
>> ON DTD.ACID = G.ACID
>> where substr(DTD.TRAN_PARTICULAR,1,3) rlike '(PUR|POS).*'
>> AND DTD.PART_TRAN_TYPE = 'D'
>> AND DTD.DEL_FLG <> 'Y'
>> AND DTD.PSTD_FLG = 'Y'
>> AND G.schemecode IN ('SBPRV','SBPRS','WSSTF','BGFR
>> N','NREPV','NROPV','BSNRE','BSNRO')
>> AND (SUBSTR(split(DTD.TRAN_RMKS,'/')[0],1,6) IN
>> ('405997','406228','406229','415527','415528','417917','4179
>> 18','418210','421539','421572','432198','435736','450502','
>> 450503','450504','468805','469190','469191','469192','
>> 474856','478286','478287','486292','490222','490223','
>> 490254','512932','512932','514833','522346','522352','
>> 524458','526106','526701','527114','527479','529608','
>> 529615','529616','532731','532734','533102','534680','
>> 536132','536610','536621','539149','539158','549751','
>> 557654','607118','607407','607445','607529','652189','652190','652157')
>> OR SUBSTR(split(DTD.TRAN_RMKS,'/')[0],1,8) IN
>> ('53270200','53270201','53270202','60757401','60757402') )
>> limit 50;
>> --
>> Best Regards,
>> Sreeharsha Eedupuganti
>>
>
>
>
> --
> Best Regards,
> Ayan Guha
>
Re: Convert hive sql to spark sql
Posted by ayan guha <gu...@gmail.com>.
have you tried in spark....it should work as it is....
On Sun, Oct 9, 2016 at 7:55 PM, Sree Eedupuganti <sr...@inndata.in> wrote:
> Hi users i need to test the performance of the query in hive and spark.
> Can any one convert these sql to spark sql. Here is the sql.
>
>
> SELECT split(DTD.TRAN_RMKS,'/')[0] AS TRAB_RMK1,
> split(DTD.TRAN_RMKS,'/')[1] AS ATM_ID,
> DTD.ACID,
> G.FORACID,
> DTD.REF_NUM,
> DTD.TRAN_ID,
> DTD.TRAN_DATE,
> DTD.VALUE_DATE,
> DTD.TRAN_PARTICULAR,
> DTD.TRAN_RMKS,
> DTD.TRAN_AMT,
> SYSDATE_ORA(),
> DTD.PSTD_DATE,
> DTD.PSTD_FLG,
> G.CUSTID,
> NULL AS PROC_FLG,
> DTD.PSTD_USER_ID,
> DTD.ENTRY_USER_ID,
> G.schemecode as SCODE
> FROM DAILY_TRAN_DETAIL_TABLE2 DTD
> JOIN ods_gam G
> ON DTD.ACID = G.ACID
> where substr(DTD.TRAN_PARTICULAR,1,3) rlike '(PUR|POS).*'
> AND DTD.PART_TRAN_TYPE = 'D'
> AND DTD.DEL_FLG <> 'Y'
> AND DTD.PSTD_FLG = 'Y'
> AND G.schemecode IN ('SBPRV','SBPRS','WSSTF','
> BGFRN','NREPV','NROPV','BSNRE','BSNRO')
> AND (SUBSTR(split(DTD.TRAN_RMKS,'/')[0],1,6) IN
> ('405997','406228','406229','415527','415528','417917','
> 417918','418210','421539','421572','432198','435736','
> 450502','450503','450504','468805','469190','469191','
> 469192','474856','478286','478287','486292','490222','
> 490223','490254','512932','512932','514833','522346','
> 522352','524458','526106','526701','527114','527479','
> 529608','529615','529616','532731','532734','533102','
> 534680','536132','536610','536621','539149','539158','
> 549751','557654','607118','607407','607445','607529','652189','652190','652157')
> OR SUBSTR(split(DTD.TRAN_RMKS,'/')[0],1,8) IN ('53270200','53270201','
> 53270202','60757401','60757402') )
> limit 50;
> --
> Best Regards,
> Sreeharsha Eedupuganti
>
--
Best Regards,
Ayan Guha