You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Puneet Khatod <pu...@tavant.com> on 2012/07/18 10:41:15 UTC

Not in clause in hive query

Hi,

I am working on Hive 0.7. I am migrating SQL queries to hive and facing issues with the queries that have 'Not in' clause usage.

Example:
select * from customer  where cust_id not in (12022,11783);

I am getting:
FAILED: Parse Error: line 1:38 cannot recognize input near ' cust_id ' 'not' 'in' in expression specification.

Is there any alternative available in Hive to replicate behaviour of 'IN' and 'NOT IN' clause?

Regards,
Puneet

From: Saggau, Arne [mailto:Arne.Saggau@ottogroup.com]
Sent: 18 July 2012 12:30
To: user@hive.apache.org
Subject: AW: not able to access Hive Web Interface

Hi,

you have to give the relative path in hive-site.xml.
So try using lib/hive-hwi-0.8.1.war

Regards
Arne

Von: yogesh.kumar13@wipro.com<ma...@wipro.com> [mailto:yogesh.kumar13@wipro.com]<mailto:[mailto:yogesh.kumar13@wipro.com]>
Gesendet: Mittwoch, 18. Juli 2012 08:41
An: user@hive.apache.org<ma...@hive.apache.org>; bejoy_ks@yahoo.com<ma...@yahoo.com>
Betreff: RE: not able to access Hive Web Interface
Wichtigkeit: Hoch

Hi all :-),

Iam trying to access Hive Web Interface but it fails.

I have this changes in hive-site.xml

************************************************************************************************
<configuration>
    <property>
        <name>hive.hwi.listen.host</name>
        <value>0.0.0.0</value>
        <description>This is the host address the Hive Web Interface will listen on</description>
    </property>

    <property>
        <name>hive.hwi.listen.port</name>
        <value>9999</value>
        <description>This is the port the Hive Web Interface will listen on</description>
    </property>

    <property>
        <name>hive.hwi.war.file</name>
        <value>/HADOOP/hive/lib/hive-hwi-0.8.1.war</value> /*  (Here is the hive directory) */
        <description>This is the WAR file with the jsp content for Hive Web Interface</description>
    </property>

</configuration>

 ***********************************************************************************************

And also export the ANT lib like.

export ANT_LIB=/Yogesh/ant-1.8.4/lib
export PATH=$PATH:$ANT_LIB


now when i do run command

hive --service hwi  it results

12/07/17 18:03:02 INFO hwi.HWIServer: HWI is starting up
12/07/17 18:03:02 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /HADOOP/hive/conf/hive-default.xml
12/07/17 18:03:02 FATAL hwi.HWIServer: HWI WAR file not found at /HADOOP/hive/lib/hive-hwi-0.8.1.war


and if I go for

hive --service hwi --help it results

Usage ANT_LIB=XXXX hive --service hwi


Althought if I go to /HADOOP/hive/lib directory I found

1) hive-hwi-0.8.1.war
2) hive-hwi-0.8.1.jar

these files are present there.

what is Iam doing wrong :-( ?

Please help and Suggest

Greetings
Yogesh Kumar
________________________________
From: Gesli, Nicole [Nicole.Gesli@memorylane.com]
Sent: Wednesday, July 18, 2012 12:50 AM
To: user@hive.apache.org<ma...@hive.apache.org>; bejoy_ks@yahoo.com<ma...@yahoo.com>
Subject: Re: DATA UPLOADTION
For the Hive query approach, check the string functions (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-StringFunctions) or write your own (UDF), if needed. It depends on what you are trying to get. Example:

SELECT TRIM(SUBSTR(data, LOCATE(LOWER(data), ' this '), LOCATE(LOWER(data), ' that ')+5)) my_string
FROM   log_table
WHERE  LOWER(data) LIKE '%this%and%that%'


From: Bejoy KS <be...@yahoo.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, "bejoy_ks@yahoo.com<ma...@yahoo.com>" <be...@yahoo.com>>
Date: Monday, July 16, 2012 11:39 PM
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Subject: Re: DATA UPLOADTION

Hi Yogesh

You can connect reporting tools like tableau , micro strategy etc direcly with hive.

If you are looking for some static reports based on aggregate data. You can process the data in hive move the resultant data into some rdbms and use some common reporting tools over the same. I know quite a few projects following this model.
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: <yo...@wipro.com>>
Date: Tue, 17 Jul 2012 06:33:43 +0000
To: <us...@hive.apache.org>>; <be...@yahoo.com>>
ReplyTo: user@hive.apache.org<ma...@hive.apache.org>
Subject: RE: DATA UPLOADTION

Thanks Gesli and Bejoy,

I have created tables in hive and uploaded data into it. I can perform query on it, please suggest me how to generate reports from that tables.

Mr. Gesli,
If I create tables with single string column like ( create table Log_table( Data STRING); ) then how can perform condition based query over the data into Log_table ?


Thanks & Regards :-)
Yogesh Kumar
________________________________
From: Gesli, Nicole [Nicole.Gesli@memorylane.com<ma...@memorylane.com>]
Sent: Monday, July 16, 2012 11:30 PM
To: user@hive.apache.org<ma...@hive.apache.org>; bejoy_ks@yahoo.com<ma...@yahoo.com>
Cc: user@hbase.apache.org<ma...@hbase.apache.org>
Subject: Re: DATA UPLOADTION
If you are just trying to find certain text in the data files and you just want to do bulk process to create reports once a day or so, and prefer to use Hive: you can create a table with with single string column. You need to pre-process your data to replace the default column delimiter in your data. Or, you can define a column delimiter that your data does not have. That is to make sure that entire line data is assigned to the column but not cut in where the column delimiter is. If your query will be different for each file type (flat files, logs, xls,...) you can create different partitions for each file type. Dump your files into the table (or table partition) folder(s). Or you can create external table(s) if your data is already in HDFS. You can than do "like" (faster) or "rlike" search on the table.

-Nicole

From: Bejoy KS <be...@yahoo.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, "bejoy_ks@yahoo.com<ma...@yahoo.com>" <be...@yahoo.com>>
Date: Monday, July 16, 2012 12:50 AM
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Cc: "user@hbase.apache.org<ma...@hbase.apache.org>" <us...@hbase.apache.org>>
Subject: Re: DATA UPLOADTION

Hi Yogesh

If you are looking at some indexing and search kind of operation you can take a look at lucene.

Whether you are using hive or Hbase you cannot do any operation without having a table structure defined for the data. So you need to create tables for each dataset and then only you can go ahead and issue queries and generate reports on those data.
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: <yo...@wipro.com>>
Date: Mon, 16 Jul 2012 06:21:15 +0000
To: <us...@hive.apache.org>>
ReplyTo: user@hive.apache.org<ma...@hive.apache.org>
Cc: <us...@hbase.apache.org>>
Subject: RE: DATA UPLOADTION

Hello Debarshi,

Please suggest me what tool should I use for these operation over hadoop dfs.

Regards
Yogesh Kumar
________________________________
From: Debarshi Basak [debarshi.basak@tcs.com<ma...@tcs.com>]
Sent: Monday, July 16, 2012 11:25 AM
To: user@hive.apache.org<ma...@hive.apache.org>
Cc: user@hive.apache.org<ma...@hive.apache.org>; user@hbase.apache.org<ma...@hbase.apache.org>
Subject: Re: DATA UPLOADTION
Hive is not the right to go about it, if you are planning to do search kind of operations


Debarshi Basak
Tata Consultancy Services
Mailto: debarshi.basak@tcs.com<ma...@tcs.com>
Website: http://www.tcs.com
____________________________________________
Experience certainty. IT Services
Business Solutions
Outsourcing
____________________________________________

----- wrote: -----
To: <us...@hive.apache.org>>
From: <yo...@wipro.com>>
Date: 07/16/2012 09:11AM
cc: <us...@hbase.apache.org>>
Subject: DATA UPLOADTION
Hi all,

I have data of Flat files, Log files, Images and .xls Files of around many G.B

I need to put operation like searching, Querying over that raw data.  and generating reports.
And its impossible to create tables manually for all to manage them. Is there any other way out or how to manage them using Hive or Hbase.

Please suggest me how do I perform these operations over them, I want to use HADOOP DFS and files has been uploaded on HDFS (Single user)


Thanks & Regards
Yogesh Kumar

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com<http://www.wipro.com>

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com<http://www.wipro.com>

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com<http://www.wipro.com>

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com<http://www.wipro.com>
Any comments or statements made in this email are not necessarily those of Tavant Technologies.
The information transmitted is intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. If you have received this in error, please contact the
sender and delete the material from any computer. All e-mails sent from or to Tavant Technologies
may be subject to our monitoring procedures.

Re: Not in clause in hive query

Posted by Bejoy Ks <be...@yahoo.com>.
Hi Puneet

The IN clause is available in latest versions of hive. (I checked  in 0.9).

In your projected scenario, You can induce the functionality of NOT IN clause as
select * from customer  whereNOT(cust_id =12022 OR cust_id=11783);

Similarly you can do it for IN clause, just avoid NOT in above query
select * from customer  where (cust_id =12022 OR cust_id=11783);

Regards
Bejoy KS



________________________________
 From: Puneet Khatod <pu...@tavant.com>
To: "user@hive.apache.org" <us...@hive.apache.org> 
Sent: Wednesday, July 18, 2012 2:11 PM
Subject: Not in clause in hive query
 

 
Hi,
 
I am working on Hive 0.7. I am migrating SQL queries to hive and facing issues with the queries that have ‘Not in’ clause usage.
 
Example:
select * from customer  where cust_id not in (12022,11783);
 
I am getting:
FAILED: Parse Error: line 1:38 cannot recognize input near ' cust_id ' 'not' 'in' in expression specification.
 
Is there any alternative available in Hive to replicate behaviour of ‘IN’ and ‘NOT IN’ clause?
 
Regards,
Puneet
 
From:Saggau, Arne [mailto:Arne.Saggau@ottogroup.com] 
Sent: 18 July 2012 12:30
To: user@hive.apache.org
Subject: AW: not able to access Hive Web Interface
 
Hi,
 
you have to give the relative path in hive-site.xml.
So try using lib/hive-hwi-0.8.1.war 
 
Regards
Arne
 
Von:yogesh.kumar13@wipro.com [mailto:yogesh.kumar13@wipro.com] 
Gesendet: Mittwoch, 18. Juli 2012 08:41
An: user@hive.apache.org; bejoy_ks@yahoo.com
Betreff: RE: not able to access Hive Web Interface
Wichtigkeit: Hoch
 
Hi all :-),

Iam trying to access Hive Web Interface but it fails.

I have this changes in hive-site.xml

************************************************************************************************
<configuration>
    <property>
        <name>hive.hwi.listen.host</name>
        <value>0.0.0.0</value>
        <description>This is the host address the Hive Web Interface will listen on</description>
    </property>
    
    <property>
        <name>hive.hwi.listen.port</name>
        <value>9999</value>
        <description>This is the port the Hive Web Interface will listen on</description>
    </property>
    
    <property>
        <name>hive.hwi.war.file</name>
        <value>/HADOOP/hive/lib/hive-hwi-0.8.1.war</value>/*  (Here is the hive directory) */
        <description>This is the WAR file with the jsp content for Hive Web Interface</description>
    </property>    
   
</configuration>

 ***********************************************************************************************
 
And also export the ANT lib like.

export ANT_LIB=/Yogesh/ant-1.8.4/lib
export PATH=$PATH:$ANT_LIB


now when i do run command 

hive --service hwi  it results 

12/07/17 18:03:02 INFO hwi.HWIServer: HWI is starting up
12/07/17 18:03:02 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /HADOOP/hive/conf/hive-default.xml
12/07/17 18:03:02 FATAL hwi.HWIServer: HWI WAR file not found at /HADOOP/hive/lib/hive-hwi-0.8.1.war


and if I go for

hive --service hwi --help it results

Usage ANT_LIB=XXXX hive --service hwi


Althought if I go to /HADOOP/hive/lib directory I found 

1) hive-hwi-0.8.1.war
2) hive-hwi-0.8.1.jar

these files are present there.

what is Iam doing wrong :-( ?

Please help and Suggest

Greetings
Yogesh Kumar

________________________________
 
From:Gesli, Nicole [Nicole.Gesli@memorylane.com]
Sent: Wednesday, July 18, 2012 12:50 AM
To: user@hive.apache.org; bejoy_ks@yahoo.com
Subject: Re: DATA UPLOADTION
For the Hive query approach, check the string functions (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-StringFunctions) or write your own (UDF), if needed. It depends on what you are trying to get. Example:
 
SELECT TRIM(SUBSTR(data, LOCATE(LOWER(data), ' this '), LOCATE(LOWER(data), ' that ')+5)) my_string
FROM   log_table
WHERE  LOWER(data) LIKE '%this%and%that%'
 
 
From: Bejoy KS <be...@yahoo.com>
Reply-To: "user@hive.apache.org" <us...@hive.apache.org>, "bejoy_ks@yahoo.com" <be...@yahoo.com>
Date: Monday, July 16, 2012 11:39 PM
To: "user@hive.apache.org" <us...@hive.apache.org>
Subject: Re: DATA UPLOADTION
 
Hi Yogesh

You can connect reporting tools like tableau , micro strategy etc direcly with hive.

If you are looking for some static reports based on aggregate data. You can process the data in hive move the resultant data into some rdbms and use some common reporting tools over the same. I know quite a few projects following this model.
Regards
Bejoy KS

Sent from handheld, please excuse typos.

________________________________
 
From: <yo...@wipro.com> 
Date: Tue, 17 Jul 2012 06:33:43 +0000
To: <us...@hive.apache.org>; <be...@yahoo.com>
ReplyTo: user@hive.apache.org 
Subject: RE: DATA UPLOADTION
 
Thanks Gesli and Bejoy,

I have created tables in hive and uploaded data into it. I can perform query on it, please suggest me how to generate reports from that tables.

Mr. Gesli, 
If I create tables with single string column like ( create table Log_table( Data STRING); ) then how can perform condition based query over the data into Log_table ?


Thanks & Regards :-)
Yogesh Kumar

________________________________
 
From:Gesli, Nicole [Nicole.Gesli@memorylane.com]
Sent: Monday, July 16, 2012 11:30 PM
To: user@hive.apache.org; bejoy_ks@yahoo.com
Cc: user@hbase.apache.org
Subject: Re: DATA UPLOADTION
If you are just trying to find certain text in the data files and you just want to do bulk process to create reports once a day or so, and prefer to use Hive: you can create a table with with single string column. You need to pre-process your data to replace the default column delimiter in your data. Or, you can define a column delimiter that your data does not have. That is to make sure that entire line data is assigned to the column but not cut in where the column delimiter is. If your query will be different for each file type (flat files, logs, xls,…) you can create different partitions for each file type. Dump your files into the table (or table partition) folder(s). Or you can create external table(s) if your data is already in HDFS. You can than do "like" (faster) or "rlike" search on the table.
 
-Nicole
 
From: Bejoy KS <be...@yahoo.com>
Reply-To: "user@hive.apache.org" <us...@hive.apache.org>, "bejoy_ks@yahoo.com" <be...@yahoo.com>
Date: Monday, July 16, 2012 12:50 AM
To: "user@hive.apache.org" <us...@hive.apache.org>
Cc: "user@hbase.apache.org" <us...@hbase.apache.org>
Subject: Re: DATA UPLOADTION
 
Hi Yogesh

If you are looking at some indexing and search kind of operation you can take a look at lucene.

Whether you are using hive or Hbase you cannot do any operation without having a table structure defined for the data. So you need to create tables for each dataset and then only you can go ahead and issue queries and generate reports on those data. 
Regards
Bejoy KS

Sent from handheld, please excuse typos.

________________________________
 
From: <yo...@wipro.com> 
Date: Mon, 16 Jul 2012 06:21:15 +0000
To: <us...@hive.apache.org>
ReplyTo: user@hive.apache.org
Cc: <us...@hbase.apache.org>
Subject: RE: DATA UPLOADTION
 
Hello Debarshi,

Please suggest me what tool should I use for these operation over hadoop dfs.

Regards
Yogesh Kumar

________________________________
 
From:Debarshi Basak [debarshi.basak@tcs.com]
Sent: Monday, July 16, 2012 11:25 AM
To: user@hive.apache.org
Cc: user@hive.apache.org; user@hbase.apache.org
Subject: Re: DATA UPLOADTION
Hive is not the right to go about it, if you are planning to do search kind of operations


Debarshi Basak
Tata Consultancy Services
Mailto: debarshi.basak@tcs.com
Website: http://www.tcs.com
____________________________________________
Experience certainty. IT Services
Business Solutions
Outsourcing
____________________________________________

----- wrote: -----
To: <us...@hive.apache.org>
>From: <yo...@wipro.com>
>Date: 07/16/2012 09:11AM
>cc: <us...@hbase.apache.org>
>Subject: DATA UPLOADTION
>Hi all,
>
>I have data of Flat files, Log files, Images and .xls Files of around many G.B
>
>I need to put operation like searching, Querying over that raw data.  and generating reports.
>And its impossible to create tables manually for all to manage them. Is there any other way out or how to manage them using Hive or Hbase.
>
>Please suggest me how do I perform these operations over them, I want to use HADOOP DFS and files has been uploaded on HDFS (Single user)
>
>
>Thanks & Regards
>Yogesh Kumar
>Please do not print this email unless it is absolutely necessary. 
>The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 
>WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 
>www.wipro.com 
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you
Please do not print this email unless it is absolutely necessary. 
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 
www.wipro.com 
Please do not print this email unless it is absolutely necessary. 
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 
www.wipro.com 
Please do not print this email unless it is absolutely necessary. 
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 
www.wipro.com 
Any comments or statements made in this email are not necessarily those of Tavant Technologies.
The information transmitted is intended only for the person or entity to which it is addressed and may 
contain confidential and/or privileged material. If you have received this in error, please contact the 
sender and delete the material from any computer. All e-mails sent from or to Tavant Technologies 
may be subject to our monitoring procedures.