You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Anurag Phadke <ap...@mozilla.com> on 2010/07/31 03:58:04 UTC

Failed with exception java.io.IOException:java.lang.NullPointerException

We are importing hadoop logs inside hive, but are running in some issues.
Sample log lines:
2010-02-25 14:27:18,000 INFO org.apache.hadoop.mapred.TaskTracker: 
SHUTDOWN_MSG:

Query: SELECT * FROM logs_temp;
runs fine for the above statement.

However, for the log lines:
/************************************************************
SHUTDOWN_MSG: Shutting down TaskTracker at 
cm-hadoop01.mozilla.org/10.2.72.53
************************************************************/

Query: SELECT * FROM logs_temp;
Failed with exception java.io.IOException:java.lang.NullPointerException

However, SELECT count(1) FROM logs_temp;
returns 3 rows, which is correct.

Table structure given below:
add jar /usr/lib/hive/lib/hive_contrib.jar;
CREATE EXTERNAL TABLE logs_temp(
line_date STRING,
line_time STRING,
message_type STRING,
classname STRING,
message STRING
)

PARTITIONED BY (ds STRING, ts STRING, hn STRING)

ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" =
"^(\\d{4}(?>-\\d{2}){2})\\s((?>\\d{2}[:,]){3}\\d{3})\\s([A-Z]+)\\s([^:]+):\\s(.*)"
)
STORED AS TEXTFILE;



Any idea on what might be going wrong here?

-Anurag


Re: Failed with exception java.io.IOException:java.lang.NullPointerException

Posted by Anurag Phadke <ap...@mozilla.com>.
it's a regex that fails when it sees an invalid line such as (/***************)
tips on what can be done to fix this?


----- Original Message -----
From: Parag Arora <pa...@webaroo.com>
To: hive-user@hadoop.apache.org
Sent: Fri, 30 Jul 2010 23:42:28 -0700 (PDT)
Subject: Re: Failed with exception java.io.IOException:java.lang.NullPointerException
It seems that your serde output must have been null.
On Sat, Jul 31, 2010 at 7:28 AM, Anurag Phadke <ap...@mozilla.com> wrote:
> We are importing hadoop logs inside hive, but are running in some issues.
> Sample log lines:
> 2010-02-25 14:27:18,000 INFO org.apache.hadoop.mapred.TaskTracker:
> SHUTDOWN_MSG:
>
> Query: SELECT * FROM logs_temp;
> runs fine for the above statement.
>
> However, for the log lines:
> /************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at
> cm-hadoop01.mozilla.org/10.2.72.53
> ************************************************************/
>
> Query: SELECT * FROM logs_temp;
> Failed with exception java.io.IOException:java.lang.NullPointerException
>
> However, SELECT count(1) FROM logs_temp;
> returns 3 rows, which is correct.
>
> Table structure given below:
> add jar /usr/lib/hive/lib/hive_contrib.jar;
> CREATE EXTERNAL TABLE logs_temp(
> line_date STRING,
> line_time STRING,
> message_type STRING,
> classname STRING,
> message STRING
> )
>
> PARTITIONED BY (ds STRING, ts STRING, hn STRING)
>
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> WITH SERDEPROPERTIES (
> "input.regex" =
>
> "^(\\d{4}(?>-\\d{2}){2})\\s((?>\\d{2}[:,]){3}\\d{3})\\s([A-Z]+)\\s([^:]+):\\s(.*)"
> )
> STORED AS TEXTFILE;
>
>
>
> Any idea on what might be going wrong here?
>
> -Anurag
>
>
-- 
Parag
http://www.paragarora.com
Phone: +91.8080350130

Re: Failed with exception java.io.IOException:java.lang.NullPointerException

Posted by Parag Arora <pa...@webaroo.com>.
It seems that your serde output must have been null.

On Sat, Jul 31, 2010 at 7:28 AM, Anurag Phadke <ap...@mozilla.com> wrote:

> We are importing hadoop logs inside hive, but are running in some issues.
> Sample log lines:
> 2010-02-25 14:27:18,000 INFO org.apache.hadoop.mapred.TaskTracker:
> SHUTDOWN_MSG:
>
> Query: SELECT * FROM logs_temp;
> runs fine for the above statement.
>
> However, for the log lines:
> /************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at
> cm-hadoop01.mozilla.org/10.2.72.53
> ************************************************************/
>
> Query: SELECT * FROM logs_temp;
> Failed with exception java.io.IOException:java.lang.NullPointerException
>
> However, SELECT count(1) FROM logs_temp;
> returns 3 rows, which is correct.
>
> Table structure given below:
> add jar /usr/lib/hive/lib/hive_contrib.jar;
> CREATE EXTERNAL TABLE logs_temp(
> line_date STRING,
> line_time STRING,
> message_type STRING,
> classname STRING,
> message STRING
> )
>
> PARTITIONED BY (ds STRING, ts STRING, hn STRING)
>
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> WITH SERDEPROPERTIES (
> "input.regex" =
>
> "^(\\d{4}(?>-\\d{2}){2})\\s((?>\\d{2}[:,]){3}\\d{3})\\s([A-Z]+)\\s([^:]+):\\s(.*)"
> )
> STORED AS TEXTFILE;
>
>
>
> Any idea on what might be going wrong here?
>
> -Anurag
>
>


-- 
Parag
http://www.paragarora.com
Phone: +91.8080350130