You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Hemal Parekh <he...@bitscopic.com> on 2015/11/24 22:06:58 UTC

Phoenix 4.4 does not accept null date value in bulk load

Hi,

We recently upgraded our production HDP cluster from 2.2 to 2.3.2. Phoenix
was upgraded from 4.2 to 4.4. The bulk load script using psql.py which was
working in Phoenix 4.2 stopped working in Phoenix 4.4. Upon investigation,
I found that psql.py was failing to upsert null value into a date column
which was working fine in Phoenix 4.2. It throws following error. The .csv
file has an empty string for a date column. To rule out any upgrade issue,
I created a temp table in Phoenix 4.4 and tried to insert a record using
psql.py but it failed giving below error for null date value.

*java.lang.IllegalArgumentException: Invalid format: ""*


create table temp1 (pk varchar, c1 varchar, c2 date, c3 integer, c4 varchar
constraint pk_temp1 primary key (pk))

Values used in .csv file: I ran psql.py separately with two different
records.

p1~abc~~1~x (this one gives error)

p2~abc~2015-11-24 00:00:00~1~x (this one is getting inserted fine)


Bulk load command:

/usr/hdp/current/phoenix-client/bin/psql.py -t TEMP1 -d '~' -s <my
host>:2181:/hbase-unsecure temp1_insert.csv


For column other than date type, psql.py can upsert null value.


Has anyone experienced this issue? Do I need to set any property in
hbase-site.xml to allow null value in date column?


Thanks,

Hemal Parekh
Senior Data Warehouse Architect
m. 240.449.4396
[image: Bitscopic Inc] <http://bitscopic.com>

Re: Phoenix 4.4 does not accept null date value in bulk load

Posted by Hemal Parekh <he...@bitscopic.com>.
I upgraded to Phoenix 4.5.2 and bulk load worked allowing null into date
columns.

Thanks a lot!


On Wed, Nov 25, 2015 at 10:00 AM, Gabriel Reid <ga...@gmail.com>
wrote:

> Yes, although it was originally reported because of a CHAR-specific issue,
> the fix was to correct support for null values for all types.
>
> - Gabriel
>
> On Wed, Nov 25, 2015 at 3:37 PM, Hemal Parekh <he...@bitscopic.com> wrote:
>
>> Thanks!
>>
>> PHOENIX-1277 was primarily for null issue in CHAR type. Does the patch
>> also take care of date column?
>>
>>
>> On Wed, Nov 25, 2015 at 3:33 AM, Gabriel Reid <ga...@gmail.com>
>> wrote:
>>
>>> Indeed, this was a regression. It has since been fixed in PHOENIX-1277
>>> [1], and is available in Phoenix 4.4.1 and Phoenix 4.5.0.
>>>
>>> - Gabriel
>>>
>>> 1. https://issues.apache.org/jira/browse/PHOENIX-1277
>>>
>>> On Wed, Nov 25, 2015 at 4:07 AM, 彭昶勳 <cx...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> In Phoenix-4.3.0 or later version, They change the way to convert a
>>>> date type column to an object in bulk load.
>>>> If a column is date type column and the value of this column is not
>>>> null, Phoenix will convert this value to byte first.
>>>> In this step, if the value of this column if empty string (""), it may
>>>> cause error.
>>>>
>>>> The code is in the  line:231 of
>>>> org.apache.phoenix.util.csv.CsvUpsertExecutor.java
>>>>
>>>> You can check the different between Phoenix-4.2.2 and Phoenix-4.3.0 in
>>>> the following website.
>>>>
>>>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.3.0/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
>>>>
>>>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.2.2/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
>>>>
>>>>
>>>> Best regards,
>>>> Chang-Syun
>>>>
>>>>
>>>> 2015-11-25 5:06 GMT+08:00 Hemal Parekh <he...@bitscopic.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> We recently upgraded our production HDP cluster from 2.2 to 2.3.2.
>>>>> Phoenix was upgraded from 4.2 to 4.4. The bulk load script using psql.py
>>>>> which was working in Phoenix 4.2 stopped working in Phoenix 4.4. Upon
>>>>> investigation, I found that psql.py was failing to upsert null value into a
>>>>> date column which was working fine in Phoenix 4.2. It throws following
>>>>> error. The .csv file has an empty string for a date column. To rule out any
>>>>> upgrade issue, I created a temp table in Phoenix 4.4 and tried to insert a
>>>>> record using psql.py but it failed giving below error for null date value.
>>>>>
>>>>> *java.lang.IllegalArgumentException: Invalid format: ""*
>>>>>
>>>>>
>>>>> create table temp1 (pk varchar, c1 varchar, c2 date, c3 integer, c4
>>>>> varchar constraint pk_temp1 primary key (pk))
>>>>>
>>>>> Values used in .csv file: I ran psql.py separately with two different
>>>>> records.
>>>>>
>>>>> p1~abc~~1~x (this one gives error)
>>>>>
>>>>> p2~abc~2015-11-24 00:00:00~1~x (this one is getting inserted fine)
>>>>>
>>>>>
>>>>> Bulk load command:
>>>>>
>>>>> /usr/hdp/current/phoenix-client/bin/psql.py -t TEMP1 -d '~' -s <my
>>>>> host>:2181:/hbase-unsecure temp1_insert.csv
>>>>>
>>>>>
>>>>> For column other than date type, psql.py can upsert null value.
>>>>>
>>>>>
>>>>> Has anyone experienced this issue? Do I need to set any property in
>>>>> hbase-site.xml to allow null value in date column?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Hemal Parekh
>>>>> Senior Data Warehouse Architect
>>>>> m. 240.449.4396
>>>>> [image: Bitscopic Inc] <http://bitscopic.com>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>>
>> Hemal Parekh
>> Senior Data Warehouse Architect
>> m. 240.449.4396
>> [image: Bitscopic Inc] <http://bitscopic.com>
>>
>>
>


-- 

Hemal Parekh
Senior Data Warehouse Architect
m. 240.449.4396
[image: Bitscopic Inc] <http://bitscopic.com>

Re: Phoenix 4.4 does not accept null date value in bulk load

Posted by Gabriel Reid <ga...@gmail.com>.
Yes, although it was originally reported because of a CHAR-specific issue,
the fix was to correct support for null values for all types.

- Gabriel

On Wed, Nov 25, 2015 at 3:37 PM, Hemal Parekh <he...@bitscopic.com> wrote:

> Thanks!
>
> PHOENIX-1277 was primarily for null issue in CHAR type. Does the patch
> also take care of date column?
>
>
> On Wed, Nov 25, 2015 at 3:33 AM, Gabriel Reid <ga...@gmail.com>
> wrote:
>
>> Indeed, this was a regression. It has since been fixed in PHOENIX-1277
>> [1], and is available in Phoenix 4.4.1 and Phoenix 4.5.0.
>>
>> - Gabriel
>>
>> 1. https://issues.apache.org/jira/browse/PHOENIX-1277
>>
>> On Wed, Nov 25, 2015 at 4:07 AM, 彭昶勳 <cx...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> In Phoenix-4.3.0 or later version, They change the way to convert a date
>>> type column to an object in bulk load.
>>> If a column is date type column and the value of this column is not
>>> null, Phoenix will convert this value to byte first.
>>> In this step, if the value of this column if empty string (""), it may
>>> cause error.
>>>
>>> The code is in the  line:231 of
>>> org.apache.phoenix.util.csv.CsvUpsertExecutor.java
>>>
>>> You can check the different between Phoenix-4.2.2 and Phoenix-4.3.0 in
>>> the following website.
>>>
>>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.3.0/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
>>>
>>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.2.2/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
>>>
>>>
>>> Best regards,
>>> Chang-Syun
>>>
>>>
>>> 2015-11-25 5:06 GMT+08:00 Hemal Parekh <he...@bitscopic.com>:
>>>
>>>> Hi,
>>>>
>>>> We recently upgraded our production HDP cluster from 2.2 to 2.3.2.
>>>> Phoenix was upgraded from 4.2 to 4.4. The bulk load script using psql.py
>>>> which was working in Phoenix 4.2 stopped working in Phoenix 4.4. Upon
>>>> investigation, I found that psql.py was failing to upsert null value into a
>>>> date column which was working fine in Phoenix 4.2. It throws following
>>>> error. The .csv file has an empty string for a date column. To rule out any
>>>> upgrade issue, I created a temp table in Phoenix 4.4 and tried to insert a
>>>> record using psql.py but it failed giving below error for null date value.
>>>>
>>>> *java.lang.IllegalArgumentException: Invalid format: ""*
>>>>
>>>>
>>>> create table temp1 (pk varchar, c1 varchar, c2 date, c3 integer, c4
>>>> varchar constraint pk_temp1 primary key (pk))
>>>>
>>>> Values used in .csv file: I ran psql.py separately with two different
>>>> records.
>>>>
>>>> p1~abc~~1~x (this one gives error)
>>>>
>>>> p2~abc~2015-11-24 00:00:00~1~x (this one is getting inserted fine)
>>>>
>>>>
>>>> Bulk load command:
>>>>
>>>> /usr/hdp/current/phoenix-client/bin/psql.py -t TEMP1 -d '~' -s <my
>>>> host>:2181:/hbase-unsecure temp1_insert.csv
>>>>
>>>>
>>>> For column other than date type, psql.py can upsert null value.
>>>>
>>>>
>>>> Has anyone experienced this issue? Do I need to set any property in
>>>> hbase-site.xml to allow null value in date column?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Hemal Parekh
>>>> Senior Data Warehouse Architect
>>>> m. 240.449.4396
>>>> [image: Bitscopic Inc] <http://bitscopic.com>
>>>>
>>>>
>>>
>>
>
>
> --
>
> Hemal Parekh
> Senior Data Warehouse Architect
> m. 240.449.4396
> [image: Bitscopic Inc] <http://bitscopic.com>
>
>

Re: Phoenix 4.4 does not accept null date value in bulk load

Posted by Hemal Parekh <he...@bitscopic.com>.
Thanks!

PHOENIX-1277 was primarily for null issue in CHAR type. Does the patch also
take care of date column?


On Wed, Nov 25, 2015 at 3:33 AM, Gabriel Reid <ga...@gmail.com>
wrote:

> Indeed, this was a regression. It has since been fixed in PHOENIX-1277
> [1], and is available in Phoenix 4.4.1 and Phoenix 4.5.0.
>
> - Gabriel
>
> 1. https://issues.apache.org/jira/browse/PHOENIX-1277
>
> On Wed, Nov 25, 2015 at 4:07 AM, 彭昶勳 <cx...@gmail.com> wrote:
>
>> Hi,
>>
>> In Phoenix-4.3.0 or later version, They change the way to convert a date
>> type column to an object in bulk load.
>> If a column is date type column and the value of this column is not
>> null, Phoenix will convert this value to byte first.
>> In this step, if the value of this column if empty string (""), it may
>> cause error.
>>
>> The code is in the  line:231 of
>> org.apache.phoenix.util.csv.CsvUpsertExecutor.java
>>
>> You can check the different between Phoenix-4.2.2 and Phoenix-4.3.0 in
>> the following website.
>>
>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.3.0/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
>>
>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.2.2/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
>>
>>
>> Best regards,
>> Chang-Syun
>>
>>
>> 2015-11-25 5:06 GMT+08:00 Hemal Parekh <he...@bitscopic.com>:
>>
>>> Hi,
>>>
>>> We recently upgraded our production HDP cluster from 2.2 to 2.3.2.
>>> Phoenix was upgraded from 4.2 to 4.4. The bulk load script using psql.py
>>> which was working in Phoenix 4.2 stopped working in Phoenix 4.4. Upon
>>> investigation, I found that psql.py was failing to upsert null value into a
>>> date column which was working fine in Phoenix 4.2. It throws following
>>> error. The .csv file has an empty string for a date column. To rule out any
>>> upgrade issue, I created a temp table in Phoenix 4.4 and tried to insert a
>>> record using psql.py but it failed giving below error for null date value.
>>>
>>> *java.lang.IllegalArgumentException: Invalid format: ""*
>>>
>>>
>>> create table temp1 (pk varchar, c1 varchar, c2 date, c3 integer, c4
>>> varchar constraint pk_temp1 primary key (pk))
>>>
>>> Values used in .csv file: I ran psql.py separately with two different
>>> records.
>>>
>>> p1~abc~~1~x (this one gives error)
>>>
>>> p2~abc~2015-11-24 00:00:00~1~x (this one is getting inserted fine)
>>>
>>>
>>> Bulk load command:
>>>
>>> /usr/hdp/current/phoenix-client/bin/psql.py -t TEMP1 -d '~' -s <my
>>> host>:2181:/hbase-unsecure temp1_insert.csv
>>>
>>>
>>> For column other than date type, psql.py can upsert null value.
>>>
>>>
>>> Has anyone experienced this issue? Do I need to set any property in
>>> hbase-site.xml to allow null value in date column?
>>>
>>>
>>> Thanks,
>>>
>>> Hemal Parekh
>>> Senior Data Warehouse Architect
>>> m. 240.449.4396
>>> [image: Bitscopic Inc] <http://bitscopic.com>
>>>
>>>
>>
>


-- 

Hemal Parekh
Senior Data Warehouse Architect
m. 240.449.4396
[image: Bitscopic Inc] <http://bitscopic.com>

Re: Phoenix 4.4 does not accept null date value in bulk load

Posted by Gabriel Reid <ga...@gmail.com>.
Indeed, this was a regression. It has since been fixed in PHOENIX-1277 [1],
and is available in Phoenix 4.4.1 and Phoenix 4.5.0.

- Gabriel

1. https://issues.apache.org/jira/browse/PHOENIX-1277

On Wed, Nov 25, 2015 at 4:07 AM, 彭昶勳 <cx...@gmail.com> wrote:

> Hi,
>
> In Phoenix-4.3.0 or later version, They change the way to convert a date
> type column to an object in bulk load.
> If a column is date type column and the value of this column is not null,
> Phoenix will convert this value to byte first.
> In this step, if the value of this column if empty string (""), it may
> cause error.
>
> The code is in the  line:231 of
> org.apache.phoenix.util.csv.CsvUpsertExecutor.java
>
> You can check the different between Phoenix-4.2.2 and Phoenix-4.3.0 in the
> following website.
>
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.3.0/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
>
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.2.2/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
>
>
> Best regards,
> Chang-Syun
>
>
> 2015-11-25 5:06 GMT+08:00 Hemal Parekh <he...@bitscopic.com>:
>
>> Hi,
>>
>> We recently upgraded our production HDP cluster from 2.2 to 2.3.2.
>> Phoenix was upgraded from 4.2 to 4.4. The bulk load script using psql.py
>> which was working in Phoenix 4.2 stopped working in Phoenix 4.4. Upon
>> investigation, I found that psql.py was failing to upsert null value into a
>> date column which was working fine in Phoenix 4.2. It throws following
>> error. The .csv file has an empty string for a date column. To rule out any
>> upgrade issue, I created a temp table in Phoenix 4.4 and tried to insert a
>> record using psql.py but it failed giving below error for null date value.
>>
>> *java.lang.IllegalArgumentException: Invalid format: ""*
>>
>>
>> create table temp1 (pk varchar, c1 varchar, c2 date, c3 integer, c4
>> varchar constraint pk_temp1 primary key (pk))
>>
>> Values used in .csv file: I ran psql.py separately with two different
>> records.
>>
>> p1~abc~~1~x (this one gives error)
>>
>> p2~abc~2015-11-24 00:00:00~1~x (this one is getting inserted fine)
>>
>>
>> Bulk load command:
>>
>> /usr/hdp/current/phoenix-client/bin/psql.py -t TEMP1 -d '~' -s <my
>> host>:2181:/hbase-unsecure temp1_insert.csv
>>
>>
>> For column other than date type, psql.py can upsert null value.
>>
>>
>> Has anyone experienced this issue? Do I need to set any property in
>> hbase-site.xml to allow null value in date column?
>>
>>
>> Thanks,
>>
>> Hemal Parekh
>> Senior Data Warehouse Architect
>> m. 240.449.4396
>> [image: Bitscopic Inc] <http://bitscopic.com>
>>
>>
>

Re: Phoenix 4.4 does not accept null date value in bulk load

Posted by 彭昶勳 <cx...@gmail.com>.
Hi,

In Phoenix-4.3.0 or later version, They change the way to convert a date
type column to an object in bulk load.
If a column is date type column and the value of this column is not null,
Phoenix will convert this value to byte first.
In this step, if the value of this column if empty string (""), it may
cause error.

The code is in the  line:231 of
org.apache.phoenix.util.csv.CsvUpsertExecutor.java

You can check the different between Phoenix-4.2.2 and Phoenix-4.3.0 in the
following website.
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.3.0/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix-core/4.2.2/org/apache/phoenix/util/csv/CsvUpsertExecutor.java?av=f


Best regards,
Chang-Syun


2015-11-25 5:06 GMT+08:00 Hemal Parekh <he...@bitscopic.com>:

> Hi,
>
> We recently upgraded our production HDP cluster from 2.2 to 2.3.2. Phoenix
> was upgraded from 4.2 to 4.4. The bulk load script using psql.py which was
> working in Phoenix 4.2 stopped working in Phoenix 4.4. Upon investigation,
> I found that psql.py was failing to upsert null value into a date column
> which was working fine in Phoenix 4.2. It throws following error. The .csv
> file has an empty string for a date column. To rule out any upgrade issue,
> I created a temp table in Phoenix 4.4 and tried to insert a record using
> psql.py but it failed giving below error for null date value.
>
> *java.lang.IllegalArgumentException: Invalid format: ""*
>
>
> create table temp1 (pk varchar, c1 varchar, c2 date, c3 integer, c4
> varchar constraint pk_temp1 primary key (pk))
>
> Values used in .csv file: I ran psql.py separately with two different
> records.
>
> p1~abc~~1~x (this one gives error)
>
> p2~abc~2015-11-24 00:00:00~1~x (this one is getting inserted fine)
>
>
> Bulk load command:
>
> /usr/hdp/current/phoenix-client/bin/psql.py -t TEMP1 -d '~' -s <my
> host>:2181:/hbase-unsecure temp1_insert.csv
>
>
> For column other than date type, psql.py can upsert null value.
>
>
> Has anyone experienced this issue? Do I need to set any property in
> hbase-site.xml to allow null value in date column?
>
>
> Thanks,
>
> Hemal Parekh
> Senior Data Warehouse Architect
> m. 240.449.4396
> [image: Bitscopic Inc] <http://bitscopic.com>
>
>