You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Mick Bisignani <mb...@gmail.com> on 2017/09/18 00:59:55 UTC

Apache DRILL v1.11.0 handling Postgres citext columns with Inconsistency

*Hi Everyone, I have found that it is necessary to use a CAST() operation
when selecting from tables that contain citext in postgres tables.*

*i have recently found that the following inconsistency / error *


SELECT  CAST(tb1.id AS INTEGER) AS id,
        CAST(tb1.term AS VARCHAR)  as term,
        tb1.term   as term1,
        tb1.created_at as created_at ,
        tb1.updated_at as updated_at
FROM my_postgres.public.p_keywords  tb1 ;

*note*:  column term is defined as CITEXT



SELECT returns the following

+-------+---------------------------------+-------+--------------------------+--------------------------+
|  id   |              term               | term1 |        created_at
 |        updated_at        |
+-------+---------------------------------+-------+--------------------------+--------------------------+
| 1300  | tall herringbone shirt          | null  | 2017-08-29 09:11:39.261
 | 2017-08-29 09:11:39.261  |
| 1301  | short sleeve herringbone shirt  | null  | 2017-08-29 09:11:39.267
 | 2017-08-29 09:11:39.267  |
| 1302  | slim fit herringbone shirt      | null  | 2017-08-29 09:11:39.274
 | 2017-08-29 09:11:39.274  |
| 1303  | leather leggings                | null  | 2017-08-29 09:11:39.28
  | 2017-08-29 09:11:39.28   |
| 1304  | faux leather leggings           | null  | 2017-08-29 09:11:39.287
 | 2017-08-29 09:11:39.287  |
| 1305  | string bikini bottom            | null  | 2017-08-29 09:11:39.293
 | 2017-08-29 09:11:39.293  |
| 1306  | drawstring bikini bottom        | null  | 2017-08-29 09:11:39.299
 | 2017-08-29 09:11:39.299  |
| 1307  | dress shoes                     | null  | 2017-08-29 09:11:39.306
 | 2017-08-29 09:11:39.306  |
| 1308  | lace up dress shoes             | null  | 2017-08-29 09:11:39.312
 | 2017-08-29 09:11:39.312  |
| 1309  | bowl pendant                    | null  | 2017-08-29 09:11:39.319
 | 2017-08-29 09:11:39.319  |
| 1310  | shoe cabinet                    | null  | 2017-08-29 09:11:39.325
 | 2017-08-29 09:11:39.325  |
| 1311  | shawl collar                    | null  | 2017-08-29 09:11:39.331
 | 2017-08-29 09:11:39.331  |
+-------+---------------------------------+-------+--------------------------+--------------------------+

term1 is null as expected due to the lack of a specific CAST() operation on
the column


When I add another column  (term_count) to the select statement, the first
CAST also fails

SELECT  CAST(tb1.id AS INTEGER) AS id,
        CAST(tb1.term AS VARCHAR)  as term,
        tb1.term   as term1,
        tb1.term_count  as term_count,
        tb1.created_at as created_at ,
        tb1.updated_at as updated_at
FROM my_postgres.public.p_keywords  tb1 ;


CAST(tb1.term AS VARCHAR) not being applied.

+-------+-------+-------+------------+--------------------------+--------------------------+
|  id   | term  | term1 | term_count |        created_at        |
 updated_at        |
+-------+-------+-------+------------+--------------------------+--------------------------+
| 1300  | null  | null  | 3          | 2017-08-29 09:11:39.261  |
2017-08-29 09:11:39.261  |
| 1301  | null  | null  | 4          | 2017-08-29 09:11:39.267  |
2017-08-29 09:11:39.267  |
| 1302  | null  | null  | 4          | 2017-08-29 09:11:39.274  |
2017-08-29 09:11:39.274  |
| 1303  | null  | null  | 2          | 2017-08-29 09:11:39.28   |
2017-08-29 09:11:39.28   |
| 1304  | null  | null  | 3          | 2017-08-29 09:11:39.287  |
2017-08-29 09:11:39.287  |
| 1305  | null  | null  | 3          | 2017-08-29 09:11:39.293  |
2017-08-29 09:11:39.293  |
| 1306  | null  | null  | 3          | 2017-08-29 09:11:39.299  |
2017-08-29 09:11:39.299  |
| 1307  | null  | null  | 2          | 2017-08-29 09:11:39.306  |
2017-08-29 09:11:39.306  |
| 1308  | null  | null  | 4          | 2017-08-29 09:11:39.312  |
2017-08-29 09:11:39.312  |
| 1309  | null  | null  | 2          | 2017-08-29 09:11:39.319  |
2017-08-29 09:11:39.319  |
| 1310  | null  | null  | 2          | 2017-08-29 09:11:39.325  |
2017-08-29 09:11:39.325  |
| 1311  | null  | null  | 2          | 2017-08-29 09:11:39.331  |
2017-08-29 09:11:39.331  |
+-------+-------+-------+------------+--------------------------+--------------------------+


is this a bug in version 1.11.0   ?  Postgres is PostgreSQL 9.6.3 using
jdbc driver version 42.1.4


Thanks

mb

Re: Apache DRILL v1.11.0 handling Postgres citext columns with Inconsistency

Posted by Mick Bisignani <mb...@gmail.com>.
Kunal , no errors  generated or is a stack generated when I run the
commands. ..Just getting "null" for column term when it does not work.

just to recap   , here is what I see

# CITEXT  must be CAST

SELECT  CAST(tb1.id AS INTEGER) AS id,
        tb1.term   as term1,
        CAST(tb1.term AS VARCHAR)  as term
FROM my_postgres.public.p_keywords    tb1 ;

  term1 is NULL
  term is OK



# works
SELECT  CAST(tb1.id AS INTEGER) AS id,
        CAST(tb1.term AS VARCHAR)  as term,
        tb1.created_at as created_at ,
        tb1.updated_at as updated_at,
        tb1.term_count  as term_count
FROM my_postgres.public.p_keywords    tb1 ;


term is OK



# works  but does not contain  CITEXT column

SELECT  CAST(tb1.id AS INTEGER) AS id,
        tb1.created_at as created_at ,
        tb1.updated_at as updated_at,
        tb1.term_count  as term_count,
        tb1.occurance as occurance,
        tb1.top_cats  as top_cats,
        tb1.cats_by_priority  as  cats_by_priority
FROM my_postgres.public.p_keywords    tb1 ;




# does not work -   just added more columns  beyond column term_count
SELECT  CAST(tb1.id AS INTEGER) AS id,
        CAST(tb1.term AS VARCHAR)  as term,
        tb1.created_at as created_at ,
        tb1.updated_at as updated_at,
        tb1.term_count  as term_count,
        tb1.occurance as occurance,
        tb1.top_cats  as top_cats,
        tb1.cats_by_priority  as  cats_by_priority
FROM my_postgres.public.p_keywords    tb1 ;


term returns  "NULL"


structure of my_postgres.public.p_keywords table

several  columns omitted



     Column       |            Type             |
Modifiers                           | Storage  | Stats target | Description
-------------------+-----------------------------+---------------------------------------------------------------+----------+--------------+-------------
 id                | integer                     | not null default
nextval('product_keywords_id_seq'::regclass) | plain    |              |
 term              | citext                      |
                                      | extended |              |
 term_count        | integer                     |
                                      | plain    |              |
 occurance         | integer                     |
                                      | plain    |              |

 top_cats          | character varying           |
                                      | extended |              |
 cats_by_priority  | character varying           |
                                      | extended |              |

 created_at        | timestamp without time zone | not null
                                     | plain    |              |
 updated_at        | timestamp without time zone | not null
                                     | plain    |              |


Drill Version 1.11.0   ?  Postgres is .6.3 using
jdbc driver version 42.1.4



Cheers



On Mon, Sep 18, 2017 at 2:14 PM, Kunal Khatua <kk...@mapr.com> wrote:

> It's odd that adding just a term_count column is causing an error but the
> other 2 columns (created, updated) don't seem to be... and gets resolved on
> removing the cast.
>
> Can you provide the stack trace and error message? Also, what are the data
> types for the other columns?
>
>
> -----Original Message-----
> From: Mick Bisignani [mailto:mbisignani@gmail.com]
> Sent: Sunday, September 17, 2017 6:00 PM
> To: user@drill.apache.org
> Subject: Apache DRILL v1.11.0 handling Postgres citext columns with
> Inconsistency
>
> *Hi Everyone, I have found that it is necessary to use a CAST() operation
> when selecting from tables that contain citext in postgres tables.*
>
> *i have recently found that the following inconsistency / error *
>
>
> SELECT  CAST(tb1.id AS INTEGER) AS id,
>         CAST(tb1.term AS VARCHAR)  as term,
>         tb1.term   as term1,
>         tb1.created_at as created_at ,
>         tb1.updated_at as updated_at
> FROM my_postgres.public.p_keywords  tb1 ;
>
> *note*:  column term is defined as CITEXT
>
>
>
> SELECT returns the following
>
> +-------+---------------------------------+-------+---------
> -----------------+--------------------------+
> |  id   |              term               | term1 |        created_at
>  |        updated_at        |
> +-------+---------------------------------+-------+---------
> -----------------+--------------------------+
> | 1300  | tall herringbone shirt          | null  | 2017-08-29 09:11:39.261
>  | 2017-08-29 09:11:39.261  |
> | 1301  | short sleeve herringbone shirt  | null  | 2017-08-29
> | 09:11:39.267
>  | 2017-08-29 09:11:39.267  |
> | 1302  | slim fit herringbone shirt      | null  | 2017-08-29 09:11:39.274
>  | 2017-08-29 09:11:39.274  |
> | 1303  | leather leggings                | null  | 2017-08-29 09:11:39.28
>   | 2017-08-29 09:11:39.28   |
> | 1304  | faux leather leggings           | null  | 2017-08-29 09:11:39.287
>  | 2017-08-29 09:11:39.287  |
> | 1305  | string bikini bottom            | null  | 2017-08-29 09:11:39.293
>  | 2017-08-29 09:11:39.293  |
> | 1306  | drawstring bikini bottom        | null  | 2017-08-29 09:11:39.299
>  | 2017-08-29 09:11:39.299  |
> | 1307  | dress shoes                     | null  | 2017-08-29 09:11:39.306
>  | 2017-08-29 09:11:39.306  |
> | 1308  | lace up dress shoes             | null  | 2017-08-29 09:11:39.312
>  | 2017-08-29 09:11:39.312  |
> | 1309  | bowl pendant                    | null  | 2017-08-29 09:11:39.319
>  | 2017-08-29 09:11:39.319  |
> | 1310  | shoe cabinet                    | null  | 2017-08-29 09:11:39.325
>  | 2017-08-29 09:11:39.325  |
> | 1311  | shawl collar                    | null  | 2017-08-29 09:11:39.331
>  | 2017-08-29 09:11:39.331  |
> +-------+---------------------------------+-------+---------
> -----------------+--------------------------+
>
> term1 is null as expected due to the lack of a specific CAST() operation
> on the column
>
>
> When I add another column  (term_count) to the select statement, the first
> CAST also fails
>
> SELECT  CAST(tb1.id AS INTEGER) AS id,
>         CAST(tb1.term AS VARCHAR)  as term,
>         tb1.term   as term1,
>         tb1.term_count  as term_count,
>         tb1.created_at as created_at ,
>         tb1.updated_at as updated_at
> FROM my_postgres.public.p_keywords  tb1 ;
>
>
> CAST(tb1.term AS VARCHAR) not being applied.
>
> +-------+-------+-------+------------+----------------------
> ----+--------------------------+
> |  id   | term  | term1 | term_count |        created_at        |
>  updated_at        |
> +-------+-------+-------+------------+----------------------
> ----+--------------------------+
> | 1300  | null  | null  | 3          | 2017-08-29 09:11:39.261  |
> 2017-08-29 09:11:39.261  |
> | 1301  | null  | null  | 4          | 2017-08-29 09:11:39.267  |
> 2017-08-29 09:11:39.267  |
> | 1302  | null  | null  | 4          | 2017-08-29 09:11:39.274  |
> 2017-08-29 09:11:39.274  |
> | 1303  | null  | null  | 2          | 2017-08-29 09:11:39.28   |
> 2017-08-29 09:11:39.28   |
> | 1304  | null  | null  | 3          | 2017-08-29 09:11:39.287  |
> 2017-08-29 09:11:39.287  |
> | 1305  | null  | null  | 3          | 2017-08-29 09:11:39.293  |
> 2017-08-29 09:11:39.293  |
> | 1306  | null  | null  | 3          | 2017-08-29 09:11:39.299  |
> 2017-08-29 09:11:39.299  |
> | 1307  | null  | null  | 2          | 2017-08-29 09:11:39.306  |
> 2017-08-29 09:11:39.306  |
> | 1308  | null  | null  | 4          | 2017-08-29 09:11:39.312  |
> 2017-08-29 09:11:39.312  |
> | 1309  | null  | null  | 2          | 2017-08-29 09:11:39.319  |
> 2017-08-29 09:11:39.319  |
> | 1310  | null  | null  | 2          | 2017-08-29 09:11:39.325  |
> 2017-08-29 09:11:39.325  |
> | 1311  | null  | null  | 2          | 2017-08-29 09:11:39.331  |
> 2017-08-29 09:11:39.331  |
> +-------+-------+-------+------------+----------------------
> ----+--------------------------+
>
>
> is this a bug in version 1.11.0   ?  Postgres is PostgreSQL 9.6.3 using
> jdbc driver version 42.1.4
>
>
> Thanks
>
> mb
>

RE: Apache DRILL v1.11.0 handling Postgres citext columns with Inconsistency

Posted by Kunal Khatua <kk...@mapr.com>.
It's odd that adding just a term_count column is causing an error but the other 2 columns (created, updated) don't seem to be... and gets resolved on removing the cast.

Can you provide the stack trace and error message? Also, what are the data types for the other columns?


-----Original Message-----
From: Mick Bisignani [mailto:mbisignani@gmail.com] 
Sent: Sunday, September 17, 2017 6:00 PM
To: user@drill.apache.org
Subject: Apache DRILL v1.11.0 handling Postgres citext columns with Inconsistency

*Hi Everyone, I have found that it is necessary to use a CAST() operation when selecting from tables that contain citext in postgres tables.*

*i have recently found that the following inconsistency / error *


SELECT  CAST(tb1.id AS INTEGER) AS id,
        CAST(tb1.term AS VARCHAR)  as term,
        tb1.term   as term1,
        tb1.created_at as created_at ,
        tb1.updated_at as updated_at
FROM my_postgres.public.p_keywords  tb1 ;

*note*:  column term is defined as CITEXT



SELECT returns the following

+-------+---------------------------------+-------+--------------------------+--------------------------+
|  id   |              term               | term1 |        created_at
 |        updated_at        |
+-------+---------------------------------+-------+--------------------------+--------------------------+
| 1300  | tall herringbone shirt          | null  | 2017-08-29 09:11:39.261
 | 2017-08-29 09:11:39.261  |
| 1301  | short sleeve herringbone shirt  | null  | 2017-08-29 
| 09:11:39.267
 | 2017-08-29 09:11:39.267  |
| 1302  | slim fit herringbone shirt      | null  | 2017-08-29 09:11:39.274
 | 2017-08-29 09:11:39.274  |
| 1303  | leather leggings                | null  | 2017-08-29 09:11:39.28
  | 2017-08-29 09:11:39.28   |
| 1304  | faux leather leggings           | null  | 2017-08-29 09:11:39.287
 | 2017-08-29 09:11:39.287  |
| 1305  | string bikini bottom            | null  | 2017-08-29 09:11:39.293
 | 2017-08-29 09:11:39.293  |
| 1306  | drawstring bikini bottom        | null  | 2017-08-29 09:11:39.299
 | 2017-08-29 09:11:39.299  |
| 1307  | dress shoes                     | null  | 2017-08-29 09:11:39.306
 | 2017-08-29 09:11:39.306  |
| 1308  | lace up dress shoes             | null  | 2017-08-29 09:11:39.312
 | 2017-08-29 09:11:39.312  |
| 1309  | bowl pendant                    | null  | 2017-08-29 09:11:39.319
 | 2017-08-29 09:11:39.319  |
| 1310  | shoe cabinet                    | null  | 2017-08-29 09:11:39.325
 | 2017-08-29 09:11:39.325  |
| 1311  | shawl collar                    | null  | 2017-08-29 09:11:39.331
 | 2017-08-29 09:11:39.331  |
+-------+---------------------------------+-------+--------------------------+--------------------------+

term1 is null as expected due to the lack of a specific CAST() operation on the column


When I add another column  (term_count) to the select statement, the first CAST also fails

SELECT  CAST(tb1.id AS INTEGER) AS id,
        CAST(tb1.term AS VARCHAR)  as term,
        tb1.term   as term1,
        tb1.term_count  as term_count,
        tb1.created_at as created_at ,
        tb1.updated_at as updated_at
FROM my_postgres.public.p_keywords  tb1 ;


CAST(tb1.term AS VARCHAR) not being applied.

+-------+-------+-------+------------+--------------------------+--------------------------+
|  id   | term  | term1 | term_count |        created_at        |
 updated_at        |
+-------+-------+-------+------------+--------------------------+--------------------------+
| 1300  | null  | null  | 3          | 2017-08-29 09:11:39.261  |
2017-08-29 09:11:39.261  |
| 1301  | null  | null  | 4          | 2017-08-29 09:11:39.267  |
2017-08-29 09:11:39.267  |
| 1302  | null  | null  | 4          | 2017-08-29 09:11:39.274  |
2017-08-29 09:11:39.274  |
| 1303  | null  | null  | 2          | 2017-08-29 09:11:39.28   |
2017-08-29 09:11:39.28   |
| 1304  | null  | null  | 3          | 2017-08-29 09:11:39.287  |
2017-08-29 09:11:39.287  |
| 1305  | null  | null  | 3          | 2017-08-29 09:11:39.293  |
2017-08-29 09:11:39.293  |
| 1306  | null  | null  | 3          | 2017-08-29 09:11:39.299  |
2017-08-29 09:11:39.299  |
| 1307  | null  | null  | 2          | 2017-08-29 09:11:39.306  |
2017-08-29 09:11:39.306  |
| 1308  | null  | null  | 4          | 2017-08-29 09:11:39.312  |
2017-08-29 09:11:39.312  |
| 1309  | null  | null  | 2          | 2017-08-29 09:11:39.319  |
2017-08-29 09:11:39.319  |
| 1310  | null  | null  | 2          | 2017-08-29 09:11:39.325  |
2017-08-29 09:11:39.325  |
| 1311  | null  | null  | 2          | 2017-08-29 09:11:39.331  |
2017-08-29 09:11:39.331  |
+-------+-------+-------+------------+--------------------------+--------------------------+


is this a bug in version 1.11.0   ?  Postgres is PostgreSQL 9.6.3 using
jdbc driver version 42.1.4


Thanks

mb