You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2016/01/24 20:42:39 UTC
[jira] [Commented] (PHOENIX-2169) Illegal data error on UPSERT
SELECT and JOIN with salted tables
[ https://issues.apache.org/jira/browse/PHOENIX-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114489#comment-15114489 ]
James Taylor commented on PHOENIX-2169:
---------------------------------------
[~jmahonin] - is this still an issue?
If so, [~ankit.singhal] - would you mind trying to fix?
> Illegal data error on UPSERT SELECT and JOIN with salted tables
> ---------------------------------------------------------------
>
> Key: PHOENIX-2169
> URL: https://issues.apache.org/jira/browse/PHOENIX-2169
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.5.0
> Reporter: Josh Mahonin
> Assignee: Josh Mahonin
> Labels: verify
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2169-bug.patch
>
>
> I have an issue where I get periodic failures (~50%) for an UPSERT SELECT query involving a JOIN on salted tables. Unfortunately I haven't been able to create a reproducible test case yet, though I'll keep trying. I believe this same behaviour existed in 4.3.1 as well, so I don't think it's a regression.
> The upsert query itself looks something like this:
> {code}
> UPSERT INTO a(tid, ds, etp, eid, ts, atp, rel, tp, tpid, dt, pro)
> SELECT c.tid,
> c.ds,
> c.etp,
> c.eid,
> c.dh,
> 0,
> c.rel,
> c.tp,
> c.tpid,
> current_time(),
> 1.0 / s.th
> FROM e_c c
> join e_s s
> ON s.tid = c.tid
> AND s.ds = c.ds
> AND s.etp = c.etp
> AND s.eid = c.eid
> WHERE c.tid = 'FOO';
> {code}
> Without the upsert, the query always returns the right data, but with the upsert, it ends up with failures like:
> Error: ERROR 201 (22000): Illegal data. ERROR 201 (22000): Illegal data. Expected length of at least 109 bytes, but had 19 (state=22000,code=201)
> The explain plan looks like:
> {code}
> UPSERT SELECT
> CLIENT 16-CHUNK PARALLEL 16-WAY RANGE SCAN OVER E_C [0,'FOO']
> SERVER FILTER BY FIRST KEY ONLY
> PARALLEL INNER-JOIN TABLE 0
> CLIENT 16-CHUNK PARALLEL 16-WAY FULL SCAN OVER E_S
> DYNAMIC SERVER FILTER BY (C.TID, C.DS, C.ETP, C.EID) IN ((S.TID, S.DS, S.ETP, S.EID))
> {code}
> I'm using SALT_BUCKETS=16 for both tables in the join, and this is a dev environment, so only 1 region server. Note that without salted tables, I have no issue with this query.
> The number of rows in E_C is around 23K, and the number of rows in E_S is 62.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)