You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Colin B. (JIRA)" <ji...@apache.org> on 2014/04/09 23:01:20 UTC

[jira] [Created] (CASSANDRA-7018) cqlsh failing to handle utf8 decode errors in certain cases

Colin B. created CASSANDRA-7018:
-----------------------------------

             Summary: cqlsh failing to handle utf8 decode errors in certain cases
                 Key: CASSANDRA-7018
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7018
             Project: Cassandra
          Issue Type: Bug
          Components: Tools
            Reporter: Colin B.
            Priority: Minor


Under certain circumstances when a row to be returned in cqlsh contains a utf8 decoding error, no results will be printed. It seems that certain types of where clauses cause the decoding to fail.

Preparation:
{code}
[cqlsh 3.1.8 | Cassandra 1.2.16 | CQL spec 3.0.0 | Thrift protocol 19.36.2]
...
cqlsh:ks> CREATE TABLE test_utf8 ( a text PRIMARY KEY, b text );
cqlsh:ks> INSERT INTO test_utf8 (a, b) VALUES (blobAsText(0x3031f6393130), '0110');
cqlsh:ks> select * from test_utf8;

 a           | b
-------------+------
 '01\xf6910' | 0110

Failed to decode value '01\xf6910' (for column 'a') as text: 'utf8' codec can't decode byte 0xf6 in position 2: invalid start byte
cqlsh:ks>
{code}

Actual Results:
{code}
cqlsh:ks> select * from test_utf8 where a = blobAsText(0x3031f6393130);

value '01\xf6910' (in col 'a') can't be deserialized as text: 'utf8' codec can't decode byte 0xf6 in position 2: invalid start byte
cqlsh:ks>
{code}

Expected Results:
{code}
cqlsh:ks> select * from test_utf8 where a = blobAsText(0x3031f6393130);

 a           | b
-------------+------
 '01\xf6910' | 0110

Failed to decode value '01\xf6910' (for column 'a') as text: 'utf8' codec can't decode byte 0xf6 in position 2: invalid start byte
cqlsh:ks>
{code}

Traceback with cqlsh --debug:
{code}
cqlsh:ks> select * from test_utf8 where a = blobAsText(0x3031f6393130);

Traceback (most recent call last):
  File "bin/cqlsh", line 942, in onecmd
    self.handle_statement(st, statementtext)
  File "bin/cqlsh", line 982, in handle_statement
    return self.handle_parse_error(cmdword, tokens, parsed, srcstr)
  File "bin/cqlsh", line 991, in handle_parse_error
    return self.perform_statement(cqlruleset.cql_extract_orig(tokens, srcstr))
  File "bin/cqlsh", line 1031, in perform_statement
    with_default_limit=with_default_limit)
  File "bin/cqlsh", line 1059, in perform_statement_untraced
    self.print_result(self.cursor, with_default_limit)
  File "bin/cqlsh", line 1111, in print_result
    self.print_static_result(cursor)
  File "bin/cqlsh", line 1141, in print_static_result
    formatted_values = [map(self.myformat_value, self.decode_row(cursor, row), cursor.column_types) for row in cursor.result]
  File "bin/cqlsh", line 590, in decode_row
    values.append(cursor.decoder.decode_value(val, vtype, nameinfo[0]))
  File "bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/decoders.py", line 54, in decode_value
    vtype.cql_parameterized_type())
  File "bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/decoders.py", line 33, in value_decode_error
    % (valuebytes, namebytes, expectedtype, err))
ProgrammingError: value '01\xf6910' (in col 'a') can't be deserialized as text: 'utf8' codec can't decode byte 0xf6 in position 2: invalid start byte
cqlsh:ks>
{code}

Problematic statements:
{code}
cqlsh:ks> select * from test_utf8 where token(a) > 0;
cqlsh:ks> select * from test_utf8 where a in (blobAsText(0x3031f6393130));
cqlsh:ks> select * from test_utf8 where a = blobAsText(0x3031f6393130);
{code}
Statements that are ok:
{code}
cqlsh:ks> select * from test_utf8 where token(a) < token('qwer');
cqlsh:ks> select * from test_utf8;
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)