You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Tyler Hobbs (JIRA)" <ji...@apache.org> on 2012/10/18 18:58:04 UTC

[jira] [Created] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Tyler Hobbs created CASSANDRA-4833:
--------------------------------------

             Summary: get_count with 'count' param between 1024 and ~actual column count fails
                 Key: CASSANDRA-4833
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.2.0 beta 1, 1.1.6
            Reporter: Tyler Hobbs


If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.

This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Yuki Morishita (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuki Morishita updated CASSANDRA-4833:
--------------------------------------

    Attachment: 4833-1.1-v2.txt

Thanks for review, Tyler.

What we want here is to fetch last page with remainder, not whole page size. So we still need requestedCount -= newColumns.

Attaching v2 for this.
                
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>            Assignee: Yuki Morishita
>         Attachments: 4833-1.1.txt, 4833-1.1-v2.txt, 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Yuki Morishita (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuki Morishita updated CASSANDRA-4833:
--------------------------------------

    Attachment: 4833-1.1.txt

You are right.
New version attached. I also modified test to match yours.

get_count pages when requesting count more than page size (determined by average column size but max at 1024). Paging starts with the last column of previously fetched page, so newly fetched page may contains one overlapped column.
When page size is 1024, and we have more than 1024 columns in a row, counting with limit of 1025 columns always fails because we fetch 1 (1025 - 1024 page size) column on 2nd page and it contains only already fetched column. Same thing can happen around the actual number of columns in a row.

Attached patch modified so that paging will fetch at least two columns.
                
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>            Assignee: Yuki Morishita
>         Attachments: 4833-1.1.txt, 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Yuki Morishita (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuki Morishita updated CASSANDRA-4833:
--------------------------------------

    Attachment: 4833-1.1.txt

get_count runs into infinite loop when requesting with count param around a multiple of page size(1024).
Patch attached with unit test.
                
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>         Attachments: 4833-1.1.txt, 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Tyler Hobbs (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481483#comment-13481483 ] 

Tyler Hobbs commented on CASSANDRA-4833:
----------------------------------------

The patch needs to be rebased, but I'm +1 on the code changes
                
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>            Assignee: Yuki Morishita
>         Attachments: 4833-1.1.txt, 4833-1.1-v2.txt, 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Tyler Hobbs (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tyler Hobbs updated CASSANDRA-4833:
-----------------------------------

    Attachment: 4833-get-count-repro.py

Attached script reproduces the issue with pycassa.
                
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>         Attachments: 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Tyler Hobbs (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479358#comment-13479358 ] 

Tyler Hobbs commented on CASSANDRA-4833:
----------------------------------------

I tested out the patch, and although the infinite loop isn't hit, the resulting count numbers are off.  For example, the repro script has 3050 columns, and when run produces these counts (manually edited for clarity):

{noformat}
specified count=1024: 1024 expected, got 1024
specified count=2^31: 3050 expected, got 2047
specified count=4000: 3050 expected, got 2047
specified count=3051: 3050 expected, got 2047
specified count=1025: 1025 expected, got 1024
{noformat}
                
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>            Assignee: Yuki Morishita
>         Attachments: 4833-1.1.txt, 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Yuki Morishita (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuki Morishita updated CASSANDRA-4833:
--------------------------------------

    Attachment:     (was: 4833-1.1.txt)
    
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>            Assignee: Yuki Morishita
>         Attachments: 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Tyler Hobbs (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480532#comment-13480532 ] 

Tyler Hobbs commented on CASSANDRA-4833:
----------------------------------------

The latest patch fixes the issue and passes all of the pycassa tests.

One comment on this conditional:
{code}
    if (requestedCount == 0 || columns.size() < predicate.slice_range.count)
        break;
{code}

Since you're no longer decrementing requestedCount, the first half of the disjunction isn't needed.  If the user actually set a requestedCount of 0, the first column slice would be empty, so we wouldn't get this far.

Other than that, I'm +1 on the changes
                
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>            Assignee: Yuki Morishita
>         Attachments: 4833-1.1.txt, 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4833) get_count with 'count' param between 1024 and ~actual column count fails

Posted by "Yuki Morishita (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuki Morishita updated CASSANDRA-4833:
--------------------------------------

    Assignee: Yuki Morishita
    
> get_count with 'count' param between 1024 and ~actual column count fails
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.6, 1.2.0 beta 1
>            Reporter: Tyler Hobbs
>            Assignee: Yuki Morishita
>         Attachments: 4833-1.1.txt, 4833-get-count-repro.py
>
>
> If you run get_count() with the 'count' param of the SliceRange set to a number between 1024 and (approximately) the actual number of columns in the row, something seems to silently fail internally, resulting in a client side timeout.  Using a 'count' param outside of this range (lower or much higher) works just fine.
> This seems to affect all of 1.1 and 1.2.0-beta1, but not 1.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira