You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/07/09 17:16:14 UTC

[jira] Created: (CASSANDRA-287) Make iterator-based read code the One True Path

Make iterator-based read code the One True Path
-----------------------------------------------

                 Key: CASSANDRA-287
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
             Project: Cassandra
          Issue Type: Bug
            Reporter: Jonathan Ellis
            Assignee: Jonathan Ellis
         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt

Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.

This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.

I introduce a new QueryFilter interface, which has 3 main methods:

    /**
     * returns an iterator that returns columns from the given memtable
     * matching the Filter criteria in sorted order.
     */
    public abstract ColumnIterator getMemColumnIterator(Memtable memtable);

    /**
     * returns an iterator that returns columns from the given SSTable
     * matching the Filter criteria in sorted order.
     */
    public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;

    /**
     * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
     * @param superColumn
     */
    public abstract void filterSuperColumn(SuperColumn superColumn);

The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.

If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-287:
-------------------------------------

    Attachment: 0007-fixes.patch

07 fixes points 1 and 2

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt, 0007-fixes.patch, 0008-fix-empty-CF-handling-should-always-be-null.patch
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729726#action_12729726 ] 

Jun Rao commented on CASSANDRA-287:
-----------------------------------

The new patch looks good.

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt, 0007-fixes.patch, 0008-fix-empty-CF-handling-should-always-be-null.patch
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729681#action_12729681 ] 

Jonathan Ellis commented on CASSANDRA-287:
------------------------------------------

Actually now that I think of it I did fix a FD leak in 287 already.  In the code

                iter = filter.getSSTableColumnIterator(sstable);
                if (iter.hasNext())
                {
                    returnCF.delete(iter.getColumnFamily());
                    iterators.add(iter);
                }
                else
                {
                    iter.close();
                }

the "else" clause didn't exist before, and since it wasn't added to the list of iterators it wouldn't get closed at the end.

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt, 0007-fixes.patch, 0008-fix-empty-CF-handling-should-always-be-null.patch
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-287:
-------------------------------------

    Attachment: 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt
                0005-nuke-timefilter.txt
                0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt
                0003-refactor-out-QueryFilter-SliceQueryFilter.txt
                0002-rename-lock_-sstableLock_.txt
                0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729467#action_12729467 ] 

Jonathan Ellis commented on CASSANDRA-287:
------------------------------------------

thanks for having a look, Jun.

1. oops. one too many rebases before submitting.  fixed.
2. fixed; if I'm feeling extra energetic I'll make it an enum
3. hmm, I'll need to take a closer look at this one -- making that change breaks SystemTableTest
4. yes, so I'm cheating a bit here to make the merge easier -- this is partially done in https://issues.apache.org/jira/browse/CASSANDRA-272 and will be completed in another patch
5. IMO it's bad form to @override an abstract method; are there others that I omitted?

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Michael Greene (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Greene updated CASSANDRA-287:
-------------------------------------

    Component/s: Core

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.4
>
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt, 0007-fixes.patch, 0008-fix-empty-CF-handling-should-always-be-null.patch
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729740#action_12729740 ] 

Jonathan Ellis commented on CASSANDRA-287:
------------------------------------------

committed

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt, 0007-fixes.patch, 0008-fix-empty-CF-handling-should-always-be-null.patch
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729289#action_12729289 ] 

Jonathan Ellis commented on CASSANDRA-287:
------------------------------------------

(01 and 02 are automated refactorings.)

06
    add IdentityQueryFilter; finish removing IFilter

05
    add TimeQueryFilter and nuke Timefilter

04
    replace namesfilter with NamesQueryFilter.  mv filter code into separate package.

03
    refactor out QueryFilter, SliceQueryFilter

02
    rename lock_ -> sstableLock_

01
    CASSANDRA-287 r/m unused (and dangerous) RowReadCommand


> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-287:
-------------------------------------

    Attachment: 0008-fix-empty-CF-handling-should-always-be-null.patch

08 fixes point 3.  the problem was inconsistent handling of empty CFs -- to be consistent with the rest of cassandra, following the original FB conventions, an empty CF should be represented with a null reference.

(IMO this is the wrong convention to pick but changing that is outside my scope for this ticket.)

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt, 0007-fixes.patch, 0008-fix-empty-CF-handling-should-always-be-null.patch
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-287:
-------------------------------------

    Comment: was deleted

(was: Actually now that I think of it I did fix a FD leak in 287 already.  In the code

                iter = filter.getSSTableColumnIterator(sstable);
                if (iter.hasNext())
                {
                    returnCF.delete(iter.getColumnFamily());
                    iterators.add(iter);
                }
                else
                {
                    iter.close();
                }

the "else" clause didn't exist before, and since it wasn't added to the list of iterators it wouldn't get closed at the end.)

> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt, 0007-fixes.patch, 0008-fix-empty-CF-handling-should-always-be-null.patch
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-287) Make iterator-based read code the One True Path

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729442#action_12729442 ] 

Jun Rao commented on CASSANDRA-287:
-----------------------------------

Review comments:
1. 0006 patch breaks compilation in unit test (merge conflicts included in the patch)
2. We should remove unused CMD_TYPE_GET_ROW,GET_SLICE AND SLICE_BY_RANGE in ReadCommand and renumber the rest.
3. SystemTable.StorageMetadata, 2nd line, change IdentityQueryFilter  to NamesQueryFilter.
4. What happens to those updateReadStatistics calls in various getRow in Table? Are they just got removed?
5. Need to add @override to all overridden methods in various filters class.


> Make iterator-based read code the One True Path
> -----------------------------------------------
>
>                 Key: CASSANDRA-287
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-287
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: 0001-CASSANDRA-287-r-m-unused-and-dangerous-RowReadComman.txt, 0002-rename-lock_-sstableLock_.txt, 0003-refactor-out-QueryFilter-SliceQueryFilter.txt, 0004-replace-namesfilter-with-NamesQueryFilter.-mv-filter.txt, 0005-nuke-timefilter.txt, 0006-add-IdentityQueryFilter-finish-removing-IFilter.txt
>
>
> Since CASSANDRA-172 we've had two read paths; the old, ad-hoc path based on the faulty assumption that we could skip checking older sstables if we got a hit earlier in the path (fixed in CASSANDRA-223 but still bearing the marks of its origin) and the new iterator-based path.
> This makes all read operations go through the iterator path, which cleans things up enormously and sets the stage for CASSANDRA-139.
> I introduce a new QueryFilter interface, which has 3 main methods:
>     /**
>      * returns an iterator that returns columns from the given memtable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getMemColumnIterator(Memtable memtable);
>     /**
>      * returns an iterator that returns columns from the given SSTable
>      * matching the Filter criteria in sorted order.
>      */
>     public abstract ColumnIterator getSSTableColumnIterator(SSTableReader sstable) throws IOException;
>     /**
>      * subcolumns of a supercolumn are unindexed, so to pick out parts of those we operate in-memory.
>      * @param superColumn
>      */
>     public abstract void filterSuperColumn(SuperColumn superColumn);
> The first two are for pulling out indexed top-level columns, from a memtable or an sstable, respectively.
> If the query is on subcolumns of a supercolumn, which are unindexed, CFS.getColumnFamily does an indexed Name filter on the supercolumn, then asks filterSuperColumn on the primary QueryFilter to pick out the parts the user is requesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.