You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Philip Martin <ph...@codematters.co.uk> on 2012/12/07 18:54:16 UTC

Literals in wc SQLite queries

Columns such as nodes.kind, nodes.presence, etc. have strings that
should be one of a discrete set of values.  When we bind these columns
in C code we use something like:

    svn_sqlite__bindf("t", presence_map, svn_wc__db_status_normal);

This means we only use known values (svn_wc__db_status_normal) and the
map converts it to the correct discrete string.  This checking happens
at build time.

We also have queries where the strings are defined as literals in
wc-queries.sql like:

    DELETE FROM nodes
    WHERE wc_id = ?1
      AND IS_STRICT_DESCENDANT_OF(local_relpath, ?2)
      AND (op_depth < ?3
           OR (op_depth = ?3 AND presence = 'base-deleted'))

There is no checking of these literals to catch errors such as
'base-delete'.

I've been thinking that transform_sql.py should do some checking.
Perhaps we could move the maps into a know header, annotate them:

    { "base-deleted", svn_wc__db_status_base_deleted },  /* MAP_DELETED */

and then have transform_sql.py parse the header and convert:

      OR (op_depth = ?3 AND presence = MAP_DELETED))

into

      OR (op_depth = ?3 AND presence = 'base-deleted'))

-- 
Philip

Re: Literals in wc SQLite queries

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> I see no reason why we should not introduce a depth map and use
> svn_sqlite__column_token instead.

For both 1.7 and 1.8 an excluded file has null depth:

$ rm -rf wc && svn co file://`pwd`/repo wc
$ svn up --set-depth exclude wc/A/f
$ sqlite3 -nullvalue - wc/.svn/wc.db "select depth from nodes where local_relpath='A/f'"
-

Directories are different:

$ svnadmin create repo
$ svn import -mm repo/format file://`pwd`/repo/A/f
$ svn1.7 co file://`pwd`/repo wc
$ svn1.7 up --set-depth exclude wc/A
$ sqlite3 wc/.svn/wc.db "select depth from nodes where local_relpath='A'"
unknown
$ rm -rf wc && svn1.8 co file://`pwd`/repo wc
$ svn1.8 up --set-depth exclude wc/A
$ sqlite3 wc/.svn/wc.db "select depth from nodes where local_relpath='A'"
infinity

-- 
Certified & Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download

Re: Literals in wc SQLite queries

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> Daniel Shahaf <da...@elego.de> writes:
>
>> Philip Martin wrote on Mon, Dec 10, 2012 at 17:49:44 +0000:
>>> Daniel Shahaf <da...@elego.de> writes:
>>> 
>>> > Philip - perhaps you can move the relevant definitions to that header?
>>> > Then I'll follow up with a transform_sql.py patch.
>>> 
>>> OK, I've done that.  I didn't annotate the maps or elements in any way.
>>
>> Implemented.  Please note the question embedded in the r1419771 log
>> message.  (The code added in r1419817 might be relevant, too, in case
>> both the node-kind and the depth maps have an "unknown" map member.)
>
> 'unknown' for depth corresponds to svn_depth_unknown while 'unknown' for
> kind corresponds to svn_kind_unknown.  As C enums these are different
> values.
>
> You changed STMT_HAS_SPARSE_NODES to use the kind mapping for depth.
> This will work since the numeric values are not involved, but it feels
> wrong to me: we should introduce a svn_depth_t map.

The reason we don't currently have a depth map is that we are reusing
svn_depth_from_word which converts command-line literals to an enum.
I see no reason why we should not introduce a depth map and use
svn_sqlite__column_token instead.

-- 
Certified & Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download

Re: Literals in wc SQLite queries

Posted by Philip Martin <ph...@wandisco.com>.
Daniel Shahaf <da...@elego.de> writes:

> Philip Martin wrote on Mon, Dec 10, 2012 at 17:49:44 +0000:
>> Daniel Shahaf <da...@elego.de> writes:
>> 
>> > Philip - perhaps you can move the relevant definitions to that header?
>> > Then I'll follow up with a transform_sql.py patch.
>> 
>> OK, I've done that.  I didn't annotate the maps or elements in any way.
>
> Implemented.  Please note the question embedded in the r1419771 log
> message.  (The code added in r1419817 might be relevant, too, in case
> both the node-kind and the depth maps have an "unknown" map member.)

'unknown' for depth corresponds to svn_depth_unknown while 'unknown' for
kind corresponds to svn_kind_unknown.  As C enums these are different
values.

You changed STMT_HAS_SPARSE_NODES to use the kind mapping for depth.
This will work since the numeric values are not involved, but it feels
wrong to me: we should introduce a svn_depth_t map.

-- 
Certified & Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download

Re: Literals in wc SQLite queries

Posted by Daniel Shahaf <da...@elego.de>.
Philip Martin wrote on Mon, Dec 10, 2012 at 17:49:44 +0000:
> Daniel Shahaf <da...@elego.de> writes:
> 
> > Philip - perhaps you can move the relevant definitions to that header?
> > Then I'll follow up with a transform_sql.py patch.
> 
> OK, I've done that.  I didn't annotate the maps or elements in any way.

Implemented.  Please note the question embedded in the r1419771 log
message.  (The code added in r1419817 might be relevant, too, in case
both the node-kind and the depth maps have an "unknown" map member.)

Re: Literals in wc SQLite queries

Posted by Philip Martin <ph...@wandisco.com>.
Daniel Shahaf <da...@elego.de> writes:

> Philip - perhaps you can move the relevant definitions to that header?
> Then I'll follow up with a transform_sql.py patch.

OK, I've done that.  I didn't annotate the maps or elements in any way.

-- 
Certified & Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download

Re: Literals in wc SQLite queries

Posted by Daniel Shahaf <da...@elego.de>.
As I said on IRC, happy to look into this, my main question is how
transform_sql.py would know what files to scan for the MAP_DELETED <->
'base-deleted' mappings.

Do we want a header file with a well-known name
(subversion/include/private/)?  Maybe in the same directory as the
source .sql file?  Maybe the .sql file should have a directive pointing
to the map file?

Among these I prefer the second one, i.e., 
subversion/libsvn_wc/wc-queries.sql -> subversion/libsvn_wc/token-maps.h

Philip - perhaps you can move the relevant definitions to that header?
Then I'll follow up with a transform_sql.py patch.

Philip Martin wrote on Fri, Dec 07, 2012 at 17:54:16 +0000:
> Columns such as nodes.kind, nodes.presence, etc. have strings that
> should be one of a discrete set of values.  When we bind these columns
> in C code we use something like:
> 
>     svn_sqlite__bindf("t", presence_map, svn_wc__db_status_normal);
> 
> This means we only use known values (svn_wc__db_status_normal) and the
> map converts it to the correct discrete string.  This checking happens
> at build time.
> 
> We also have queries where the strings are defined as literals in
> wc-queries.sql like:
> 
>     DELETE FROM nodes
>     WHERE wc_id = ?1
>       AND IS_STRICT_DESCENDANT_OF(local_relpath, ?2)
>       AND (op_depth < ?3
>            OR (op_depth = ?3 AND presence = 'base-deleted'))
> 
> There is no checking of these literals to catch errors such as
> 'base-delete'.
> 
> I've been thinking that transform_sql.py should do some checking.
> Perhaps we could move the maps into a know header, annotate them:
> 
>     { "base-deleted", svn_wc__db_status_base_deleted },  /* MAP_DELETED */
> 
> and then have transform_sql.py parse the header and convert:
> 
>       OR (op_depth = ?3 AND presence = MAP_DELETED))
> 
> into
> 
>       OR (op_depth = ?3 AND presence = 'base-deleted'))
> 
> -- 
> Philip

Re: Literals in wc SQLite queries

Posted by Julian Foad <ju...@btopenworld.com>.
Greg Stein wrote:

> That sounds like a great idea! +1
> 
> It shouldn't be hard to tweak transform_sql.py. If you have a specific
> prefix for the "token" symbols (eg TOKEN_BASE_DELETED), then we can
> have the transform script simply look for lines like you suggest in
> (say) wc_db.h. If the search regex is strong enough, then it will have
> no problem finding them. And if transform_sql.py sees TOKEN_FOO that
> it does NOT recognize, then it should throw an error. Either somebody
> forgot the mapping, or the syntax was not conformant -- both are
> "errors".
> 
> I would be near -1 on putting short numeric values into the db; I
> chose the tokens for debuggability, and think that is quite important.
> I like your improvement on the strengthening the db/token system.
> 
> Cheers,
> -g
> 
> 
> On Fri, Dec 7, 2012 at 12:54 PM, Philip Martin <ph...@codematters.co.uk> 
> wrote:
>>  Columns such as nodes.kind, nodes.presence, etc. have strings that
>>  should be one of a discrete set of values.  When we bind these columns
>>  in C code we use something like:
>> 
>>      svn_sqlite__bindf("t", presence_map, svn_wc__db_status_normal);
>> 
>>  This means we only use known values (svn_wc__db_status_normal) and the
>>  map converts it to the correct discrete string.  This checking happens
>>  at build time.
>> 
>>  We also have queries where the strings are defined as literals in
>>  wc-queries.sql like:
>> 
>>      DELETE FROM nodes
>>      WHERE wc_id = ?1
>>        AND IS_STRICT_DESCENDANT_OF(local_relpath, ?2)
>>        AND (op_depth < ?3
>>             OR (op_depth = ?3 AND presence = 'base-deleted'))
>> 
>>  There is no checking of these literals to catch errors such as
>>  'base-delete'.
>> 
>>  I've been thinking that transform_sql.py should do some checking.
>>  Perhaps we could move the maps into a know header, annotate them:
>> 
>>      { "base-deleted", svn_wc__db_status_base_deleted },  /* MAP_DELETED */
>> 
>>  and then have transform_sql.py parse the header and convert:
>> 
>>        OR (op_depth = ?3 AND presence = MAP_DELETED))
>> 
>>  into
>> 
>>        OR (op_depth = ?3 AND presence = 'base-deleted'))


If we're doing string matching anyway, can we just make it parse the plain

   { "base-deleted", svn_wc__db_status_base_deleted },

lines out of the header file (recognizing that it's a line in a token map, by whatever means is convenient) and then match the literal

  ... presence = 'base-deleted'

by the pattern:

  ... <recognized_column_name> = <valid_token_for_that_column>

Even if we have to append "/* TOKEN_MAP(presence) */" to every line, but hopefully without that, that's better than "/* MAP_DELETED */".

- Julian

Re: Literals in wc SQLite queries

Posted by Greg Stein <gs...@gmail.com>.
That sounds like a great idea! +1

It shouldn't be hard to tweak transform_sql.py. If you have a specific
prefix for the "token" symbols (eg TOKEN_BASE_DELETED), then we can
have the transform script simply look for lines like you suggest in
(say) wc_db.h. If the search regex is strong enough, then it will have
no problem finding them. And if transform_sql.py sees TOKEN_FOO that
it does NOT recognize, then it should throw an error. Either somebody
forgot the mapping, or the syntax was not conformant -- both are
"errors".

I would be near -1 on putting short numeric values into the db; I
chose the tokens for debuggability, and think that is quite important.
I like your improvement on the strengthening the db/token system.

Cheers,
-g


On Fri, Dec 7, 2012 at 12:54 PM, Philip Martin <ph...@codematters.co.uk> wrote:
> Columns such as nodes.kind, nodes.presence, etc. have strings that
> should be one of a discrete set of values.  When we bind these columns
> in C code we use something like:
>
>     svn_sqlite__bindf("t", presence_map, svn_wc__db_status_normal);
>
> This means we only use known values (svn_wc__db_status_normal) and the
> map converts it to the correct discrete string.  This checking happens
> at build time.
>
> We also have queries where the strings are defined as literals in
> wc-queries.sql like:
>
>     DELETE FROM nodes
>     WHERE wc_id = ?1
>       AND IS_STRICT_DESCENDANT_OF(local_relpath, ?2)
>       AND (op_depth < ?3
>            OR (op_depth = ?3 AND presence = 'base-deleted'))
>
> There is no checking of these literals to catch errors such as
> 'base-delete'.
>
> I've been thinking that transform_sql.py should do some checking.
> Perhaps we could move the maps into a know header, annotate them:
>
>     { "base-deleted", svn_wc__db_status_base_deleted },  /* MAP_DELETED */
>
> and then have transform_sql.py parse the header and convert:
>
>       OR (op_depth = ?3 AND presence = MAP_DELETED))
>
> into
>
>       OR (op_depth = ?3 AND presence = 'base-deleted'))
>
> --
> Philip