You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stdcxx.apache.org by Martin Sebor <se...@roguewave.com> on 2008/02/18 20:05:35 UTC

Re: svn commit: r628611 - in /stdcxx/trunk/tests: include/rw_braceexp.h self/0.braceexp.cpp src/braceexp.cpp

This is super cool! I've played with it a bit and except for a few
minor issues (mostly with ranges) it works like a charm! (I've just
committed an enhanced version of the test that reveals these problems.)

About the interface:
http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/include/rw_braceexp.h?r=628611

Can the third argument be negative? (If not, it might make sense
to change its type to size_t). What about the sep argument? Can
it be outside of the range of char, such as -1? It seems to just
get converted to char here (oooh, line number links ;-):
http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l274
It might make sense to do some basic checking on it (and on the
other arguments) in rw_brace_expand() before passing it to the
implementation.

A few questions/comments on the implementation:
http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817

The _rw_is_lower() and _rw_is_upper() helper functions should either
be replaced with the C equivalents or rewritten to handle EBCDIC. If
the latter, I suggest making them inline.

In _rw_brace_graph, the member functions don't need to be privatized
(with the _C_ prefix). Also, unless the class is intended to be copy
constructible and assignable I suggest to explicitly declare the two
members private.

Finally, please check your braces :) E.g., these:
http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l29
http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l54
http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l87

Martin

vitek@apache.org wrote:
> Author: vitek
> Date: Sun Feb 17 20:35:15 2008
> New Revision: 628611
> 
> URL: http://svn.apache.org/viewvc?rev=628611&view=rev
> Log:
> 
> 2008-02-17  Travis Vitek  <vi...@roguewave.com>
> 
> 	STDCXX-714
> 	* tests/include/rw_braceexp.h: Added a new header to declare the test
> 	suite helper function rw_brace_expand().
> 	* tests/src/braceexp.cpp: New source file to define rw_brace_expand().
> 	* tests/self/0.braceexp.cpp: New test to exercise rw_brace_expand()
> 
> 
> Added:
>     stdcxx/trunk/tests/include/rw_braceexp.h
>     stdcxx/trunk/tests/self/0.braceexp.cpp
>     stdcxx/trunk/tests/src/braceexp.cpp
> 
> Added: stdcxx/trunk/tests/include/rw_braceexp.h
> URL: http://svn.apache.org/viewvc/stdcxx/trunk/tests/include/rw_braceexp.h?rev=628611&view=auto
> ==============================================================================
> --- stdcxx/trunk/tests/include/rw_braceexp.h (added)
> +++ stdcxx/trunk/tests/include/rw_braceexp.h Sun Feb 17 20:35:15 2008
> @@ -0,0 +1,43 @@
> +/************************************************************************
> + *
> + * rw_braceexp.h - declarations of testsuite helpers
> + *
> + * $Id$
> + *
> + ************************************************************************
> + *
> + * Licensed to the Apache Software  Foundation (ASF) under one or more
> + * contributor  license agreements.  See  the NOTICE  file distributed
> + * with  this  work  for  additional information  regarding  copyright
> + * ownership.   The ASF  licenses this  file to  you under  the Apache
> + * License, Version  2.0 (the  "License"); you may  not use  this file
> + * except in  compliance with the License.   You may obtain  a copy of
> + * the License at
> + *
> + * http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the  License is distributed on an  "AS IS" BASIS,
> + * WITHOUT  WARRANTIES OR CONDITIONS  OF ANY  KIND, either  express or
> + * implied.   See  the License  for  the  specific language  governing
> + * permissions and limitations under the License.
> + *
> + **************************************************************************/
> +
> +#ifndef RW_BRACEEXP_H_INCLUDED
> +#define RW_BRACEEXP_H_INCLUDED
> +
> +
> +#include <testdefs.h>
> +
> +
> +// rw_brace_expand() performs brace expansion according to chapter 3.5.1 of
> +// the Bash Reference Manual with the exception of special handling for the
> +// string `${'. 
> +
> +_TEST_EXPORT char*
> +rw_brace_expand (const char* brace_expr, char* s, int n, int sep);
> +
> +
> +#endif   // RW_BRACEEXP_H_INCLUDED
> +
> 
> Added: stdcxx/trunk/tests/self/0.braceexp.cpp
> URL: http://svn.apache.org/viewvc/stdcxx/trunk/tests/self/0.braceexp.cpp?rev=628611&view=auto
> ==============================================================================
> --- stdcxx/trunk/tests/self/0.braceexp.cpp (added)
> +++ stdcxx/trunk/tests/self/0.braceexp.cpp Sun Feb 17 20:35:15 2008
> @@ -0,0 +1,121 @@
> +/************************************************************************
> + *
> + * 0.braceexp.cpp - tests exercising the rw_brace_expand() helper
> + *
> + * $Id$
> + *
> + ************************************************************************
> + *
> + * Licensed to the Apache Software  Foundation (ASF) under one or more
> + * contributor  license agreements.  See  the NOTICE  file distributed
> + * with  this  work  for  additional information  regarding  copyright
> + * ownership.   The ASF  licenses this  file to  you under  the Apache
> + * License, Version  2.0 (the  "License"); you may  not use  this file
> + * except in  compliance with the License.   You may obtain  a copy of
> + * the License at
> + *
> + * http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the  License is distributed on an  "AS IS" BASIS,
> + * WITHOUT  WARRANTIES OR CONDITIONS  OF ANY  KIND, either  express or
> + * implied.   See  the License  for  the  specific language  governing
> + * permissions and limitations under the License.
> + *
> + **************************************************************************/
> +
> +#include <rw_braceexp.h>
> +
> +#include <stdio.h>        // for fprintf(), stderr
> +#include <stdlib.h>       // for free()
> +#include <string.h>       // for strcmp()
> +
> +// the number of failed tests
> +static int nerrors;
> +
> +static void
> +test (int line, const char* brace_expr, const char* expect)
> +{
> +    char buf [128];
> +
> +    char* result = rw_brace_expand (brace_expr, buf, sizeof (buf), ' ');
> +
> +    const bool equal =   (expect && result)
> +                       ? !strcmp (expect, result)
> +                       : expect == result;
> +
> +    if (!equal) {
> +
> +        ++nerrors;
> +
> +        fprintf (stderr,
> +                 "%d. rw_brace_expand(\"%s\", ...) failed. "
> +                 "expected \"%s\" got \"%s\"\n",
> +                 line, brace_expr,
> +                 expect ? expect : "(null)",
> +                 result ? result : "(null)");
> +    }
> +
> +    if (result != buf)
> +        free (result);
> +}
> +
> +int main ()
> +{
> +#define TEST(s,e) test (__LINE__, s, e)
> +
> +    TEST ("a", "a");
> +    TEST ("a\\b", "ab");
> +
> +    TEST ("{0}"  , "{0}");
> +    TEST ("\\{0}", "{0}");
> +    TEST ("{\\0}", "{0}");
> +    TEST ("{0\\}", "{0}");
> +
> +    TEST ("{0..1}", "0 1");
> +    TEST ("\\{0..1}", "{0..1}");
> +    TEST ("{\\0..1}", "{0..1}");
> +    TEST ("{0\\..1}", "{0..1}");
> +    TEST ("{0..\\1}", "{0..1}");
> +    TEST ("{0..1\\}", "{0..1}");
> +
> +    TEST ("a{0}",      "a{0}");
> +    TEST ("a{0}b",     "a{0}b");
> +    TEST ("a\\{0\\}b", "a{0}b");
> +    TEST ("a\\{0,123,4\\}b", "a{0,123,4}b");
> +
> +    TEST ("{0..a}", "{0..a}");
> +    TEST ("{a..0}", "{a..0}");
> +
> +    TEST ("{0..0}", "0");
> +    TEST ("{0..5}", "0 1 2 3 4 5");
> +    TEST ("{4..2}", "4 3 2");
> +    TEST ("{a..g}", "a b c d e f g");
> +    TEST ("{g..c}", "g f e d c");
> +    TEST ("{A..G}", "A B C D E F G");
> +    TEST ("{G..C}", "G F E D C");
> +
> +    TEST ("{,}", "");
> +    TEST ("{abc,def}", "abc def");
> +    TEST ("{ab\\c,d\\ef}", "abc def");
> +    TEST ("abc{d,e,f}", "abcd abce abcf");
> +    
> +    TEST ("z{c,a{d..f}a,c}z", "zcz zadaz zaeaz zafaz zcz");
> +    TEST ("z{c,a{d,e,f}a,c}z", "zcz zadaz zaeaz zafaz zcz");
> +    
> +    TEST ("{abc,{,d,e,f,}}", "abc d e f");
> +    TEST ("{abc,{,d,e,f,}}{x,y}", "abcx abcy x y dx dy ex ey fx fy x y");
> +    TEST ("{abc,{,d\\,e\\,f,}}", "abc d,e,f");
> +
> +    TEST ("A{0..3}", "A0 A1 A2 A3");
> +    TEST ("A{0..2}{6..7}", "A06 A07 A16 A17 A26 A27");
> +    TEST ("A{A}{0..3}", "A{A}{0..3}");
> +    TEST ("A{0..3}{A}", "A0{A} A1{A} A2{A} A3{A}");
> +
> +    TEST ("A{0,{4..7}{,}}", "A0 A4 A4 A5 A5 A6 A6 A7 A7");
> +    TEST ("a{1,3,x{5..9}y}b", "a1b a3b ax5yb ax6yb ax7yb ax8yb ax9yb");
> +    TEST ("{en,es}_{US,MX}.*", "en_US.* en_MX.* es_US.* es_MX.*");
> +
> +    // return 0 on success, 1 on failure
> +    return !(0 == nerrors);
> +}
> 
> Added: stdcxx/trunk/tests/src/braceexp.cpp
> URL: http://svn.apache.org/viewvc/stdcxx/trunk/tests/src/braceexp.cpp?rev=628611&view=auto
> ==============================================================================
> --- stdcxx/trunk/tests/src/braceexp.cpp (added)
> +++ stdcxx/trunk/tests/src/braceexp.cpp Sun Feb 17 20:35:15 2008
> @@ -0,0 +1,717 @@
> +#include <stdlib.h>
> +#include <string.h>
> +
> +
> +static int _rw_is_digit (int ch)
> +{
> +    return !(ch < '0' || '9' < ch);
> +}
> +
> +static int _rw_is_lower (int ch)
> +{
> +    return !(ch < 'a' || 'z' < ch);
> +}
> +
> +static int _rw_is_upper (int ch)
> +{
> +    return !(ch < 'A' || 'Z' < ch);
> +}
> +
> +// search `beg' to `end' for the next unescaped open brace . if `end'
> +// is 0 then the search will terminate at the end of string. returns 0
> +// on failure.
> +static const char* _rw_find_open_brace (const char* beg,
> +                                        const char* end)
> +{
> +    bool is_escaped = false;
> +
> +    for (/**/; *beg; ++beg)
> +    {
> +        if (end && !(beg < end))
> +            break;
> +
> +        const bool is_match = (*beg == '{');
> +        if (!is_escaped && is_match) {
> +            return beg;
> +        }
> +
> +        is_escaped = !is_escaped && (*beg == '\\');
> +    }
> +
> +    return 0;
> +}
> +
> +// search `beg' to `end' for the next close brace at the current brace
> +// depth. if `end' is 0, the search will continue until a match or end
> +// of string. returns 0 on failure.
> +static const char* _rw_find_close_brace (const char* beg,
> +                                         const char* end)
> +{
> +    bool is_escaped = false;
> +
> +    // nested brace depth
> +    for (int depth = 1; *beg; ++beg)
> +    {
> +        if (end && !(beg < end))
> +            break;
> +
> +        const bool is_open_brace = (*beg == '{');
> +        if (!is_escaped && is_open_brace) {
> +            ++depth;
> +        }
> +
> +        const bool is_close_brace = (*beg == '}');
> +        if (!is_escaped && is_close_brace) {
> +            --depth;
> +        }
> +
> +        is_escaped = !is_escaped && (*beg == '\\');
> +
> +        if (depth == 0)
> +            return beg;
> +    }
> +
> +    return 0;
> +}
> +
> +// search `beg' to `end' for the next unescaped comma at the current
> +// brace depth. if `end' is 0, the search will continue until a match
> +// or end of string. returns 0 on failure.
> +static const char* _rw_find_next_comma (const char* beg,
> +                                        const char* end)
> +{
> +    bool is_escaped = false;
> +
> +    // nested brace depth
> +    for (int depth = 0; *beg; ++beg)
> +    {
> +        if (end && !(beg < end))
> +            break;
> +
> +        const bool is_open_brace = (*beg == '{');
> +        if (!is_escaped && is_open_brace) {
> +            ++depth;
> +        }
> +
> +        const bool is_close_brace = (*beg == '}');
> +        if (!is_escaped && is_close_brace) {
> +            --depth;
> +        }
> +
> +        const bool is_comma = (*beg == ',');
> +        if (!is_escaped && is_comma) {
> +            if (depth == 0)
> +                return beg;
> +        }
> +
> +        is_escaped = !is_escaped && (*beg == '\\');
> +    }
> +
> +    return 0;
> +}
> +
> +
> +
> +// this is where most of the work is done.
> +struct _rw_brace_graph
> +{
> +    //
> +    _rw_brace_graph ();
> +
> +    //
> +    ~_rw_brace_graph ();
> +
> +    // expand `brace_expr' into `len' bytes of `buf'. if it won't fit, we
> +    // allocate a buffer with malloc() and return that. so the user needs
> +    // to free() the return buffer if it is not equal to `buf'.
> +    char* _C_build_and_expand (const char* brace_expr,
> +                               char* buf, int len, int sep);
> +
> +private:
> +
> +    // node for a directed-acyclic-graph that we build from the original
> +    // brace expression
> +    struct _rw_brace_node
> +    {
> +        const char* str_;
> +        int len_;
> +
> +        _rw_brace_node* sibling_;
> +        _rw_brace_node* child_;
> +    };
> +
> +    // retrieve a new node. nodes are allocated in large blocks. those
> +    // blocks are deallocated when this graph instance is destroyed.
> +    _rw_brace_node* _C_get_new_node ();
> +
> +    // this function generates a dag from an arbitrary brace expression.
> +    // the format is `prefix{body}suffix'. prefix, and suffix can both
> +    // be the empty string. body may be a comma seperated list of tokens,
> +    // a range of tokens, or arbitrary literal text. escapes on commas
> +    // and braces are ignored, and left intact otherwise.
> +    _rw_brace_node* _C_build_anything (const char* beg, const char* end);
> +
> +    // generates a dag from an arbitrary range brace expression. the
> +    // format is `{?..?}suffix' where both ? are alphanumeric digits
> +    // of the same character class.
> +    _rw_brace_node* _C_build_range    (const char* beg, const char* end);
> +
> +    // generates a dag from an arbitrary list brace expression. the
> +    // format is `{a,b[,c]}suffix', where `a', `b' and `c' are full
> +    // brace expansions that would be processed by _C_build_anything.
> +    _rw_brace_node* _C_build_list     (const char* beg, const char* end);
> +
> +    // generates a dag from literal text. no brace expansion is done
> +    // on the literal text.
> +    _rw_brace_node* _C_build_literal  (const char* beg, const char* end);
> +
> +    // the number of nodes held by each brace buffer [see below]
> +    enum { _C_size = 64 };
> +
> +    // buffer to avoid many small allocations
> +    struct _rw_brace_buffer
> +    {
> +        _rw_brace_node nodes_ [_C_size];
> +        _rw_brace_buffer* next_;
> +    };
> +
> +    // the initial set of nodes is preallocated as part of this graph
> +    // object instance. additional blocks of nodes will be allocated
> +    // as is necessary by the _C_get_new_node() member.
> +    _rw_brace_buffer node_cache_;
> +    
> +    int used_; // number of nodes used in the last buffer
> +    _rw_brace_buffer* nodes_; // pointer to last allocated node buffer
> +
> +    // code for generating the string
> +
> +    bool _C_append (const char* beg, int n);
> +
> +    char* user_buf_;   // this is the buffer
> +    char* string_buf_;
> +    int string_cap_;
> +    int string_len_;
> +
> +    // we use this to walk the stack. we need to walk from the root
> +    // of the tree to each leaf. when we reach a leaf, we need a way
> +    // to see the path that we took to get where we are. we use this
> +    // path to write out each part of the full expansion.
> +    struct _rw_recursion_context
> +    {
> +        _rw_recursion_context (_rw_brace_node* node)
> +            : parent_ (0), node_ (node)
> +        {
> +        }
> +
> +        _rw_recursion_context (_rw_recursion_context* parent)
> +            : parent_ (parent), node_ (parent->node_->child_)
> +        {
> +        }
> +
> +        _rw_recursion_context* parent_;
> +        _rw_brace_node* node_;
> +    };
> +
> +    // this function walks the dag, leaving a trail of context
> +    // objects so we can walk back to the root and write the data
> +    // at each level in the graph.
> +    bool _C_brace_expand (_rw_recursion_context* self, char sep);
> +
> +    // this function writes the data at each level of the dag
> +    // to our internal buffer.
> +    bool _C_brace_expand_write (_rw_recursion_context* context);
> +};
> +
> +_rw_brace_graph::_rw_brace_graph ()
> +    : used_ (0)
> +    , nodes_ (&node_cache_)
> +    , user_buf_ (0)
> +    , string_buf_ (0)
> +    , string_cap_ (0)
> +    , string_len_ (0)
> +{
> +    node_cache_.next_ = 0;
> +}
> +
> +_rw_brace_graph::~_rw_brace_graph ()
> +{
> +    _rw_brace_buffer* nodes = node_cache_.next_;
> +    while (nodes)
> +    {
> +        _rw_brace_buffer* next = nodes->next_;
> +        nodes->next_ = 0;
> +
> +        // deallocate this buffer
> +        free (nodes);
> +
> +        // setup to deallocate next one
> +        nodes = next;
> +    }
> +}
> +
> +
> +
> +char*
> +_rw_brace_graph::_C_build_and_expand (const char* brace_expr,
> +                                      char* buf, int len, int sep)
> +{
> +    const int brace_expr_len = strlen (brace_expr);
> +
> +    _rw_brace_node* root = _C_build_anything (brace_expr,
> +                                              brace_expr + brace_expr_len);
> +    if (!root)
> +        return 0;
> +
> +    // setup for the expansion
> +    user_buf_   = buf;
> +    string_buf_ = buf;
> +    string_cap_ = len;
> +    string_len_ = 0;
> +
> +   // this helps us do the building of the output string
> +    _rw_recursion_context context (root);
> +
> +    if (!_C_brace_expand (&context, sep))
> +    {
> +        if (string_buf_ != user_buf_)
> +            free (string_buf_);
> +        string_buf_ = 0;
> +    }
> +
> +    // kill the last seperator with a null terminator
> +    else if (string_buf_)
> +    {
> +        const int pos = string_len_ < 1 ? 0 : string_len_ - 1;
> +        string_buf_ [pos] = '\0';
> +    }
> +
> +    return string_buf_;
> +}
> +
> +_rw_brace_graph::_rw_brace_node*
> +_rw_brace_graph::_C_get_new_node ()
> +{
> +    used_ += 1;
> +
> +    // if we run out of space, grab a new block
> +    if (! (used_ < _C_size))
> +    {
> +        nodes_->next_
> +           = (_rw_brace_buffer*)malloc (sizeof (_rw_brace_buffer));
> +        if (!nodes_->next_)
> +            return 0;
> +
> +        nodes_ = nodes_->next_;
> +        nodes_->next_ = 0;
> +
> +        used_ -= _C_size;
> +    }
> +
> +    _rw_brace_node* node = &nodes_->nodes_ [used_ - 1];
> +    node->str_     = 0;
> +    node->len_     = 0;
> +    node->sibling_ = 0;
> +    node->child_   = 0;
> +
> +    return node;
> +}
> +
> +_rw_brace_graph::_rw_brace_node*
> +_rw_brace_graph::_C_build_anything (const char* beg, const char* end)
> +{
> +    // 
> +    const char* open_brace = _rw_find_open_brace (beg, end);
> +    if (open_brace)
> +    {
> +        // build a node for the prefix if there is one
> +        _rw_brace_node* prefix = _C_get_new_node ();
> +        prefix->str_ = beg;
> +        prefix->len_ = (open_brace - beg);
> +
> +        // try to build a range expansion
> +        _rw_brace_node* child = 0;
> +        
> +        child = _C_build_range (open_brace, end);
> +        if (!child) {
> +
> +            // try to do a list expansion
> +            child = _C_build_list (open_brace, end);
> +            if (!child) {
> +
> +                // try to do a literal expansion
> +                child = _C_build_literal (open_brace, end);
> +                if (!child) {
> +                    return 0;
> +                }
> +            }
> +        }
> +
> +        // we must have a valid child pointer at this point
> +        prefix->child_ = child;
> +   
> +        return prefix;
> +    }
> +
> +    // there was no open brace, so the entire text from beg to end
> +    // is a literal
> +    _rw_brace_node* prefix = _C_get_new_node ();
> +    if (!prefix)
> +        return 0;
> +
> +    prefix->str_ = beg;
> +    prefix->len_ = (end - beg);
> +
> +    return prefix;
> +}
> +
> +_rw_brace_graph::_rw_brace_node*
> +_rw_brace_graph::_C_build_literal (const char* beg, const char* end)
> +{
> +    _rw_brace_node* prefix = _C_get_new_node ();
> +    if (!prefix)
> +        return 0;
> +
> +    prefix->str_ = beg;
> +    prefix->len_ = (end - beg);
> +
> +    return prefix;
> +}
> +
> +_rw_brace_graph::_rw_brace_node*
> +_rw_brace_graph::_C_build_range (const char* beg, const char* end)
> +{
> +    // check for {a..z} type range expression. make sure not to
> +    // reference past the end of the string.
> +    if (   beg [0] != '{'
> +        || beg [1] == '\0'
> +        || beg [2] != '.'
> +        || beg [3] != '.'
> +        || beg [4] == '\0'
> +        || beg [5] != '}')
> +        return 0;
> +
> +    // grab characters that represent the start and end of the range
> +    char cbeg = beg [1];
> +    char cend = beg [4];
> +
> +    // only works if range characters are both digit, lower or upper.
> +    const int both_are_digit = _rw_is_digit (cbeg) && _rw_is_digit (cend);
> +    const int both_are_lower = _rw_is_lower (cbeg) && _rw_is_lower (cend);
> +    const int both_are_upper = _rw_is_upper (cbeg) && _rw_is_upper (cend);
> +
> +    if (! (both_are_digit || both_are_lower || both_are_upper))
> +        return false;
> +
> +    // these must live for duration of program
> +    static const char* table [] =
> +    {
> +        "0123456789",                 // 0
> +        "abcdefghijklmnopqrstuvwxyz", // 1
> +        "ABCDEFGHIJKLMNOPQRSTUVWXYZ", // 2
> +    };
> +
> +    const int idx = (both_are_lower << 0) +
> +                    (both_are_upper << 1);
> +
> +    const int dir = (cbeg < cend) ? 1 : -1;
> +
> +    // build the suffix
> +    _rw_brace_node* suffix = _C_build_anything (beg + 6, end);
> +    if (!suffix)
> +        return false; // failed to parse suffix
> +
> +    _rw_brace_node* first_child = 0;
> +    _rw_brace_node* final_child = 0;
> +
> +    // build a list of children, associate each with suffix
> +    for (/**/; cbeg != cend; cbeg += dir) {
> +
> +        // create a child from whatever is between beg and end. that childs
> +        // next pointer will be the suffix we created above.
> +        _rw_brace_node* child = _C_get_new_node ();
> +        if (!child) {
> +            return false;
> +        }
> +
> +        // this finds beg in our array, we could have used strchr
> +        child->str_ = table [idx] + (cbeg - table [idx][0]);
> +        child->len_ = 1;
> +
> +        // track children we have created
> +        if (!first_child)
> +            first_child = child;
> +
> +        if (final_child)
> +            final_child->sibling_ = child;
> +        final_child = child;
> +
> +        // suffix is the suffix of the child expression
> +        while (child->child_)
> +            child = child->child_;
> +        child->child_ = suffix;
> +    }
> +
> +    // create the last node in the range.
> +    _rw_brace_node* child = _C_get_new_node ();
> +    if (!child) {
> +        return false;
> +    }
> +
> +    child->str_ = table [idx] + (cbeg - table [idx][0]);
> +    child->len_ = 1;
> +
> +    // trach child
> +    if (!first_child)
> +        first_child = child;
> +
> +    if (final_child)
> +        final_child->sibling_ = child;
> +    final_child = child;
> +
> +    // suffix is the suffix of the child expression
> +    while (child->child_)
> +        child = child->child_;
> +    child->child_ = suffix;
> +
> +    return first_child;
> +}
> +
> +_rw_brace_graph::_rw_brace_node*
> +_rw_brace_graph::_C_build_list (const char* beg, const char* end)
> +{
> +    //
> +    // {abc,d{1..3}e}f
> +    // ^              ^
> +    //
> +    // prefix ''
> +    // suffix 'f'
> +    // middle '{abc,d{1..3}e}'
> +
> +    // generate node for prefix [which is a literal, and is handled by the caller
> +    // generate node for the suffix [which is anything]
> +
> +    // generate list of nodes for the middle [each of which are anything]
> +    // set the next pointer of each to the suffix, and set next pointer
> +    // of the prefix to the first list item.
> +
> +    // d{1..3}e{1..3}
> +    //
> +    // root -> d -> 1 -> e -> 1 -> EMPTY
> +    //              2 //      2 //
> +    //              3 /       3 /
> +
> +    // {1..3}{4..6}  14 15 16 24 25 26 34 35 36
> +    //
> +    // root -> 1 -> 4
> +    //         2 // 5
> +    //         3 /  6
> +
> +    // we really expect that the first token is an open paren the
> +    // caller should have consumed the prefix before calling this
> +    if (*beg++ != '{')
> +        return false;
> +
> +    // now fill in the middle, each child we create directly will
> +    // have a child pointer to the suffix node
> +
> +    // all of the nodes in the list are children of `parent'. the
> +    // graph from nodes after the list are 
> +    const char* mid = _rw_find_next_comma (beg, end);
> +    if (!mid)
> +        return false; // no comma, then no list
> +
> +    // find the end of th list expression
> +    const char* eol = _rw_find_close_brace (mid, end);
> +    if (!eol)
> +        return false; // no list terminator
> +
> +    // build a node for the suffix.
> +    _rw_brace_node* suffix = _C_build_anything (eol+1, end);
> +    if (!suffix)
> +        return false; // failed to parse end
> +
> +    _rw_brace_node* first_child = 0;
> +    _rw_brace_node* final_child = 0;
> +
> +    while (mid) {
> +    
> +        // create a child from whatever is between beg and end. that childs
> +        // next pointer will be the suffix we created above.
> +        _rw_brace_node* child = _C_build_anything (beg, mid);
> +        if (!child) {
> +            return false;
> +        }
> +
> +        if (!first_child)
> +            first_child = child;
> +        
> +        // append new child to end of chain
> +        if (final_child)
> +            final_child->sibling_ = child;
> +        final_child = child;
> +
> +        // the nex pointer for this child is the suffix
> +
> +        // suffix is the suffix of the child expression
> +        while (child->child_)
> +            child = child->child_;
> +        child->child_ = suffix;
> +
> +        beg = mid + 1;
> +        mid = _rw_find_next_comma (beg, eol);
> +    }
> +
> +    // okay, we have a pointer to the last comma in the list. create an
> +    // item for the data between the comma and the list terminator
> +
> +    // '{abc,d{1..3}e}a'
> +    //      ^        ^ ^
> +    //    beg      eol end
> +
> +    // build nodes from the last entry in the list
> +    _rw_brace_node* child = _C_build_anything (beg, eol);
> +    if (!child) {
> +        return false;
> +    }
> +
> +    // append last child to end of chain
> +    final_child->sibling_ = child;
> +    final_child = child;
> +
> +    // suffix is the suffix of the child expression
> +    while (child->child_)
> +        child = child->child_;
> +    child->child_ = suffix;
> +
> +    // success, return first child in this expansion
> +    return first_child;
> +}
> +
> +bool _rw_brace_graph::_C_append (const char* s, int n)
> +{
> +    const int new_len = string_len_ + n;
> +
> +    // not enough space, grow buf
> +    if (! (new_len < string_cap_)) {
> +
> +        // buf grows in 256 byte blocks
> +        int new_cap = string_cap_;
> +        while (! (new_len < new_cap))
> +            new_cap += 256;
> +
> +        char* new_buf = (char*)malloc (new_cap);
> +        if (!new_buf)
> +            return false;
> +
> +        // copy the old data
> +        memcpy (new_buf, string_buf_, string_len_);
> +
> +        // copy the new data
> +        memcpy (new_buf + string_len_, s, n);
> +        new_buf [new_len] = '\0';
> +
> +        // don't free until after append because `s' may
> +        // be in string_buf_
> +        if (string_buf_ != user_buf_)
> +            free (string_buf_);
> +
> +        string_cap_ = new_cap;
> +        string_len_ = new_len;
> +        string_buf_ = new_buf;
> +    }
> +
> +    // just copy the data
> +    else {
> +        memcpy (string_buf_ + string_len_, s, n);
> +        string_buf_ [new_len] = '\0';
> +
> +        string_len_ = new_len;
> +    }
> +
> +    return true;
> +}
> +
> +
> +bool _rw_brace_graph::_C_brace_expand_write (_rw_recursion_context* context)
> +{
> +    if (context->parent_) {
> +        if (!_C_brace_expand_write (context->parent_))
> +            return false;
> +    }
> +
> +    // write the data at this level
> +    //return _C_append (context->node_->str_, context->node_->len_);
> +
> +    bool is_escaped = false;
> +
> +    const char* beg = context->node_->str_;
> +    for (int n = 0; n < context->node_->len_; ++n, ++beg) {
> +
> +        is_escaped = !is_escaped && (*beg == '\\');
> +        if (!is_escaped) {
> +            if (!_C_append (beg, 1))
> +                return false;
> +        }
> +    }
> +
> +    return true;
> +}
> +
> +bool _rw_brace_graph::_C_brace_expand (_rw_recursion_context* self, char sep)
> +{
> +    // if this node has no children or siblings, we have found a full
> +    // expansion.
> +    if (!self->node_ ||
> +        !self->node_->sibling_ && !self->node_->child_) {
> +
> +        const int length_before = string_len_;
> +
> +        // use recursion again to walk back to the root the graph and
> +        // write each contexts data as we unwind back toward the leaf
> +        if (!_C_brace_expand_write (self))
> +            return false;
> +
> +        const int length_after = string_len_;
> +
> +        // don't write a seperator if we wrote no data
> +        if (length_before != length_after && !_C_append (&sep, 1))
> +            return false;
> +
> +        return true;
> +    }
> +    
> +    // iterate through all of the children of the node, thus finding all
> +    // other combinations
> +    _rw_recursion_context context (self);
> +    while (context.node_) {
> +
> +        if (!_C_brace_expand (&context, sep))
> +            return false;
> +
> +        context.node_ = context.node_->sibling_;
> +    }
> +
> +    return true;
> +}
> +
> +//
> +// returns 0 if brace_expr is not a valid brace expression, else a
> +// pointer to the buf that stores the expanded expression. if the
> +// returned value is not the same pointer as the input parameter
> +// expanded, then the user must deallocate the pointer with a call
> +// to free().
> +char* rw_brace_expand (const char* brace_expr, char* s, int n, int sep)
> +{
> +    if (!brace_expr)
> +        return 0;
> +
> +    _rw_brace_graph graph;
> +
> +    // build the graph, and then expand it into buf
> +    char* buf = graph._C_build_and_expand (brace_expr, s, n, sep);
> +    if (!buf)
> +        return 0;
> +
> +    return buf;
> +}
> +
> 
> 
> 


Re: svn commit: r628611 - in /stdcxx/trunk/tests: include/rw_braceexp.h self/0.braceexp.cpp src/braceexp.cpp

Posted by Martin Sebor <se...@roguewave.com>.
Travis Vitek wrote:
> 
[...]
> Martin Sebor wrote:
>> A few questions/comments on the implementation:
>> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817
>>
>> The _rw_is_lower() and _rw_is_upper() helper functions should either
>> be replaced with the C equivalents or rewritten to handle EBCDIC. If
>> the latter, I suggest making them inline.
>>
> The C equivalents are documented to be locale dependent. I don't believe
> that we want the input string to be localized, so I've written my own
> routines that are equivalent the the C functions for the classic locale.

Don't forget about EBCDIC where letters of the alphabet don't have
consecutive values. See src/ctype.cpp.

> 
> I'm not sure why you suggest making them inline. Is there some other reason
> other than the obvious, and possibly unnecessary, optimization benefit?

Just optimization.

> 
> 
> Martin Sebor wrote:
>> In _rw_brace_graph, the member functions don't need to be privatized
>> (with the _C_ prefix). Also, unless the class is intended to be copy
>> constructible and assignable I suggest to explicitly declare the two
>> members private.
>>
> Actually, none of this needs to be privatized because it is already private
> to the implementation. Yay.
> 
> 
> Martin Sebor wrote:
>> Finally, please check your braces :) E.g., these:
>> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l29
>> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l54
>> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l87
>>
> Have I mentioned that our brace convention is wonky? ;-)

Hey, at least it's consistent (for the most part ;-)

Martin


Re: svn commit: r628611 - in /stdcxx/trunk/tests: include/rw_braceexp.h self/0.braceexp.cpp src/braceexp.cpp

Posted by Travis Vitek <vi...@roguewave.com>.


Martin Sebor wrote:
> 
> This is super cool! I've played with it a bit and except for a few
> minor issues (mostly with ranges) it works like a charm! (I've just
> committed an enhanced version of the test that reveals these problems.)
> 

Yes, thank you for your enhancements to the test suite. As you've probably
seen, I've already responded to those additions in another thread. You can
find that discussion here [http://tinyurl.com/3xnkve].



About the interface:
http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/include/rw_braceexp.h?r=628611

Can the third argument be negative? (If not, it might make sense
to change its type to size_t).

No, it should not be. Even if it were accepted it should be treated as 0.


Martin Sebor wrote:
> 
> What about the sep argument? Can
> it be outside of the range of char, such as -1? It seems to just
> get converted to char here (oooh, line number links ;-):
> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l274
> It might make sense to do some basic checking on it (and on the
> other arguments) in rw_brace_expand() before passing it to the
> implementation.
> 
No, sep should be convertable to an unsigned char, just as it should be when
you call 'strchr'. 


Martin Sebor wrote:
> 
> A few questions/comments on the implementation:
> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817
> 
> The _rw_is_lower() and _rw_is_upper() helper functions should either
> be replaced with the C equivalents or rewritten to handle EBCDIC. If
> the latter, I suggest making them inline.
> 
The C equivalents are documented to be locale dependent. I don't believe
that we want the input string to be localized, so I've written my own
routines that are equivalent the the C functions for the classic locale.

I'm not sure why you suggest making them inline. Is there some other reason
other than the obvious, and possibly unnecessary, optimization benefit?


Martin Sebor wrote:
> 
> In _rw_brace_graph, the member functions don't need to be privatized
> (with the _C_ prefix). Also, unless the class is intended to be copy
> constructible and assignable I suggest to explicitly declare the two
> members private.
> 
Actually, none of this needs to be privatized because it is already private
to the implementation. Yay.


Martin Sebor wrote:
> 
> Finally, please check your braces :) E.g., these:
> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l29
> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l54
> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/src/braceexp.cpp?r=628817#l87
> 
Have I mentioned that our brace convention is wonky? ;-)

Travis


-- 
View this message in context: http://www.nabble.com/Re%3A-svn-commit%3A-r628611---in--stdcxx-trunk-tests%3A-include-rw_braceexp.h-self-0.braceexp.cpp-src-braceexp.cpp-tp15552138p15562361.html
Sent from the stdcxx-dev mailing list archive at Nabble.com.


Re: svn commit: r628611 - in /stdcxx/trunk/tests: include/rw_braceexp.h self/0.braceexp.cpp src/braceexp.cpp

Posted by Martin Sebor <se...@roguewave.com>.
Martin Sebor wrote:
> This is super cool! I've played with it a bit and except for a few
> minor issues (mostly with ranges) it works like a charm! (I've just
> committed an enhanced version of the test that reveals these problems.)
> 
> About the interface:
> http://fisheye6.cenqua.com/browse/stdcxx/trunk/tests/include/rw_braceexp.h?r=628611 
> 
> 
> Can the third argument be negative? (If not, it might make sense
> to change its type to size_t). What about the sep argument? Can
> it be outside of the range of char, such as -1?

Btw., it occurs to me that allowing -1 for the separator might be
useful to specify the absence of a separator to generate strings
of consecutive characters or digits. This probably isn't something
we will want to do in the locale queries or in the expected
failures project but it might come in handy when generating input
for test cases (we already have the <char>@<repeat> notation in
the driver -- see rw_expand()).

Martin