You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Hyrum K. Wright" <hy...@mail.utexas.edu> on 2010/10/12 18:35:43 UTC

object-model: Return by value, reference or pointer? (or something else?)

[ This is more thoughts (and questions!) about the work on the
object-model branch.  If you don't care about the nuances of such
things, you can stop reading now, and save your cycles for shipping
1.7 work. :) ]

I'm trying to determine the best way to return objects from the
various wrapping classes found in Types.h [1].  At issue here is how
to return NULL values back to callers.

Recall from a previous discussion that these classes simply hold a
handle to the underlying C structure, and then implement getter
methods to enable the caller to pull information out of them.  (We may
change the implementation down the road, but the API seems
reasonable.)  Currently, we return this information by value, relying
upon compiler optimizations to eliminate unneeded copies, etc.  The
problem with this approach is that many of our C structs are
incomplete, that is, they contain only certain subsets of values, with
other values represented as NULL pointers.

This creates a problem when attempting to return these NULL values by
value in the C++ bindings.  The "obvious" solution is to return the
problematic values by pointer, but that introduces difficulties of its
own.  I see the following options:

1) Return everything by value
   Pros: simpler memory management, less overhead (?)
   Cons: doesn't allow the return of NULL values, need to establish
conventions to represent NULL objects (an isNull() method?)

2) Return everything by pointer
   Pros: can represent the NULL value, potentially less memory overhead
   Cons: more complex memory management (caller must delete returned value)

3) Some combination of (1) & (2)
   Pros: Can return guaranteed existing things by value, potentially
NULL ones by pointer
   Cons: Making the distinction between optional and required takes
more effort, a heterogeneous API is more work for callers

I'm leaning toward (2), even though it requires a bit more management
on the part of callers, I think it's the better approach to take.
It's not as clean, but still allows the use of the NULL value, which
callers can use in the normal fashion.  I've not explored how this
would impact swig's ability to generate bindings based on the C++
object model, though.

Thoughts?

-Hyrum

[1] http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Philipp Marek <ph...@linbit.com>.
How about returning a pointer to an empty string (zero length) at some fixed 
address (eg. allocated statically), which is good enough for most callers, 
but provide a function or symbol that compares against that address?

This way you'd get an empty string back; if you care whether the property 
really exists you do a call to noProperty() or something like that.


Regards,

Phil

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Wed, Oct 13, 2010 at 11:44 AM, Julien Cugnière
<ju...@gmail.com> wrote:
> 2010/10/13 Hyrum K. Wright <hy...@mail.utexas.edu>
>> Another option is a custom SVN::String type which looks / smells /
>> acts like a string, but also allows wrapping the NULL value, in a
>> manner you suggest above.
>
> Sorry to intrude, but just an additional possibility : can the strings
> manipulated by SVN contain embedded NUL characters ( "\0" ) ?
>
> If they can't, then you could use an std::string containing a single
> NUL character to represent the NULL string :
>
>    #define SVN_NULL_STRING std::string(1, '\0')
>
>    inline std::string getAuthor() const { return ptr ?
> std::string(ptr) : SVN_NULL_STRING; }

Subversion "strings" can be arbitrary byte arrays.  I haven't yet
thought about how this translates to C++, but I'm wary of using
something not-quite-standard like the above.

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Julien Cugnière <ju...@gmail.com>.
2010/10/13 Hyrum K. Wright <hy...@mail.utexas.edu>
> Another option is a custom SVN::String type which looks / smells /
> acts like a string, but also allows wrapping the NULL value, in a
> manner you suggest above.

Sorry to intrude, but just an additional possibility : can the strings
manipulated by SVN contain embedded NUL characters ( "\0" ) ?

If they can't, then you could use an std::string containing a single
NUL character to represent the NULL string :

    #define SVN_NULL_STRING std::string(1, '\0')

    inline std::string getAuthor() const { return ptr ?
std::string(ptr) : SVN_NULL_STRING; }

-- 
Julien Cugnière

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> Steinar Bang <sb...@dod.no>:

>>>>> Philipp Marek <ph...@linbit.com>:
>> It might be easier for callers if this did
>> if (isNull)
>> return "";
>> so that the value could just be used in printf() and similar without 
>> explicit checking.

> If that was ok to do, one might as well represent a NULL with an empty
> string.

As I said in a different message in the thread, if you need NULL, I
think returning auto_ptr<string> is the simplest way that is both
standards compliant, won't leak memory, and doesn't force you to handle
object life cycle for the returned strings.

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> Philipp Marek <ph...@linbit.com>:

> It might be easier for callers if this did
> 	if (isNull)
> 		return "";
> so that the value could just be used in printf() and similar without 
> explicit checking.

If that was ok to do, one might as well represent a NULL with an empty
string.

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Philipp Marek <ph...@linbit.com>.
Hello Hyrum!

On Thursday 14 October 2010, Hyrum K. Wright wrote:
> The following is a somewhat naïve implementation, but does it jive
> with your suggestion?
...
>     inline const char *c_str() const
>     {
>       if (isNull)
>         return NULL;
>       else
>         return std::string::c_str();
>     }
It might be easier for callers if this did
	if (isNull)
		return "";
so that the value could just be used in printf() and similar without 
explicit checking.


Furthermore I don't know how many instances of these strings will be active 
simultaneously; but the "bool isNull" could cost up to 8 bytes (because of 
aligning on 64bit).
If that is a concern, how about using a specific, static value for the 
std::string instead that is compared against?

I know, microoptimization again, but I wanted to mention the idea.


Regards,

Phil

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Branko Čibej <br...@xbc.nu>.
 On 15.10.2010 15:06, Hyrum K. Wright wrote:
> (I hope our C docs point this out explicitly, too.)

If not, you're bound to find out sooner or later. :)

-- Brane

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Fri, Oct 15, 2010 at 8:06 AM, Hyrum K. Wright
<hy...@mail.utexas.edu> wrote:
> On Fri, Oct 15, 2010 at 4:35 AM, Branko Čibej <br...@xbc.nu> wrote:
>>  On 14.10.2010 20:39, Hyrum K. Wright wrote:
>>> The following is a somewhat naïve implementation, but does it jive
>>> with your suggestion?
>>
>> Roughly yes, see the other comment.
>> On reflection, though, I like the suggestion of returning an
>> std::pair<std::string, bool>. Make a typedef of that so that users can
>> declare return-value variables, and use it where it's absolutely
>> necessary to know that it's not an empty string but a null string. Saves
>> a lot of trouble with the string subclassing, too. And better than
>> pulling in 90% of boost just to get a poor-man's replacement for null
>> references.
>
> Yeah, after that little implementation exercise, and hearing here all
> the nuances of subclassing std::string, I'm leaning in this direction
> as well.  Having a separate NullableString type would also help folks
> know which strings are guaranteed to have values and which ones are
> "optional" via the API.  (I hope our C docs point this out explicitly,
> too.)

Added something called ValidString in r1023863.  Review welcome.

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Fri, Oct 15, 2010 at 4:35 AM, Branko Čibej <br...@xbc.nu> wrote:
>  On 14.10.2010 20:39, Hyrum K. Wright wrote:
>> The following is a somewhat naïve implementation, but does it jive
>> with your suggestion?
>
> Roughly yes, see the other comment.
> On reflection, though, I like the suggestion of returning an
> std::pair<std::string, bool>. Make a typedef of that so that users can
> declare return-value variables, and use it where it's absolutely
> necessary to know that it's not an empty string but a null string. Saves
> a lot of trouble with the string subclassing, too. And better than
> pulling in 90% of boost just to get a poor-man's replacement for null
> references.

Yeah, after that little implementation exercise, and hearing here all
the nuances of subclassing std::string, I'm leaning in this direction
as well.  Having a separate NullableString type would also help folks
know which strings are guaranteed to have values and which ones are
"optional" via the API.  (I hope our C docs point this out explicitly,
too.)

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Branko Čibej <br...@xbc.nu>.
 On 14.10.2010 20:39, Hyrum K. Wright wrote:
> The following is a somewhat naïve implementation, but does it jive
> with your suggestion?

Roughly yes, see the other comment.
On reflection, though, I like the suggestion of returning an
std::pair<std::string, bool>. Make a typedef of that so that users can
declare return-value variables, and use it where it's absolutely
necessary to know that it's not an empty string but a null string. Saves
a lot of trouble with the string subclassing, too. And better than
pulling in 90% of boost just to get a poor-man's replacement for null
references.

-- Brane

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Branko Čibej <br...@xbc.nu>.
 On 14.10.2010 20:39, Hyrum K. Wright wrote:
>     inline operator bool() const
>     {
>       return !isNull;
>     }

This bit seriously changes the semantics of std::string. I recommend not
to override operator bool() like this. Write a different accessor
function instead.

-- Brane

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Thu, Oct 14, 2010 at 5:22 AM, Branko Čibej <br...@xbc.nu> wrote:
>  On 13.10.2010 16:10, Hyrum K. Wright wrote:
>> On Tue, Oct 12, 2010 at 5:06 PM, Branko Čibej <br...@xbc.nu> wrote:
>>>  On 12.10.2010 22:30, Hyrum K. Wright wrote:
>>>> On Tue, Oct 12, 2010 at 2:40 PM, Branko Čibej <br...@xbc.nu> wrote:
>>>>>  On 12.10.2010 20:35, Hyrum K. Wright wrote:
>>>>>> 1) Return everything by value
>>>>>>    Pros: simpler memory management, less overhead (?)
>>>>>>    Cons: doesn't allow the return of NULL values, need to establish
>>>>>> conventions to represent NULL objects (an isNull() method?)
>>>>> Meh.
>>>>>
>>>>>    inline operator bool() const { return (this->c_struct_pointer != 0); }
>>>> That works great for our own types, but what about stuff like std::string?
>>>>
>>>> inline std::string getAuthor() const { return std::string(ptr->author); }
>>>>
>>>> doesn't go over so well when ptr->author is NULL.  If returning by
>>>> value, we *have* to return a string, but there just isn't any way to
>>>> indicate the null string.
>>> Good point ... that's a mess. But returning a pointer to an std::string
>>> is a bigger one ... eep.
>> Another option is a custom SVN::String type which looks / smells /
>> acts like a string, but also allows wrapping the NULL value, in a
>> manner you suggest above.
>>
>>> So typically you'd add a hasAuthor function and throw an exception from
>>> getAuthor if there is no author info for a revision. However, in this
>>> particular case, returning an empty string is just as good, unless you
>>> want to make the fine distinction between a svn:author property with an
>>> empty value (is that even allowed?) and no svn:author property on the
>>> revision. This is no different than if you had a getProperty(name) and
>>> did a lookup in a private map of property name/value pairs.
>> I just used getAuthor() as an example, and while I'm not certain as to
>> the specifics in that particular case (ed: I see Mike has answered
>> this elsethread), I know there are other places where the
>> present-but-empty vs. not-present distinction is an important and
>> valid one.
>
> All right. Then derive svn::string from std::string, and add a .null()
> method. You get to use all the standard string alogorithm
> specializations, plus you get what you want.
>
> You can even add a factory class method that is the only way you can
> construct an svn::string object for which .null() returns true, so for
> any normal use, svn::string and std:: string behave in exactly the same way.
>
> Of course, deriving that template is tricky, given that std::string is
> already a template specialization, but since this is a library, it's
> worth the trouble of doing it right.

The following is a somewhat naïve implementation, but does it jive
with your suggestion?

-Hyrum

[[[
/* This class is similar to std::string, with the benefit that it allows
   non-existent strings, similar to a NULL char * value.  When the possibility
   exists that the string is NULL, callers should check the value (using
   the "if(str)" idiom) before doing any other operation.  Most operations on
   NULL mystring's are undefined. */
class mystring : public std::string
{
  public:
    inline mystring()
      : isNull(false),
        std::string()
    {
    }

    inline mystring(const char *s)
      : isNull(s == NULL)
    {
      if (s)
        assign(s);
    }

    inline operator bool() const
    {
      return !isNull;
    }

    inline const char *c_str() const
    {
      if (isNull)
        return NULL;
      else
        return std::string::c_str();
    }

  private:
    bool isNull;
};
]]]

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> Branko Čibej <br...@xbc.nu>:

> std::string also doesn't have any virtual functions, nor does a bool
> member require a destructor. Memory deallocation is OK. std::string is
> actually a POD and tricky rules apply that allow you do do this
> safely, IIRC ...

Nevertheless, deleting a derived object from a base class pointer, when
there isn't a virtual constructor has undefined behaviour.

And IMO venturing into a language's undefined behaviour area is not a
good choice for a public API.

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Branko Čibej <br...@xbc.nu>.
 On 14.10.2010 21:32, Steinar Bang wrote:
>>>>>> Branko Čibej <br...@xbc.nu>:
>> All right. Then derive svn::string from std::string, and add a .null()
>> method. You get to use all the standard string alogorithm
>> specializations, plus you get what you want.
> There is one known objection to this: std::string doesn't have a virtual
> destructor, so if you hand a pointer to your derived string off to
> something that holds a std::string* and then do delete on that
> std::string* you enter into "undefined behaviour" territory.
>
> Meaning what happens isn't defined, and different compilers can
> do... interesting things.

std::string also doesn't have any virtual functions, nor does a bool
member require a destructor. Memory deallocation is OK. std::string is
actually a POD and tricky rules apply that allow you do do this safely,
IIRC ...

-- Brane

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> Branko Čibej <br...@xbc.nu>:

> All right. Then derive svn::string from std::string, and add a .null()
> method. You get to use all the standard string alogorithm
> specializations, plus you get what you want.

There is one known objection to this: std::string doesn't have a virtual
destructor, so if you hand a pointer to your derived string off to
something that holds a std::string* and then do delete on that
std::string* you enter into "undefined behaviour" territory.

Meaning what happens isn't defined, and different compilers can
do... interesting things.

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Branko Čibej <br...@xbc.nu>.
 On 13.10.2010 16:10, Hyrum K. Wright wrote:
> On Tue, Oct 12, 2010 at 5:06 PM, Branko Čibej <br...@xbc.nu> wrote:
>>  On 12.10.2010 22:30, Hyrum K. Wright wrote:
>>> On Tue, Oct 12, 2010 at 2:40 PM, Branko Čibej <br...@xbc.nu> wrote:
>>>>  On 12.10.2010 20:35, Hyrum K. Wright wrote:
>>>>> 1) Return everything by value
>>>>>    Pros: simpler memory management, less overhead (?)
>>>>>    Cons: doesn't allow the return of NULL values, need to establish
>>>>> conventions to represent NULL objects (an isNull() method?)
>>>> Meh.
>>>>
>>>>    inline operator bool() const { return (this->c_struct_pointer != 0); }
>>> That works great for our own types, but what about stuff like std::string?
>>>
>>> inline std::string getAuthor() const { return std::string(ptr->author); }
>>>
>>> doesn't go over so well when ptr->author is NULL.  If returning by
>>> value, we *have* to return a string, but there just isn't any way to
>>> indicate the null string.
>> Good point ... that's a mess. But returning a pointer to an std::string
>> is a bigger one ... eep.
> Another option is a custom SVN::String type which looks / smells /
> acts like a string, but also allows wrapping the NULL value, in a
> manner you suggest above.
>
>> So typically you'd add a hasAuthor function and throw an exception from
>> getAuthor if there is no author info for a revision. However, in this
>> particular case, returning an empty string is just as good, unless you
>> want to make the fine distinction between a svn:author property with an
>> empty value (is that even allowed?) and no svn:author property on the
>> revision. This is no different than if you had a getProperty(name) and
>> did a lookup in a private map of property name/value pairs.
> I just used getAuthor() as an example, and while I'm not certain as to
> the specifics in that particular case (ed: I see Mike has answered
> this elsethread), I know there are other places where the
> present-but-empty vs. not-present distinction is an important and
> valid one.

All right. Then derive svn::string from std::string, and add a .null()
method. You get to use all the standard string alogorithm
specializations, plus you get what you want.

You can even add a factory class method that is the only way you can
construct an svn::string object for which .null() returns true, so for
any normal use, svn::string and std:: string behave in exactly the same way.

Of course, deriving that template is tricky, given that std::string is
already a template specialization, but since this is a library, it's
worth the trouble of doing it right.

-- Brane

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Tue, Oct 12, 2010 at 5:06 PM, Branko Čibej <br...@xbc.nu> wrote:
>  On 12.10.2010 22:30, Hyrum K. Wright wrote:
>> On Tue, Oct 12, 2010 at 2:40 PM, Branko Čibej <br...@xbc.nu> wrote:
>>>  On 12.10.2010 20:35, Hyrum K. Wright wrote:
>>>> 1) Return everything by value
>>>>    Pros: simpler memory management, less overhead (?)
>>>>    Cons: doesn't allow the return of NULL values, need to establish
>>>> conventions to represent NULL objects (an isNull() method?)
>>> Meh.
>>>
>>>    inline operator bool() const { return (this->c_struct_pointer != 0); }
>> That works great for our own types, but what about stuff like std::string?
>>
>> inline std::string getAuthor() const { return std::string(ptr->author); }
>>
>> doesn't go over so well when ptr->author is NULL.  If returning by
>> value, we *have* to return a string, but there just isn't any way to
>> indicate the null string.
>
> Good point ... that's a mess. But returning a pointer to an std::string
> is a bigger one ... eep.

Another option is a custom SVN::String type which looks / smells /
acts like a string, but also allows wrapping the NULL value, in a
manner you suggest above.

> So typically you'd add a hasAuthor function and throw an exception from
> getAuthor if there is no author info for a revision. However, in this
> particular case, returning an empty string is just as good, unless you
> want to make the fine distinction between a svn:author property with an
> empty value (is that even allowed?) and no svn:author property on the
> revision. This is no different than if you had a getProperty(name) and
> did a lookup in a private map of property name/value pairs.

I just used getAuthor() as an example, and while I'm not certain as to
the specifics in that particular case (ed: I see Mike has answered
this elsethread), I know there are other places where the
present-but-empty vs. not-present distinction is an important and
valid one.

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 10/12/2010 06:06 PM, Branko Čibej wrote:
> So typically you'd add a hasAuthor function and throw an exception from
> getAuthor if there is no author info for a revision. However, in this
> particular case, returning an empty string is just as good, unless you
> want to make the fine distinction between a svn:author property with an
> empty value (is that even allowed?) and no svn:author property on the
> revision. This is no different than if you had a getProperty(name) and
> did a lookup in a private map of property name/value pairs.

Not paying much attention to the larger discussion, but yes, you can set
svn:author to an empty value, and you can remove the property altogether.
Any bindings layer above our APIs should preserve the distinction.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Branko Čibej <br...@xbc.nu>.
 On 12.10.2010 22:30, Hyrum K. Wright wrote:
> On Tue, Oct 12, 2010 at 2:40 PM, Branko Čibej <br...@xbc.nu> wrote:
>>  On 12.10.2010 20:35, Hyrum K. Wright wrote:
>>> 1) Return everything by value
>>>    Pros: simpler memory management, less overhead (?)
>>>    Cons: doesn't allow the return of NULL values, need to establish
>>> conventions to represent NULL objects (an isNull() method?)
>> Meh.
>>
>>    inline operator bool() const { return (this->c_struct_pointer != 0); }
> That works great for our own types, but what about stuff like std::string?
>
> inline std::string getAuthor() const { return std::string(ptr->author); }
>
> doesn't go over so well when ptr->author is NULL.  If returning by
> value, we *have* to return a string, but there just isn't any way to
> indicate the null string.

Good point ... that's a mess. But returning a pointer to an std::string
is a bigger one ... eep.

So typically you'd add a hasAuthor function and throw an exception from
getAuthor if there is no author info for a revision. However, in this
particular case, returning an empty string is just as good, unless you
want to make the fine distinction between a svn:author property with an
empty value (is that even allowed?) and no svn:author property on the
revision. This is no different than if you had a getProperty(name) and
did a lookup in a private map of property name/value pairs.

I'm afraid it's not going to be easy to be consistent, but I'd strive to
follow (1) wherever it's reasonable.

-- Brane

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Tue, Oct 12, 2010 at 2:40 PM, Branko Čibej <br...@xbc.nu> wrote:
>  On 12.10.2010 20:35, Hyrum K. Wright wrote:
>> 1) Return everything by value
>>    Pros: simpler memory management, less overhead (?)
>>    Cons: doesn't allow the return of NULL values, need to establish
>> conventions to represent NULL objects (an isNull() method?)
>
> Meh.
>
>    inline operator bool() const { return (this->c_struct_pointer != 0); }

That works great for our own types, but what about stuff like std::string?

inline std::string getAuthor() const { return std::string(ptr->author); }

doesn't go over so well when ptr->author is NULL.  If returning by
value, we *have* to return a string, but there just isn't any way to
indicate the null string.

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Branko Čibej <br...@xbc.nu>.
 On 12.10.2010 20:35, Hyrum K. Wright wrote:
> 1) Return everything by value
>    Pros: simpler memory management, less overhead (?)
>    Cons: doesn't allow the return of NULL values, need to establish
> conventions to represent NULL objects (an isNull() method?)

Meh.

    inline operator bool() const { return (this->c_struct_pointer != 0); }

> 2) Return everything by pointer
>    Pros: can represent the NULL value, potentially less memory overhead
>    Cons: more complex memory management (caller must delete returned value)

This would be seriously horrible since any user of the library would
then have to invent their own way to manage the lifetime of these
objects (i.e., wrap them in smart pointers). Defeats the whole purpose
of having a C++ API in the first place IMHO.

> 3) Some combination of (1) & (2)
>    Pros: Can return guaranteed existing things by value, potentially
> NULL ones by pointer
>    Cons: Making the distinction between optional and required takes
> more effort, a heterogeneous API is more work for callers

Yup, so it is, so go with (1) and do it right. :)

-- Brane

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Branko Čibej <br...@xbc.nu>.
 On 13.10.2010 22:05, Steinar Bang wrote:
> I'm divided on whether I would move the delete of the RefCounter object
> into the dec_ref method, or leave it where it is.  On one hand it feels
> like good encapsulation to put it together with the dec_ref, on the
> other hand, I've always been wary of "delete this;"...

Why? Just because it feels strange to you? Same reason why you'd rather
copy-past lots of classes instead of instantiate templates? ;)

Kidding, but a serious question nevertheless.

-- Brane

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:

> This is an interesting approach...and parallels what has already been
> done. :)

> Out of curiosity, have you taken a chance to look at the
> existing code here:
> http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h
> ? 

Looking at it now.

> Of particular interest are the RefCounter and CStructWrapper classes.

Hm... personally I would have gone for a non-templatized solution.  More
classes (the classes I spelled out, you create by instantiating
templates), perhaps, and more code.  But IMO higher readability.

But it's a personal choice thing, and not an argument against.

I'm divided on whether I would move the delete of the RefCounter object
into the dec_ref method, or leave it where it is.  On one hand it feels
like good encapsulation to put it together with the dec_ref, on the
other hand, I've always been wary of "delete this;"...

So leaving the delete where it is is probably a good choice. :-)

Other than that it looks good to me (with the caveat that I haven't
actually tried using it, and that's when you know whether an API is good
or not).


Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Wed, Oct 13, 2010 at 2:28 PM, Steinar Bang <sb...@dod.no> wrote:
>>>>>> Steinar Bang <sb...@dod.no>:
>
>>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:
>>> 2) Return everything by pointer
>>> Pros: can represent the NULL value, potentially less memory overhead
>>> Cons: more complex memory management (caller must delete returned value)
>
>> Um... why must the caller delete the returned value?  I thought you were
>> using refcounting smartpointers?
>
> Boost has two such (shared_ptr which is non-intrusive, and intrusive_ptr
> which is).
>  http://www.boost.org/doc/libs/1_44_0/libs/smart_ptr/smart_ptr.htm
>  http://www.drdobbs.com/184401507
>
> But personally, I wouldn't have used either, because the templated smart
> pointers are more complicated than smart pointers have to be, and it
> would create a dependency to boost (which btw is a nice library to have
> dependencies to, and contains stuff the Standard library should have
> contained, instead of what it actually contains... but that's a
> different story...).
>
> What I would do is to:
>  - create a base class ReferenceCountedObject with an increment() method
>   and a decrement() method, and a protected virtual
>   freeWrappedPointer() method
>  - I would also create a base smart pointer class for
>   ReferenceCountedObject* that
>   - In its copy constructor(s) would increment the referenced object's
>     reference count
>   - In the assignment operator would decrement the reference count of
>     any object the pointer is already pointing to, and increment the
>     reference count of the assigned object
>   - In its destructor, decrement the reference count of the referenced
>     object
>   If possible, I would make the constructors and assignment operator
>   protected
>  - I would inherit from ReferenceCountedObject to make all of my wrapper
>   classes
>  - The wrapper classes would implement freeWrappedPointer() to free the
>   SVN API object they have a pointer to in the correct manner
>  - I would inherit from the base smart pointer class to create one smart
>   pointer for each wrapper class.  These smart pointers' copy
>   constructors and assignment operators would delegate to the base
>   class for copy constructors and assignment, and implement wrapper
>   class specific de-referencing operators
>  - I would create a singleton class wrapping the API, functioning as a
>   factory for all API objects
>  - I would disallow making wrapper class instances (either by using pure
>   virtual classes in the API, and hidden implementation classes, or by
>   giving the wrappers private constructors and making the singleton API
>   wrapper class a friend.  I think I prefer the pure virtual with
>   hidden implementation approach)
>
> The reason for not creating the wrapper objects directly is that you
> would like to have control over the initial increment value.  Another
> reason is that having such a singleton class maps nicely to wrapping a C
> API.

This is an interesting approach...and parallels what has already been
done. :)  Out of curiosity, have you taken a chance to look at the
existing code here:
http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h
?  Of particular interest are the RefCounter and CStructWrapper
classes.

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> Steinar Bang <sb...@dod.no>:

>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:
>> 2) Return everything by pointer
>> Pros: can represent the NULL value, potentially less memory overhead
>> Cons: more complex memory management (caller must delete returned value)

> Um... why must the caller delete the returned value?  I thought you were
> using refcounting smartpointers?

Boost has two such (shared_ptr which is non-intrusive, and intrusive_ptr
which is).
  http://www.boost.org/doc/libs/1_44_0/libs/smart_ptr/smart_ptr.htm
  http://www.drdobbs.com/184401507

But personally, I wouldn't have used either, because the templated smart
pointers are more complicated than smart pointers have to be, and it
would create a dependency to boost (which btw is a nice library to have
dependencies to, and contains stuff the Standard library should have
contained, instead of what it actually contains... but that's a
different story...).

What I would do is to:
 - create a base class ReferenceCountedObject with an increment() method
   and a decrement() method, and a protected virtual
   freeWrappedPointer() method
 - I would also create a base smart pointer class for
   ReferenceCountedObject* that
   - In its copy constructor(s) would increment the referenced object's
     reference count
   - In the assignment operator would decrement the reference count of
     any object the pointer is already pointing to, and increment the
     reference count of the assigned object
   - In its destructor, decrement the reference count of the referenced
     object 
   If possible, I would make the constructors and assignment operator
   protected 
 - I would inherit from ReferenceCountedObject to make all of my wrapper
   classes
 - The wrapper classes would implement freeWrappedPointer() to free the
   SVN API object they have a pointer to in the correct manner
 - I would inherit from the base smart pointer class to create one smart
   pointer for each wrapper class.  These smart pointers' copy
   constructors and assignment operators would delegate to the base
   class for copy constructors and assignment, and implement wrapper
   class specific de-referencing operators
 - I would create a singleton class wrapping the API, functioning as a
   factory for all API objects
 - I would disallow making wrapper class instances (either by using pure
   virtual classes in the API, and hidden implementation classes, or by
   giving the wrappers private constructors and making the singleton API
   wrapper class a friend.  I think I prefer the pure virtual with
   hidden implementation approach)

The reason for not creating the wrapper objects directly is that you
would like to have control over the initial increment value.  Another
reason is that having such a singleton class maps nicely to wrapping a C
API.

RE: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Bolstridge, Andrew" <an...@intergraph.com>.
-----Original Message-----
> From: Steinar Bang [mailto:sb@dod.no] 
> Sent: 13 October 2010 21:16
> To: dev@subversion.apache.org
> Subject: Re: object-model: Return by value, reference or pointer? (or
something else?)

> Do you need the distrinction between an empty string and a NULL?
> If not then I would return an empty string for a NULL on the C side.

> There's also the consideration that some std::string implementations
are deep copying (the gcc version uses refcounting > (or at least used
to do so), but the VC++ one used to use deep copying.  I'm not sure if
that still is the case).

--
Yes, it does, they removed refcounts after VC6. The problem with
refcounting a std::string is the performance of it in a threaded
environment is worse than simply copying the string (you have to handle
all the copy-on-write issues and this gets difficult).
http://www.sgi.com/tech/stl/string_discussion.html describes some of the
problems.


I have agree with Steiner - what's the difference between a null string
and an empty string? Typically there is no difference. In the case of C
there was, as the char* might be NULL, or it might point to a byte with
\0 as its first character. That NULLness is an artefact of how C manages
string memory, not any way of describing the string. Once you start
using both, you get stuff like .NETs '.IsNullorEmpty()' methods - and
then you're back right where you started, treating a NULL and an empty
string as the same thing.

In C++, if you really need to return either nothing or a string, return
a std::pair<bool, std::string> (like std::map uses to indicate if the
insertion succeeded or failed due to an existing key was already present
in the map).

Most DB access methods return the datatype, and another indicator
whether the data was NULL. 

Personally, I would stick to just returning a string, or if it is
essential to return a null value, return a string and an null indicator
- your choice whether that's best as a pair, or an out parameter. I
would not inherit from std::string, nor would I throw an exception (not
that anyone's yet suggested that - we're not C# or Java programmers
after all :-) )

Andy



Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:

> I don't want the caller to have to depend upon the lifetime of the
> source object, hence the desire to return something by value or a
> newly allocated pointer.

> Additionally, it still wouldn't work, since references have to point
> to some object, hence there is no such thing as a "null reference".

Well... this might actually be a use for auto_ptr<string>.

If you return auto_ptr<string> you force the caller to assign it to an
auto_ptr<string> (ie. something that will clean up after itself when it
goes out of scope).  This assignment would assign ownership to the
string from the return-by-value'd auto_ptr<string> to the variable you
assign it to.

If you fail to assign, the returned auto_ptr should go out of scope, and
delete the string it returns to.

That is... if auto_ptr<string> can hold NULL values.  I don't know if it
can (I think it can, but I've never used it, so I don't know).  Yep,
looks like it does.  See eg. http://en.wikipedia.org/wiki/Auto_ptr
(example program with assignment a bit down).

> (This entire conversation is reminding my why I *hate* C++.  To bad
> there isn't too much of an alternative here... :/)

(I saw a review once for Scott Meyers' "Effective C++", where the
reviewer said something like "Scott Meyers shows that he has a thorugh
grasp of C++.  It can be debated whether having a through grasp of C++
is a good thing or not...".  :-)  (btw if you have to work in C++ then
that particular book, is essential reading))

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Wed, Oct 13, 2010 at 3:16 PM, Steinar Bang <sb...@dod.no> wrote:
>>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:
>
>> At this point, it's my turn to get confused.  I'm speaking
>> specifically of cases where we return STL classes.  Brane's earlier
>> suggestion to create an implicit bool conversion works fine for our
>> custom classes.  The problem I'm trying to solve is "how does one
>> return the equivalent of '(const char *) NULL' in an std::string?"
>> It's a this point that pointers start coming into play, because a NULL
>> std::string * is quite feasible, whereas a NULL std::string isn't.
>
> Ah, ok.  That was at a different part of the thread... which I avoided
> responding to because I have any good answers at hand. :-)
>
> Do you need the distrinction between an empty string and a NULL?
>
> If not then I would return an empty string for a NULL on the C side.

Yes, so that's a no-go.

> There's also the consideration that some std::string implementations are
> deep copying (the gcc version uses refcounting (or at least used to do
> so), but the VC++ one used to use deep copying.  I'm not sure if that
> still is the case).
>
> Then maybe what you want to return is const std::string& (which leaves
> you with the headache of managing the life cycle of the std::string, so
> maybe not... though... you have the context of the wrapper class and
> could cache them lazily there.  That would increase the footprint of the
> wrapper, though).

I don't want the caller to have to depend upon the lifetime of the
source object, hence the desire to return something by value or a
newly allocated pointer.

Additionally, it still wouldn't work, since references have to point
to some object, hence there is no such thing as a "null reference".

(This entire conversation is reminding my why I *hate* C++.  To bad
there isn't too much of an alternative here... :/)

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:

> At this point, it's my turn to get confused.  I'm speaking
> specifically of cases where we return STL classes.  Brane's earlier
> suggestion to create an implicit bool conversion works fine for our
> custom classes.  The problem I'm trying to solve is "how does one
> return the equivalent of '(const char *) NULL' in an std::string?"
> It's a this point that pointers start coming into play, because a NULL
> std::string * is quite feasible, whereas a NULL std::string isn't.

Ah, ok.  That was at a different part of the thread... which I avoided
responding to because I have any good answers at hand. :-)

Do you need the distrinction between an empty string and a NULL?

If not then I would return an empty string for a NULL on the C side.

There's also the consideration that some std::string implementations are
deep copying (the gcc version uses refcounting (or at least used to do
so), but the VC++ one used to use deep copying.  I'm not sure if that
still is the case).

Then maybe what you want to return is const std::string& (which leaves
you with the headache of managing the life cycle of the std::string, so
maybe not... though... you have the context of the wrapper class and
could cache them lazily there.  That would increase the footprint of the
wrapper, though).


Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by eg <eg...@gmail.com>.
On 10/13/2010 12:51 PM, Hyrum K. Wright wrote:

>
> At this point, it's my turn to get confused.  I'm speaking
> specifically of cases where we return STL classes.  Brane's earlier
> suggestion to create an implicit bool conversion works fine for our
> custom classes.  The problem I'm trying to solve is "how does one
> return the equivalent of '(const char *) NULL' in an std::string?"
> It's a this point that pointers start coming into play, because a NULL
> std::string * is quite feasible, whereas a NULL std::string isn't.


It may not be of interest, but the Boost::Optional library was designed 
as a potential solution to this type of situation.


http://www.boost.org/doc/libs/1_44_0/libs/optional/doc/html/index.html

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Wed, Oct 13, 2010 at 2:42 PM, Steinar Bang <sb...@dod.no> wrote:
>>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:
>
>> Refcounting smartpointers are used to avoid duplication of the
>> underlying C structures, but they are completely private.  I've got
>> serious reservations about exposing that complexity in the external
>> object interfaces.  Given those reservations, the API would just
>> return a vanilla pointer, which leaves the caller the responsibility
>> of freeing it.
>
> Here I'm not following you.  You would hide the complexity of smart
> pointers, and would return plain pointers?

I would let the callers worry about the complexity of smart pointers
externally, rather than make an eternal interface decision based upon
them.

> Hm... the first I wouldn't worry about, and on the second I would go to
> great lenghts to make it hard to get at the plain pointers.

At this point, it's my turn to get confused.  I'm speaking
specifically of cases where we return STL classes.  Brane's earlier
suggestion to create an implicit bool conversion works fine for our
custom classes.  The problem I'm trying to solve is "how does one
return the equivalent of '(const char *) NULL' in an std::string?"
It's a this point that pointers start coming into play, because a NULL
std::string * is quite feasible, whereas a NULL std::string isn't.

These strings, and their contents are copied from the internal values
used by the wrapper classes, so the actual C data would be completely
obscured.  This is evidenced by the current (aforementioned) design.

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:

> Refcounting smartpointers are used to avoid duplication of the
> underlying C structures, but they are completely private.  I've got
> serious reservations about exposing that complexity in the external
> object interfaces.  Given those reservations, the API would just
> return a vanilla pointer, which leaves the caller the responsibility
> of freeing it.

Here I'm not following you.  You would hide the complexity of smart
pointers, and would return plain pointers?

Hm... the first I wouldn't worry about, and on the second I would go to
great lenghts to make it hard to get at the plain pointers.

At least if we're talking about a variant of your alternative 2.

>> Generally 1 gives a nicer API.  But if you have any inheritance and
>> virtual methods going, then 2 is simplest.

> No inheritance or virtual stuff, as of yet, so not a concern.

Ok, that makes it simpler.  Then go for 1.  I would have inheritance
here to make a SPOC for the refcounting, but it could be done without
virtual methods.


Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:

> Refcounting smartpointers are used to avoid duplication of the
> underlying C structures, but they are completely private.  I've got
> serious reservations about exposing that complexity in the external
> object interfaces.  Given those reservations, the API would just
> return a vanilla pointer, which leaves the caller the responsibility
> of freeing it.

Here I'm not following you.  You would hide the complexity of smart
pointers, and would return plain pointers?

Hm... the first I wouldn't worry about, and on the second I would go to
great lenghts to make it hard to get at the plain pointers.

At least if we're talking about a variant of your alternative 2.

>> Generally 1 gives a nicer API.  But if you have any inheritance and
>> virtual methods going, then 2 is simplest.

> No inheritance or virtual stuff, as of yet, so not a concern.

Ok, that makes it simpler.  Then go for 1.  I would have inheritance
here to make a SPOC for the refcounting, but it could be done without
virtual methods.

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Wed, Oct 13, 2010 at 11:51 AM, Steinar Bang <sb...@dod.no> wrote:
>>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:
>
>> 2) Return everything by pointer
>>    Pros: can represent the NULL value, potentially less memory overhead
>>    Cons: more complex memory management (caller must delete returned value)
>
> Um... why must the caller delete the returned value?  I thought you were
> using refcounting smartpointers?

Refcounting smartpointers are used to avoid duplication of the
underlying C structures, but they are completely private.  I've got
serious reservations about exposing that complexity in the external
object interfaces.  Given those reservations, the API would just
return a vanilla pointer, which leaves the caller the responsibility
of freeing it.

> Generally 1 gives a nicer API.  But if you have any inheritance and
> virtual methods going, then 2 is simplest.

No inheritance or virtual stuff, as of yet, so not a concern.

-Hyrum

Re: object-model: Return by value, reference or pointer? (or something else?)

Posted by Steinar Bang <sb...@dod.no>.
>>>>> "Hyrum K. Wright" <hy...@mail.utexas.edu>:

> 2) Return everything by pointer
>    Pros: can represent the NULL value, potentially less memory overhead
>    Cons: more complex memory management (caller must delete returned value)

Um... why must the caller delete the returned value?  I thought you were
using refcounting smartpointers?

Generally 1 gives a nicer API.  But if you have any inheritance and
virtual methods going, then 2 is simplest.