You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@corinthia.apache.org by jan i <ja...@apache.org> on 2015/08/10 13:26:23 UTC

C99 versus C++ (limited)

Hi

Peter and I talked the other day and among others about the benefits of
using C++ instead of sticking to C99.

This would be a major change in the project (less in the code, more in the
"how to"), and it is
not something we should "just" do.

I favor C++, but not unlimited, I see 2 places where C++ can give us more
stable code:
- Interfaces.
Using classes to group our functions (like e.g. platform, core, filters/odf
etc.),
would make it very clear where the function originates. It would also allow
group global variables that are private to the rest of the world.
I would not use real interface classes, for our internal grouping, that is
not needed. But e.g. the DocFormats API should be a real interface class
- Automatic.
At the moment we have a lot of code managing construction/deconstruction,
that could be totally automated by use of C++ smart pointers.
- Object model (filters, flat and core)
would be more logically represented as objects, and suddenly copying etc.
would be a lot easier.

I would not like to see big inheritance (especially not multiple
inheritance).

I fail to see what we loose by making the change, but please give your
opinion.

rgds
jan i.

Ps. This is in no way a vote thread, but simply a way to gather opinions.

Re: C99 versus C++ (limited)

Posted by jan i <ja...@apache.org>.

Hi.

I started this thread to see if there was a big interest in converting to
C++ (limited). Looking at the comments I categorize such a change as
ranging from
"nice to have" to "not needed" at least for now.

Peter told me very politely (using different words), that there are so much
to do in corinthia, that my time could be used better.

Therefore we will keep DocFormat unchanged in C99.

I hope nobody objects that I play with C++ for the editor framework, since
it is a consumer the effect on other parts are non-existing.

rgds
jan i.


On 13 August 2015 at 22:31, Peter Kelly <pm...@apache.org> wrote:

> I think C++ would make working with the DocFormats library, at least in
> its current form, significantly easier. In particular, the explicit support
> for classes, and the ability to use smart pointers (thus avoiding manual
> reference counting) would be a big win in terms of complexity.
>
> As a background to why the library is in C and not C++:
>
> The reason is that originally DocFormats was written in Objective C (since
> I was targeting only iOS at the time). Objective C is a superset of C, so
> when I decided I wanted to open source the code and enable it to be used on
> non-Apple platforms, I methodically went through the source tree converting
> all the Objective C classes and reference counting statements into their C
> equivalents. Objective C has automatic reference counting now, but at the
> time I was not using it, so this meant the translation was relatively
> straightforward.
>
> While it *is* possible to mix Objective C and C++, doing so results in an
> additional layer of complexity, which I wanted to avoid - you have two ways
> of defining classes etc. The conversion to C was simpler than I expect a
> conversion to C++ would have been. However, now all the code is in C and
> completely free of any Apple-specific dependencies, I think it would be
> reasonable to move to C++ to more concisely express many of the things that
> are currently done explicitly (memory management being the most
> significant). The resulting code would also be more readable.
>
> I don’t volunteer to do the conversion myself, since it’s a lot of work.
> However for anyone willing to take on the task, this would be an excellent
> way of becoming intimately familiar with the library, which would be of
> great use in developing ODF and other filters.
>
> I kind of have a natural aversion to C++ because of it’s complexity, and
> the sheer number of features which, if they are all (or even a significant
> portion of them) used can lead to very complicated code. I think we should
> agree on fairly strict guidelines on the subset of the language we use, do
> avoid things “getting out of hand” with the codebase, so to speak.
>
> There are some nice properties of C I like, such as the ability to grep
> for a function name throughout the whole source tree to find out all the
> places it’s used, which is handy for refactoring. Xcode also has some
> refactoring tools which work for Objective C and most of C, which I used a
> lot doing the original conversion, but these do not work with C++ (of
> course this is a limitation of Xcode, not a problem with C++ per se).
>
> There are some specific areas we’ll need to be careful about in terms of
> performance. Actually the first pure C code I had in DocFormats, long
> before I converted the whole library, was the DFNode and DFDocument
> structures, which use a specialised memory allocator that simply allocates
> a slab of memory and frees it all in one go after conversion has finished.
> Prior to that, every node was a separate Objective C object, and freeing a
> whole document took an inordinate amount of time, due to the large number
> of release messages sent to free individual nodes, and the fact that
> Objective C’s dispatch mechanism is not efficient for compute-intensive
> code. This had a very noticeable impact on load times of large documents,
> which was greatly improved by switching to a customised, efficient memory
> allocation strategy. We should maintain this when moving to C++.
>
> Regarding Flat, I’d like to keep that in C at least for now, because my
> plan is to build a virtual machine for executing Flat programs, and for
> which I’ll implement a garbage collector, which necessarily requires
> intimate knowledge of the memory layout of objects. While this is possible
> to do in C++, it’s easier in C as there’s less abstractions in the way.
> Flat is also about to get it’s own type system, which will be different in
> many respects from that of C++ (and more tailored towards the task of
> transformation). I’ll post more on this in due course.
>
> But for the bulk of the DocFormats code, I think it makes sense to move to
> C++, and that we’ll benefit from the improved maintainability and make it
> easier for new committers coming into the project to understand the
> structure of the code.
>
> —
> Dr Peter M. Kelly
> pmkelly@apache.org
>
> PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key>
> (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)
>
> > On 10 Aug 2015, at 6:26 pm, jan i <ja...@apache.org> wrote:
> >
> > Hi
> >
> > Peter and I talked the other day and among others about the benefits of
> > using C++ instead of sticking to C99.
> >
> > This would be a major change in the project (less in the code, more in
> the
> > "how to"), and it is
> > not something we should "just" do.
> >
> > I favor C++, but not unlimited, I see 2 places where C++ can give us more
> > stable code:
> > - Interfaces.
> > Using classes to group our functions (like e.g. platform, core,
> filters/odf
> > etc.),
> > would make it very clear where the function originates. It would also
> allow
> > group global variables that are private to the rest of the world.
> > I would not use real interface classes, for our internal grouping, that
> is
> > not needed. But e.g. the DocFormats API should be a real interface class
> > - Automatic.
> > At the moment we have a lot of code managing construction/deconstruction,
> > that could be totally automated by use of C++ smart pointers.
> > - Object model (filters, flat and core)
> > would be more logically represented as objects, and suddenly copying etc.
> > would be a lot easier.
> >
> > I would not like to see big inheritance (especially not multiple
> > inheritance).
> >
> > I fail to see what we loose by making the change, but please give your
> > opinion.
> >
> > rgds
> > jan i.
> >
> > Ps. This is in no way a vote thread, but simply a way to gather opinions.
>
>

Re: C99 versus C++ (limited)

Posted by Peter Kelly <pm...@apache.org>.

I think C++ would make working with the DocFormats library, at least in its current form, significantly easier. In particular, the explicit support for classes, and the ability to use smart pointers (thus avoiding manual reference counting) would be a big win in terms of complexity.

As a background to why the library is in C and not C++:

The reason is that originally DocFormats was written in Objective C (since I was targeting only iOS at the time). Objective C is a superset of C, so when I decided I wanted to open source the code and enable it to be used on non-Apple platforms, I methodically went through the source tree converting all the Objective C classes and reference counting statements into their C equivalents. Objective C has automatic reference counting now, but at the time I was not using it, so this meant the translation was relatively straightforward.

While it *is* possible to mix Objective C and C++, doing so results in an additional layer of complexity, which I wanted to avoid - you have two ways of defining classes etc. The conversion to C was simpler than I expect a conversion to C++ would have been. However, now all the code is in C and completely free of any Apple-specific dependencies, I think it would be reasonable to move to C++ to more concisely express many of the things that are currently done explicitly (memory management being the most significant). The resulting code would also be more readable.

I don’t volunteer to do the conversion myself, since it’s a lot of work. However for anyone willing to take on the task, this would be an excellent way of becoming intimately familiar with the library, which would be of great use in developing ODF and other filters.

I kind of have a natural aversion to C++ because of it’s complexity, and the sheer number of features which, if they are all (or even a significant portion of them) used can lead to very complicated code. I think we should agree on fairly strict guidelines on the subset of the language we use, do avoid things “getting out of hand” with the codebase, so to speak.

There are some nice properties of C I like, such as the ability to grep for a function name throughout the whole source tree to find out all the places it’s used, which is handy for refactoring. Xcode also has some refactoring tools which work for Objective C and most of C, which I used a lot doing the original conversion, but these do not work with C++ (of course this is a limitation of Xcode, not a problem with C++ per se).

There are some specific areas we’ll need to be careful about in terms of performance. Actually the first pure C code I had in DocFormats, long before I converted the whole library, was the DFNode and DFDocument structures, which use a specialised memory allocator that simply allocates a slab of memory and frees it all in one go after conversion has finished. Prior to that, every node was a separate Objective C object, and freeing a whole document took an inordinate amount of time, due to the large number of release messages sent to free individual nodes, and the fact that Objective C’s dispatch mechanism is not efficient for compute-intensive code. This had a very noticeable impact on load times of large documents, which was greatly improved by switching to a customised, efficient memory allocation strategy. We should maintain this when moving to C++.

Regarding Flat, I’d like to keep that in C at least for now, because my plan is to build a virtual machine for executing Flat programs, and for which I’ll implement a garbage collector, which necessarily requires intimate knowledge of the memory layout of objects. While this is possible to do in C++, it’s easier in C as there’s less abstractions in the way. Flat is also about to get it’s own type system, which will be different in many respects from that of C++ (and more tailored towards the task of transformation). I’ll post more on this in due course.

But for the bulk of the DocFormats code, I think it makes sense to move to C++, and that we’ll benefit from the improved maintainability and make it easier for new committers coming into the project to understand the structure of the code.

—
Dr Peter M. Kelly
pmkelly@apache.org

PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key>
(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

> On 10 Aug 2015, at 6:26 pm, jan i <ja...@apache.org> wrote:
> 
> Hi
> 
> Peter and I talked the other day and among others about the benefits of
> using C++ instead of sticking to C99.
> 
> This would be a major change in the project (less in the code, more in the
> "how to"), and it is
> not something we should "just" do.
> 
> I favor C++, but not unlimited, I see 2 places where C++ can give us more
> stable code:
> - Interfaces.
> Using classes to group our functions (like e.g. platform, core, filters/odf
> etc.),
> would make it very clear where the function originates. It would also allow
> group global variables that are private to the rest of the world.
> I would not use real interface classes, for our internal grouping, that is
> not needed. But e.g. the DocFormats API should be a real interface class
> - Automatic.
> At the moment we have a lot of code managing construction/deconstruction,
> that could be totally automated by use of C++ smart pointers.
> - Object model (filters, flat and core)
> would be more logically represented as objects, and suddenly copying etc.
> would be a lot easier.
> 
> I would not like to see big inheritance (especially not multiple
> inheritance).
> 
> I fail to see what we loose by making the change, but please give your
> opinion.
> 
> rgds
> jan i.
> 
> Ps. This is in no way a vote thread, but simply a way to gather opinions.

Re: C99 versus C++ (limited)

Posted by Ian C <ia...@amham.net>.

Hi all,

my day job is as a C++ programmer, I was finding it quite refreshing
to go back to straight C.
In some respects it is easier.

I think we need to look at whether there really are any major benefits
to using one over the other with respect to the goals of Coriinthia,
not as a flame war like debate on the merits of a given language.

However, I am not sure that I can clearly identify the goals well
enough to give a considered opinion.

Thinking aloud ...

Corinthia takes an input document (ignore the type for the moment) and
converts/represents it in its own tree structure, The DF nodes. Which
are then aimed at an HTML form to be displayed/edited.

After an edit the HTML document is converted back to its DF node form
and either merged into the original as an edit or saved in it entirety
in a new form?

At least that is how I think of it.

The internal DF nodes lend themselves to being C++ objects, but I
think the way they currently implemented is more than adequate. So
what do we gain there?

The transformation of OOXML, ODF, Latex, HTML or whatever into the DF
nodes is basically a mapping exercise. As is the reverse. Do we gain
if that is done in C++? I can think of automatically generating
classes for the DOM nodes as the ODF Toolkit does for Java. That may
provide a way to create a more comprehensive and measurable mapping?
But we could just as easily generate C code too? I can even envisage a
kind of higher level mapping definition language to describe the
mappings and then the majority of the thing could be automatically
generated? That would be some task though.

So I go around in circles with no conclusion....

What do OO languages provide.... abstraction, inheritance,
encapsulation, interface management. Do we need all of that ... we can
and do to some extent create C modules that model that. I suspect we
may be able to make the lens functions a bit clearer in C++?

Ok, I'll stop driveling now....

Does the possible editor have any influence on the language choice? If
we choose X will it align more easily with....???

Ok really stopping now.... :-)

On Mon, Aug 10, 2015 at 7:57 PM, Harry Bachmann
<ha...@powerapp.eu> wrote:
> Hi
>
> I spend some time and tested a little C99 versus C++ on the Arm-Core and found out C99 makes some times much better code.
>
> Hope this Helps ;-)
>
> rgs
>
> Harry
>> Am 10.08.2015 um 19:26 schrieb jan i <ja...@apache.org>:
>>
>> Hi
>>
>> Peter and I talked the other day and among others about the benefits of
>> using C++ instead of sticking to C99.
>>
>> This would be a major change in the project (less in the code, more in the
>> "how to"), and it is
>> not something we should "just" do.
>>
>> I favor C++, but not unlimited, I see 2 places where C++ can give us more
>> stable code:
>> - Interfaces.
>> Using classes to group our functions (like e.g. platform, core, filters/odf
>> etc.),
>> would make it very clear where the function originates. It would also allow
>> group global variables that are private to the rest of the world.
>> I would not use real interface classes, for our internal grouping, that is
>> not needed. But e.g. the DocFormats API should be a real interface class
>> - Automatic.
>> At the moment we have a lot of code managing construction/deconstruction,
>> that could be totally automated by use of C++ smart pointers.
>> - Object model (filters, flat and core)
>> would be more logically represented as objects, and suddenly copying etc.
>> would be a lot easier.
>>
>> I would not like to see big inheritance (especially not multiple
>> inheritance).
>>
>> I fail to see what we loose by making the change, but please give your
>> opinion.
>>
>> rgds
>> jan i.
>>
>> Ps. This is in no way a vote thread, but simply a way to gather opinions.
>

-- 
Cheers,

Ian C

Re: C99 versus C++ (limited)

Posted by jan i <ja...@apache.org>.

On Monday, August 10, 2015, Harry Bachmann <ha...@powerapp.eu>
wrote:

> Hi
>
> I spend some time and tested a little C99 versus C++ on the Arm-Core and
> found out C99 makes some times much better code.

better code as in
- smaller foot print
- more robust
- faster
- something else

>
> Hope this Helps ;-)

we are always happy for input, that it the be efit of a true opensource
project.

rgds
jan i

>
> rgs
>
> Harry
> > Am 10.08.2015 um 19:26 schrieb jan i <jani@apache.org <javascript:;>>:
> >
> > Hi
> >
> > Peter and I talked the other day and among others about the benefits of
> > using C++ instead of sticking to C99.
> >
> > This would be a major change in the project (less in the code, more in
> the
> > "how to"), and it is
> > not something we should "just" do.
> >
> > I favor C++, but not unlimited, I see 2 places where C++ can give us more
> > stable code:
> > - Interfaces.
> > Using classes to group our functions (like e.g. platform, core,
> filters/odf
> > etc.),
> > would make it very clear where the function originates. It would also
> allow
> > group global variables that are private to the rest of the world.
> > I would not use real interface classes, for our internal grouping, that
> is
> > not needed. But e.g. the DocFormats API should be a real interface class
> > - Automatic.
> > At the moment we have a lot of code managing construction/deconstruction,
> > that could be totally automated by use of C++ smart pointers.
> > - Object model (filters, flat and core)
> > would be more logically represented as objects, and suddenly copying etc.
> > would be a lot easier.
> >
> > I would not like to see big inheritance (especially not multiple
> > inheritance).
> >
> > I fail to see what we loose by making the change, but please give your
> > opinion.
> >
> > rgds
> > jan i.
> >
> > Ps. This is in no way a vote thread, but simply a way to gather opinions.
>
>

-- 
Sent from My iPad, sorry for any misspellings.

Re: C99 versus C++ (limited)

Posted by Harry Bachmann <ha...@powerapp.eu>.

Hi

I spend some time and tested a little C99 versus C++ on the Arm-Core and found out C99 makes some times much better code.

Hope this Helps ;-)

rgs

Harry
> Am 10.08.2015 um 19:26 schrieb jan i <ja...@apache.org>:
> 
> Hi
> 
> Peter and I talked the other day and among others about the benefits of
> using C++ instead of sticking to C99.
> 
> This would be a major change in the project (less in the code, more in the
> "how to"), and it is
> not something we should "just" do.
> 
> I favor C++, but not unlimited, I see 2 places where C++ can give us more
> stable code:
> - Interfaces.
> Using classes to group our functions (like e.g. platform, core, filters/odf
> etc.),
> would make it very clear where the function originates. It would also allow
> group global variables that are private to the rest of the world.
> I would not use real interface classes, for our internal grouping, that is
> not needed. But e.g. the DocFormats API should be a real interface class
> - Automatic.
> At the moment we have a lot of code managing construction/deconstruction,
> that could be totally automated by use of C++ smart pointers.
> - Object model (filters, flat and core)
> would be more logically represented as objects, and suddenly copying etc.
> would be a lot easier.
> 
> I would not like to see big inheritance (especially not multiple
> inheritance).
> 
> I fail to see what we loose by making the change, but please give your
> opinion.
> 
> rgds
> jan i.
> 
> Ps. This is in no way a vote thread, but simply a way to gather opinions.