You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Nick Wellnhofer <we...@aevum.de> on 2016/06/01 15:04:08 UTC

Re: [lucy-dev] Separate binaries for test suites

On 26/05/2016 04:12, Marvin Humphrey wrote:
> With Python distutils as with Perl's Module::Build, it is at least
> theoretically possible to provide a list of object files to be linked into the
> host extension.

I'm leaning towards the "list of object files" approach for the Perl bindings. 
Unless static libraries are really used as intended, they seem to create more 
problems than they solve.

> But with CGO, I don't know how to do that.
>
> *   You can't tell the Go build system about a list of compiled object files.
> *   You can't tell CGO about arbitrary directories full of C files to compile.
> *   You *can* put loose C files into the package directory, but only that one
>     dir -- nested dirs aren't allowed.
> *   You *can* hack in linker instructions using pseudo `#cgo` directives.
>     https://golang.org/cmd/cgo/#hdr-Using_cgo_with_the_go_command
>
> We're currently using the last option to tell CGO about the static archive
> built using Charmonizer/Make.

Another solution is to add fake references to object files that aren't picked 
up when linking with the static library. The only problematic file for me was 
TestUtils.o. Object files for non-inert classes are always referenced by the 
generated binding code. For inert classes, this could even be automated by 
adding a reference to any symbol of an inert class somewhere in the generated 
code. This leaves non-Clownfish source files (stuff like LFReg) that aren't 
used by the parcel itself which should be a rare case.

>> What about code like `xs/XSBind.c`? Should it be compiled by the host, too?
>
> Yes, IMO -- it's hard to handle otherwise because you have to know about
> installation-specific include dirs and the like.

OK.

Nick


Re: [lucy-dev] Separate binaries for test suites

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Tue, Jun 14, 2016 at 5:28 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> On Jun 8, 2016, at 03:02 , Marvin Humphrey <ma...@rectangular.com> wrote:
>>
>> Over time, we should try to make decisions that consolidate as much of build
>> code as possible in a shared central system (i.e. charmonizer/make for the
>> foreseeable future).  Even if that code eventually gets refactored, it's only
>> one task to refactor it, as opposed to N tasks to refactor code in N hosts.
>
> Originally, I didn’t want to make Charmonizer a hard requirement but I’m
> willing to give that up, at least for now.

I hear you.  As Charmonizer has expanded a bit, I've been reminded of the old
Larry Wall chestnut, "It's easier to port a shell than a shell script."

What's more important than Charmonizer's implementation are satisfying the
requirements of not imposing any prerequisites (e.g. Unix shell) to run
configuration probing of a C compilation environment, nor any build tool
outside of what the host language environment provides.  But I figure that
means for the time being, it's impractical to work with anything except
bundle-able, portable Charmonizer.

>> I would say tests in a separate binary.  For example, under the Perl bindings
>> the core tests for Lucy would be run from a separate XS module (Lucy::CFTest?
>> LucyCFTest? CFTest?) which has a dependency on the main Lucy XS module.
>
> Maybe we should keep the Lucy::Test convention?

After a bit of reflection: +1

I think that means also that *::Test is reserved for any Clownfish parcel.

Marvin Humphrey

Re: [lucy-dev] Separate binaries for test suites

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jun 8, 2016, at 03:02 , Marvin Humphrey <ma...@rectangular.com> wrote:
> 
> Over time, we should try to make decisions that consolidate as much of build
> code as possible in a shared central system (i.e. charmonizer/make for the
> foreseeable future).  Even if that code eventually gets refactored, it's only
> one task to refactor it, as opposed to N tasks to refactor code in N hosts.

Originally, I didn’t want to make Charmonizer a hard requirement but I’m willing to give that up, at least for now.

> I would say tests in a separate binary.  For example, under the Perl bindings
> the core tests for Lucy would be run from a separate XS module (Lucy::CFTest?
> LucyCFTest? CFTest?) which has a dependency on the main Lucy XS module.

Maybe we should keep the Lucy::Test convention?

> That's a solvable problem: we just rename such dirs when moving them to the
> host language distro.  Then we can find the dirs using logic like this:
> 
>    my $CFCORE = -d "cfcore" ? "cfcore" : "../core";
>    my $CFTEST = -d "cftest" ? "cftest" : "../test”;

OK, then the next step is to move the test code.

Nick


Re: [lucy-dev] Separate binaries for test suites

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Sat, Jun 4, 2016 at 12:32 PM, Nick Wellnhofer <we...@aevum.de> wrote:
> On 01/06/2016 17:04, Nick Wellnhofer wrote:
>>
>> On 26/05/2016 04:12, Marvin Humphrey wrote:
>>>
>>> With Python distutils as with Perl's Module::Build, it is at least
>>> theoretically possible to provide a list of object files to be linked
>>> into the
>>> host extension.
>>
>>
>> I'm leaning towards the "list of object files" approach for the Perl
>> bindings.  Unless static libraries are really used as intended, they seem
>> to create more problems than they solve.
>
> I submitted a new pull request the makes the Perl bindings use "make" to
> build the core object files.

+1 to merge!

Over time, we should try to make decisions that consolidate as much of build
code as possible in a shared central system (i.e. charmonizer/make for the
foreseeable future).  Even if that code eventually gets refactored, it's only
one task to refactor it, as opposed to N tasks to refactor code in N hosts.

This branch has some complexities, but it moves us in the right direction.

> Do we want the tests in a separate binary, or do we want a second binary
> containing both core and test code? The latter allows to integrate the test
> code more tightly, the former requires to make some test-only symbols
> visible. An advanced solution would even allow both approaches.

I would say tests in a separate binary.  For example, under the Perl bindings
the core tests for Lucy would be run from a separate XS module (Lucy::CFTest?
LucyCFTest? CFTest?) which has a dependency on the main Lucy XS module.

Most unit tests operate against the public API.  In my opinion, it is OK for
test code to be written as if it were external to the library.

It may be necessary for the test-author to modify the public API in order to
make some kinds of testing possible, or pursue other workarounds in unusual
cases.  That's acceptable.

Some test code will need access to struct definitions.  Since the tests will
only ever operate against the current version, we don't need to be concerned
about ABI compatibility promises.  It should continue to be possible to access
struct definitions, but it would be nice if the interface were a little less
obscure than the current mechanism of defining a `C_PREFIX_CLASSNAME` macro.

> Currently, we pass parcel privacy defines like CFP_CFISH as command-line
> arguments to the compiler. If we start to build separate binaries, this
> requires to put the test code in a separate directory and use special
> Makefile rules that add some per-directory flags. I think this can be done
> in a cross-platform way but it's a little complicated. Another approach is
> to move the #defines directly into the source files. This is more flexible
> with regard to directory layout, but less flexible when changing
> configurations. Different host languages might even need different settings.

I think "convention over configuration" is a good operating principle here.
Supporting flexibility in directory layouts would be a misfeature.

Complicated build code is technical debt and a barrier to contribution.  It is
often appropriate to take on some debt when evolving a codebase quickly and
when final solutions are not clear, but hopefully we can pay down that debt
eventually.

> It's probably a good idea to move the test code into a separate directory
> regardless of the issue above. Marvin suggested to name it "test". The only
> problem I can see is that this directory must be moved into the host
> language subdirectory when bundling host language distributions like CPAN
> tarballs. "test" is a pretty generic name that might cause clashes, but the
> same is true for "core".

That's a solvable problem: we just rename such dirs when moving them to the
host language distro.  Then we can find the dirs using logic like this:

    my $CFCORE = -d "cfcore" ? "cfcore" : "../core";
    my $CFTEST = -d "cftest" ? "cftest" : "../test";

Marvin Humphrey

Re: [lucy-dev] Separate binaries for test suites

Posted by Nick Wellnhofer <we...@aevum.de>.
On 01/06/2016 17:04, Nick Wellnhofer wrote:
> On 26/05/2016 04:12, Marvin Humphrey wrote:
>> With Python distutils as with Perl's Module::Build, it is at least
>> theoretically possible to provide a list of object files to be linked into the
>> host extension.
>
> I'm leaning towards the "list of object files" approach for the Perl bindings.
> Unless static libraries are really used as intended, they seem to create more
> problems than they solve.

I submitted a new pull request the makes the Perl bindings use "make" to build 
the core object files.

Some open questions:

Do we want the tests in a separate binary, or do we want a second binary 
containing both core and test code? The latter allows to integrate the test 
code more tightly, the former requires to make some test-only symbols visible. 
An advanced solution would even allow both approaches.

Currently, we pass parcel privacy defines like CFP_CFISH as command-line 
arguments to the compiler. If we start to build separate binaries, this 
requires to put the test code in a separate directory and use special Makefile 
rules that add some per-directory flags. I think this can be done in a 
cross-platform way but it's a little complicated. Another approach is to move 
the #defines directly into the source files. This is more flexible with regard 
to directory layout, but less flexible when changing configurations. Different 
host languages might even need different settings.

It's probably a good idea to move the test code into a separate directory 
regardless of the issue above. Marvin suggested to name it "test". The only 
problem I can see is that this directory must be moved into the host language 
subdirectory when bundling host language distributions like CPAN tarballs. 
"test" is a pretty generic name that might cause clashes, but the same is true 
for "core".

Nick