You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Alex Herbert <al...@gmail.com> on 2021/05/15 13:47:20 UTC

[rng] Add ObjectSampler interfaces and a CompositeSampler

New shape samplers have been added to the library to sampler coordinates
from different shapes (see RNG-132 [1]).

I have been working on an idea to combine shape samplers together so that a
more complex shape can be sampled, for example a surface or volume.

This requires that different samplers can be combined as a common sampler.
This is facilitated by adding new interfaces to the library which are the
generic typed version of the current DiscreteSampler (for int) and
ContinuousSampler (for double) and their SharedStateSampler extensions:

public interface ObjectSampler<T> {
    T sample();
}

public interface SharedStateObjectSampler<T> extends
        ObjectSampler<T>,
        SharedStateSampler<SharedStateObjectSampler<T>> {
    // Composite interface
}

All the samplers in the library that create object samples already use the
method name sample() and implement SharedStateSampler. The exception is the
UnitSphereSampler which has a sampling method nextVector(). So adding these
interfaces is a small change to facilitate a composite sampler.

The composite sampler should combine many samplers, each with its own
weight. The weights can be used to create a discrete probability
distribution. We have 3 samplers that can sample efficiently from this:

GuideTableDiscreteSampler
AliasMethodDiscreteSampler
MarsagliaTsangWangDiscreteSampler.Enumerated

So a composite sampler must accept a set of weighted samplers (of the same
type) and create a discrete sampler to select which one to sample. This is
facilitated using a builder API:

S is the type of sampler

public interface Builder<S> {
    int size();
    Builder<S> add(S sampler, double weight);
    Builder<S> setFactory(DiscreteProbabilitySamplerFactory factory);
    // Only works if size > 0
    S build(UniformRandomProvider rng);
}

The factory specifies a mechanism to create the users choice of discrete
sampler:

public interface DiscreteProbabilitySamplerFactory {
    DiscreteSampler create(UniformRandomProvider rng,
                           double[] probabilities);
}

It is not required to be set as a default will exist. The choice for
the DiscreteProbabilityCollectionSampler was the GuideTableDiscreteSampler
due to its low construction overhead (see RNG-109 [2]).

A static class provides a mechanism to create composite samplers via
builders typed to the final sample type:

public final class CompositeSamplers {
    public static <T> Builder<ObjectSampler<T>> newObjectSamplerBuilder();
    public static <T> Builder<SharedStateObjectSampler<T>>
        newSharedStateObjectSamplerBuilder();
    public static Builder<DiscreteSampler> newDiscreteSamplerBuilder();
    public static Builder<SharedStateDiscreteSampler>
        newSharedStateDiscreteSamplerBuilder();
    public static Builder<ContinuousSampler> newContinuousSamplerBuilder();
    public static Builder<SharedStateContinuousSampler>
        newSharedStateContinuousSamplerBuilder();
}

An example of usage would be:

UniformRandomProvider rng = ...;
DiscreteSampler dayOfMonth = CompositeSamplers.newDiscreteSamplerBuilder()
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Jan
    .add(DiscreteUniformSampler.of(rng, 1, 28), 28) // Feb
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Mar
    .add(DiscreteUniformSampler.of(rng, 1, 30), 30) // Apr
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // May
    .add(DiscreteUniformSampler.of(rng, 1, 30), 30) // Jun
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Jul
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Aug
    .add(DiscreteUniformSampler.of(rng, 1, 30), 30) // Sep
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Oct
    .add(DiscreteUniformSampler.of(rng, 1, 30), 30) // Nov
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Dec
    .build(rng);
int day = dayOfMonth.sample();

// Diamond vertices
double[] a = {0, 0};
double[] b = {1, 1};
double[] c = {2, 0};
double[] d = {1, -1};
// Note: The sample type (double[]) must be specified if the builder is not
assigned
ObjectSampler<double[]> diamond =
    CompositeSamplers.<double[]>newObjectSamplerBuilder()
    .add(TriangleSampler.of(a, b, c, rng), 1) // Upper
    .add(TriangleSampler.of(a, d, c, rng), 1) // Lower
    .build(rng);
double[] coord = diamond.sample();

// Note: Type is inferred if the builder is assigned and then used:
Builder<ObjectSampler<double[]>> builder =
CompositeSamplers.newObjectSamplerBuilder();
builder.add(TriangleSampler.of(a, b, c, rng), 1); // Upper
etc.

I have a working version of the above and can create a WIP pull request for
a detailed inspection.

I suggest starting with adding the two new interfaces (ObjectSampler<T> and
SharedStateObjectSampler<T>) and changing the codebase to implement it.
Then adding a composite sampler in a separate change that will require
further discussion.

Note: I had started with the idea of a static factory method:

public static <T> SharedStateObjectSampler<T> of(UniformRandomProvider rng,
                                                 List<? extends
SharedStateObjectSampler<T>> samplers,
                                                 double[] weights) {

This is a similar idea to the factory constructor for
the DiscreteProbabilityCollectionSampler. However to add similar methods
for all the 6 samplers (as above) requires more code. Using the builder API
it can be done with a single generic builder that encapsulates all the
functionality of collecting the samplers and constructing the discrete
probability sampler. It also allows optional arguments such as the method
to control the discrete probability sampler. The static factory method
requires a user to create a list to hold each sampler and then an array of
weights before calling the factory method. In my opinion the builder is
easier to use as the samplers can be added as they are generated.

Alex

[1] https://issues.apache.org/jira/browse/RNG-132
[2] https://issues.apache.org/jira/browse/RNG-109

Re: [rng] Add ObjectSampler interfaces and a CompositeSampler

Posted by Alex Herbert <al...@gmail.com>.
On Sat, 15 May 2021 at 16:02, Gilles Sadowski <gi...@gmail.com> wrote:

> Hi Alex.
>
> Would the proposal be any different if with Java 8+ features?
> [IOW, is it still useful (in any sense) to stick with Java 6?]
>

The Sampler<T> interface can be replaced with Supplier<T>. But then all the
sampling methods would have to be renamed to 'get()' as opposed to
'sample()' which we currently have in the codebase.

The interface BiFunction<T, U, R> can be used to specify the factory to
build the discrete sampler:

BiFunction<double[], UniformRandomProvider, DiscreteSampler>

I do not think any other Java 8 features would help reduce the new amount
of public API.

With Java 8 you could build a composite with streams. We would provide
access to the pair used to stored the sampler and the weights:

static class WeightedSampler<S> {
            /** The weight. */
            private final double weight;
            /** The sampler. */
            private final S sampler;
            // etc...
}

You can then use streams to build a composite sampler from a source of
objects:

List<Triangle> triangles = ...;
UniformRandomProvider rng = ...;
ObjectSampler<double[]> surface = triangles.stream().map(t -> {
    double area = t.getArea();
    double[] a = t.getVertex(0);
    double[] b = t.getVertex(1);
    double[] c = t.getVertex(2);
    return WeightedSample.of(area, TriangleSampler.of(a, b, c, rng));
}).collect(CompositeCollectors.toObjectSampler(rng));
double[] point = surface.sample();

The CompositeCollectors would amass the WeightedSamplers and construct the
composite. The class would still require 6 ways of calling it so that the 6
composite variants can be created.

This is just an idea. I do not think Java 8 is really a benefit and can be
added later.

Alex

Re: [rng] Add ObjectSampler interfaces and a CompositeSampler

Posted by Gilles Sadowski <gi...@gmail.com>.
Hi Alex.

Would the proposal be any different if with Java 8+ features?
[IOW, is it still useful (in any sense) to stick with Java 6?]

Regards,
Gilles

> [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org