You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Alex Herbert (Jira)" <ji...@apache.org> on 2023/07/13 13:06:00 UTC

[jira] [Reopened] (RNG-184) ArraySampler to shuffle all primitive array types and generic T[] arrays

     [ https://issues.apache.org/jira/browse/RNG-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Herbert reopened RNG-184:
------------------------------
      Assignee: Alex Herbert

Update the ArraySampler to change from a void return type to return the input array argument. This allows chaining the shuffle as an input argument for another function:
{code:java}
UniformRandomProvider rng = ...
double[] values = ...
for (int i = 0; i < 10; i++) {
    someMethod(ArraySampler.shuffle(rng, values.clone());
}
{code}

> ArraySampler to shuffle all primitive array types and generic T[] arrays
> ------------------------------------------------------------------------
>
>                 Key: RNG-184
>                 URL: https://issues.apache.org/jira/browse/RNG-184
>             Project: Commons RNG
>          Issue Type: New Feature
>          Components: sampling
>    Affects Versions: 1.5
>            Reporter: Alex Herbert
>            Assignee: Alex Herbert
>            Priority: Major
>             Fix For: 1.6
>
>
> The sampling module contains code to shuffle int[] in the PermutationSampler:
> {code:java}
> public static void shuffle(UniformRandomProvider rng, int[] list)
> public static void shuffle(UniformRandomProvider rng,
>    int[] list,
>    int start,
>    boolean towardHead) {code}
> Shuffling support should be expanded to all primitive types and a generic array type.
> The non-integer domain is out of scope for the PermutationSampler. The shuffle method is present as a utility for permuting arrays of indices.
> I suggest a new API in ArraySampler with a shuffle that can handle sub-ranges as:
> {code:java}
> public static void shuffle(UniformRandomProvider rng, int[] data);
> public static void shuffle(UniformRandomProvider rng, int[] data,
>                            int from, int to);
> {code}
> Note that there is a ListSampler that offers similar functionality for a List:
> {code:java}
> public static <T> void shuffle(UniformRandomProvider rng,
>                                List<T> list)
> public static <T> void shuffle(UniformRandomProvider rng,
>                                List<T> list,
>                                int start,
>                                boolean towardHead)
> // Also sampling
> public static <T> List<T> sample(UniformRandomProvider rng,
>                                  List<T> collection,
>                                  int k) 
> {code}
> I do not think supporting a range for shuffling here is required as this can be achieved using:
> {code:java}
> int from = ...
> int to = ...
> ListSampler.shuffle(rng, list.subList(from, to));
> {code}
> This is the how the half-shuffle method (towards head/tail) is implemented. The origin of the half-shuffle API is unknown but it is not as flexible a sub-range shuffle and I do not propose to implement it for all array types.
> Note that consistency between Lists and arrays would require a sampling method for all arrays, e.g.:
> {code:java}
> public static double[] sample(UniformRandomProvider rng,
>     double[] array,
>     int k) {
> public static <T> T[] sample(UniformRandomProvider rng,
>     T[] array,
>     int k)
> {code}
> Since this static method uses a new PermutationSampler per sample the method is not suitable for repeat invocation. Development of a sampling API for arrays should be under another ticket, e.g.
> {code:java}
> // Sampler returns k elements from the given array
> SharedStateObjectSampler<double[]> createSampler(UniformRandomProvider rng,
>                                                  double[] array, int k).{code}
> Such a sampler would internally maintain the PermutationSampler between invocations.
> This ticket will only cover shuffling support in ArraySampler.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)