You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ben Kietzman (Jira)" <ji...@apache.org> on 2021/06/03 14:21:00 UTC

[jira] [Created] (ARROW-12945) [C++][Dataset] Refactor InMemoryDataset to inherit FragmentDataset

Ben Kietzman created ARROW-12945:
------------------------------------

             Summary: [C++][Dataset] Refactor InMemoryDataset to inherit FragmentDataset
                 Key: ARROW-12945
                 URL: https://issues.apache.org/jira/browse/ARROW-12945
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Ben Kietzman


InMemoryDataset could inherit FragmentDataset. Actually it'd be beneficial if all datasets could have a vector of their fragments; this would allow subtree pruning to be used on any dataset when performing predicate pushdown.

See also ARROW-8065 for which an unmet goal was to make Dataset a concrete class which contained fragments (essentially FragmentDataset), and have subclasses simply add guarantees on those fragments (FileSystemDataset contains only FIleFragments).

See also ARROW-12891 (add support for subtree pruning to FragmentDataset)

NB: This will require promotion of FragmentDataset to a public class or demotion of InMemoryDataset to an internal class (with public factories)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)