You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Markus Weimer (JIRA)" <ji...@apache.org> on 2015/10/01 02:02:04 UTC
[jira] [Commented] (REEF-538) Implement IPartitionedDataSet on
IFileSystem
[ https://issues.apache.org/jira/browse/REEF-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939083#comment-14939083 ]
Markus Weimer commented on REEF-538:
------------------------------------
One likely extension point here is the deserialization of the individual files. We should design the API for this in a way where that part can be user supplied. Similar to how {{InputFormat}} and {{RecordReader}} interact in Hadoop. User code would have to provide this deserializer.
> Implement IPartitionedDataSet on IFileSystem
> --------------------------------------------
>
> Key: REEF-538
> URL: https://issues.apache.org/jira/browse/REEF-538
> Project: REEF
> Issue Type: New Feature
> Components: REEF.NET
> Reporter: Markus Weimer
>
> The goal is to build an {{IPartionedDataSet}} where each partition is backed by one or more files stored in on an {{IFileSystem}}. The whole dataset is specified as a set of such files and / or path expressions with {{*}} and such.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)