You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metamodel.apache.org by "Kasper Sørensen (JIRA)" <ji...@apache.org> on 2015/06/10 13:12:00 UTC

[jira] [Resolved] (METAMODEL-148) Add a HdfsResource implementation

     [ https://issues.apache.org/jira/browse/METAMODEL-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kasper Sørensen resolved METAMODEL-148.
---------------------------------------
    Resolution: Fixed

Fixed as of commit https://git-wip-us.apache.org/repos/asf?p=metamodel.git;a=commit;h=6cf39add29cebf0787e81399ec6c3b9b150ff262

> Add a HdfsResource implementation
> ---------------------------------
>
>                 Key: METAMODEL-148
>                 URL: https://issues.apache.org/jira/browse/METAMODEL-148
>             Project: Apache MetaModel
>          Issue Type: New Feature
>            Reporter: Kasper Sørensen
>            Assignee: Kasper Sørensen
>
> I suggest to implement a Resource class that will allow reading and writing files in Hadoop's HDFS file system.
> Background:
> Many of the file-based DataContext implementations we have accept a Resource - an interface which abstracts the file system. We currently have implementations like FileResource, UrlResource, ClasspathResource.
> A request I get often is to also support Hadoop. Now obviously the ideal Hadoop integration would not even imply using MetaModel's query-based approach to data access, but for many simple use cases it is actually not that important whether the job runs natively in Hadoop (for example in Map-Reduce) or if the process simply fetches the file over the wire. I have seen many cases of small-ish CSV files on Hadoop for example, where it would actually be quicker to run through the file on a client than submitting a job to Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)