You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Pat Ferrel <pa...@occamsmachete.com> on 2014/04/29 19:51:21 UTC

When to move to Hadoop 2

Is there something, a page or Jira that describes the current h2 status? I think Dmitiry is using h1 now as I am. Is it time to switch? I ask primarily for access to the filesystem (HDFS or local through hadoop), not for access to mapreduce jobs.


Re: When to move to Hadoop 2

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Tue, Apr 29, 2014 at 12:12 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> True but I need to use the dir tree walking and file list creation code in
> HDFS 2 in order to create URIs for Spark input. So HDFS 2 it is.
>
> BTW Spark supports more file systems than local: and hdfs:, I assume we
> are sticking with those two?
>

There's no artificial limitation here. Spark supports talking to FileSystem
api in hadoop. Therefore, everything that exists for this api (HDFS, EXT4,
S3 etc etc) , can be (in theory) used. At least, it is not explicitly
blocked.


>
> On Apr 29, 2014, at 11:13 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>
> I am using cdh 4.3 with hadoop 2.0.0 (as a dependency) (i am not using
> mr1-cdh.... )
>
> for code that runs over Spark Bindings this is a moot issue since spark can
> read and write to pretty much any version of hadoop or distro that exists
> today. One just needs to compile it with proper version of hadoop
> hdfs/mr/yarn.
>
>
>
> On Tue, Apr 29, 2014 at 10:51 AM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
>
> > Is there something, a page or Jira that describes the current h2 status?
> I
> > think Dmitiry is using h1 now as I am. Is it time to switch? I ask
> > primarily for access to the filesystem (HDFS or local through hadoop),
> not
> > for access to mapreduce jobs.
> >
> >
>
>

Re: When to move to Hadoop 2

Posted by Pat Ferrel <pa...@occamsmachete.com>.
True but I need to use the dir tree walking and file list creation code in HDFS 2 in order to create URIs for Spark input. So HDFS 2 it is.

BTW Spark supports more file systems than local: and hdfs:, I assume we are sticking with those two?

On Apr 29, 2014, at 11:13 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

I am using cdh 4.3 with hadoop 2.0.0 (as a dependency) (i am not using
mr1-cdh.... )

for code that runs over Spark Bindings this is a moot issue since spark can
read and write to pretty much any version of hadoop or distro that exists
today. One just needs to compile it with proper version of hadoop
hdfs/mr/yarn.



On Tue, Apr 29, 2014 at 10:51 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Is there something, a page or Jira that describes the current h2 status? I
> think Dmitiry is using h1 now as I am. Is it time to switch? I ask
> primarily for access to the filesystem (HDFS or local through hadoop), not
> for access to mapreduce jobs.
> 
> 


Re: When to move to Hadoop 2

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
I am using cdh 4.3 with hadoop 2.0.0 (as a dependency) (i am not using
mr1-cdh.... )

for code that runs over Spark Bindings this is a moot issue since spark can
read and write to pretty much any version of hadoop or distro that exists
today. One just needs to compile it with proper version of hadoop
hdfs/mr/yarn.



On Tue, Apr 29, 2014 at 10:51 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Is there something, a page or Jira that describes the current h2 status? I
> think Dmitiry is using h1 now as I am. Is it time to switch? I ask
> primarily for access to the filesystem (HDFS or local through hadoop), not
> for access to mapreduce jobs.
>
>