You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@crunch.apache.org by Pavithra Siriwardena <th...@gmail.com> on 2014/11/03 10:05:50 UTC

Input source Text() does not exist!

Hi folks,

I am trying to run my first Crunch pipeline, using the example project
<https://github.com/jwills/crunch-demo> given in Crunch Home Page. I'm
using Hadoop 2.51 and the Crunch0.11.0.

When I try to run it locally with using Hadoop's launcher script,

hadoop jar target/crunch-demo-1.0-SNAPSHOT-job.jar /home/pavithra/in
/home/pavithra/out

I get the error following error saying 'Input source Text() does not
exist!'; Note that the input text files are in '/home/pavithra/in'.

Exception in thread "main" java.lang.IllegalStateException: Input source
Text(/home/pavithra/in) does not exist!
at
org.apache.crunch.impl.dist.collect.BaseInputCollection.getSizeInternal(BaseInputCollection.java:66)
at
org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
at
org.apache.crunch.impl.dist.collect.BaseDoCollection.getSizeInternal(BaseDoCollection.java:49)
at
org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
at
org.apache.crunch.impl.dist.collect.BaseDoCollection.getSizeInternal(BaseDoCollection.java:49)
at
org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
at
org.apache.crunch.util.PartitionUtils.getRecommendedPartitions(PartitionUtils.java:51)
at
org.apache.crunch.util.PartitionUtils.getRecommendedPartitions(PartitionUtils.java:40)
at org.apache.crunch.lib.Aggregate.count(Aggregate.java:55)
at
org.apache.crunch.impl.dist.collect.PCollectionImpl.count(PCollectionImpl.java:245)
at com.example.WordCount.run(WordCount.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.example.WordCount.main(WordCount.java:22)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 Can you please guide me on this problem...

Thanks
-Pavithra

Re: Input source Text() does not exist!

Posted by Pavithra Siriwardena <th...@gmail.com>.

Hi Gabriel,

Thanks for your quick reply. You are correct, I have mixed up the two
modes, being in pseudo-distributed mode, I thought I was  running the job
in local mode, and tried to access the local file system. I've switched
back to the standalone mode and now it's working.

Thanks & Regards,
-Pavithra

On Mon, Nov 3, 2014 at 3:16 PM, Gabriel Reid <ga...@gmail.com> wrote:

> Hi Pavithra,
>
> This looks like there is probably a difference in where you're putting
> your input files and where Hadoop is expecting to find them.
> Are your input files on your local file system, or on HDFS? And is
> Hadoop set up to work with your local file system or HDFS?
>
> The path "/home/pavithra/in" looks like it is likely a local path
> (i.e. on the drive of your machine where you're running the command).
> It could be that when hadoop is trying to read the files, it's
> connecting to HDFS, and not your local file system.
>
> An easy way to check this is to type "hadoop dfs -ls
> /home/pavithra/in". That command should list the input files you want
> to use for your Crunch job, but I'm expecting that it will say that
> that path does not exist (because it's looking on HDFS). Could you try
> this out and then post back what you get?
>
> - Gabriel
>
>
> On Mon, Nov 3, 2014 at 10:05 AM, Pavithra Siriwardena
> <th...@gmail.com> wrote:
> > Hi folks,
> >
> > I am trying to run my first Crunch pipeline, using the example project
> given
> > in Crunch Home Page. I'm using Hadoop 2.51 and the Crunch0.11.0.
> >
> > When I try to run it locally with using Hadoop's launcher script,
> >
> > hadoop jar target/crunch-demo-1.0-SNAPSHOT-job.jar /home/pavithra/in
> > /home/pavithra/out
> >
> > I get the error following error saying 'Input source Text() does not
> > exist!'; Note that the input text files are in '/home/pavithra/in'.
> >
> > Exception in thread "main" java.lang.IllegalStateException: Input source
> > Text(/home/pavithra/in) does not exist!
> > at
> >
> org.apache.crunch.impl.dist.collect.BaseInputCollection.getSizeInternal(BaseInputCollection.java:66)
> > at
> >
> org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
> > at
> >
> org.apache.crunch.impl.dist.collect.BaseDoCollection.getSizeInternal(BaseDoCollection.java:49)
> > at
> >
> org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
> > at
> >
> org.apache.crunch.impl.dist.collect.BaseDoCollection.getSizeInternal(BaseDoCollection.java:49)
> > at
> >
> org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
> > at
> >
> org.apache.crunch.util.PartitionUtils.getRecommendedPartitions(PartitionUtils.java:51)
> > at
> >
> org.apache.crunch.util.PartitionUtils.getRecommendedPartitions(PartitionUtils.java:40)
> > at org.apache.crunch.lib.Aggregate.count(Aggregate.java:55)
> > at
> >
> org.apache.crunch.impl.dist.collect.PCollectionImpl.count(PCollectionImpl.java:245)
> > at com.example.WordCount.run(WordCount.java:54)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> > at com.example.WordCount.main(WordCount.java:22)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> >
> >  Can you please guide me on this problem...
> >
> > Thanks
> > -Pavithra
> >
> >
>



-- 
-Pavithra

Re: Input source Text() does not exist!

Posted by Gabriel Reid <ga...@gmail.com>.

Hi Pavithra,

This looks like there is probably a difference in where you're putting
your input files and where Hadoop is expecting to find them.
Are your input files on your local file system, or on HDFS? And is
Hadoop set up to work with your local file system or HDFS?

The path "/home/pavithra/in" looks like it is likely a local path
(i.e. on the drive of your machine where you're running the command).
It could be that when hadoop is trying to read the files, it's
connecting to HDFS, and not your local file system.

An easy way to check this is to type "hadoop dfs -ls
/home/pavithra/in". That command should list the input files you want
to use for your Crunch job, but I'm expecting that it will say that
that path does not exist (because it's looking on HDFS). Could you try
this out and then post back what you get?

- Gabriel


On Mon, Nov 3, 2014 at 10:05 AM, Pavithra Siriwardena
<th...@gmail.com> wrote:
> Hi folks,
>
> I am trying to run my first Crunch pipeline, using the example project given
> in Crunch Home Page. I'm using Hadoop 2.51 and the Crunch0.11.0.
>
> When I try to run it locally with using Hadoop's launcher script,
>
> hadoop jar target/crunch-demo-1.0-SNAPSHOT-job.jar /home/pavithra/in
> /home/pavithra/out
>
> I get the error following error saying 'Input source Text() does not
> exist!'; Note that the input text files are in '/home/pavithra/in'.
>
> Exception in thread "main" java.lang.IllegalStateException: Input source
> Text(/home/pavithra/in) does not exist!
> at
> org.apache.crunch.impl.dist.collect.BaseInputCollection.getSizeInternal(BaseInputCollection.java:66)
> at
> org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
> at
> org.apache.crunch.impl.dist.collect.BaseDoCollection.getSizeInternal(BaseDoCollection.java:49)
> at
> org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
> at
> org.apache.crunch.impl.dist.collect.BaseDoCollection.getSizeInternal(BaseDoCollection.java:49)
> at
> org.apache.crunch.impl.dist.collect.PCollectionImpl.getSize(PCollectionImpl.java:315)
> at
> org.apache.crunch.util.PartitionUtils.getRecommendedPartitions(PartitionUtils.java:51)
> at
> org.apache.crunch.util.PartitionUtils.getRecommendedPartitions(PartitionUtils.java:40)
> at org.apache.crunch.lib.Aggregate.count(Aggregate.java:55)
> at
> org.apache.crunch.impl.dist.collect.PCollectionImpl.count(PCollectionImpl.java:245)
> at com.example.WordCount.run(WordCount.java:54)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at com.example.WordCount.main(WordCount.java:22)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>
>  Can you please guide me on this problem...
>
> Thanks
> -Pavithra
>
>