You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Konstantin Kudryavtsev <ku...@gmail.com> on 2014/07/02 18:38:33 UTC

Run spark unit test on Windows 7

Hi all,

I'm trying to run some transformation on *Spark*, it works fine on cluster
(YARN, linux machines). However, when I'm trying to run it on local machine
(*Windows 7*) under unit test, I got errors:

java.io.IOException: Could not locate executable null\bin\winutils.exe
in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)


My code is following:

@Test
def testETL() = {
    val conf = new SparkConf()
    val sc = new SparkContext("local", "test", conf)
    try {
        val etl = new IxtoolsDailyAgg() // empty constructor

        val data = sc.parallelize(List("in1", "in2", "in3"))

        etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
        Assert.assertTrue(true)
    } finally {
        if(sc != null)
            sc.stop()
    }
}


Why is it trying to access hadoop at all? and how can I fix it? Thank you
in advance

Thank you,
Konstantin Kudryavtsev

Re: Run spark unit test on Windows 7

Posted by Denny Lee <de...@gmail.com>.

Thanks!  will take a look at this later today. HTH!



> On Jul 3, 2014, at 11:09 AM, Kostiantyn Kudriavtsev <ku...@gmail.com> wrote:
> 
> Hi Denny,
> 
> just created https://issues.apache.org/jira/browse/SPARK-2356
> 
>> On Jul 3, 2014, at 7:06 PM, Denny Lee <de...@gmail.com> wrote:
>> 
>> Hi Konstantin,
>> 
>> Could you please create a jira item at: https://issues.apache.org/jira/browse/SPARK/ so this issue can be tracked?
>> 
>> Thanks,
>> Denny
>> 
>> 
>>> On July 2, 2014 at 11:45:24 PM, Konstantin Kudryavtsev (kudryavtsev.konstantin@gmail.com) wrote:
>>> 
>>> It sounds really strange...
>>> 
>>> I guess it is a bug, critical bug and must be fixed... at least some flag must be add (unable.hadoop)
>>> 
>>> I found the next workaround :
>>> 1) download compiled winutils.exe from http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
>>> 2) put this file into d:\winutil\bin
>>> 3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")
>>> 
>>> after that test runs
>>> 
>>> Thank you,
>>> Konstantin Kudryavtsev
>>> 
>>> 
>>> On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee <de...@gmail.com> wrote:
>>> You don't actually need it per se - its just that some of the Spark libraries are referencing Hadoop libraries even if they ultimately do not call them. When I was doing some early builds of Spark on Windows, I admittedly had Hadoop on Windows running as well and had not run into this particular issue.
>>> 
>>> 
>>> 
>>>> On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev <ku...@gmail.com> wrote:
>>>> No, I don’t
>>>> 
>>>> why do I need to have HDP installed? I don’t use Hadoop at all and I’d like to read data from local filesystem
>>>> 
>>>>> On Jul 2, 2014, at 9:10 PM, Denny Lee <de...@gmail.com> wrote:
>>>>> 
>>>>> By any chance do you have HDP 2.1 installed? you may need to install the utils and update the env variables per http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
>>>>> 
>>>>> 
>>>>>> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
>>>>>> 
>>>>>> Hi Andrew,
>>>>>> 
>>>>>> it's windows 7 and I doesn't set up any env variables here 
>>>>>> 
>>>>>> The full stack trace:
>>>>>> 
>>>>>> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
>>>>>> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
>>>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>>>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>>>> at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
>>>>>> at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
>>>>>> at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
>>>>>> at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
>>>>>> at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
>>>>>> at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
>>>>>> at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
>>>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
>>>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
>>>>>> at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>> at junit.framework.TestCase.runTest(TestCase.java:168)
>>>>>> at junit.framework.TestCase.runBare(TestCase.java:134)
>>>>>> at junit.framework.TestResult$1.protect(TestResult.java:110)
>>>>>> at junit.framework.TestResult.runProtected(TestResult.java:128)
>>>>>> at junit.framework.TestResult.run(TestResult.java:113)
>>>>>> at junit.framework.TestCase.run(TestCase.java:124)
>>>>>> at junit.framework.TestSuite.runTest(TestSuite.java:232)
>>>>>> at junit.framework.TestSuite.run(TestSuite.java:227)
>>>>>> at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
>>>>>> at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
>>>>>> at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
>>>>>> at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
>>>>>> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
>>>>>> 
>>>>>> 
>>>>>> Thank you,
>>>>>> Konstantin Kudryavtsev
>>>>>> 
>>>>>> 
>>>>>>> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <an...@databricks.com> wrote:
>>>>>>> Hi Konstatin,
>>>>>>> 
>>>>>>> We use hadoop as a library in a few places in Spark. I wonder why the path includes "null" though.
>>>>>>> 
>>>>>>> Could you provide the full stack trace?
>>>>>>> 
>>>>>>> Andrew
>>>>>>> 
>>>>>>> 
>>>>>>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <ku...@gmail.com>:
>>>>>>> 
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> I'm trying to run some transformation on Spark, it works fine on cluster (YARN, linux machines). However, when I'm trying to run it on local machine (Windows 7) under unit test, I got errors:
>>>>>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>>>>>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>>>>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>>>>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>>>>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>>>>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>>>>>> 
>>>>>>>> My code is following:
>>>>>>>> @Test
>>>>>>>> def testETL() = {
>>>>>>>>     val conf = new SparkConf()
>>>>>>>>     val sc = new SparkContext("local", "test", conf)
>>>>>>>>     try {
>>>>>>>>         val etl = new IxtoolsDailyAgg() // empty constructor
>>>>>>>> 
>>>>>>>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>>>>>>>> 
>>>>>>>>         etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
>>>>>>>>         Assert.assertTrue(true)
>>>>>>>>     } finally {
>>>>>>>>         if(sc != null)
>>>>>>>>             sc.stop()
>>>>>>>>     }
>>>>>>>> }
>>>>>>>> 
>>>>>>>> Why is it trying to access hadoop at all? and how can I fix it? Thank you in advance
>>>>>>>> 
>>>>>>>> Thank you,
>>>>>>>> Konstantin Kudryavtsev
>

Re: Run spark unit test on Windows 7

Posted by Kostiantyn Kudriavtsev <ku...@gmail.com>.

Hi Denny,

just created https://issues.apache.org/jira/browse/SPARK-2356

On Jul 3, 2014, at 7:06 PM, Denny Lee <de...@gmail.com> wrote:

> Hi Konstantin,
> 
> Could you please create a jira item at: https://issues.apache.org/jira/browse/SPARK/ so this issue can be tracked?
> 
> Thanks,
> Denny
> 
> 
> On July 2, 2014 at 11:45:24 PM, Konstantin Kudryavtsev (kudryavtsev.konstantin@gmail.com) wrote:
> 
>> It sounds really strange...
>> 
>> I guess it is a bug, critical bug and must be fixed... at least some flag must be add (unable.hadoop)
>> 
>> I found the next workaround :
>> 1) download compiled winutils.exe from http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
>> 2) put this file into d:\winutil\bin
>> 3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")
>> 
>> after that test runs
>> 
>> Thank you,
>> Konstantin Kudryavtsev
>> 
>> 
>> On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee <de...@gmail.com> wrote:
>> You don't actually need it per se - its just that some of the Spark libraries are referencing Hadoop libraries even if they ultimately do not call them. When I was doing some early builds of Spark on Windows, I admittedly had Hadoop on Windows running as well and had not run into this particular issue.
>> 
>> 
>> 
>> On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev <ku...@gmail.com> wrote:
>> No, I don’t
>> 
>> why do I need to have HDP installed? I don’t use Hadoop at all and I’d like to read data from local filesystem
>> 
>> On Jul 2, 2014, at 9:10 PM, Denny Lee <de...@gmail.com> wrote:
>> 
>>> By any chance do you have HDP 2.1 installed? you may need to install the utils and update the env variables per http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
>>> 
>>> 
>>> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
>>> 
>>>> Hi Andrew,
>>>> 
>>>> it's windows 7 and I doesn't set up any env variables here 
>>>> 
>>>> The full stack trace:
>>>> 
>>>> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
>>>> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>> at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
>>>> at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
>>>> at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
>>>> at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
>>>> at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
>>>> at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
>>>> at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
>>>> at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>> at junit.framework.TestCase.runTest(TestCase.java:168)
>>>> at junit.framework.TestCase.runBare(TestCase.java:134)
>>>> at junit.framework.TestResult$1.protect(TestResult.java:110)
>>>> at junit.framework.TestResult.runProtected(TestResult.java:128)
>>>> at junit.framework.TestResult.run(TestResult.java:113)
>>>> at junit.framework.TestCase.run(TestCase.java:124)
>>>> at junit.framework.TestSuite.runTest(TestSuite.java:232)
>>>> at junit.framework.TestSuite.run(TestSuite.java:227)
>>>> at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
>>>> at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
>>>> at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
>>>> at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
>>>> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
>>>> 
>>>> 
>>>> Thank you,
>>>> Konstantin Kudryavtsev
>>>> 
>>>> 
>>>> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <an...@databricks.com> wrote:
>>>> Hi Konstatin,
>>>> 
>>>> We use hadoop as a library in a few places in Spark. I wonder why the path includes "null" though.
>>>> 
>>>> Could you provide the full stack trace?
>>>> 
>>>> Andrew
>>>> 
>>>> 
>>>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <ku...@gmail.com>:
>>>> 
>>>> Hi all,
>>>> 
>>>> I'm trying to run some transformation on Spark, it works fine on cluster (YARN, linux machines). However, when I'm trying to run it on local machine (Windows 7) under unit test, I got errors:
>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>> 
>>>> My code is following:
>>>> @Test
>>>> def testETL() = {
>>>>     val conf = new SparkConf()
>>>>     val sc = new SparkContext("local", "test", conf)
>>>>     try {
>>>>         val etl = new IxtoolsDailyAgg() // empty constructor
>>>> 
>>>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>>>> 
>>>>         etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
>>>>         Assert.assertTrue(true)
>>>>     } finally {
>>>>         if(sc != null)
>>>>             sc.stop()
>>>>     }
>>>> }
>>>> 
>>>> Why is it trying to access hadoop at all? and how can I fix it? Thank you in advance
>>>> 
>>>> Thank you,
>>>> Konstantin Kudryavtsev

Re: Run spark unit test on Windows 7

Posted by Denny Lee <de...@gmail.com>.

Hi Konstantin,

Could you please create a jira item at: https://issues.apache.org/jira/browse/SPARK/ so this issue can be tracked?

Thanks,
Denny


On July 2, 2014 at 11:45:24 PM, Konstantin Kudryavtsev (kudryavtsev.konstantin@gmail.com) wrote:

It sounds really strange...

I guess it is a bug, critical bug and must be fixed... at least some flag must be add (unable.hadoop)

I found the next workaround :
1) download compiled winutils.exe from http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
2) put this file into d:\winutil\bin
3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")

after that test runs

Thank you,
Konstantin Kudryavtsev


On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee <de...@gmail.com> wrote:
You don't actually need it per se - its just that some of the Spark libraries are referencing Hadoop libraries even if they ultimately do not call them. When I was doing some early builds of Spark on Windows, I admittedly had Hadoop on Windows running as well and had not run into this particular issue.



On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev <ku...@gmail.com> wrote:
No, I don’t

why do I need to have HDP installed? I don’t use Hadoop at all and I’d like to read data from local filesystem

On Jul 2, 2014, at 9:10 PM, Denny Lee <de...@gmail.com> wrote:

By any chance do you have HDP 2.1 installed? you may need to install the utils and update the env variables per http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows


On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:

Hi Andrew,

it's windows 7 and I doesn't set up any env variables here 

The full stack trace:

14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)


Thank you,
Konstantin Kudryavtsev


On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <an...@databricks.com> wrote:
Hi Konstatin,

We use hadoop as a library in a few places in Spark. I wonder why the path includes "null" though.

Could you provide the full stack trace?

Andrew


2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <ku...@gmail.com>:

Hi all,

I'm trying to run some transformation on Spark, it works fine on cluster (YARN, linux machines). However, when I'm trying to run it on local machine (Windows 7) under unit test, I got errors:


java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)

My code is following:


@Test
def testETL() = {
    val conf = new SparkConf()
    val sc = new SparkContext("local", "test", conf)
    try {
        val etl = new IxtoolsDailyAgg() // empty constructor

        val data = sc.parallelize(List("in1", "in2", "in3"))

        etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
        Assert.assertTrue(true)
    } finally {
        if(sc != null)
            sc.stop()
    }
}

Why is it trying to access hadoop at all? and how can I fix it? Thank you in advance

Thank you,
Konstantin Kudryavtsev

Re: Run spark unit test on Windows 7

Posted by Konstantin Kudryavtsev <ku...@gmail.com>.

It sounds really strange...

I guess it is a bug, critical bug and must be fixed... at least some flag
must be add (unable.hadoop)

I found the next workaround :
1) download compiled winutils.exe from
http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
2) put this file into d:\winutil\bin
3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")

after that test runs

Thank you,
Konstantin Kudryavtsev


On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee <de...@gmail.com> wrote:

> You don't actually need it per se - its just that some of the Spark
> libraries are referencing Hadoop libraries even if they ultimately do not
> call them. When I was doing some early builds of Spark on Windows, I
> admittedly had Hadoop on Windows running as well and had not run into this
> particular issue.
>
>
>
> On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev <
> kudryavtsev.konstantin@gmail.com> wrote:
>
>> No, I don't
>>
>> why do I need to have HDP installed? I don't use Hadoop at all and I'd
>> like to read data from local filesystem
>>
>> On Jul 2, 2014, at 9:10 PM, Denny Lee <de...@gmail.com> wrote:
>>
>> By any chance do you have HDP 2.1 installed? you may need to install the
>> utils and update the env variables per
>> http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
>>
>>
>> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev <
>> kudryavtsev.konstantin@gmail.com> wrote:
>>
>> Hi Andrew,
>>
>> it's windows 7 and I doesn't set up any env variables here
>>
>> The full stack trace:
>>
>> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in
>> the hadoop binary path
>> java.io.IOException: Could not locate executable null\bin\winutils.exe in
>> the Hadoop binaries.
>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>  at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>  at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>  at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
>> at
>> org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
>>  at
>> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
>> at
>> org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
>>  at
>> org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
>> at
>> org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
>>  at
>> org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
>>  at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
>> at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>  at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>>  at junit.framework.TestCase.runTest(TestCase.java:168)
>> at junit.framework.TestCase.runBare(TestCase.java:134)
>>  at junit.framework.TestResult$1.protect(TestResult.java:110)
>> at junit.framework.TestResult.runProtected(TestResult.java:128)
>>  at junit.framework.TestResult.run(TestResult.java:113)
>> at junit.framework.TestCase.run(TestCase.java:124)
>>  at junit.framework.TestSuite.runTest(TestSuite.java:232)
>> at junit.framework.TestSuite.run(TestSuite.java:227)
>>  at
>> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
>> at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
>>  at
>> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
>> at
>> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
>>  at
>> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>  at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>  at java.lang.reflect.Method.invoke(Method.java:606)
>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
>>
>>
>> Thank you,
>> Konstantin Kudryavtsev
>>
>>
>> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <an...@databricks.com> wrote:
>>
>>> Hi Konstatin,
>>>
>>> We use hadoop as a library in a few places in Spark. I wonder why the
>>> path includes "null" though.
>>>
>>> Could you provide the full stack trace?
>>>
>>> Andrew
>>>
>>>
>>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <
>>> kudryavtsev.konstantin@gmail.com>:
>>>
>>> Hi all,
>>>>
>>>> I'm trying to run some transformation on *Spark*, it works fine on
>>>> cluster (YARN, linux machines). However, when I'm trying to run it on local
>>>> machine (*Windows 7*) under unit test, I got errors:
>>>>
>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>>
>>>>
>>>> My code is following:
>>>>
>>>> @Test
>>>> def testETL() = {
>>>>     val conf = new SparkConf()
>>>>     val sc = new SparkContext("local", "test", conf)
>>>>     try {
>>>>         val etl = new IxtoolsDailyAgg() // empty constructor
>>>>
>>>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>>>>
>>>>         etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
>>>>         Assert.assertTrue(true)
>>>>     } finally {
>>>>         if(sc != null)
>>>>             sc.stop()
>>>>     }
>>>> }
>>>>
>>>>
>>>> Why is it trying to access hadoop at all? and how can I fix it? Thank
>>>> you in advance
>>>>
>>>> Thank you,
>>>> Konstantin Kudryavtsev
>>>>
>>>
>>>
>>
>>
>

Re: Run spark unit test on Windows 7

Posted by Denny Lee <de...@gmail.com>.

You don't actually need it per se - its just that some of the Spark
libraries are referencing Hadoop libraries even if they ultimately do not
call them. When I was doing some early builds of Spark on Windows, I
admittedly had Hadoop on Windows running as well and had not run into this
particular issue.



On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev <
kudryavtsev.konstantin@gmail.com> wrote:

> No, I don’t
>
> why do I need to have HDP installed? I don’t use Hadoop at all and I’d
> like to read data from local filesystem
>
> On Jul 2, 2014, at 9:10 PM, Denny Lee <de...@gmail.com> wrote:
>
> By any chance do you have HDP 2.1 installed? you may need to install the
> utils and update the env variables per
> http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
>
>
> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev <
> kudryavtsev.konstantin@gmail.com> wrote:
>
> Hi Andrew,
>
> it's windows 7 and I doesn't set up any env variables here
>
> The full stack trace:
>
> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the
> hadoop binary path
> java.io.IOException: Could not locate executable null\bin\winutils.exe in
> the Hadoop binaries.
> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>  at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>  at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>  at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
> at
> org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
>  at
> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> at
> org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
>  at
> org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
> at
> org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
>  at
> org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
>  at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
> at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
>  at junit.framework.TestCase.runTest(TestCase.java:168)
> at junit.framework.TestCase.runBare(TestCase.java:134)
>  at junit.framework.TestResult$1.protect(TestResult.java:110)
> at junit.framework.TestResult.runProtected(TestResult.java:128)
>  at junit.framework.TestResult.run(TestResult.java:113)
> at junit.framework.TestCase.run(TestCase.java:124)
>  at junit.framework.TestSuite.runTest(TestSuite.java:232)
> at junit.framework.TestSuite.run(TestSuite.java:227)
>  at
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
>  at
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
> at
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
>  at
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
>
>
> Thank you,
> Konstantin Kudryavtsev
>
>
> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <an...@databricks.com> wrote:
>
>> Hi Konstatin,
>>
>> We use hadoop as a library in a few places in Spark. I wonder why the
>> path includes "null" though.
>>
>> Could you provide the full stack trace?
>>
>> Andrew
>>
>>
>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <
>> kudryavtsev.konstantin@gmail.com>:
>>
>> Hi all,
>>>
>>> I'm trying to run some transformation on *Spark*, it works fine on
>>> cluster (YARN, linux machines). However, when I'm trying to run it on local
>>> machine (*Windows 7*) under unit test, I got errors:
>>>
>>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>
>>>
>>> My code is following:
>>>
>>> @Test
>>> def testETL() = {
>>>     val conf = new SparkConf()
>>>     val sc = new SparkContext("local", "test", conf)
>>>     try {
>>>         val etl = new IxtoolsDailyAgg() // empty constructor
>>>
>>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>>>
>>>         etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
>>>         Assert.assertTrue(true)
>>>     } finally {
>>>         if(sc != null)
>>>             sc.stop()
>>>     }
>>> }
>>>
>>>
>>> Why is it trying to access hadoop at all? and how can I fix it? Thank
>>> you in advance
>>>
>>> Thank you,
>>> Konstantin Kudryavtsev
>>>
>>
>>
>
>

Re: Run spark unit test on Windows 7

Posted by Kostiantyn Kudriavtsev <ku...@gmail.com>.

No, I don’t

why do I need to have HDP installed? I don’t use Hadoop at all and I’d like to read data from local filesystem

On Jul 2, 2014, at 9:10 PM, Denny Lee <de...@gmail.com> wrote:

> By any chance do you have HDP 2.1 installed? you may need to install the utils and update the env variables per http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
> 
> 
> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
> 
>> Hi Andrew,
>> 
>> it's windows 7 and I doesn't set up any env variables here 
>> 
>> The full stack trace:
>> 
>> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
>> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>> 	at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>> 	at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
>> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
>> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
>> 	at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
>> 	at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
>> 	at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
>> 	at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
>> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
>> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
>> 	at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at junit.framework.TestCase.runTest(TestCase.java:168)
>> 	at junit.framework.TestCase.runBare(TestCase.java:134)
>> 	at junit.framework.TestResult$1.protect(TestResult.java:110)
>> 	at junit.framework.TestResult.runProtected(TestResult.java:128)
>> 	at junit.framework.TestResult.run(TestResult.java:113)
>> 	at junit.framework.TestCase.run(TestCase.java:124)
>> 	at junit.framework.TestSuite.runTest(TestSuite.java:232)
>> 	at junit.framework.TestSuite.run(TestSuite.java:227)
>> 	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
>> 	at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
>> 	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
>> 	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
>> 	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
>> 
>> 
>> Thank you,
>> Konstantin Kudryavtsev
>> 
>> 
>> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <an...@databricks.com> wrote:
>> Hi Konstatin,
>> 
>> We use hadoop as a library in a few places in Spark. I wonder why the path includes "null" though.
>> 
>> Could you provide the full stack trace?
>> 
>> Andrew
>> 
>> 
>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <ku...@gmail.com>:
>> 
>> Hi all,
>> 
>> I'm trying to run some transformation on Spark, it works fine on cluster (YARN, linux machines). However, when I'm trying to run it on local machine (Windows 7) under unit test, I got errors:
>> 
>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>> 
>> My code is following:
>> 
>> @Test
>> def testETL() = {
>>     val conf = new SparkConf()
>>     val sc = new SparkContext("local", "test", conf)
>>     try {
>>         val etl = new IxtoolsDailyAgg() // empty constructor
>> 
>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>> 
>>         etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
>>         Assert.assertTrue(true)
>>     } finally {
>>         if(sc != null)
>>             sc.stop()
>>     }
>> }
>> 
>> Why is it trying to access hadoop at all? and how can I fix it? Thank you in advance
>> 
>> Thank you,
>> Konstantin Kudryavtsev
>> 
>>

Re: Run spark unit test on Windows 7

Posted by Denny Lee <de...@gmail.com>.

By any chance do you have HDP 2.1 installed? you may need to install the utils and update the env variables per http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows


> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev <ku...@gmail.com> wrote:
> 
> Hi Andrew,
> 
> it's windows 7 and I doesn't set up any env variables here 
> 
> The full stack trace:
> 
> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 	at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
> 	at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
> 	at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
> 	at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
> 	at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
> 	at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at junit.framework.TestCase.runTest(TestCase.java:168)
> 	at junit.framework.TestCase.runBare(TestCase.java:134)
> 	at junit.framework.TestResult$1.protect(TestResult.java:110)
> 	at junit.framework.TestResult.runProtected(TestResult.java:128)
> 	at junit.framework.TestResult.run(TestResult.java:113)
> 	at junit.framework.TestCase.run(TestCase.java:124)
> 	at junit.framework.TestSuite.runTest(TestSuite.java:232)
> 	at junit.framework.TestSuite.run(TestSuite.java:227)
> 	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
> 	at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
> 	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
> 	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
> 	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
> 
> 
> Thank you,
> Konstantin Kudryavtsev
> 
> 
>> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <an...@databricks.com> wrote:
>> Hi Konstatin,
>> 
>> We use hadoop as a library in a few places in Spark. I wonder why the path includes "null" though.
>> 
>> Could you provide the full stack trace?
>> 
>> Andrew
>> 
>> 
>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <ku...@gmail.com>:
>> 
>>> Hi all,
>>> 
>>> I'm trying to run some transformation on Spark, it works fine on cluster (YARN, linux machines). However, when I'm trying to run it on local machine (Windows 7) under unit test, I got errors:
>>> 
>>> 
>>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>> 
>>> My code is following:
>>> 
>>> 
>>> @Test
>>> def testETL() = {
>>>     val conf = new SparkConf()
>>>     val sc = new SparkContext("local", "test", conf)
>>>     try {
>>>         val etl = new IxtoolsDailyAgg() // empty constructor
>>> 
>>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>>> 
>>>         etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
>>>         Assert.assertTrue(true)
>>>     } finally {
>>>         if(sc != null)
>>>             sc.stop()
>>>     }
>>> }
>>> 
>>> Why is it trying to access hadoop at all? and how can I fix it? Thank you in advance
>>> 
>>> Thank you,
>>> Konstantin Kudryavtsev
>

Re: Run spark unit test on Windows 7

Posted by Konstantin Kudryavtsev <ku...@gmail.com>.

Hi Andrew,

it's windows 7 and I doesn't set up any env variables here

The full stack trace:

14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the
hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in
the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
 at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
 at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
 at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
at
org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
 at
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
at
org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
 at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
at
org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
 at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
 at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
 at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
 at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
 at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
 at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
 at
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
at
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
 at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

Thank you,
Konstantin Kudryavtsev

On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <an...@databricks.com> wrote:

> Hi Konstatin,
>
> We use hadoop as a library in a few places in Spark. I wonder why the path
> includes "null" though.
>
> Could you provide the full stack trace?
>
> Andrew
>
>
> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <
> kudryavtsev.konstantin@gmail.com>:
>
> Hi all,
>>
>> I'm trying to run some transformation on *Spark*, it works fine on
>> cluster (YARN, linux machines). However, when I'm trying to run it on local
>> machine (*Windows 7*) under unit test, I got errors:
>>
>>
>> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>
>>
>> My code is following:
>>
>>
>> @Test
>> def testETL() = {
>>     val conf = new SparkConf()
>>     val sc = new SparkContext("local", "test", conf)
>>     try {
>>         val etl = new IxtoolsDailyAgg() // empty constructor
>>
>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>>
>>         etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
>>         Assert.assertTrue(true)
>>     } finally {
>>         if(sc != null)
>>             sc.stop()
>>     }
>> }
>>
>>
>> Why is it trying to access hadoop at all? and how can I fix it? Thank you
>> in advance
>>
>> Thank you,
>> Konstantin Kudryavtsev
>>
>
>

Re: Run spark unit test on Windows 7

Posted by Andrew Or <an...@databricks.com>.

Hi Konstatin,

We use hadoop as a library in a few places in Spark. I wonder why the path
includes "null" though.

Could you provide the full stack trace?

Andrew


2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <
kudryavtsev.konstantin@gmail.com>:

> Hi all,
>
> I'm trying to run some transformation on *Spark*, it works fine on
> cluster (YARN, linux machines). However, when I'm trying to run it on local
> machine (*Windows 7*) under unit test, I got errors:
>
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>
>
> My code is following:
>
> @Test
> def testETL() = {
>     val conf = new SparkConf()
>     val sc = new SparkContext("local", "test", conf)
>     try {
>         val etl = new IxtoolsDailyAgg() // empty constructor
>
>         val data = sc.parallelize(List("in1", "in2", "in3"))
>
>         etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
>         Assert.assertTrue(true)
>     } finally {
>         if(sc != null)
>             sc.stop()
>     }
> }
>
>
> Why is it trying to access hadoop at all? and how can I fix it? Thank you
> in advance
>
> Thank you,
> Konstantin Kudryavtsev
>