You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Kostya Golikov <jo...@gmail.com> on 2015/01/13 19:41:16 UTC

Minimal and complete set of dependencies required to setup Miniclusters

Hey, guys!

>From "How to develop Hadoop tests"
<https://wiki.apache.org/hadoop/HowToDevelopUnitTests> I discovered that
there is such thing as Miniclusters, which provide easy-to-setup in-process
version of Hadoop. I was eager to use both MiniDFSCluster and
MiniYARNCluster for a functional test of my pig script (I am aware of
PigUnit and embraced it for unit tests), but had quite hard times,
resolving all transitive dependencies -- apparently below set of
dependencies is not enough to get up and running:

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>${hadoop.version}</version>
    <classifier>tests</classifier>
    <scope>test</scope>
</dependency>

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-yarn-server-tests</artifactId>
    <version>${hadoop.version}</version>
    <classifier>tests</classifier>
    <scope>test</scope>
</dependency>

What is the canonical way to introduce miniclusters to a maven project,
given that I don't have any other hadoop components already listed in pom?

Re: Minimal and complete set of dependencies required to setup Miniclusters

Posted by Ted Yu <yu...@gmail.com>.
In HBase, we have the following dependencies:
          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
            <version>${hadoop-two.version}</version>
            <type>test-jar</type>
            <scope>test</scope>
            <exclusions>
              <exclusion>
                <groupId>io.netty</groupId>
                <artifactId>netty</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>${hadoop-two.version}</version>
            <type>test-jar</type>
            <scope>test</scope>
            <exclusions>
              <exclusion>
                <groupId>javax.servlet.jsp</groupId>
                <artifactId>jsp-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>stax</groupId>
                <artifactId>stax-api</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-minicluster</artifactId>
            <version>${hadoop-two.version}</version>
            <exclusions>
              <exclusion>
                <groupId>javax.servlet.jsp</groupId>
                <artifactId>jsp-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>stax</groupId>
                <artifactId>stax-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>io.netty</groupId>
                <artifactId>netty</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

Cheers

On Tue, Jan 13, 2015 at 10:41 AM, Kostya Golikov <jo...@gmail.com>
wrote:

> Hey, guys!
>
> From "How to develop Hadoop tests"
> <https://wiki.apache.org/hadoop/HowToDevelopUnitTests> I discovered that
> there is such thing as Miniclusters, which provide easy-to-setup in-process
> version of Hadoop. I was eager to use both MiniDFSCluster and
> MiniYARNCluster for a functional test of my pig script (I am aware of
> PigUnit and embraced it for unit tests), but had quite hard times,
> resolving all transitive dependencies -- apparently below set of
> dependencies is not enough to get up and running:
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-hdfs</artifactId>
>     <version>${hadoop.version}</version>
>     <classifier>tests</classifier>
>     <scope>test</scope>
> </dependency>
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-yarn-server-tests</artifactId>
>     <version>${hadoop.version}</version>
>     <classifier>tests</classifier>
>     <scope>test</scope>
> </dependency>
>
> What is the canonical way to introduce miniclusters to a maven project,
> given that I don't have any other hadoop components already listed in pom?
>

Re: Minimal and complete set of dependencies required to setup Miniclusters

Posted by Ted Yu <yu...@gmail.com>.
In HBase, we have the following dependencies:
          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
            <version>${hadoop-two.version}</version>
            <type>test-jar</type>
            <scope>test</scope>
            <exclusions>
              <exclusion>
                <groupId>io.netty</groupId>
                <artifactId>netty</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>${hadoop-two.version}</version>
            <type>test-jar</type>
            <scope>test</scope>
            <exclusions>
              <exclusion>
                <groupId>javax.servlet.jsp</groupId>
                <artifactId>jsp-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>stax</groupId>
                <artifactId>stax-api</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-minicluster</artifactId>
            <version>${hadoop-two.version}</version>
            <exclusions>
              <exclusion>
                <groupId>javax.servlet.jsp</groupId>
                <artifactId>jsp-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>stax</groupId>
                <artifactId>stax-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>io.netty</groupId>
                <artifactId>netty</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

Cheers

On Tue, Jan 13, 2015 at 10:41 AM, Kostya Golikov <jo...@gmail.com>
wrote:

> Hey, guys!
>
> From "How to develop Hadoop tests"
> <https://wiki.apache.org/hadoop/HowToDevelopUnitTests> I discovered that
> there is such thing as Miniclusters, which provide easy-to-setup in-process
> version of Hadoop. I was eager to use both MiniDFSCluster and
> MiniYARNCluster for a functional test of my pig script (I am aware of
> PigUnit and embraced it for unit tests), but had quite hard times,
> resolving all transitive dependencies -- apparently below set of
> dependencies is not enough to get up and running:
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-hdfs</artifactId>
>     <version>${hadoop.version}</version>
>     <classifier>tests</classifier>
>     <scope>test</scope>
> </dependency>
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-yarn-server-tests</artifactId>
>     <version>${hadoop.version}</version>
>     <classifier>tests</classifier>
>     <scope>test</scope>
> </dependency>
>
> What is the canonical way to introduce miniclusters to a maven project,
> given that I don't have any other hadoop components already listed in pom?
>

Re: Minimal and complete set of dependencies required to setup Miniclusters

Posted by Ted Yu <yu...@gmail.com>.
In HBase, we have the following dependencies:
          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
            <version>${hadoop-two.version}</version>
            <type>test-jar</type>
            <scope>test</scope>
            <exclusions>
              <exclusion>
                <groupId>io.netty</groupId>
                <artifactId>netty</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>${hadoop-two.version}</version>
            <type>test-jar</type>
            <scope>test</scope>
            <exclusions>
              <exclusion>
                <groupId>javax.servlet.jsp</groupId>
                <artifactId>jsp-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>stax</groupId>
                <artifactId>stax-api</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-minicluster</artifactId>
            <version>${hadoop-two.version}</version>
            <exclusions>
              <exclusion>
                <groupId>javax.servlet.jsp</groupId>
                <artifactId>jsp-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>stax</groupId>
                <artifactId>stax-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>io.netty</groupId>
                <artifactId>netty</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

Cheers

On Tue, Jan 13, 2015 at 10:41 AM, Kostya Golikov <jo...@gmail.com>
wrote:

> Hey, guys!
>
> From "How to develop Hadoop tests"
> <https://wiki.apache.org/hadoop/HowToDevelopUnitTests> I discovered that
> there is such thing as Miniclusters, which provide easy-to-setup in-process
> version of Hadoop. I was eager to use both MiniDFSCluster and
> MiniYARNCluster for a functional test of my pig script (I am aware of
> PigUnit and embraced it for unit tests), but had quite hard times,
> resolving all transitive dependencies -- apparently below set of
> dependencies is not enough to get up and running:
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-hdfs</artifactId>
>     <version>${hadoop.version}</version>
>     <classifier>tests</classifier>
>     <scope>test</scope>
> </dependency>
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-yarn-server-tests</artifactId>
>     <version>${hadoop.version}</version>
>     <classifier>tests</classifier>
>     <scope>test</scope>
> </dependency>
>
> What is the canonical way to introduce miniclusters to a maven project,
> given that I don't have any other hadoop components already listed in pom?
>

Re: Minimal and complete set of dependencies required to setup Miniclusters

Posted by Ted Yu <yu...@gmail.com>.
In HBase, we have the following dependencies:
          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
            <version>${hadoop-two.version}</version>
            <type>test-jar</type>
            <scope>test</scope>
            <exclusions>
              <exclusion>
                <groupId>io.netty</groupId>
                <artifactId>netty</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>${hadoop-two.version}</version>
            <type>test-jar</type>
            <scope>test</scope>
            <exclusions>
              <exclusion>
                <groupId>javax.servlet.jsp</groupId>
                <artifactId>jsp-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>stax</groupId>
                <artifactId>stax-api</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

          <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-minicluster</artifactId>
            <version>${hadoop-two.version}</version>
            <exclusions>
              <exclusion>
                <groupId>javax.servlet.jsp</groupId>
                <artifactId>jsp-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>stax</groupId>
                <artifactId>stax-api</artifactId>
              </exclusion>
              <exclusion>
                <groupId>io.netty</groupId>
                <artifactId>netty</artifactId>
              </exclusion>
            </exclusions>
          </dependency>

Cheers

On Tue, Jan 13, 2015 at 10:41 AM, Kostya Golikov <jo...@gmail.com>
wrote:

> Hey, guys!
>
> From "How to develop Hadoop tests"
> <https://wiki.apache.org/hadoop/HowToDevelopUnitTests> I discovered that
> there is such thing as Miniclusters, which provide easy-to-setup in-process
> version of Hadoop. I was eager to use both MiniDFSCluster and
> MiniYARNCluster for a functional test of my pig script (I am aware of
> PigUnit and embraced it for unit tests), but had quite hard times,
> resolving all transitive dependencies -- apparently below set of
> dependencies is not enough to get up and running:
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-hdfs</artifactId>
>     <version>${hadoop.version}</version>
>     <classifier>tests</classifier>
>     <scope>test</scope>
> </dependency>
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-yarn-server-tests</artifactId>
>     <version>${hadoop.version}</version>
>     <classifier>tests</classifier>
>     <scope>test</scope>
> </dependency>
>
> What is the canonical way to introduce miniclusters to a maven project,
> given that I don't have any other hadoop components already listed in pom?
>