You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bahir.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/07/27 03:42:20 UTC

[jira] [Commented] (BAHIR-38) Spark-submit does not use latest locally installed Bahir packages

    [ https://issues.apache.org/jira/browse/BAHIR-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395006#comment-15395006 ] 

ASF GitHub Bot commented on BAHIR-38:
-------------------------------------

GitHub user ckadner opened a pull request:

    https://github.com/apache/bahir/pull/14

    [BAHIR-38] clean Ivy cache during Maven install phase

    [BAHIR-38: Spark-submit does not use latest locally installed Bahir packages](https://issues.apache.org/jira/browse/BAHIR-38)
    
    When we `install` the org.apache.bahir jars into the local Maven repository we also need to clean the previous jar files from the Ivy cache (~/iv2/cache/org.apache.bahir/*) so `spark-submit -packages ...` will pick up the new version from the the local Maven repository.
    
    *pom.xml:*
    
    ```xml
      <build>
        <plugins>
          ...
          <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-clean-plugin</artifactId>
            <executions>
              <!--
                When we `install` the org.apache.bahir jars into the local Maven repository we also need
                to clean the previous jar files from the Ivy cache (~/iv2/cache/org.apache.bahir/*) so
                `spark-submit -packages ...` will pick up the new version from the the local Maven repository
              -->
              <execution>
                <id>cleanup-ivy-cache</id>
                <phase>install</phase>
                <goals>
                  <goal>clean</goal>
                </goals>
                <configuration>
                  <followSymLinks>false</followSymLinks>
                  <excludeDefaultDirectories>true</excludeDefaultDirectories>
                  <filesets>
                    <fileset>
                      <directory>${user.home}/.ivy2/cache/${project.groupId}/${project.artifactId}</directory>
                      <includes>
                        <include>*-${project.version}.*</include>
                        <include>jars/${project.artifactId}-${project.version}.jar</include>
                      </includes>
                    </fileset>
                  </filesets>
                </configuration>
              </execution>
            </executions>
          </plugin>
        </plugins>
      </build>
      ...
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ckadner/bahir BAHIR-38_clean_Ivy_cache_during_mvn_install

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/bahir/pull/14.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14
    
----
commit e21e5d33c0b6ba953743479c68c176a16a0b8bf6
Author: Christian Kadner <ck...@us.ibm.com>
Date:   2016-07-27T03:37:22Z

    [BAHIR-38] clean Ivy cache during Maven install phase

----


> Spark-submit does not use latest locally installed Bahir packages
> -----------------------------------------------------------------
>
>                 Key: BAHIR-38
>                 URL: https://issues.apache.org/jira/browse/BAHIR-38
>             Project: Bahir
>          Issue Type: Bug
>          Components: Build
>    Affects Versions: 2.0.0
>         Environment: Maven (3.3.9) on Mac OS X
>            Reporter: Christian Kadner
>            Assignee: Christian Kadner
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> We use {{`spark-submit --packages <maven-coordinates> ...`}} to run Spark with any of the Bahir extensions. 
> In order to perform a _manual integration test_ of a Bahir code change developers have to _build_ the respective Bahir module and then _install_ it into their *local Maven repository*. Then, when running {{`spark-submit --packages <maven-coordinates> ...`}} Spark will use *Ivy* to resolve the given _maven-coordinates_ in order add the necessary jar files to the classpath.
> The first time Ivy encounters new maven coordinates, it will download them from the local or remote Maven repository. All consecutive times Ivy will just use the previously cached jar files based on group ID, artifact ID and version, but irrespective of creation time stamp. 
> This behavior is fine when using spark-submit with released versions of Spark packages. For continuous development and integration-testing however that Ivy caching behavior poses a problem. 
> To *work around* it developers have to *clear the local Ivy cache* each time they _install_ a new version of a Bahir package into their local Maven repository and before the run spark-submit.
> For example, to test a code change in module streaming-mqtt, we would have to do ...
> {code}
> mvn clean install -pl streaming-mqtt
> rm -rf ~/.ivy2/cache/org.apache.bahir/spark-streaming-mqtt_2.11/
> ${SPARK_HOME}/bin/spark-submit \
>     --packages org.apache.bahir:spark-streaming-mqtt_2.11:2.0.0-SNAPSHOT \
>     test.py
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)