You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/05 02:01:54 UTC

[GitHub] [arrow] GavinRay97 opened a new issue #12570: Arrow nightly Maven releases don't seem to work

GavinRay97 opened a new issue #12570:
URL: https://github.com/apache/arrow/issues/12570


   Following the instructions listed here:
   - https://github.com/apache/arrow/blob/650f111b524fb1c5bfbfa6f533d15929c90ddc40/docs/source/java/install.rst#installing-nightly-packages
   
   I get the following error when trying to install. I think the content type is being mis-interpreted (as HTML rather than XML)
   
   ```java
   [WARNING] The POM for org.apache.arrow:arrow-flight:jar:8.0.0.dev165 is invalid, transitive dependencies (if any) will not be available: 1 problem was encountered while building the effective model
   [FATAL] Non-parseable POM C:\Users\rayga\.m2\repository\org\apache\arrow\arrow-flight\8.0.0.dev165\arrow-flight-8.0.0.
   dev165.pom: expected = after attribute name (position: TEXT seen ...l="preconnect" href="https://github.githubassets.com" crossorigin>... @15:77)  @ line 15, column 77
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] davisusanibar commented on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
davisusanibar commented on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1060621861


   Hi Team, sorry to join late
   
   Thank you @GavinRay97 , library are downloaded but it's invalid pom/jar.
   
   Related to update [Arrow Java Nightly Doc](https://github.com/apache/arrow/blob/650f111b524fb1c5bfbfa6f533d15929c90ddc40/docs/source/java/install.rst#installing-nightly-packages) ... I just reviewing the issue and I see 02 options:
   
   1. Analyze/review/configure how to integrate/use github nightly release in a transparent manner
   2. Define workaround to build arrow java nightly dependencies locally:
   - Add your repo to documentation
   - Define more generic integration (without modifying/adding more configuration) to add to the documentation 
   
   Just working on a generic nightly build implementation using this shell script:
   
   Code to add to the docs:
   
   ```
   #!/bin/bash
   
   # Shell variables
   ARROW_JAVA_NIGHTLY_VERSION=${1:-'nightly-2022-03-03-0-github-java-jars'}
   DEPENDENCY_TO_INSTALL=${2:-'arrow'}
   
   # Local Variables
   TMP_FOLDER=arrow_java_$(date +"%d-%m-%Y")
   PATTERN_TO_GET_LIB_AND_VERSION='([a-z].+)-([0-9].[0-9].[0-9].dev[0-9]+).([a-z]+)'
   
   # Aplication logic
   echo $DEPENDENCY_TO_INSTALL
   mkdir -p $TMP_FOLDER
   pushd $TMP_FOLDER
   echo "**************** 1 - Download arrow-java $1 dependencies ****************"
   wget $( \
   	wget \
   		-qO- https://api.github.com/repos/ursacomputing/crossbow/releases/tags/$ARROW_JAVA_NIGHTLY_VERSION \
   		| jq -r '.assets[] | select((.name | endswith(".pom")) or (.name | endswith(".jar"))) | .browser_download_url' \
   		| grep $DEPENDENCY_TO_INSTALL )
   
   
   echo "**************** 2 - Install arrow java libraries to local repository ****************"
   for LIBRARY in $(ls | grep -E '.jar' | grep dev); do
   	[[ $LIBRARY =~ $PATTERN_TO_GET_LIB_AND_VERSION ]]
   	FILE=$PWD/${BASH_REMATCH[0]}
   	if [[ ( ${BASH_REMATCH[0]} == *"$DEPENDENCY_TO_INSTALL"* ) ]];then
   		if [ -f "$FILE" ]; then
   			FILE=$FILE
   		else
   			if [ -f "$FILE.jar" ]; then # Out of regex: -javadoc.jar / -sources.jar
   				FILE=$FILE.jar
   			else 
   				if [ -f "$FILE-with-dependencies.jar" ]; then # Out of regex: -with-dependencies.jar
   					FILE=$FILE-with-dependencies.jar
   				else 
   				    echo "Please! Review $FILE, it was not intalled on m2 locally."
   				fi
   			fi
   		fi
   		echo "$FILE"
   		mvn install:install-file \
   			-Dfile="$FILE" \
   			-DgroupId=org.apache.arrow \
   			-DartifactId=${BASH_REMATCH[1]} \
   			-Dversion=${BASH_REMATCH[2]} \
   			-Dpackaging=${BASH_REMATCH[3]} \
   			-DcreateChecksum=true \
   			-Dgenerate.pom=true
   	fi
   done
   popd
   # rm -rf $TMP_FOLDER
   echo "Go to your project and execute: mvn clean install"
   ```
   
   Execute: Download all dependencies / Or only jar needed
   ```
   # Download all dependencies
   sh arrow_java_nightly.sh nightly-2022-03-03-0-github-java-jars
   
   # Download needed library, for example: memory
   sh arrow_java_nightly.sh nightly-2022-03-03-0-github-java-jars memory
   ```
   
   Use: Go to your pom.xml add dependencies and version needed
   ```
        ...
       <properties>
           <arrow.version>8.0.0.dev165</arrow.version>
       </properties>
   
       <dependencies>
           <dependency>
               <groupId>org.apache.arrow</groupId>
               <artifactId>arrow-memory-core</artifactId>
               <version>${arrow.version}</version>
           </dependency>
           <dependency>
               <groupId>org.apache.arrow</groupId>
               <artifactId>arrow-memory-netty</artifactId>
               <version>${arrow.version}</version>
           </dependency>
           <dependency>
               <groupId>ch.qos.logback</groupId>
               <artifactId>logback-classic</artifactId>
               <version>${logback.version}</version>
           </dependency>
           <dependency>
               <groupId>org.apache.arrow</groupId>
               <artifactId>flight-core</artifactId>
               <version>${arrow.version}</version>
           </dependency>
       </dependencies>
       ...
   ```
   
   Run:
   ```
   mvn clean install
   ```
   
   Please if you could help me if this work on your side.
   
   Thank you in advance.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] GavinRay97 edited a comment on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
GavinRay97 edited a comment on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059852761


   Here is a Node.js script to download from the Nightlies and extract the assets into Maven repository structure:
   
   ```json
   {
       "name": "arrow-download-nightly-as-maven-repo",
       "scripts": {
           "start": "node index.mjs"
       },
       "dependencies": {
           "cross-fetch": "^3.1.5",
           "jsdom": "^19.0.0"
       }
   }
   ```
   ```js
   // index.mjs
   // Run with: $ node index.mjs
   import fetch from "cross-fetch"
   import asyncFS from "fs/promises"
   import { JSDOM } from "jsdom"
   import path from "path"
   import { fileURLToPath } from "url"
   
   // Polyfill "__dirname" for Node.js ECMAScript Module filetype
   const __dirname = path.dirname(fileURLToPath(import.meta.url))
   
   const ARROW_NIGHTLY_TAG_URL =
       "https://github.com/ursacomputing/crossbow/releases/tag/nightly-2022-03-03-0-github-java-jars"
   
   async function main() {
       extractArrowNightlyJarsToLocalM2Repo(ARROW_NIGHTLY_TAG_URL)
   }
   
   main().catch((err) => {
       console.error(err)
       process.exit(1)
   })
   
   async function extractArrowNightlyJarsToLocalM2Repo(arrowNightlyTagUrl) {
       // Parse HTML to DOM
       const dom = await JSDOM.fromURL(arrowNightlyTagUrl)
       const document = dom.window.document
   
       // Get all <li> tags containing the asset name and download URL
       const assetLinkEls = document.querySelectorAll("li.Box-row")
       const assets = []
       for (const el of assetLinkEls) {
           const anchorTag = el.querySelector("a")
           const assetFilename = anchorTag.textContent.trim()
           const link = anchorTag.href
           if (assetFilename.includes("Source code")) continue
           const { library, version } = getLibraryAndVersionFromAssetFilename(assetFilename)
           if (assets[library]) {
               assets[library].push({ version, link, assetFilename })
           } else {
               assets[library] = [{ version, link, assetFilename }]
           }
       }
   
       for (const [library, versions] of Object.entries(assets)) {
           for (const { version, link, assetFilename } of versions) {
               const basePath = "org/apache/arrow"
               const libPath = `${library}/${version}`
               const fullPath = path.join(__dirname, "../", basePath, libPath)
               asyncFS.mkdir(fullPath, { recursive: true })
               console.log("Downloading " + assetFilename + " to " + fullPath)
               await downloadUrlAssetToPath(link, path.join(fullPath, assetFilename))
           }
       }
   }
   
   async function downloadUrlAssetToPath(url, filepath) {
       const request = await fetch(url)
       const content = await request.text()
       return asyncFS.writeFile(filepath, content)
   }
   
   // M2 repo folder format:
   // org/apache/arrow/<lib-name>/<version>/<lib-name>-<version>.(ext)
   function getLibraryAndVersionFromAssetFilename(filename) {
       const libraryAndVersionRegex = /(?<library>.+)-(?<version>\d\.\d\.\d.dev\d+)/
       return filename.match(libraryAndVersionRegex)?.groups
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
lidavidm commented on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1060970759


   A JIRA was filed here: https://issues.apache.org/jira/browse/ARROW-15865


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] GavinRay97 edited a comment on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
GavinRay97 edited a comment on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059817506


   I've used a regular GitHub repository as a Maven repository before, for that you have to use the "raw" URL:
   
   
   
   ```groovy
   repositories {
     maven {
       name "expecty"
       url "https://raw.github.com/pniederw/expecty/master/m2repo/"
     }
   }
   ```
   
   Maybe something like this might be needed for using tagged releases too? I checked the POM it pulled and it's the HTML of the GitHub page rather than the actual asset.
   
   (I wanted to start prototyping a project with FlightSQL but there was some issue with it making it into the v7.0.0 release POMs)
   
   Today I will try to write a script that takes the URL to the nightly Java releases, downloads all the assets, and then creates the proper M2 folder structure for the version number.
   
   I'll publish last night's releases to this repo and share the URL for anyone else who might want a temporary fix until the 7.0.1 or 8.0.0 staging releases 👍


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] GavinRay97 commented on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
GavinRay97 commented on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059817506


   I've used a regular GitHub repository as a Maven repository before, for that you have to use the "raw" URL 
   
   Maybe something like this might be needed for using tagged releases too? I checked the POM it pulled and it's the HTML of the GitHub page rather than the actual asset.
   
   (I wanted to start prototyping a project with FlightSQL but there was some issue with it making it into the v7.0.0 release POMs)
   
   Today I will try to write a script that takes the URL to the nightly Java releases, downloads all the assets, and then creates the proper M2 folder structure for the version number.
   
   I'll publish last night's releases to this repo and share the URL for anyone else who might want a temporary fix until the 7.0.1 or 8.0.0 staging releases 👍


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] GavinRay97 edited a comment on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
GavinRay97 edited a comment on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059852761


   Here is a Node.js script to download from the Nightlies and extract the assets into Maven repository structure:
   
   ```json
   {
       "name": "arrow-download-nightly-as-maven-repo",
       "scripts": {
           "start": "node index.mjs"
       },
       "dependencies": {
           "cross-fetch": "^3.1.5",
           "jsdom": "^19.0.0"
       }
   }
   ```
   ```js
   // index.mjs
   // Run with: $ node index.mjs
   import fetch from "cross-fetch"
   import asyncFS from "fs/promises"
   import { JSDOM } from "jsdom"
   import path from "path"
   import { fileURLToPath } from "url"
   
   // Polyfill "__dirname" for Node.js ECMAScript Module filetype
   const __dirname = path.dirname(fileURLToPath(import.meta.url))
   
   const ARROW_NIGHTLY_TAG_URL =
       "https://github.com/ursacomputing/crossbow/releases/tag/nightly-2022-03-03-0-github-java-jars"
   
   async function main() {
       extractArrowNightlyJarsToLocalM2Repo(ARROW_NIGHTLY_TAG_URL)
   }
   
   main().catch((err) => {
       console.error(err)
       process.exit(1)
   })
   
   async function extractArrowNightlyJarsToLocalM2Repo(arrowNightlyTagUrl) {
       // Parse HTML to DOM
       const dom = await JSDOM.fromURL(arrowNightlyTagUrl)
       const document = dom.window.document
   
       // Get all <li> tags containing the asset name and download URL
       const assetLinkEls = document.querySelectorAll("li.Box-row")
       const assets = []
       for (const el of assetLinkEls) {
           const anchorTag = el.querySelector("a")
           const assetFilename = anchorTag.textContent.trim()
           const link = anchorTag.href
           if (assetFilename.includes("Source code")) continue
           const { library, version } = getLibraryAndVersionFromAssetFilename(assetFilename)
           if (assets[library]) {
               assets[library].push({ version, link, assetFilename })
           } else {
               assets[library] = [{ version, link, assetFilename }]
           }
       }
   
       for (const [library, versions] of Object.entries(assets)) {
           for (const { version, link, assetFilename } of versions) {
               const basePath = "org/apache/arrow"
               const libPath = `${library}/${version}`
               const fullPath = path.join(__dirname, "../", basePath, libPath)
               asyncFS.mkdir(fullPath, { recursive: true })
               await downloadUrlAssetToPath(link, path.join(fullPath, assetFilename))
           }
       }
   }
   
   async function downloadUrlAssetToPath(url, filepath) {
       const request = await fetch(url)
       const content = await request.text()
       return asyncFS.writeFile(filepath, content)
   }
   
   // M2 repo folder format:
   // org/apache/arrow/<lib-name>/<version>/<lib-name>-<version>.(ext)
   function getLibraryAndVersionFromAssetFilename(filename) {
       const libraryAndVersionRegex = /(?<library>.+)-(?<version>\d\.\d\.\d.dev\d+)/
       return filename.match(libraryAndVersionRegex)?.groups
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] GavinRay97 edited a comment on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
GavinRay97 edited a comment on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059852761


   Here is a Node.js script to download from the Nightlies and extract the assets into Maven repository structure:
   
   ```json
   {
       "name": "arrow-download-nightly-as-maven-repo",
       "scripts": {
           "start": "node index.mjs"
       },
       "dependencies": {
           "cross-fetch": "^3.1.5",
           "jsdom": "^19.0.0"
       }
   }
   ```
   ```js
   // index.mjs
   // Run with: $ node index.mjs
   import fetch from "cross-fetch"
   import asyncFS from "fs/promises"
   import { JSDOM } from "jsdom"
   import path from "path"
   import { fileURLToPath } from "url"
   
   // Polyfill "__dirname" for Node.js ECMAScript Module filetype
   const __dirname = path.dirname(fileURLToPath(import.meta.url))
   
   const ARROW_NIGHTLY_TAG_URL =
       "https://github.com/ursacomputing/crossbow/releases/tag/nightly-2022-03-03-0-github-java-jars"
   
   async function main() {
       extractArrowNightlyJarsToLocalM2Repo(ARROW_NIGHTLY_TAG_URL)
   }
   
   main().catch((err) => {
       console.error(err)
       process.exit(1)
   })
   
   async function extractArrowNightlyJarsToLocalM2Repo(arrowNightlyTagUrl) {
       // Parse HTML to DOM
       const dom = await JSDOM.fromURL(arrowNightlyTagUrl)
       const document = dom.window.document
   
       // Get all <li> tags containing the asset name and download URL
       const assetLinkEls = document.querySelectorAll("li.Box-row")
       const assets = []
       for (const el of assetLinkEls) {
           const anchorTag = el.querySelector("a")
           const assetFilename = anchorTag.textContent.trim()
           const link = anchorTag.href
           if (assetFilename.includes("Source code")) continue
           const { library, version } = getLibraryAndVersionFromAssetFilename(assetFilename)
           if (assets[library]) {
               assets[library].push({ version, link, assetFilename })
           } else {
               assets[library] = [{ version, link, assetFilename }]
           }
       }
   
       for (const [library, versions] of Object.entries(assets)) {
           for (const { version, link, assetFilename } of versions) {
               const basePath = "org/apache/arrow"
               const libPath = `${library}/${version}`
               const fullPath = path.join(__dirname, "../", basePath, libPath)
               asyncFS.mkdir(fullPath, { recursive: true })
               console.log("Downloading " + assetFilename + " to " + fullPath)
               await downloadUrlAssetToPath(link, path.join(fullPath, assetFilename))
           }
       }
   }
   
   async function downloadUrlAssetToPath(url, filepath) {
       const request = await fetch(url)
       const content = await request.text()
       return asyncFS.writeFile(filepath, content)
   }
   
   // M2 repo folder format:
   // org/apache/arrow/<lib-name>/<version>/<lib-name>-<version>.(ext)
   function getLibraryAndVersionFromAssetFilename(filename) {
       const libraryAndVersionRegex = /(?<library>.+)-(?<version>\d\.\d\.\d.dev\d+)/
       return filename.match(libraryAndVersionRegex)?.groups
   }
   ```
   
   ```sh
   user@MSI:~/projects/arrow-download-nightly-as-maven-repo$ tree org/
   org/
   └── apache
       └── arrow
           ├── arrow-algorithm
           │   └── 8.0.0.dev165
           │       ├── arrow-algorithm-8.0.0.dev165-javadoc.jar
           │       ├── arrow-algorithm-8.0.0.dev165-sources.jar
           │       ├── arrow-algorithm-8.0.0.dev165-tests.jar
           │       ├── arrow-algorithm-8.0.0.dev165.jar
           │       └── arrow-algorithm-8.0.0.dev165.pom
           ├── arrow-avro
           │   └── 8.0.0.dev165
           │       ├── arrow-avro-8.0.0.dev165-javadoc.jar
           │       ├── arrow-avro-8.0.0.dev165-sources.jar
           │       ├── arrow-avro-8.0.0.dev165-tests.jar
           │       ├── arrow-avro-8.0.0.dev165.jar
           │       └── arrow-avro-8.0.0.dev165.pom
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] GavinRay97 commented on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
GavinRay97 commented on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059852761


   Here is a Node.js script to download from the Nightlies and extract the assets into Maven repository structure:
   
   ```js
   // index.mjs
   // Run with: $ node index.mjs
   import fetch from "cross-fetch"
   import asyncFS from "fs/promises"
   import { JSDOM } from "jsdom"
   import path from "path"
   import { fileURLToPath } from "url"
   
   // Polyfill "__dirname" for Node.js ECMAScript Module filetype
   const __dirname = path.dirname(fileURLToPath(import.meta.url))
   
   const ARROW_NIGHTLY_TAG_URL =
       "https://github.com/ursacomputing/crossbow/releases/tag/nightly-2022-03-03-0-github-java-jars"
   
   async function main() {
       extractArrowNightlyJarsToLocalM2Repo(ARROW_NIGHTLY_TAG_URL)
   }
   
   main().catch((err) => {
       console.error(err)
       process.exit(1)
   })
   
   async function extractArrowNightlyJarsToLocalM2Repo(arrowNightlyTagUrl) {
       // Parse HTML to DOM
       const dom = await JSDOM.fromURL(arrowNightlyTagUrl)
       const document = dom.window.document
   
       // Get all <li> tags containing the asset name and download URL
       const assetLinkEls = document.querySelectorAll("li.Box-row")
       const assets = []
       for (const el of assetLinkEls) {
           const anchorTag = el.querySelector("a")
           const assetFilename = anchorTag.textContent.trim()
           const link = anchorTag.href
           if (assetFilename.includes("Source code")) continue
           const { library, version } = getLibraryAndVersionFromAssetFilename(assetFilename)
           if (assets[library]) {
               assets[library].push({ version, link, assetFilename })
           } else {
               assets[library] = [{ version, link, assetFilename }]
           }
       }
   
       for (const [library, versions] of Object.entries(assets)) {
           for (const { version, link, assetFilename } of versions) {
               const basePath = "org/apache/arrow"
               const libPath = `${library}/${version}`
               const fullPath = path.join(__dirname, "../", basePath, libPath)
               asyncFS.mkdir(fullPath, { recursive: true })
               await downloadUrlAssetToPath(link, path.join(fullPath, assetFilename))
           }
       }
   }
   
   async function downloadUrlAssetToPath(url, filepath) {
       const request = await fetch(url)
       const content = await request.text()
       return asyncFS.writeFile(filepath, content)
   }
   
   // M2 repo folder format:
   // org/apache/arrow/<lib-name>/<version>/<lib-name>-<version>.(ext)
   function getLibraryAndVersionFromAssetFilename(filename) {
       const libraryAndVersionRegex = /(?<library>.+)-(?<version>\d\.\d\.\d.dev\d+)/
       return filename.match(libraryAndVersionRegex)?.groups
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] GavinRay97 edited a comment on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
GavinRay97 edited a comment on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059852761


   Here is a Node.js script to download from the Nightlies and extract the assets into Maven repository structure, and the 03/03 jars published as usable M2 repo.
   
   Instructions for use with Gradle/Maven are here:
   https://github.com/GavinRay97/arrow-nightlies-repo
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
lidavidm commented on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059770864


   @davisusanibar were you able to get this to work?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] GavinRay97 edited a comment on issue #12570: Arrow nightly Maven releases don't seem to work

Posted by GitBox <gi...@apache.org>.
GavinRay97 edited a comment on issue #12570:
URL: https://github.com/apache/arrow/issues/12570#issuecomment-1059852761


   Here is a Node.js script to download from the Nightlies and extract the assets into Maven repository structure:
   
   ```json
   {
       "name": "arrow-download-nightly-as-maven-repo",
       "scripts": {
           "start": "node index.mjs"
       },
       "dependencies": {
           "cross-fetch": "^3.1.5",
           "jsdom": "^19.0.0"
       }
   }
   ```
   ```js
   // index.mjs
   // Run with: $ node index.mjs
   import fetch from "cross-fetch"
   import fs from "fs"
   import asyncFS from "fs/promises"
   import { JSDOM } from "jsdom"
   import path from "path"
   import { fileURLToPath } from "url"
   
   // Polyfill "__dirname" for Node.js ECMAScript Module filetype
   const __dirname = path.dirname(fileURLToPath(import.meta.url))
   
   const ARROW_NIGHTLY_TAG_URL =
       "https://github.com/ursacomputing/crossbow/releases/tag/nightly-2022-03-03-0-github-java-jars"
   
   async function main() {
       extractArrowNightlyJarsToLocalM2Repo(ARROW_NIGHTLY_TAG_URL)
   }
   
   main().catch((err) => {
       console.error(err)
       process.exit(1)
   })
   
   async function extractArrowNightlyJarsToLocalM2Repo(arrowNightlyTagUrl) {
       // Parse HTML to DOM
       const dom = await JSDOM.fromURL(arrowNightlyTagUrl)
       const document = dom.window.document
   
       // Get all <li> tags containing the asset name and download URL
       const assetLinkEls = document.querySelectorAll("li.Box-row")
       const assets = []
       for (const el of assetLinkEls) {
           const anchorTag = el.querySelector("a")
           const assetFilename = anchorTag.textContent.trim()
           const link = anchorTag.href
           if (assetFilename.includes("Source code")) continue
           const { library, version } = getLibraryAndVersionFromAssetFilename(assetFilename)
           if (assets[library]) {
               assets[library].push({ version, link, assetFilename })
           } else {
               assets[library] = [{ version, link, assetFilename }]
           }
       }
   
       for (const [library, versions] of Object.entries(assets)) {
           for (const { version, link, assetFilename } of versions) {
               const basePath = "org/apache/arrow"
               const libPath = `${library}/${version}`
               const fullPath = path.join(__dirname, "../", basePath, libPath)
               asyncFS.mkdir(fullPath, { recursive: true })
               console.log("Downloading " + assetFilename + " to " + fullPath)
               await downloadUrlAssetToPath(link, path.join(fullPath, assetFilename))
           }
       }
   }
   
   async function downloadUrlAssetToPath(url, filepath) {
       const request = await fetch(url)
       const fileStream = fs.createWriteStream(filepath)
       return new Promise((resolve, reject) => {
           request.body.pipe(fileStream)
           request.body.on("error", reject)
           fileStream.on("finish", resolve)
       })
   }
   
   // M2 repo folder format:
   // org/apache/arrow/<lib-name>/<version>/<lib-name>-<version>.(ext)
   function getLibraryAndVersionFromAssetFilename(filename) {
       const libraryAndVersionRegex = /(?<library>.+)-(?<version>\d\.\d\.\d.dev\d+)/
       return filename.match(libraryAndVersionRegex)?.groups
   }
   ```
   
   ```sh
   user@MSI:~/projects/arrow-download-nightly-as-maven-repo$ tree org/
   org/
   └── apache
       └── arrow
           ├── arrow-algorithm
           │   └── 8.0.0.dev165
           │       ├── arrow-algorithm-8.0.0.dev165-javadoc.jar
           │       ├── arrow-algorithm-8.0.0.dev165-sources.jar
           │       ├── arrow-algorithm-8.0.0.dev165-tests.jar
           │       ├── arrow-algorithm-8.0.0.dev165.jar
           │       └── arrow-algorithm-8.0.0.dev165.pom
           ├── arrow-avro
           │   └── 8.0.0.dev165
           │       ├── arrow-avro-8.0.0.dev165-javadoc.jar
           │       ├── arrow-avro-8.0.0.dev165-sources.jar
           │       ├── arrow-avro-8.0.0.dev165-tests.jar
           │       ├── arrow-avro-8.0.0.dev165.jar
           │       └── arrow-avro-8.0.0.dev165.pom
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org