You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Joe Mocker <jm...@magnite.com.INVALID> on 2022/03/19 15:08:33 UTC

Building Hadoop on macOS Monterey?

Hi,

Curious if anyone has tips for building Hadoop on macOS Monterey, for Apple Silicon? My goal is to be able to use native (compression) libraries. After some gymnastics, I have been able to compile Hadoop 2.9.1 but problems arise locating and loading dynamic libraries.

For example running hadoop checknative results in the following

22/03/19 07:57:00 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
22/03/19 07:57:00 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
22/03/19 07:57:00 ERROR snappy.SnappyCompressor: failed to load SnappyCompressor
java.lang.UnsatisfiedLinkError: Cannot load libsnappy.1.dylib (dlopen(libsnappy.1.dylib, 0x0009): tried: '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libsnappy.1.dylib' (no such file), 'libsnappy.1.dylib' (no such file), '/usr/lib/libsnappy.1.dylib' (no such file), '/Volumes/work/hadoop-2.9.1/libsnappy.1.dylib' (no such file))!
	at org.apache.hadoop.io.compress.snappy.SnappyCompressor.initIDs(Native Method)
	at org.apache.hadoop.io.compress.snappy.SnappyCompressor.<clinit>(SnappyCompressor.java:57)
	at org.apache.hadoop.io.compress.SnappyCodec.isNativeCodeLoaded(SnappyCodec.java:82)
	at org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:92)
22/03/19 07:57:00 WARN zstd.ZStandardCompressor: Error loading zstandard native libraries: java.lang.InternalError: Cannot load libzstd.1.dylib (dlopen(libzstd.1.dylib, 0x0009): tried: '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libzstd.1.dylib' (no such file), 'libzstd.1.dylib' (no such file), '/usr/lib/libzstd.1.dylib' (no such file), '/Volumes/work/hadoop-2.9.1/libzstd.1.dylib' (no such file))!
WARNING: /work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64//bin/java is loading libcrypto in an unsafe way
Abort trap: 6
 
No matter what combination I try of setting LD_LIBRARY_PATH, DYLD_LIBRARY_PATH and/or DYLD_FALLBACK_LIBRARY_PATH it will not find the necessary libraries. I think this has to do with restrictions due to Apple’s System Integrity Protection (SIP).

The only way I have figured out how to work around this so far is to symlink all the dynamic libraries in one location then run hadoop from that working directory, for example

lrwxrwxr-x  1 mock  staff     59 Mar 18 17:55 libcrypto.dylib@ -> /opt/homebrew/Cellar/openssl@1.1/1.1.1m/lib/libcrypto.dylib
lrwxrwxr-x  1 mock  staff     45 Mar 18 18:09 libhadoop.dylib@ -> /work/hadoop-2.9.1/lib/native/libhadoop.dylib
lrwxrwxr-x  1 mock  staff     53 Mar 18 17:55 libsnappy.1.dylib@ -> /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.dylib
lrwxrwxr-x  1 mock  staff     51 Mar 18 18:05 libzstd.1.dylib@ -> /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.dylib

% $HADOOP_HOME/bin/hadoop checknative
22/03/19 08:05:55 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
22/03/19 08:05:55 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop:  true /Volumes/work/hadoop-2.9.1/lib/native/libhadoop.dylib
zlib:    true /usr/lib/libz.1.dylib
snappy:  true /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.1.1.9.dylib
zstd  :  true /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.5.2.dylib
lz4:     true revision:10301
bzip2:   false 
openssl: false EVP_CIPHER_CTX_cleanup

What am really looking to do is use Spark (and Jupyter), with native libraries, which adds even more wrinkles to it.

Any suggestions would be appreciated.

  —joe

Re: Building Hadoop on macOS Monterey?

Posted by Joe Mocker <jm...@magnite.com.INVALID>.
I realize we are getting off topic for the list, but yeah I had entertained the idea of just running with SIP disabled. After all, I use Linux instances all day…

As I investigated how macOS striped DYLD* & LD*, macOS would still load native libraries in places like your current working directory, so as an attacker I could still arrange for my code to be loaded with enough tinkering without the means I resorted to. ¯\_(ツ)_/¯ 

  —joe

> On Mar 25, 2022, at 10:55 PM, Hariharan <ha...@gmail.com> wrote:
> 
> The stripping of DYLD* and LD* variables is a "feature" that's part of Apple's SIP. So another option to stop this is to disable SIP - https://developer.apple.com/documentation/security/disabling_and_enabling_system_integrity_protection <https://developer.apple.com/documentation/security/disabling_and_enabling_system_integrity_protection>
> 
> Apple doesn't recommend this, but I've been running different macbooks with SIP disabled for years, and haven't noticed any bad side effects.
> 
> As an aside, Apple has a history of crippling various functionalities behind SIP. For example, in the latest versions of Monterey, you can't run certain `dscl` commands unless you either disable SIP or provide "Full Disk Access" to bash.
> 
> Thanks, 
> Hariharan 
> 
> On Fri, 25 Mar 2022, 21:46 Andrew Purtell, <andrew.purtell@gmail.com <ma...@gmail.com>> wrote:
> Thank you for sharing that blog post.
> 
> > The TL;DR is that as soon as macOS executes one if its trusted executables, like /bin/sh or /usr/bin/env, it cripples anything you might have done like setting DYLD_LIBRARY_PATH to dynamic library folders, and results in failure to load them. 
> 
> On the one hand I can see the security requirements that led to this decision, but this is so contrary to the UNIX philosophy, IMHO, it's no wonder it violates the principle of least surprise here and I bet in many many other situations. This reminds me why 'step 1' of setting up for dev on my new M1 macbook was to install Parallels and a Linux aarch64 VM. That environment is quite sane and the VM overheads are manageable. 
> 
> 
> On Fri, Mar 25, 2022 at 9:03 AM Joe Mocker <jmocker@magnite.com <ma...@magnite.com>> wrote:
> Hi, Thanks...
> 
> It ended up being more involved than that due to all the shared library dependencies, but I figured it out (at least with an older version of Hadoop). I ended up writing a short post about it
> 
> https://creechy.wordpress.com/2022/03/22/building-hadoop-spark-jupyter-on-macos/ <https://creechy.wordpress.com/2022/03/22/building-hadoop-spark-jupyter-on-macos/>
> 
>  --joe
> 
> On Thu, Mar 24, 2022 at 3:14 PM Andrew Purtell <andrew.purtell@gmail.com <ma...@gmail.com>> wrote:
> If you build with -Dbundle.snappy -Dbundle.zstd on the Maven command line this would produce a tarball containing copies of the native shared libraries in lib/native/ and this would be like your symlink workaround but perhaps less hacky and something the build supports already. Does this work for you? 
> 
>> On Mar 19, 2022, at 8:09 AM, Joe Mocker <jm...@magnite.com.invalid> wrote:
>> 
>> Hi,
>> 
>> Curious if anyone has tips for building Hadoop on macOS Monterey, for Apple Silicon? My goal is to be able to use native (compression) libraries. After some gymnastics, I have been able to compile Hadoop 2.9.1 but problems arise locating and loading dynamic libraries.
>> 
>> For example running hadoop checknative results in the following
>> 
>> 22/03/19 07:57:00 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
>> 22/03/19 07:57:00 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
>> 22/03/19 07:57:00 ERROR snappy.SnappyCompressor: failed to load SnappyCompressor
>> java.lang.UnsatisfiedLinkError: Cannot load libsnappy.1.dylib (dlopen(libsnappy.1.dylib, 0x0009): tried: '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libsnappy.1.dylib' (no such file), 'libsnappy.1.dylib' (no such file), '/usr/lib/libsnappy.1.dylib' (no such file), '/Volumes/work/hadoop-2.9.1/libsnappy.1.dylib' (no such file))!
>> 	at org.apache.hadoop.io.compress.snappy.SnappyCompressor.initIDs(Native Method)
>> 	at org.apache.hadoop.io.compress.snappy.SnappyCompressor.<clinit>(SnappyCompressor.java:57)
>> 	at org.apache.hadoop.io.compress.SnappyCodec.isNativeCodeLoaded(SnappyCodec.java:82)
>> 	at org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:92)
>> 22/03/19 07:57:00 WARN zstd.ZStandardCompressor: Error loading zstandard native libraries: java.lang.InternalError: Cannot load libzstd.1.dylib (dlopen(libzstd.1.dylib, 0x0009): tried: '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libzstd.1.dylib' (no such file), 'libzstd.1.dylib' (no such file), '/usr/lib/libzstd.1.dylib' (no such file), '/Volumes/work/hadoop-2.9.1/libzstd.1.dylib' (no such file))!
>> WARNING: /work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64//bin/java is loading libcrypto in an unsafe way
>> Abort trap: 6
>>  
>> No matter what combination I try of setting LD_LIBRARY_PATH, DYLD_LIBRARY_PATH and/or DYLD_FALLBACK_LIBRARY_PATH it will not find the necessary libraries. I think this has to do with restrictions due to Apple’s System Integrity Protection (SIP).
>> 
>> The only way I have figured out how to work around this so far is to symlink all the dynamic libraries in one location then run hadoop from that working directory, for example
>> 
>> lrwxrwxr-x  1 mock  staff     59 Mar 18 17:55 libcrypto.dylib@ -> /opt/homebrew/Cellar/openssl@1.1/1.1.1m/lib/libcrypto.dylib
>> lrwxrwxr-x  1 mock  staff     45 Mar 18 18:09 libhadoop.dylib@ -> /work/hadoop-2.9.1/lib/native/libhadoop.dylib
>> lrwxrwxr-x  1 mock  staff     53 Mar 18 17:55 libsnappy.1.dylib@ -> /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.dylib
>> lrwxrwxr-x  1 mock  staff     51 Mar 18 18:05 libzstd.1.dylib@ -> /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.dylib
>> 
>> % $HADOOP_HOME/bin/hadoop checknative
>> 22/03/19 08:05:55 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
>> 22/03/19 08:05:55 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
>> Native library checking:
>> hadoop:  true /Volumes/work/hadoop-2.9.1/lib/native/libhadoop.dylib
>> zlib:    true /usr/lib/libz.1.dylib
>> snappy:  true /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.1.1.9.dylib
>> zstd  :  true /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.5.2.dylib
>> lz4:     true revision:10301
>> bzip2:   false 
>> openssl: false EVP_CIPHER_CTX_cleanup
>> 
>> What am really looking to do is use Spark (and Jupyter), with native libraries, which adds even more wrinkles to it.
>> 
>> Any suggestions would be appreciated.
>> 
>>   —joe
> 
> 
> -- 
> Best regards,
> Andrew
> 
> Unrest, ignorance distilled, nihilistic imbeciles -
>     It's what we’ve earned
> Welcome, apocalypse, what’s taken you so long?
> Bring us the fitting end that we’ve been counting on
>    - A23, Welcome, Apocalypse


Re: Building Hadoop on macOS Monterey?

Posted by Hariharan <ha...@gmail.com>.
The stripping of DYLD* and LD* variables is a "feature" that's part of
Apple's SIP. So another option to stop this is to disable SIP -
https://developer.apple.com/documentation/security/disabling_and_enabling_system_integrity_protection

Apple doesn't recommend this, but I've been running different macbooks with
SIP disabled for years, and haven't noticed any bad side effects.

As an aside, Apple has a history of crippling various functionalities
behind SIP. For example, in the latest versions of Monterey, you can't run
certain `dscl` commands unless you either disable SIP or provide "Full Disk
Access" to bash.

Thanks,
Hariharan

On Fri, 25 Mar 2022, 21:46 Andrew Purtell, <an...@gmail.com> wrote:

> Thank you for sharing that blog post.
>
> > The TL;DR is that as soon as macOS executes one if its trusted
> executables, like /bin/sh or /usr/bin/env, it cripples anything you might
> have done like setting DYLD_LIBRARY_PATH to dynamic library folders, and
> results in failure to load them.
>
> On the one hand I can see the security requirements that led to this
> decision, but this is so contrary to the UNIX philosophy, IMHO, it's no
> wonder it violates the principle of least surprise here and I bet in many
> many other situations. This reminds me why 'step 1' of setting up for dev
> on my new M1 macbook was to install Parallels and a Linux aarch64 VM. That
> environment is quite sane and the VM overheads are manageable.
>
>
> On Fri, Mar 25, 2022 at 9:03 AM Joe Mocker <jm...@magnite.com> wrote:
>
>> Hi, Thanks...
>>
>> It ended up being more involved than that due to all the shared library
>> dependencies, but I figured it out (at least with an older version of
>> Hadoop). I ended up writing a short post about it
>>
>>
>> https://creechy.wordpress.com/2022/03/22/building-hadoop-spark-jupyter-on-macos/
>>
>>  --joe
>>
>> On Thu, Mar 24, 2022 at 3:14 PM Andrew Purtell <an...@gmail.com>
>> wrote:
>>
>>> If you build with -Dbundle.snappy -Dbundle.zstd on the Maven command
>>> line this would produce a tarball containing copies of the native shared
>>> libraries in lib/native/ and this would be like your symlink workaround but
>>> perhaps less hacky and something the build supports already. Does this work
>>> for you?
>>>
>>> On Mar 19, 2022, at 8:09 AM, Joe Mocker <jm...@magnite.com.invalid>
>>> wrote:
>>>
>>> Hi,
>>>
>>> Curious if anyone has tips for building Hadoop on macOS Monterey, for
>>> Apple Silicon? My goal is to be able to use native (compression) libraries.
>>> After some gymnastics, I have been able to compile Hadoop 2.9.1 but
>>> problems arise locating and loading dynamic libraries.
>>>
>>> For example running hadoop checknative results in the following
>>>
>>> 22/03/19 07:57:00 WARN bzip2.Bzip2Factory: Failed to load/initialize
>>> native-bzip2 library system-native, will use pure-Java version
>>> 22/03/19 07:57:00 INFO zlib.ZlibFactory: Successfully loaded &
>>> initialized native-zlib library
>>> 22/03/19 07:57:00 ERROR snappy.SnappyCompressor: failed to load
>>> SnappyCompressor
>>> java.lang.UnsatisfiedLinkError: Cannot load libsnappy.1.dylib
>>> (dlopen(libsnappy.1.dylib, 0x0009): tried:
>>> '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libsnappy.1.dylib'
>>> (no such file), 'libsnappy.1.dylib' (no such file),
>>> '/usr/lib/libsnappy.1.dylib' (no such file),
>>> '/Volumes/work/hadoop-2.9.1/libsnappy.1.dylib' (no such file))!
>>> at org.apache.hadoop.io.compress.snappy.SnappyCompressor.initIDs(Native
>>> Method)
>>> at
>>> org.apache.hadoop.io.compress.snappy.SnappyCompressor.<clinit>(SnappyCompressor.java:57)
>>> at
>>> org.apache.hadoop.io.compress.SnappyCodec.isNativeCodeLoaded(SnappyCodec.java:82)
>>> at
>>> org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:92)
>>> 22/03/19 07:57:00 WARN zstd.ZStandardCompressor: Error loading zstandard
>>> native libraries: java.lang.InternalError: Cannot load libzstd.1.dylib
>>> (dlopen(libzstd.1.dylib, 0x0009): tried:
>>> '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libzstd.1.dylib'
>>> (no such file), 'libzstd.1.dylib' (no such file),
>>> '/usr/lib/libzstd.1.dylib' (no such file),
>>> '/Volumes/work/hadoop-2.9.1/libzstd.1.dylib' (no such file))!
>>> WARNING: /work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64//bin/java is
>>> loading libcrypto in an unsafe way
>>> Abort trap: 6
>>>
>>> No matter what combination I try of setting LD_LIBRARY_PATH,
>>> DYLD_LIBRARY_PATH and/or DYLD_FALLBACK_LIBRARY_PATH it will not find the
>>> necessary libraries. I think this has to do with restrictions due to
>>> Apple’s System Integrity Protection (SIP).
>>>
>>> The only way I have figured out how to work around this so far is to
>>> symlink all the dynamic libraries in one location then run hadoop from that
>>> working directory, for example
>>>
>>> lrwxrwxr-x  1 mock  staff     59 Mar 18 17:55 libcrypto.dylib@ ->
>>> /opt/homebrew/Cellar/openssl@1.1/1.1.1m/lib/libcrypto.dylib
>>> lrwxrwxr-x  1 mock  staff     45 Mar 18 18:09 libhadoop.dylib@ ->
>>> /work/hadoop-2.9.1/lib/native/libhadoop.dylib
>>> lrwxrwxr-x  1 mock  staff     53 Mar 18 17:55 libsnappy.1.dylib@ ->
>>> /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.dylib
>>> lrwxrwxr-x  1 mock  staff     51 Mar 18 18:05 libzstd.1.dylib@ ->
>>> /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.dylib
>>>
>>> % $HADOOP_HOME/bin/hadoop checknative
>>> 22/03/19 08:05:55 WARN bzip2.Bzip2Factory: Failed to load/initialize
>>> native-bzip2 library system-native, will use pure-Java version
>>> 22/03/19 08:05:55 INFO zlib.ZlibFactory: Successfully loaded &
>>> initialized native-zlib library
>>> Native library checking:
>>> hadoop:  true /Volumes/work/hadoop-2.9.1/lib/native/libhadoop.dylib
>>> zlib:    true /usr/lib/libz.1.dylib
>>> snappy:  true /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.1.1.9.dylib
>>> zstd  :  true /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.5.2.dylib
>>> lz4:     true revision:10301
>>> bzip2:   false
>>> openssl: false EVP_CIPHER_CTX_cleanup
>>>
>>> What am really looking to do is use Spark (and Jupyter), with native
>>> libraries, which adds even more wrinkles to it.
>>>
>>> Any suggestions would be appreciated.
>>>
>>>   —joe
>>>
>>>
>
> --
> Best regards,
> Andrew
>
> Unrest, ignorance distilled, nihilistic imbeciles -
>     It's what we’ve earned
> Welcome, apocalypse, what’s taken you so long?
> Bring us the fitting end that we’ve been counting on
>    - A23, Welcome, Apocalypse
>

Re: Building Hadoop on macOS Monterey?

Posted by Andrew Purtell <an...@gmail.com>.
Thank you for sharing that blog post.

> The TL;DR is that as soon as macOS executes one if its trusted
executables, like /bin/sh or /usr/bin/env, it cripples anything you might
have done like setting DYLD_LIBRARY_PATH to dynamic library folders, and
results in failure to load them.

On the one hand I can see the security requirements that led to this
decision, but this is so contrary to the UNIX philosophy, IMHO, it's no
wonder it violates the principle of least surprise here and I bet in many
many other situations. This reminds me why 'step 1' of setting up for dev
on my new M1 macbook was to install Parallels and a Linux aarch64 VM. That
environment is quite sane and the VM overheads are manageable.


On Fri, Mar 25, 2022 at 9:03 AM Joe Mocker <jm...@magnite.com> wrote:

> Hi, Thanks...
>
> It ended up being more involved than that due to all the shared library
> dependencies, but I figured it out (at least with an older version of
> Hadoop). I ended up writing a short post about it
>
>
> https://creechy.wordpress.com/2022/03/22/building-hadoop-spark-jupyter-on-macos/
>
>  --joe
>
> On Thu, Mar 24, 2022 at 3:14 PM Andrew Purtell <an...@gmail.com>
> wrote:
>
>> If you build with -Dbundle.snappy -Dbundle.zstd on the Maven command line
>> this would produce a tarball containing copies of the native shared
>> libraries in lib/native/ and this would be like your symlink workaround but
>> perhaps less hacky and something the build supports already. Does this work
>> for you?
>>
>> On Mar 19, 2022, at 8:09 AM, Joe Mocker <jm...@magnite.com.invalid>
>> wrote:
>>
>> Hi,
>>
>> Curious if anyone has tips for building Hadoop on macOS Monterey, for
>> Apple Silicon? My goal is to be able to use native (compression) libraries.
>> After some gymnastics, I have been able to compile Hadoop 2.9.1 but
>> problems arise locating and loading dynamic libraries.
>>
>> For example running hadoop checknative results in the following
>>
>> 22/03/19 07:57:00 WARN bzip2.Bzip2Factory: Failed to load/initialize
>> native-bzip2 library system-native, will use pure-Java version
>> 22/03/19 07:57:00 INFO zlib.ZlibFactory: Successfully loaded &
>> initialized native-zlib library
>> 22/03/19 07:57:00 ERROR snappy.SnappyCompressor: failed to load
>> SnappyCompressor
>> java.lang.UnsatisfiedLinkError: Cannot load libsnappy.1.dylib
>> (dlopen(libsnappy.1.dylib, 0x0009): tried:
>> '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libsnappy.1.dylib'
>> (no such file), 'libsnappy.1.dylib' (no such file),
>> '/usr/lib/libsnappy.1.dylib' (no such file),
>> '/Volumes/work/hadoop-2.9.1/libsnappy.1.dylib' (no such file))!
>> at org.apache.hadoop.io.compress.snappy.SnappyCompressor.initIDs(Native
>> Method)
>> at
>> org.apache.hadoop.io.compress.snappy.SnappyCompressor.<clinit>(SnappyCompressor.java:57)
>> at
>> org.apache.hadoop.io.compress.SnappyCodec.isNativeCodeLoaded(SnappyCodec.java:82)
>> at
>> org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:92)
>> 22/03/19 07:57:00 WARN zstd.ZStandardCompressor: Error loading zstandard
>> native libraries: java.lang.InternalError: Cannot load libzstd.1.dylib
>> (dlopen(libzstd.1.dylib, 0x0009): tried:
>> '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libzstd.1.dylib'
>> (no such file), 'libzstd.1.dylib' (no such file),
>> '/usr/lib/libzstd.1.dylib' (no such file),
>> '/Volumes/work/hadoop-2.9.1/libzstd.1.dylib' (no such file))!
>> WARNING: /work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64//bin/java is
>> loading libcrypto in an unsafe way
>> Abort trap: 6
>>
>> No matter what combination I try of setting LD_LIBRARY_PATH,
>> DYLD_LIBRARY_PATH and/or DYLD_FALLBACK_LIBRARY_PATH it will not find the
>> necessary libraries. I think this has to do with restrictions due to
>> Apple’s System Integrity Protection (SIP).
>>
>> The only way I have figured out how to work around this so far is to
>> symlink all the dynamic libraries in one location then run hadoop from that
>> working directory, for example
>>
>> lrwxrwxr-x  1 mock  staff     59 Mar 18 17:55 libcrypto.dylib@ ->
>> /opt/homebrew/Cellar/openssl@1.1/1.1.1m/lib/libcrypto.dylib
>> lrwxrwxr-x  1 mock  staff     45 Mar 18 18:09 libhadoop.dylib@ ->
>> /work/hadoop-2.9.1/lib/native/libhadoop.dylib
>> lrwxrwxr-x  1 mock  staff     53 Mar 18 17:55 libsnappy.1.dylib@ ->
>> /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.dylib
>> lrwxrwxr-x  1 mock  staff     51 Mar 18 18:05 libzstd.1.dylib@ ->
>> /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.dylib
>>
>> % $HADOOP_HOME/bin/hadoop checknative
>> 22/03/19 08:05:55 WARN bzip2.Bzip2Factory: Failed to load/initialize
>> native-bzip2 library system-native, will use pure-Java version
>> 22/03/19 08:05:55 INFO zlib.ZlibFactory: Successfully loaded &
>> initialized native-zlib library
>> Native library checking:
>> hadoop:  true /Volumes/work/hadoop-2.9.1/lib/native/libhadoop.dylib
>> zlib:    true /usr/lib/libz.1.dylib
>> snappy:  true /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.1.1.9.dylib
>> zstd  :  true /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.5.2.dylib
>> lz4:     true revision:10301
>> bzip2:   false
>> openssl: false EVP_CIPHER_CTX_cleanup
>>
>> What am really looking to do is use Spark (and Jupyter), with native
>> libraries, which adds even more wrinkles to it.
>>
>> Any suggestions would be appreciated.
>>
>>   —joe
>>
>>

-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
    It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse

Re: Building Hadoop on macOS Monterey?

Posted by Joe Mocker <jm...@magnite.com.INVALID>.
Hi, Thanks...

It ended up being more involved than that due to all the shared library
dependencies, but I figured it out (at least with an older version of
Hadoop). I ended up writing a short post about it

https://creechy.wordpress.com/2022/03/22/building-hadoop-spark-jupyter-on-macos/

 --joe

On Thu, Mar 24, 2022 at 3:14 PM Andrew Purtell <an...@gmail.com>
wrote:

> If you build with -Dbundle.snappy -Dbundle.zstd on the Maven command line
> this would produce a tarball containing copies of the native shared
> libraries in lib/native/ and this would be like your symlink workaround but
> perhaps less hacky and something the build supports already. Does this work
> for you?
>
> On Mar 19, 2022, at 8:09 AM, Joe Mocker <jm...@magnite.com.invalid>
> wrote:
>
> Hi,
>
> Curious if anyone has tips for building Hadoop on macOS Monterey, for
> Apple Silicon? My goal is to be able to use native (compression) libraries.
> After some gymnastics, I have been able to compile Hadoop 2.9.1 but
> problems arise locating and loading dynamic libraries.
>
> For example running hadoop checknative results in the following
>
> 22/03/19 07:57:00 WARN bzip2.Bzip2Factory: Failed to load/initialize
> native-bzip2 library system-native, will use pure-Java version
> 22/03/19 07:57:00 INFO zlib.ZlibFactory: Successfully loaded & initialized
> native-zlib library
> 22/03/19 07:57:00 ERROR snappy.SnappyCompressor: failed to load
> SnappyCompressor
> java.lang.UnsatisfiedLinkError: Cannot load libsnappy.1.dylib
> (dlopen(libsnappy.1.dylib, 0x0009): tried:
> '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libsnappy.1.dylib'
> (no such file), 'libsnappy.1.dylib' (no such file),
> '/usr/lib/libsnappy.1.dylib' (no such file),
> '/Volumes/work/hadoop-2.9.1/libsnappy.1.dylib' (no such file))!
> at org.apache.hadoop.io.compress.snappy.SnappyCompressor.initIDs(Native
> Method)
> at
> org.apache.hadoop.io.compress.snappy.SnappyCompressor.<clinit>(SnappyCompressor.java:57)
> at
> org.apache.hadoop.io.compress.SnappyCodec.isNativeCodeLoaded(SnappyCodec.java:82)
> at
> org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:92)
> 22/03/19 07:57:00 WARN zstd.ZStandardCompressor: Error loading zstandard
> native libraries: java.lang.InternalError: Cannot load libzstd.1.dylib
> (dlopen(libzstd.1.dylib, 0x0009): tried:
> '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libzstd.1.dylib'
> (no such file), 'libzstd.1.dylib' (no such file),
> '/usr/lib/libzstd.1.dylib' (no such file),
> '/Volumes/work/hadoop-2.9.1/libzstd.1.dylib' (no such file))!
> WARNING: /work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64//bin/java is
> loading libcrypto in an unsafe way
> Abort trap: 6
>
> No matter what combination I try of setting LD_LIBRARY_PATH,
> DYLD_LIBRARY_PATH and/or DYLD_FALLBACK_LIBRARY_PATH it will not find the
> necessary libraries. I think this has to do with restrictions due to
> Apple’s System Integrity Protection (SIP).
>
> The only way I have figured out how to work around this so far is to
> symlink all the dynamic libraries in one location then run hadoop from that
> working directory, for example
>
> lrwxrwxr-x  1 mock  staff     59 Mar 18 17:55 libcrypto.dylib@ ->
> /opt/homebrew/Cellar/openssl@1.1/1.1.1m/lib/libcrypto.dylib
> lrwxrwxr-x  1 mock  staff     45 Mar 18 18:09 libhadoop.dylib@ ->
> /work/hadoop-2.9.1/lib/native/libhadoop.dylib
> lrwxrwxr-x  1 mock  staff     53 Mar 18 17:55 libsnappy.1.dylib@ ->
> /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.dylib
> lrwxrwxr-x  1 mock  staff     51 Mar 18 18:05 libzstd.1.dylib@ ->
> /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.dylib
>
> % $HADOOP_HOME/bin/hadoop checknative
> 22/03/19 08:05:55 WARN bzip2.Bzip2Factory: Failed to load/initialize
> native-bzip2 library system-native, will use pure-Java version
> 22/03/19 08:05:55 INFO zlib.ZlibFactory: Successfully loaded & initialized
> native-zlib library
> Native library checking:
> hadoop:  true /Volumes/work/hadoop-2.9.1/lib/native/libhadoop.dylib
> zlib:    true /usr/lib/libz.1.dylib
> snappy:  true /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.1.1.9.dylib
> zstd  :  true /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.5.2.dylib
> lz4:     true revision:10301
> bzip2:   false
> openssl: false EVP_CIPHER_CTX_cleanup
>
> What am really looking to do is use Spark (and Jupyter), with native
> libraries, which adds even more wrinkles to it.
>
> Any suggestions would be appreciated.
>
>   —joe
>
>

Re: Building Hadoop on macOS Monterey?

Posted by Andrew Purtell <an...@gmail.com>.
If you build with -Dbundle.snappy -Dbundle.zstd on the Maven command line this would produce a tarball containing copies of the native shared libraries in lib/native/ and this would be like your symlink workaround but perhaps less hacky and something the build supports already. Does this work for you? 

> On Mar 19, 2022, at 8:09 AM, Joe Mocker <jm...@magnite.com.invalid> wrote:
> 
> Hi,
> 
> Curious if anyone has tips for building Hadoop on macOS Monterey, for Apple Silicon? My goal is to be able to use native (compression) libraries. After some gymnastics, I have been able to compile Hadoop 2.9.1 but problems arise locating and loading dynamic libraries.
> 
> For example running hadoop checknative results in the following
> 
> 22/03/19 07:57:00 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
> 22/03/19 07:57:00 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
> 22/03/19 07:57:00 ERROR snappy.SnappyCompressor: failed to load SnappyCompressor
> java.lang.UnsatisfiedLinkError: Cannot load libsnappy.1.dylib (dlopen(libsnappy.1.dylib, 0x0009): tried: '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libsnappy.1.dylib' (no such file), 'libsnappy.1.dylib' (no such file), '/usr/lib/libsnappy.1.dylib' (no such file), '/Volumes/work/hadoop-2.9.1/libsnappy.1.dylib' (no such file))!
> 	at org.apache.hadoop.io.compress.snappy.SnappyCompressor.initIDs(Native Method)
> 	at org.apache.hadoop.io.compress.snappy.SnappyCompressor.<clinit>(SnappyCompressor.java:57)
> 	at org.apache.hadoop.io.compress.SnappyCodec.isNativeCodeLoaded(SnappyCodec.java:82)
> 	at org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:92)
> 22/03/19 07:57:00 WARN zstd.ZStandardCompressor: Error loading zstandard native libraries: java.lang.InternalError: Cannot load libzstd.1.dylib (dlopen(libzstd.1.dylib, 0x0009): tried: '/Volumes/work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64/zulu-8.jdk/Contents/Home/bin/./libzstd.1.dylib' (no such file), 'libzstd.1.dylib' (no such file), '/usr/lib/libzstd.1.dylib' (no such file), '/Volumes/work/hadoop-2.9.1/libzstd.1.dylib' (no such file))!
> WARNING: /work/zulu8.60.0.21-ca-jdk8.0.322-macosx_aarch64//bin/java is loading libcrypto in an unsafe way
> Abort trap: 6
>  
> No matter what combination I try of setting LD_LIBRARY_PATH, DYLD_LIBRARY_PATH and/or DYLD_FALLBACK_LIBRARY_PATH it will not find the necessary libraries. I think this has to do with restrictions due to Apple’s System Integrity Protection (SIP).
> 
> The only way I have figured out how to work around this so far is to symlink all the dynamic libraries in one location then run hadoop from that working directory, for example
> 
> lrwxrwxr-x  1 mock  staff     59 Mar 18 17:55 libcrypto.dylib@ -> /opt/homebrew/Cellar/openssl@1.1/1.1.1m/lib/libcrypto.dylib
> lrwxrwxr-x  1 mock  staff     45 Mar 18 18:09 libhadoop.dylib@ -> /work/hadoop-2.9.1/lib/native/libhadoop.dylib
> lrwxrwxr-x  1 mock  staff     53 Mar 18 17:55 libsnappy.1.dylib@ -> /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.dylib
> lrwxrwxr-x  1 mock  staff     51 Mar 18 18:05 libzstd.1.dylib@ -> /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.dylib
> 
> % $HADOOP_HOME/bin/hadoop checknative
> 22/03/19 08:05:55 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
> 22/03/19 08:05:55 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
> Native library checking:
> hadoop:  true /Volumes/work/hadoop-2.9.1/lib/native/libhadoop.dylib
> zlib:    true /usr/lib/libz.1.dylib
> snappy:  true /opt/homebrew/Cellar/snappy/1.1.9/lib/libsnappy.1.1.9.dylib
> zstd  :  true /opt/homebrew/Cellar/zstd/1.5.2/lib/libzstd.1.5.2.dylib
> lz4:     true revision:10301
> bzip2:   false 
> openssl: false EVP_CIPHER_CTX_cleanup
> 
> What am really looking to do is use Spark (and Jupyter), with native libraries, which adds even more wrinkles to it.
> 
> Any suggestions would be appreciated.
> 
>   —joe