You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Andy LoPresto (JIRA)" <ji...@apache.org> on 2017/01/10 19:09:58 UTC

[jira] [Created] (NIFI-3313) First deployment of NiFi can hang on VMs without sufficient entropy if using /dev/random

Andy LoPresto created NIFI-3313:
-----------------------------------

             Summary: First deployment of NiFi can hang on VMs without sufficient entropy if using /dev/random
                 Key: NIFI-3313
                 URL: https://issues.apache.org/jira/browse/NIFI-3313
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
    Affects Versions: 1.1.1
            Reporter: Andy LoPresto
            Assignee: Andy LoPresto
            Priority: Critical


h1. Analysis of Issue

h2. Statement of Problem:

NiFi deployed on headless VM (little user interaction by way of keyboard and mouse I/O) can take 5-10 minutes (reported) to start up. User reports this occurs on a "secure" cluster. Further examination is required to determine which specific process requires the large amount of random input (no steps to reproduce, configuration files, logs, or VM environment information provided). 

h2. Context

The likely cause of this issue is that a process is attempting to read from _/dev/random_, a \*nix "device" providing a pseudo-random number generator (PRNG). Also available is _/dev/urandom_, a related PRNG. Despite common misperceptions, _/dev/urandom_ is not "less-secure" than _/dev/random_ for all general use cases. _/dev/random_ blocks if the entropy *estimate* (a "guess" of the existing entropy introduced into the pool) is lower than the amount of random data requested by the caller. In contrast, _/dev/urandom_ does not block, but provides the output of the same cryptographically-secure PRNG (CSPRNG) that _/dev/random_ reads from \[myths\]. After as little as 256 bytes of initial seeding, accessing _/dev/random_ and _/dev/urandom_ are functionally equivalent, as the long period of random data generated will not require re-seeding before sufficient entropy can be provided again. 

As mentioned earlier, further examination is required to determine if the process requiring random input occurs at application boot or only at "machine" (hardware or VM) boot. On the first deployment of the system with certificates, the certificate generation process will require substantial random input. However, on application launch and connection to a cluster, even the TLS/SSL protocol requires some amount of random input. 

h2. Proposed Solutions

h3. rngd

A software toolset for accessing dedicated hardware PRNG (*true* RNG, or TRNG) called _rng-tools_ \[rngtools\] exists for Linux. Specialized hardware, as well as Intel chips from IvyBridge and on (2012), can provide hardware-generated random input to the kernel. Using the daemon _rngd_ to seed the _/dev/random_ and _/dev/urandom_ entropy pool is the simplest solution. 

*Note: Do not use _/dev/urandom_ to seed _/dev/random_ using _rngd_. This is like running a garden hose from a car's exhaust back into its gas tank and trying to drive.*

h3. Instruct Java to use /dev/urandom

The Java Runtime Environment (JRE) can be instructed to use _/dev/urandom_ for all invocations of {{SecureRandom}}, either on a per-Java process basis \[jdk-urandom\] or in the JVM configuration \[oracle-urandom\], which means it will not block on server startup. The NiFi {{bootstrap.conf}} file can be modified to contain an additional Java argument directing the JVM to use _/dev/urandom_. 

h2. Other Solutions

h3. Entropy Gathering Tools

Tools to gather entropy from non-standard sources (audio card noise, video capture from webcams, etc.) have been developed such as audio-entropyd \[wagner\], but these tools are not verified or well-examined -- usually when tested, they are only tested for the strength of their PRNG, not the ability of the tool to capture entropy and generate sufficiently random data unavailable to an attacker who may be able to determine the internal state. 

h3. haveged

A solution has been proposed to use {{havaged}} \[haveged\], a user-space daemon relying on the HAVEGE (HArdware Volatile Entropy Gathering and Expansion) construct to continually increase the entropy on the system, allowing _/dev/random_ to run without blocking. 

However, on further investigation, multiple sources indicate this solution may be insecure \[dice\]\[leek-havege\]. 

Michael Kerrisk: 

bq. Having read a number of papers about HAVEGE, Peter \[Anvin\] said he had been unable to work out whether this was a "real thing". Most of the papers that he has read run along the lines, "we took the output from HAVEGE, and ran some tests on it and all of the tests passed". The problem with this sort of reasoning is the point that Peter made earlier: there are no tests for randomness, only for non-randomness.
bq. One of Peter's colleagues replaced the random input source employed by HAVEGE with a constant stream of ones. All of the same tests passed. In other words, all that the test results are guaranteeing is that the HAVEGE developers have built a very good PRNG. It is possible that HAVEGE does generate some amount of randomness, Peter said. But the problem is that the proposed source of randomness is simply too complex to analyze; thus it is not possible to make a definitive statement about whether it is truly producing randomness. (By contrast, the HWRNGs that Peter described earlier have been analyzed to produce a quantum theoretical justification that they are producing true randomness.) "So, while I can't really recommend it, I can't not recommend it either." If you are going to run HAVEGE, Peter strongly recommended running it together with rngd, rather than as a replacement for it.

Tom Leek:

bq. Of course, the whole premise of HAVEGE is questionable. For any practical security, you need a few "real random" bits, no more than 200, which you use as seed in a cryptographically secure PRNG. The PRNG will produce gigabytes of pseudo-\[data\] indistinguishable from true randomness, and that's good enough for all practical purposes.
bq. Insisting on going back to the hardware for every bit looks like yet another outbreak of that flawed idea which sees entropy as a kind of gasoline, which you burn up when you look at it.

h2. Next Steps

As described above, further investigation is necessary, but moving forward, barring new information, I would propose directing the JVM to use _/dev/urandom_ and making _rngd_ available to systems that support a TRNG. 

[myths] http://www.2uo.de/myths-about-urandom/
[rngtools] https://git.kernel.org/cgit/utils/kernel/rng-tools/rng-tools.git/about/
[jdk-urandom] http://stackoverflow.com/a/2325109/70465
[oracle-urandom] https://docs.oracle.com/cd/E13209_01/wlcp/wlss30/configwlss/jvmrand.html
[wagner] https://people.eecs.berkeley.edu/~daw/rnd/
[haveged] http://www.issihosts.com/haveged/
[dice] https://lwn.net/Articles/525459/
[leek-havege] http://security.stackexchange.com/a/34552/16485




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)