#core java #java
This article is inspired by Monty-Hall problem/paradox. Now I found this paradox disturbing initially (this is why it’s paradox, wink…), and decided to quickly model it to validate for myself. That brought idea of writing up all I know about Java Random API in this article.
If you want to read about paradox itself click here
java.util.Random
Instantiation
Random generator is a bit group sequence generator. Random constructor accepts “seed” parameter which is just a starting value for the sequence generator. This is why if you seed 2 Random instances with same number and call same next() methods in exact sequence, generated number will be exactly the same.
To protect default Random constructor from such behavior, Java uses current system time to seed generator. However it is possible that 2 threads will instantiate Random objects at exactly same nano second, therefore current time is XOR’ed with “seed uniquifier” value which is guaranteed to be unique for each thread via AtomicLong#compareAndSet().
Please note, that if you need multiple threads served by random generator it might be beneficial to use ThreadLocalRandom, that reduces thread contention.
Number generation
Actual “random” number generation is done by next() method as all other methods are just wrappers around that to create correct variable, using some bit arithmetic.
The next() method itself is multiplication, addition and masking with carefully selected values that ensure good uniformity. This means that once you have same bits generated by 2 random generators (which could be achieved via setSeed() method), next generated sequences will be exactly the same, similar to what is achieved by having same seed from the start.
Number predictability
At this point you may feel that aforementioned predictability may break security of system, if Random is used to generate tokens, passwords, IDs, etc. In fact you can calculate Random seed using just 2 subsequent generated values. Also since next() returns only 32 bytes, having longer (e.g. 128-bit) token will not improve security.
When to use
Use Random or ThreadLocalRandom when you care only about uniformity of generated numbers. These will give you uniform probabilities and decent performance.
java.security.SecureRandom
As said before biggest problem with Random is that it’s reversible. To break this we may use cryptographically strong (requires many iterations to guess input based on output) hashing function and apply it to generated numbers. In fact SecureRandom is a mini-framework to assemble you own generator from seeding, byte generation and hashing function.
Structure
SecureRandom is not a generator itself, but a Template pattern around how random generator must operate. It uses Java SPI functionality, that allows you to hook in any implementation of SecureRandomSpi that would be used to actually generate seed and bytes. Other than that, SecureRandom operates similarly to Random.
In case you need to use different generators in the same application, you can use constructors/factory methods with your implementation or Provider to get SecureRandom instances with different backends.
So what if you don’t provide your own SPI implementation? Then SecureRandom will fall back to default Sun implementation - sun.security.provider.SecureRandom. Let’s further describe what this Sun implementation does to generate random numbers.
Initialization
To obtain random seed Sun implementation uses SeedGenerator object. Here you have 3 options:
- You can specify a system property to read seed from your URL. In this case URLSeedGenerator instance will be used
- If you don’t go 1st option it will attempt to use NativeSeedGenerator (more on this below)
- If your OS does not support NativeSeedGenerator, it will fall back to using ThreadSeedGenerator, which will generate seed based on how much time OS takes to switch between application threads.
NativeSeedGenerator is the instance that most likely to be used. This generator will use random device of your OS/hardware:
- /dev/random/ or /dev/urandom on Unix-likes
- Crypto API on Windows These devices generate randomness based on user/OS events, e.g. interrupts (IO events such as keystrokes and network events, hardware timers, etc). The exact device used may be found in $JAVA_HOME/lib/security/java.security file under securerandom.source property.
Why would URLSeedGenerator be necessary? It could be used to implement true random number generator, e.g. you use weather forecast to generate randomness. This data may be supplied as web service and your random generator may use URL of that service to seed itself.
Number generation
Number generation is similar to Random, however every generated number gets hashed using algorithm that is specified on SecureRandom creation. This way if attacker gets hold of generated values, it is harder to figure out number generation algorithm as they also have to brute force values to get original ones.
When to use
Use when you need to generate anything that is security-related: temporary passwords, session IDs, tokens, etc.
/dev/random blocking
I though I write this up separately, as finding this took N man-hours to find out.
If your system uses /dev/random, it will block until minimal level of randomness is generated. Normally OS save some randomness before restart and no blocking happens. However, if you launch containerized application, there is nothing stored since last launch and you will have to wait. This may take very long time (I saw this taking over 1 hour on Amazon EC instance), especially if you run it on server setting when no one is pressing keyboard.
Be wary that some Java components use SecureRandom under the hood, e.g. UUID (when it generates IDs) or File (when creating temporary files), so you may encounter this in situations outside security domain.
In place of conclusion
Hopefully this sheds some light on how Java random number generators work under the hood. Thanks for reading and stay safe!