Expand/Shrink

(is_/get_[max])prime[s/_le]

Definition: integer res = is_prime(atom n, integer bIndex=-1)
-- or --
atom res = get_prime(integer k)
-- or --
sequence res = get_primes(integer count=0)
-- or --
sequence res = get_primes_le(integer hi, overrun=0)
-- or --
integer maxprime = get_maxprime(atom m)
Description: Determine whether an integer is prime, or retrieve the nth prime or a table of prime numbers or a prime index.

n: return true if n is a prime number, false otherwise, or the result of a binary_search() when bIndex is true.
bIndex: false = use binary search, true = return result from that, -1 (default) = use trial division (see notes).
k: obtain the kth prime number.
count: the number of primes to obtain, see notes below.
hi: return a sequence of all primes <= hi.
overrun: in case you need a few extra (must not be -ve).
m: return the index of the highest prime <= floor(sqrt(m+1)).
pwa/p2js: Supported.
Comments: The file builtins/primes.e (an autoinclude) contains an extensible prime generator routine, that maintains an internal table of previously generated prime numbers, initially set to {2,3,5,7}.

All five functions extend said table when necessary, then:
is_prime() performs a binary search when it can/must, and/or uses it for trial division on larger numbers.
get_prime() returns a single element from it.
get_primes() returns an exact or cropped copy of it.
get_primes_le() uses a fast binary_search() to crop it.
get_maxprime() performs a fast binary_search() for floor(sqrt(m+1)).

Note that get_prime(0) yields 0, in preference to crashing.
If count is omitted or zero, the current table is returned, which can be as short as the initial four shown above.
If count is positive, at least that number of primes will be returned.
If count is negative, then exactly abs(count) primes are returned.
The latter would normally be a little slower than creating a non-trimmed copy, especially when it gets quite big.
However, rather than say sum(get_primes(26)[1..26]) you may as well get this routine to trim it for you, ie use sum(get_primes(-26)) instead.

You are of course free to mangle your own copy of any result from get_primes(), but for obvious reasons you are not allowed to damage or otherwise wreak havoc on the internal table.
In some cases better performance may be noticed if any previous results from get_primes() are explicitly discarded (ie set to {}) before invoking get_prime[s]() again (ie avoid any unnecessary cloning), and of course it is a pretty cheap operation to re-fetch the same thing, that is assuming it does not need any further extending, or trimming. You can also invoke get_prime(-1), which happens more by accident than design to return 7[1], to clear/re-initialise the internal table, should you need to release the memory for some other use.
[1] get_primes(-1) always returns {2}, which is slightly more rational/sensible.

The biggest limiting factor is the size of the internal table. Like all Phix sequences, it can easily contain 100,000,000 items, and can probably contain 200 million, but the theoretical limit of 400 million is quite likely to get thwarted by something else being in the way (aka memory fragmentation or some other variable allocated slap in the middle of addressable memory). Those numbers are potentially much bigger on 64 bit, depending on physical memory and the operating system. I can tell you that the 100,000,000th prime is 2,038,074,743 which suggests the sort of limits that might be applicable to parameter n of is_prime() on 32-bit [which is defined as atom to allow "integer" values above 1,073,741,823], but otherwise there are no deliberate restrictions in place here, other than the occasional precision validation test.

The mpz_prime() function is likely to be significantly faster for larger numbers, and obviously is not limited to 31/53/63/64 bit values like the native 32/64-bit Phix integer/atoms are, and it will probably also use and hog less memory, then again this needs no dlls and is perfectly adequate and fast enough, for most of the time, that is assuming the results of any calculations are also all within the normal Phix integer/atom [accuracy] limits.

Fairly obviously the +1 in get_maxprime() avoids glitches along the lines of sqrt(64) = 7.99999, which inevitably becomes increasingly likely as m gets significantly larger, especially on 64-bit where as mentioned elsewhere there is no implicit (thin veneer of) rounding (that hides a multitude of sins) when storing numbers out of the hardware FPU, as yet.

Some time ago, is_prime() was renamed as is_prime2() and a replacement put in pfactors.e, which was sometimes faster...
As of 1.0.2 they have been merged [in an attempt] to get the best of both [which for the most part failed/gained almost nowt].
NB passing false for bIndex is required to obtain the same performance profile previously achieved by is_prime2().

Should bIndex not be -1, specifically true (as opposed to not 0) or false, the internal primes table is extended to cover n,
and then a fast k := binary_search() is performed on it, before finally returning iff(bIndex?k:k>0).
Note that building the complete primes table can be a bit of a memory hog, or even run out, and/or entail higher one-off setup costs.
When bIndex is true and is_prime() returns a positive result, then calling get_prime() with that will retrieve the corresponding prime number, or of course it could be used to subscript some prior result from get_primes[_le]() that you may have knocking about.
Obviously is_prime(0) yields false, because it isn’t (unlike some otherwise equivalent functions in some other programming languages that I won’t name), as in fact does any n<2, in other words all negative n as well, even when bIndex is true.

When bIndex is -1, it uses a sometimes faster/simplified version of code from prime_factors(). As just mentioned, invoking is_prime() with only a single parameter, ie not extending the internal table (beyond floor(sqrt(n)), can be much faster and deal with a much larger n.
Bear in mind, however, that while that is (usually) pretty fast, even on 15 or 16 digit numbers, it is still a trial division approach and no doubt mpz_prime() will be much faster on larger numbers, plus of course the latter is not bound by any 53/64 bit precision limits.
In some cases, is_prime(,false) is significantly faster, but sometimes much slower, up to a factor of 5 either way.
The general advice (assuming you do not want an index) is to experiment with is_prime(n[,false]) to see which is the faster.
[all my attempts to automate that decision for you have yielded pretty dismal results, yet it may be something as simple as ">1e7"...]
Example:
?get_prime(1) -- prints 2
?get_primes(-26) -- prints {2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101}
?get_primes_le(100) -- prints {2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97}
?get_maxprime(100) -- prints 4 (as in get_prime(4) is 7, since 11^2 > 100)
for i=1 to maxprime(1e12) do -- the idx of all primes < 1e6(==sqrt(1e12))
Implementation: See builtins\primes.e (an autoinclude) for details of the actual implementation.
See Also: binary_search, mpz_prime
Expand/Shrink