Expand/Shrink

[prime_]factors / powers

Definition: sequence s = prime_factors(atom n, integer duplicates=false, maxprime=100)
-- or --
sequence s = prime_powers(atom n)
-- or --
sequence s = factors(atom n, object include1=0)
Description: prime_factors() returns a list of prime factors of n.
prime_powers() returns a list of prime powers of n.
factors() returns a list of all integer factors of n.
pwa/p2js: Supported.
Comments: The parameter n must be a non-negative integer (as stored in an atom) between 0 and power(2,iff(machine_bits()=32?53:64)).

When duplicates is false, prime_factors() always returns {} when passed a prime number (or <2).
When duplicates is true, and n>=1, prime_factors() returns a list such that multiplying all the elements of it together produces the value n, ie a true decomposition. In fact, prime_factors(1,true) yields {1} as a special case to fulfil that requirement, despite 1 not actually being a prime number. That special case is also the one and only time the result of prime_factors() can ever contain a 1.
However, prime_factors(0) yields {} no matter what duplicates and maxprime are set to (hopefully less annoying than {0}).
The duplicates parameter can also take the special values 2 and 3 as detailed in auxillary functions below.

The maxprime argument limits the number of prime factors that are checked, for more details see mpz_prime_factors(), which is also capable of handling much larger inputs.
The get_maxprime() function can be used to obtain a suitable maxprime index by performing a fast binary_search() for floor(sqrt(p)), which ensures all appropriate primes are checked. However that routine is unlikely to be suitable for mpz_prime_factors() for reasons outlined therein. Alternatively you can pass -1 to maxprime (in this routine) and have it invoke get_maxprime() for you, but of course that potentially means an unnecessary sqrt() and binary_search() every time.

Values above the stated limits trigger an exception, since otherwise the dropped bits would make the result quite incorrect and utterly meaningless. Should you by some strange circumstance happen to know that several trailing bits of a very large n are meant to be 0 (something this routine could not possibly test) and therefore there really isn’t any precision loss, I supppose you could handle the powers of 2 yourself and let these routines (only have to) deal with the in-range remainder - in other words divide by power(2,k) for some k and either prepend (say) repeat(2,k) to the head of the result, or in the case of prime_powers() either prepend {2,k} or add k into the first {2,j} should there be one.

The prime_powers(n) function decomposes n into powers of small primes, eg 720 ==> {{2,4},{3,2},{5,1}}, with each element being a {prime,power} pair. Note this perseveres all the way on up to 16|20 digit numbers, and may get rather tardy towards the precision limit, as per Example 2 below. This is closer to mpz_prime_factors() than prime_factors() is was, as of 1.0.2 [for code reuse reasons] this routine has become a trivial thin shim to prime_factors(n,2,-1).

The factors() routine returns a list of all integer factors of n, and in fact when n>1e7 it turns out faster to construct the result from the prime_powers(). The result of factors(0) is {}, irrespective of the include1 setting.

If the optional include1 parameter is:
 1 the result includes 1 and n,
-1 the result includes 1 but not n,
 0 (the default), the result includes neither 1 nor n, and
-9 then n is a lim_chk setting (0=fail maxint, 1=pass it, see technicalia).
The (uppercase) strings "BOTH","JUST1","NEITHER", and "SET_LIM" can also be used to the same effect.

The factors() routine is really only suitable for relatively small values of n, for larger values consider using prime_factors(), prime_powers(), or mpz_prime_factors() instead.

On 32-bit, the power(2,53)-1 limit of 9,007,199,254,740,991 performs some 47,453,132 trial divisions, which at several hundred clock cycles (each) takes just under 2 seconds on my machine (update: just over half that with the >1e7 speedup handling mentioned above).
That should give you some idea of just how far you can push this routine before needing to a) look for something better, or b) devise a smarter approach/algorithm that relies a little less on brute force and ignorance.
Auxillary functions: bool res = square_free(atom n, integer maxprime=100)
returns true if prime_factors(n,true,maxprime) would contain no duplicates, but without constructing any unnecessary internal lists, and quitting early on failure. square_free(0) yields true, because it is.

integer fc = factor_count(atom n) effectively returns length(factors(n,1)), but without actually constructing it, and is actually the product of all the (power+1) from prime_powers(), eg from prime_powers(720) being {{2,4},{3,1},{5,1}}, factor_count(720) is (4+1)*(2+1)*(1+1)=30 but as just said without building and extracting from any intermediate.

atom fs = factor_sum(atom n) effectively returns sum(factors(n,1)), but again without actually constructing it, also with a dash of crafty number theory for extra speed (geddit?), though to be fair you’d need a pretty big n/factor list for either this or factor_count() to make any noticeable difference.

Just so you know, internally these routines have been quite heavily refactored to minimise code duplication, such that prime_powers(n) is in fact implemented as a trivial thin shim to prime_factors(n,2,-1), and factor_count(n) as prime_factors(n,3,-1), however refactoring square_free() as prime_factors(n,4,-1) or similar has been left as an exercise for the avid reader (that is a joke btw!).

The checks on the legal range of n and handling of -1 in maxprime as noted above apply equally to these routines.
Examples:
?factors(6)            -- {2,3}
?factors(6,include1:=1) -- {1,2,3,6}
?factors(6,include1:=-1) -- {1,2,3}
?prime_factors(6) -- {2,3}
?prime_factors(720) -- {2,3,5}
?prime_factors(12345) -- {3,5,823}
?prime_factors(27)                  -- {3}
?prime_factors(27,duplicates:=true) -- {3,3,3}
?prime_factors(27,duplicates:=2)    -- {{3,2}}
?get_prime(101) -- 547
atom a = power(2*get_prime(101),2)
?prime_factors(a,true)      -- {2,2,299209}
?prime_factors(a,true,101)  -- {2,2,547,547}
?prime_powers(a)            -- {{2,2},{547,2}}
Example 2:
-- Worst case scenario. Be warned this chews through memory and made my machine almost
-- completely unresponsive for several minutes, nearly forcing a hard reboot. 
-- [The times shown exclude the get_prime() costs, which are actually even more.]
-- Of course under 32-bits it crashes fairly quickly, on the second iteration.
for b=53 to 65 do
    integer pn = get_maxprime(power(2,b))
    atom a = get_prime(pn-1)*get_prime(pn), t1 = time()
    sequence pa = prime_powers(a)
    printf(1,"b:%d, pn:%,d, a:%,d, prime_powers:%v (%s)\n",{b,pn,a,pa,elapsed(time()-t1)})
end for

--/* output:
C:\Program Files (x86)\Phix>p64 test01
b:53, pn:5484599, a:9007195909437503, prime_powers:{{94906247,1},{94906249,1}} (0.5s)
b:54, pn:7603554, a:18014382671793161, prime_powers:{{134217649,1},{134217689,1}} (0.6s)
b:55, pn:10545586, a:36028786674750007, prime_powers:{{189812501,1},{189812507,1}} (0.9s)
b:56, pn:14630844, a:72057554846356433, prime_powers:{{268435367,1},{268435399,1}} (1.2s)
b:57, pn:20306370, a:144115174032001927, prime_powers:{{379625041,1},{379625047,1}} (1.7s)
b:58, pn:28192751, a:288230356824359011, prime_powers:{{536870879,1},{536870909,1}} (2.3s)
b:59, pn:39155484, a:576460715868510101, prime_powers:{{759250091,1},{759250111,1}} (3.2s)
b:60, pn:54400029, a:1152921423002469787, prime_powers:{{1073741783,1},{1073741789,1}} (4.5s)
b:61, pn:75601572, a:2305842851326038979, prime_powers:{{1518500183,1},{1518500213,1}} (6.3s)
b:62, pn:105097566, a:4611685975477714963, prime_powers:{{2147483629,1},{2147483647,1}} (8.8s)
b:63, pn:146144319, a:9223371873002223329, prime_powers:{{3037000453,1},{3037000493,1}} (12.2s)
b:64, pn:203280222, a:18446743979220271189, prime_powers:{{4294967279,1},{4294967291,1}} (17.3s)

C:\Program Files (x86)\Phix\test01.exw:7
argument to prime_powers() exceeds maximum precision

==> see C:\Program Files (x86)\Phix\ex.err
Press Enter...
--*/
Implementation: See builtins\pfactors.e (an autoinclude) for details of the actual implementation.
See Also: get_primes, mpz_prime_factors, mpz_factors, Floats Are Not Exact
Expand/Shrink