pqueue
The file builtins\pqueue.e (an autoinclude) contains a basic implementation of priority queues, plus
more traditional stacks and queues (see sub-page).
A priority queue is kind of fast to-do list, whereby you can add as many items as you want in any order and at any time, and quickly retrieve either the lowest or highest element (but not both), depending on whether the priority queue was created as a MIN_HEAP or a MAX_HEAP. You can add items on-the-fly, between peeling a few off, and the head of the priority queue will always be the min/max element.
Relatively small lists gain little, however I can attest to seeing this thoroughly trounce an earlier version of a program that had to process over 800,000 numbers in a different order to the way they were created, by a factor of 17.5 (0.8s vs 14s), and yet the only real difference was how those numbers were being stored and sorted. While a standard sort() is generally pretty fast, it can (sometimes) be completely outclassed by a priority queue, especially if (as per the example below) we can avoid the need to generate everything up-front, and/or avoid any repeated sorting or scanning.
For more technical details about the implementation of priority queues (a kind of shape-constrained binary tree kept in a flat list), see the comments at the start of builtins\pqueue.e - note that at the time of writing no consideration (or testing) whatsoever has been given to thread-safety.
Like dictionaries, there is a default queue which works as if pq_new() had returned 1, which can simplify things slightly when only one (MIN_HEAP) queue is required.
Traditionally the priority is an integer (as in the example below), but it can also be a float, string, or even a complex nested sequence. A custom comparison routine can also be provided for priorities held as (say) mpfr or mpz.
Priority queues particularly shine when used with generators, whereby as each item is popped the next (if any) for that generator (which is always >= than the one just popped, but may be lower or higher than other generators in the queue) is immediately pushed back on. [By "generator" I just mean anything you can get another value from, be that a routine_id, an index, a file handle, whatever - the data can be anything, the only constraint is the priority has to be sortable.]
These routines are fully supported by pwa/p2js.
Obviously the key point is it gets 25 (32) after 52 (25) and
33 (27) and before 72 (49), which it achieves by popping and
re-adding the p=2 generator more times than the p=3, p=5, and p=7 generators. Incidentally it would be quite fair
to say that code pushes far too many primes, apart from a starter it only needs to push the next prime when it
pops a p==v. Without a priority queue, that kind of thing can get real fiddly real fast. Should you be looking
for a true (in-order-of-addition) queue, simply use a standard Phix sequence [update:
or see the new subpage].
Note there is no way to examine any element in the queue other than the head.
If you define a MIN_HEAP queue, the minimum priority is readily available, the maximum is not.
If you define a MAX_HEAP queue, the maximum priority is readily available, the minimum is not.
Should you require both min and max to be available, and perhaps a few things in the middle, just use a dictionary instead.
A priority queue is kind of fast to-do list, whereby you can add as many items as you want in any order and at any time, and quickly retrieve either the lowest or highest element (but not both), depending on whether the priority queue was created as a MIN_HEAP or a MAX_HEAP. You can add items on-the-fly, between peeling a few off, and the head of the priority queue will always be the min/max element.
Relatively small lists gain little, however I can attest to seeing this thoroughly trounce an earlier version of a program that had to process over 800,000 numbers in a different order to the way they were created, by a factor of 17.5 (0.8s vs 14s), and yet the only real difference was how those numbers were being stored and sorted. While a standard sort() is generally pretty fast, it can (sometimes) be completely outclassed by a priority queue, especially if (as per the example below) we can avoid the need to generate everything up-front, and/or avoid any repeated sorting or scanning.
For more technical details about the implementation of priority queues (a kind of shape-constrained binary tree kept in a flat list), see the comments at the start of builtins\pqueue.e - note that at the time of writing no consideration (or testing) whatsoever has been given to thread-safety.
Like dictionaries, there is a default queue which works as if pq_new() had returned 1, which can simplify things slightly when only one (MIN_HEAP) queue is required.
Traditionally the priority is an integer (as in the example below), but it can also be a float, string, or even a complex nested sequence. A custom comparison routine can also be provided for priorities held as (say) mpfr or mpz.
Priority queues particularly shine when used with generators, whereby as each item is popped the next (if any) for that generator (which is always >= than the one just popped, but may be lower or higher than other generators in the queue) is immediately pushed back on. [By "generator" I just mean anything you can get another value from, be that a routine_id, an index, a file handle, whatever - the data can be anything, the only constraint is the priority has to be sortable.]
These routines are fully supported by pwa/p2js.
Example:
integer powers = pq_new(), p, v
sequence res = {}
for i=1 to 23 do
p = get_prime(i)
pq_add({p,p},powers) -- {2,3,5,7,11,...}, aka power(prime,1)
{p,v} = pq_pop(powers)
res &= v
pq_add({p,v*p},powers) -- {4,8,9,16,25,...}, aka power(prime,>1)
end for
pq_destroy(powers)
printf(1,"The first 23 primes or perfect powers are:\n")
pp(res) -- {2,3,4,5,7,8,9,11,13,16,17,19,23,25,27,29,31,32,37,41,43,47,49}
| integer pqid = |
|
| integer res = |
pq_size(integer pqid=1) - obtains the number of entries currently in the list. |
| bool res = |
pq_empty(integer pqid=1) - yields true when the list is empty,
false otherwise.return pq_size(pqid)==0) |
| |
| sequence item = |
|
| object data = |
|
| sequence item = |
pq_peek(integer pqid=1) - as pq_pop but leaves item in the queue. |
|
Note there is no way to examine any element in the queue other than the head.
If you define a MIN_HEAP queue, the minimum priority is readily available, the maximum is not.
If you define a MAX_HEAP queue, the maximum priority is readily available, the minimum is not.
Should you require both min and max to be available, and perhaps a few things in the middle, just use a dictionary instead.