Adjective Noun

Home

Pick and Choose (Part N)

2020-06-26 22:51, Tags: combinatorics

Hey, you remember this series, right? No? Oh that's probably because the last article was years ago. If you're interested, here's the previous articles...

Ahh, those heady days of 2018. I had big dreams. I had hoped my momentum for working on this module would eventually pick back up. I realised along the way that I'm sitting on some half-decent functions that no one can really use because I haven't published them. That changes today.

Today, I have published Math::Combinatorics to CPAN. It features the following functions

  • multicombinations (aka combinations with replacement)
  • variations (aka k-permutations of n)
  • partitions
  • derangements

Of those, only derangements is not written in NQP. My skill with NQP is that of a amateur, so I may not have written the most efficient code. However the implementations written in NQP should at least be noticeably faster than most pure-Raku functions implementing the same algorithms.

I held off on publishing this module for so long because I wanted to polish it, provide more functions, and implement faster .skip on things like permutations, where the n-th permutation in a sequence can be determined algorithmically.

In hindsight, it was silly not to publish sooner. It doesn't prevent me from working on those things, and encourages others to get involved too.

Sometimes in life, you have to pick and choose your battles.

Exportation Exploration - Update

2018-10-29 02:40, Tags: raku

I'm a little embarrassed, although I can't say I'm entirely surprised. In the last post, I made mention that I was possibly missing something obvious. I was right. In the Reddit comments, user FCO posted a gist (that's my untouched fork of FCO's work) that made me feel like an idiot. For all my aliasing experimentation, I'd totally overlooked the idea of declaring a function that does what EXPORT does and calling it... literally anything... and then exporting that function as EXPORT into the importers namespace.

Stripping the gist down to it's simplest form, is this snippet, which I have put up on the ecosystem.

multi sub exported-EXPORT(%exports, *@names --> Hash()) {
    do for @names -> $name {
        unless %exports{ $name }:exists {
            die("Unknown name for export: '$name'");
        }
        "&$name" => %exports{ $name }
    }
}

multi sub EXPORT {
    my %exports;
    multi sub trait_mod:<is>(Routine:D \r, Bool :$exportable!) is export {
        trait_mod:<is>(r, :exportable(r.name => True));
    }
    multi sub trait_mod:<is>(Routine:D \r, :$exportable!) is export {
        trait_mod:<is>(r, :export($exportable));
        %exports{ r.name } = r
    }
    {
        '&EXPORT' => sub (*@names) { exported-EXPORT %exports, |@names }
    }
}

It's beautiful in it's simplicity. A function is declared that returns key/value pairs, where the keys are names, and the values are functions. For all intents and purposes, this will be the EXPORT sub that the importer uses. The trick is that it takes an additional hash as an argument. In this modules EXPORT sub, that very hash is declared, along side the exportable trait that populates that hash with functions that use it. Finally, the function that was initially declared is exported in to the module writers namespace as EXPORT. It is given the hash of exports, as well as whatever list is passed to it by module user in the use statement.

It also handles tags, hence the call to trait_mod:<is>(:export).

The snippet does not handle any aliasing, and I think outside of experimentation, that's a good idea. Imagine it handled import aliasing with the syntax described earlier, and people start using it. Then, later down the track, Rakudo adds an exportable (or something) trait that does the same thing, and because it's in core, people stop using my module. I'm cool with that, but if people who are using those modules are using some funky import aliasing trick, that might break someones code, and I'm not cool with that.

To be clear, I think aliasing is a useful feature, but the syntax should be standardised first. In the reddit comments, raiph pointed out that if you want your users to be able to alias imports, your exportable functions should be module scoped

use Exportable;
module Foo {
    our sub bar is exportable { ... }
    our sub baz is exportable { ... }
}

This allows functions to be imported by name, or called with the fully qualified name... and aliasing can be done with a simple assignment

use Foo 'bar';

sub baz { ... }

say bar();
say Foo::baz();

my &bazz := &Foo::baz;

say bazz();

In any event, the Exportable module is now available to use, and handles the syntax in the original article where I said "Ultimately what I'd like is something like this". Thanks again FCO.

Exportation Exploration

2018-10-26 09:30, Tags: raku perl python

Well it's been another long while since I updated. You know how it is, things in life exist in a hierarchy, and playing with code comes behind family and work. In any case, I'm still working on my combinatorics modules... and as I move closer to an actual release, I've been ruminating on different ways to export functions from my module.

Maybe this is an unpopular opinion, but I have a slight distaste for modules that import functions by default. It's unfortunate that the most Huffmanised (quickest, easiest) way of exporting functions...

unit module Foo

sub bar is export { ... }

... results in polluting the users namespace. I would have hoped that we as a community would have moved to strongly preferring selective imports... but TIMTOWTDI?

A running theme of mine seems to be "let's look at what Python does", so why stop now? Typically, if I import a namespace in Python, I can only use symbols from that namespace by prefixing them with the namespace.

>>> import string
>>> list(string.ascii_lowercase)[:10]
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
>>> ascii_lowercase
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'ascii_lowercase' is not defined

Alternatively, I can selectively import individual functions, and only those functions are exported, not the namespace they came from...

>>> from string import ascii_lowercase
>>> ascii_lowercase
>>> list(ascii_lowercase)[:10]
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
>>> string.ascii_lowercase
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'string' is not defined

As far as I know, symbols are never imported implicitly. I think this export/import system is one of the things Python got mostly right.

So what can Rakoons start doing right now to do better. If your module is OO-based (ie. it's functions are exported via method calls on an instantiated class) then this doesn't apply... but if your module exports individual functions, I would urge you not to use is export trait, at least not without tags.

There's a couple of options when it comes to enforcing selective imports. The easiest is by tagging all your imports. I don't think that's necessarily what export tags were created for, but as a module writer, it's the certainly the simplest way to avoid polluting the users namespace.

unit module Foo

sub bar is export(:bar) { ... }
sub baz is export(:baz) { ... }

The :ALL tag is implied by the is export trait

The downside is I have to type the function name twice, and make sure I spell it correctly the second time, and remember to rename the tag if I rename the function.

Then, users can selectively import functions like this:

use Foo :bar;

say bar();
# baz() or Foo::baz() not available here

Again, I'm not sure this is the intended use for tags, but as you know, TIMTOWTDI.

In Perl, the generally standard way to export functions is to have your package inherit from the core Exporter module, then define a list of functions that are exported by default in the package scoped array @EXPORT, and selective imports in @EXPORT_OK.

package Foo;

use parent 'Exporter';
our @EXPORT_OK = qw( bar baz )

sub bar { ... }
sub baz { ... }

This means that enforcing selective imports in Perl requires a minimum of only 2 lines of code, but this also suffers from having to type the function name a second time.

Similar to how @EXPORT_OK was defined, the importing code declares a list of words to the module to selectively import those names...

use Foo qw( bar );

say bar()
# baz() not available, though Foo::baz() is

To get similar import syntax in Raku - in contrast to Perl's 2 lines - one has to jump through a few hoops.

my %exports;
module Foo {
    sub bar { ... }
    sub baz { ... }

    %exports = MY::.grep(*.key.starts-with: '&');
}

multi sub EXPORT(*@names) {
    my %imports;
    for @names -> $name {
        unless %exports{"&$name"}:exists {
            die("Unknown name for export: '$name'");
        }
        %imports{"&$name"} := %exports{"&$name"}
    }
    return %imports
}

It's a little esoteric, but I'm creating an %exports hash that contains all symbol names in the Foo namespace that start with & (ie. the names of Callable symbols) and creating a key/value Pair from the name/symbol. Later on in the EXPORT sub, I iterate through any provided names, and if they exist in %exports, then they are added to the %imports hash. It also allows me to fail early if the user tries to import something I don't export.

Despite the verbosity, at least I didn't have to type my function names twice... but this exports all my functions regardless if I want to export them or not! This could be controlled via a trait... but I'll come back to that.

The result is that I can then import individual symbols with a list of names, like so.

use Foo <bar>

say bar();
# baz() or Foo::baz() not available here

As per the comment above, Foo::baz() is also unavailable. In most cases that doesn't matter, and I could just import the symbols I want. However, there were situations in Perl where it was handy to be able to use the fully qualified name, typically when there was a name collision.

In those cases, it would be nice to be able to use the full name, or perhaps import a symbol using a different name. Python users are very used being able to alias symbols on import

>>> from itertools import combinations_with_replacement as cwr
>>> list(cwr('ABC', 2))
[('A', 'A'), ('A', 'B'), ('A', 'C'), ('B', 'B'), ('B', 'C'), ('C', 'C')]

... and given a little modification to my EXPORT function, I can kinda do something similar with Raku.

multi sub EXPORT(*@args) {
    my %imports;
    for @args -> $arg {
        my ($name, $symbol) =  $arg ~~ Pair ?? |$arg.kv !! $arg, $arg;
        unless %exports{"&$name"}:exists {
            die("Unknown name for export: '$name'");
        }
        %imports{"&$symbol"} := %exports{"&$name"}
    }
    return %imports
}

Which allows me to do this

use Foo \[ 'bar', baz => 'frob' ];

say bar();
say frob(); # calls the Foo::baz() function

You might have seen functions that allow you to export a symbol of your choice. Here I'm just (ab)using that syntax... but it should be possible to add some sugar around this syntax.

Ultimately, as it stands, I can't say I'm entirely happy with the current state of exporting and selective importing. I think making the is export export names by default was the wrong choice, particularly considering the :DEFAULT tag exists. Ideally, is export would have made the function available for selective import by name. Module writers wishing to pollute their users namespace would then have to write is export(:DEFAULT). To put it another way, Raku should make it easier to do the right thing.

The upshot is that Raku is a young language, and there's still the possibility for course correction. I'd like to see some simpler syntax for making functions exportable (and eventually, alias-able). For starters, a new is exportable that allows symbols to be exported by name (and not default) would go some ways to improving the situation.

I've made some attempts... I can improve my example above (which made all Callable's exportable) by using a trait to mark exportable functions.

multi sub trait_mod:<is>(Routine:D \r, :$exportable!) {
    r.^mixin( role { method is_exportable { True } } )
}

my %exports;
module Foo {
    sub bar is exportable { ... }
    sub baz is exportable { ... }

    %exports = MY::.grep(*.value.?is_exportable);
}

This still requires me to paste in my funky EXPORT sub. I know I can write a module that exports an EXPORT multi, but I don't know how to access the importers PseudoStash to handle the exporting. The closest I got was reaching into CALLER::CALLER::() from the EXPORT sub, which feels wrong.

I truly don't understand the intricacies of exporting enough to know how to do it. As always, it's highly probable that I'm doing things completely wrong, or missing something obvious. Ultimately what I'd like is something like this...

# Exportable by 'foo'
sub foo is exportable { ... }

# Exportable by 'bar' and the :b tag
sub bar is exportable(:b) { ... }

# Exportable by 'baz' and the :b tag
sub baz is exportable(:b) { ... }

# Is not exportable
sub qux { ... }

The beauty is that is would practically be as easy as is export, doesn't require typing the function name twice, and would arguable provide a better experience for both module writers, and end users. I suspect this could possibly be done in the module space, but I think eventually having something like this in core would encourage it's usage.

If you know how to write a trait like this, I'd be interested on your input. I'm also interested in other users thoughts in general on Raku's current export functionality. Comment on Reddit.

Update!

Pick and Choose (Part 2)

2018-03-17 01:04, Tags: combinatorics

So I'm learning a bunch about combinatorics, improving my knowledge of Raku Iterators, and getting better at NQP. Of those 3 things, writing NQP is probably the least interesting one to write about, so I'm not sure I'll talk much about it, and I covered a little on Iterators already in part 1½.

Before deciding to write a fast, native, combinatorics module, I did a lot of playing around to find ways to do produce of these sequences as a simple (often one-liner) solution... even if they weren't terribly fast. For someone who never did more than high school maths, I ran up against ideas that might be obvious to the more mathematically inclined... but it's these realisations that were the most interesting to me, so that's mainly what I'm going to talk about.

The next combinatorial algorithm I wanted to implement was a type of permutation known as "k-permutations of n", or P(n, k). Although, looking up information on this algorithm, I learned that it's not strictly a permutation in the true sense of the word, so it has been known by some other names... such as "variations" or "n choose k".

Similar to a normal permutations, this algorithm is about the different ways you can arrange n items. The catch is, you can only arrange up to k of those items. I don't have a cool ice cream analogy to explain this one, but since order matters, the first one that comes to mind is a racing podium. In a given race of 8 racers, how many different possible ways could you order the participants at 1st, 2nd, and 3rd place? The answer is P(8, 3) == 336.

One of the reasons I wanted to tackle this algorithm next is because - like combinations_with_repetition - it's available in Python under the itertools module. It's generated when using the permutations function, with an additional argument (the k in P(n, k)).

>>> from itertools import permutations
>>> list(permutations([1, 2, 3, 4], 2))
[(1, 2), (1, 3), (1, 4), (2, 1), (2, 3), (2, 4),
 (3, 1), (3, 2), (3, 4), (4, 1), (4, 2), (4, 3)]

Again, like my previous multicombinations algorithm, I had originally worked out a one-liner that produces the same results, albeit, not in lexicographic order

> (1, 2, 3, 4).combinations(2).map(|*.permutations)
((1 2) (2 1) (1 3) (3 1) (1 4) (4 1)
 (2 3) (3 2) (2 4) (4 2) (3 4) (4 3))

If you can get over the odd order, this looks pretty efficient because I am not throwing away any produced elements. However, there is quite a lot of overhead associated with creating so many short lived permutations Iterators. Of course, if you just need a short solution with no dependencies, this will suffice... but we can do a little better.

I realised that with sequences of k elements, where k ≥ n - 1, I could just generate the permutations and trim the sub-lists.

> (^3).combinations(2).map(|*.permutations).sort
((0 1) (0 2) (1 0) (1 2) (2 0) (2 1))
> (^3).permutations».[^2]
((0 1) (0 2) (1 0) (1 2) (2 0) (2 1))

... and then I realised that when k == n - 2, I could generate the permutations, skip every second element, and trim the sub-lists

> (^4).combinations(2).map(|*.permutations).sort
((0 1) (0 2) (0 3) (1 0) (1 2) (1 3) (2 0) (2 1) (2 3) (3 0) (3 1) (3 2))
> (^4).permutations[0, 2 ... *]».[^2]
((0 1) (0 2) (0 3) (1 0) (1 2) (1 3) (2 0) (2 1) (2 3) (3 0) (3 1) (3 2))

Some of you are smarter than me and know where this is heading... but I had to run a few more examples until I realised that I could always generate permutations, and the number of elements to skip is (n - k)! ie. the factorial of n - k.

sub postfix:<!> ($n) { [×] 1 .. $n }  # Factorial function

sub variations(@n, $k) {
    @n.permutations[ 0, * + (@n - $k)! ... * ].map(*[^$k])
}

You might think that since I'm throwing away data, it's inefficient... and you'd be right! However, it's got less overhead than the first one-liner; it's only making one call to permutations, and it's doing a relatively simple index into some lists. It benches faster too, and there are some simple ways to speed this up a little if you sacrifice it's brevity... but if I talk about them now, I'll run out of paper... That's how this computer contraption works, right?

So now on to the meat. How do I efficiently generate this sequence algorithmically? This algorithm was a bit harder to track down, particularly because it's known by a few names, the main one of which is confusingly close to a different algorithm. I also couldn't find a RosettaCode task that talks about this specific sequence.

Eventually I found a blog post titled "A Simple, Efficient P(n,k) Algorithm" by some chap named Alistair Israel. The description matched what I was looking for, so I got to work translating it to Raku, and here it is.

sub variations(@n is copy, $k) {

    my @i = ^@n.elems;
    my $m = @n.end;
    my $e = $k - 1;

    gather loop {

        take @n[ @i[^$k] ];

        my $j = $k;
        $j++ while $j$m && @i[$e] ≥ @i[$j];

        if $j$m {
            swap( @i[$e], @i[$j] );
        }
        else {
            @i[$k .. $m] .= reverse;

            my $i = $e - 1;
            $i-- while $i0 && @i[$i] ≥ @i[$i + 1];

            last if $i < 0;

            $j = $m;
            $j-- while $j > $i && @i[$i] ≥ @i[$j];

            swap( @i[$i], @i[$j] );
            @i[$i + 1 .. $m] .= reverse;
        }
    }
}

Don't run it yet! There is no swap function in Raku. Of course, I can simply swap natively...

> my @x = 0..5
[0 1 2 3 4 5]
> (@x[2], @x[4]) = (@x[4], @x[2]); @x
[0 1 4 3 2 5]

... but I have to repeat myself too much, and since swap is a fairly common task in permutation algorithms, it's best to factor it out. You could do that with a plain ol' function

sub swap($a is rw, $b is rw) {
    ($a, $b) = $b, $a;
}

Or if you feeling extra adventurous, you could experiment with the current macro system.

use experimental :macros;

macro swap($a, $b) {
    quasi {
        ({{{$a}}}, {{{$b}}}) = {{{$b}}}, {{{$a}}}
    }
}

I say current to re-enforce that macros are an experimental feature they may change in future releases.

Whichever one you choose, this variations function - in pure Raku - benches faster than the previous one-line for shorter sequences like P(4, 2), but starts to lose on longer sequences like P(6, 5), but that should be rectified by converting this algorithm to NQP. Which it was... and is! I have pushed the NQP based variations to Github if you wanna give it a spin.

There still some improvements to make. It occurs to me that calculating the count-only each time is silly. I should calculate it once during the Iterator creation, and store it for later use.

Lastly... My guess is that if people are playing with combinatorics, they're probably playing with factorials as well... So the module now also has two additional exports: a factorial() function, and factorial-op which exports the postfix ! factorial operator. Factorials are interesting in themselves and I have been looking into faster implementations of that, so that will most likely make it's way into later commits.

Earlier