This is Part 4 in a series. You can start at the Intro here.

The last post of this series was kind of more about replicating numpy's `linspace`

function in Perl 6 than it was about testing the limits of this matplotlib wrapper. In this part I am going to hit one of those limits and encounter a minor short-coming of the wrapper as it currently exists.

If you're following along, I had decided to take a shot at one of the histograms and jumped into the first one named `histogram_demo_features`

. I took a glance at the Python code and froze...

import numpy as np import matplotlib.mlab as mlab import matplotlib.pyplot as plt

Straight off the bat I'm in trouble: Queue dramatic music. The `Matplotlib`

wrapper I'd been using was really a wrapper for `matplotlib`

's sub-module `pyplot`

. My wrapper gave me no access to other sub-modules like `mlab`

. I was going to have to modify the wrapper.

In Python, importing a package doesn't necessarily give me access to the sub-packages, eg. If I import the `matplotlib`

base package, I can't use `matplotlib.pyplot`

or `matplotlib.mlab`

... but! If I import *only* `matplotlib.pyplot`

, I can now also use `matplotlib.mlab`

. Maybe (probably) I just don't understand Python packaging.

Stefan Seifert, the guy who created the `Inline::Python`

module is a clever guy. Me, I'm not so clever, so I just kinda hacked away until things worked, and this is what I landed on. I imported `matplotlib`

at the package level (which will run when the wrapped modules is `use`

d) and defined `pyplot`

and `mlab`

as their own class.

use Inline::Python; my $py = Inline::Python.new(); $py.run('import matplotlib.pyplot'); class Matplotlib::Mlab { method FALLBACK($name, |c) { $py.call('matplotlib.mlab', $name, |c); } } class Matplotlib::Plot { method FALLBACK($name, |c) { $py.call('matplotlib.pyplot', $name, |c); } } class Matplotlib { method FALLBACK($name, |c) { $py.call('matplotlib', $name, |c); } }

Oh, I also dropped the `py`

from `pyplot`

in my module because... reasons. I've also got a class for the top-level `matplotlib`

package. I'm not sure if there's methods in there I will need to call, but it doesn't hurt to be prepared.

I don't even know if jumped this hurdle, or just kinda kicked it over and stumbled ahead. Let me know if you have a more sane way to I could have done this. In any case, what mattered to me most at the time was that it worked and I could move on to playing with plots. To use this fancy new wrapper in all my previous examples, all I need to do is change the class instantiation from this

my $plt = Matplotlib.new;

... to this

my $plt = Matplotlib::Plot.new;

Which actually maps closer the Python code, anyways. With that out of the way I can move on the next few lines of code.

np.random.seed(0) # example data mu = 100 # mean of distribution sigma = 15 # standard deviation of distribution x = mu + sigma * np.random.randn(437)

I can guess what `random.seed`

does. Pseudo-random number generators (or PRNG's) use an algorithm to compute a random number; this is the "pseudo" part of pseudo-random. Provided you start the algorithm at the same number (the seed) each time, the result is always the same. How the seed is obtained normally (and how the random numbers are generated) differs between operating systems and programming languages. The `seed`

function in Perl is called `srand`

, so that part's easy.

Then we come to `randn`

. A quick search led me to this StackOverflow post where I learned that it creates a "normal distribution." That link jumps to one of the replies, which is from an actual statistician! This helpful human explains that a normal distribution is *"a distribution where the values are more likely to occur near the mean value"*. So, think bell curve.

I'm not a stats guy. Heck, I'm not even a maths guy... So I headed to RosettaCode to grab a normal distribution function in Perl 6. I modified it slightly (hopefully without breaking it) so that behaves like a very simple clone of `numpy.random.randn`

, and like `numpy`

, stuck it in it's own sub-module to the `Numpl`

module I created in Part 3.

class Numpl::Random { method randn($n) { sqrt( -2 × log(rand) ) × cos( τ × rand ) xx $n; } } class Numpl { # linspace stuff ... method random { Numpl::Random.new(); } }

Which means I could now do this

my $np = Numpl.new; my $x = $np.random.randn(437)

The next few lines are pretty straight-forward, so moving now to `mlab.normpdf`

. Ok, so the comment there tells me that this thing adds a line of "best fit", but what the heck does it have to do with PDF? Being curious, I did a search and found out it stands for 'Probability Density Function'. With my curiosity quenched, I converted the rest of the code to Perl without much fanfare.

use Numpl; use Matplotlib; my $np = Numpl.new; my $plt = Matplotlib::Plot.new; my $mlab = Matplotlib::Mlab.new; srand(0); # example data my $mu = 100; # mean of distribution my $sigma = 15; # standard deviation of distribution my $x = $np.random.randn(437).map( * × $sigma + $mu ); my $num_bins = 50; my ( $fig, $ax ) = $plt.subplots(); # the histogram of the data my ( $n, $bins, $patches ) = $ax.hist( $x, $num_bins, :normed(1) ); # add a 'best fit' line my $y = $bins.map(-> $value { $mlab.normpdf( $value, $mu, $sigma ) }); $ax.plot($bins, $y, '--'); $ax.set_xlabel('Smarts'); $ax.set_ylabel('Probability density'); $ax.set_title('Histogram of IQ: $\mu=100$, $\sigma=15$'); # Tweak spacing to prevent clipping of ylabel $fig.tight_layout(); $plt.show();

So, um, yeah... Not much to say here that hasn't been covered. I'm using a `map`

again on the results of `randn`

and `norpdf`

. The rest is pretty standard translation stuff, and here's the result.

Even though I am seeding the PRNG, Perl will generate random numbers differently than Python, so this doesn't look exactly like the one in the gallery. You can remove the `srand`

to get a different graph each time. The colours, however, are a little... academic. I think next I'll try applying one of the style sheets to a graph.

To be continued...