Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How is that relevant? CPAN is a package repository, not a MCMC sampling library. Can you point me to a Perl library that implements an API for constructing a probabilistic graphical model and then performs inference on it via MCMC, like PyMC3 or STAN? Is it as robust and fully featured as either of those?


Stan isn’t really written in any of those languages either.

The python pystan is wrapper that ships data to/from the Stan binary and marshals it into a python-friendly form; I think Julia’s is similar.

I’m not exactly volunteering to do it, but a PerlStan would not be that hard to implement. As for scientific communication, a point you raised above, I don’t think it’d be too bad. Most readers of a paper would be interested in the model itself, and that would be written in Stan’s DSL regardless.


Fine, STAN is a bad example since it’s written as a DSL parsed by a standalone interpreter.

But tons of other numerical methods are also missing from Perl. To use another stats example, in another comment, I gave the example that PDL only supports random variable generation for common distributions (e.g. normal, gamma, Poisson). Anything beyond stats 101 level and you’re on your own.


In bringing up CPAN, the other poster's point might have been that Matlab/Python/Octave don't generally contain native implementations of these either. A lot of Matlab and NumPy is wrapper around BLAS/ATLAS, for example.

One could do the same with Perl, and in fact, people have. If you need random variates from a Type 2 Gumbel distribution, for example, Math::GSL::Randist has you covered https://metacpan.org/pod/Math::GSL::Randist#Gumbel

Honestly, I'm not rushing to convert our stuff to PDL, but I did want to push back a little on the idea that python is The One True Way to do scientific computing. It's a fine language, but I think a lot of its specific benefits are overstated (or mixed in with the general idea of taking computing seriously).


Yep, there's more than one way to do things and PDL wraps all the same GSL functions <https://metacpan.org/pod/PDL::GSL::RNG#ran_gumbel1>.

Also note that PDL does automatic broadcasting of input variables so it does an entire C loop for an array of values being evaluated. See this example <https://gist.github.com/zmughal/fd79961a166d653a7316aef2f010...> for how that applies to all GSL functions that are available in PDL. Though I do notice that some of the distributions available at <https://docs.scipy.org/doc/scipy/reference/stats.html#contin...> are not in GSL.

Though when I do stats, I often reach for R and have done some work in the past to make PDL work with the R interpreter (it currently has some build bitrot and I need to fix that).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: