Apropos of the hn.algolia.com search used in the above link, it seems pretty good and accurate. I could put my HN user name into the text box along with words like prince and xml, and it worked right.
Only returned comments by me. I did not have to put my name in a separate user name field. I'm guessing this a full-text search feature.
Can someone explain why it is called a "functional" language?
Wikipedia says it's essentially a pure subset of Prolog, but Prolog is not classified as a functional language, so I'm curious what makes this one called a functional language.
I assume there must be some significant language features that e.g. functions.
> Wikipedia says it's essentially a pure subset of Prolog, but Prolog is not classified as a functional language
That statement used “pure” in the sense of “functionally pure” (I'm not sure about “subset”; it certainly lacks some prolog features that are impure, but seems to have features that Prolog lacks; it seems more like it contains a pure subset of Prolog than that it is one.)
> I assume there must be some significant language features that e.g. functions.
I think you missed something I. that sentence, but, yes, Mercury has first-class and higher-order functions.
I think the difference is that in Prolog you have imperative state management while Mercury only has pure functions, and manages state through linear types.
Another difference between Prolog and Mercury is that Mercury lacks 'real' unification. Predicates must be moded (they must be compiled for some particular variable instantiation states). The compiler reorders literals in a way that free variables do not need to be represented in memory. Almost all the 'logical' side of the language is discarded statically at compile-time. The first time that I tried it (+ 20 years ago!) executables were blazingly fast! The bad side (at the time) was poor flexibility and very slow compilation time (compared with Prolog).
Of course, for any prolog program of significant size, you want nearly all of it to be moded and deterministic anyway, and not leave choice points all over the shop.
Prolog feels like imperative programming, but with a special kind of depth first search through options built-in to the language. Mercury feels like a functional language, but with the same depth-first search built-in.
I like some of the semi-determinism or non-determinism, it’s sometimes a really natural way of specifying part of a problem. But it’s only magic sauce part of the time.
You should look at Icon [0][1], which is a lot like... Pascal, or any Algol family language, but which has pervasive depth-first search and backtracking built-in. Icon is and feels very procedural.
Or jq, which also has pervasive depth-first search and backtracking built-in. jq is and feels very functional.
I always wanted to try logic programming, but I felt like I never understood how to use it effectively.
For my own personal coding projects, 99% of the time I am either writing a CLI tool or a daemon/server of some kind. I've attempted to learn Prolog in the past, but I never got to the point where I felt like I could actually write a complete program that interacts with the world in a meaningful way. So I pretty much gave up before ever getting as far as checking out something like Mercury... and I couldn't even figure out how to use miniKanren at all.
I can certainly see the utility of logic programming as an embedded DSL, but I have no idea how to write a "complete" program in a logic language.
Am I missing something? Is there a specific book or guide I should follow? Are there examples of open source programs that use logic programming I should look at?
This seems pretty cool, but I don’t really understand the function-mode concept, does anyone have a good explanation of what it is or why it’s a feature?
Mode declarations aren't really a feature so much as they are additional information provided by the programmer to the compiler. In Prolog if you have a predicate
foo(X,Z) :-
bar(X,Y),
baz(Y,Z).
Traditionally if you call foo(1,Z) it will try to find an answer for bar(1,Y) first then feed that Y value into baz, if you call foo(X,3) it will still try to find values for bar(X,Y) first. Sometimes that's okay because the engine will be like well yeah there's something that'll work so we'll say bar is true and move on to baz but some times it tell you to go suck eggs.
but having to do that sucks. First it's forcing you to think about Prolog's implementation rather than the logical relationship between X and Z; second it's repetitive and forces you to keep both versions in sync with each other which is gonna get rough for more complex predicates.
One of the things Mercury uses that mode info for is to re-arrange the body of foo(X,Y). So if you tell it that you will use both foo(in,out) and foo(out,in) it will compile two different versions of foo and dispatch appropriately.
>> One of the things Mercury uses that mode info for is to re-arrange the body of foo(X,Y). So if you tell it that you will use both foo(in,out) and foo(out,in) it will compile two different versions of foo and dispatch appropriately.
Very good point (missing from my essay below). To add some context, in Prolog at least, because of the depth-first-search implementation of the search for answers (more corretly, but formally: for resolvents) the order in which the clauses of a program are placed in the source file can make the difference between the program entering an infinite recursion when it is called in a certain mode, and the program terminating in the same mode.
Could some kind of topological sort of the clauses, like I think Unix make does for the rules defined in a makefile, be applicable and of help, or is that not meaningful here? PL theory is not my area, just throwing out a wild idea :)
I had been studying make and working on some topological sorting algorithms lately, is how I thought of the idea.
Sort of, not quite far off. Basically it's possible to impose a total ordering over the Herbrand base (the set of atomic formulae from which clauses are composed) or the set of constants in a Prolog program in such a way that a program is guaranteed to terminate, provided it adheres to the given ordering, but that is something that has to be hand-crafted, i.e. there's no known way to come up with such an ordering beforehand.
Sorry I can't think of a very good example right now, but suppose you have a program like this:
p:- q
q:- p
Given Prolog's top-down evaluation, that will enter an infinite recursion. It is possible to avoid this by ordering the Herbrand base so that either p > q or q > p, which precludes one of the two clauses (so p>q is violated by "q:-p" because "p" must come before "q" in a clause). This means that some programs (like the example above) can't be written. With an ordering over constants also, termination can be guaranteed when the lexicographic ordering is violated so for instance, with the natural numbers ordered in descending order, the following program terminates:
Because (given even/1 and odd/1 are relations over the naturals) r(X,Y) can only be a relation between X and Y where X > Y, so the predicate will only count down towards 0, rather than up towards infinity.
This is basically the gist of stratification used to allow decidable recursion or negation in datalog languages.
But that is an ordering of atoms within clauses and constants within atoms, not clauses. I don't know of any principled way to order clauses in a program to guarantee termination. That would be interesting to have.
>Sorry I can't think of a very good example right now
No problem.
I should say that I didn't understand your reply fully, due to not having the full necessary background. But I think I got the rough gist of the first half of the reply, and a part of the second half.
I had learned only a little Prolog, many years ago, just out of interest, only from books, and did not have access to any machine to try it on at the time. But I do remember bits of it, fuzzily (pun intended). So let me take a guess at what the second program means:
>even(0).
That is a fact stating: 0 is even.
>even(X):- r(X,Y), odd(Y).
That is a rule, stating:
X is even if there is a relation r between X and Y, and (comma means and in Prolog, IIRC) if Y is odd.
>odd(X):- r(X,Y), even(Y).
X is odd if there is a relation r between X and Y, and Y is even.
I don't understand the rest of the second half of your reply, at present, although I did think about it a bit. Maybe I need to study it more, or maybe I lack some prequisite knowledge.
>I don't know of any principled way to order clauses in a program to guarantee termination. That would be interesting to have.
Yes, I too think it would. That was the idea and question I originally asked about. I did intuitively think it was a difficult problem, but who knows, it may get solved in the future. I think that would make Prolog quite a bit more powerful.
Anyway, good to know that my idea was not totally off base, at least. Cheers :)
Apologies, I assumed a background in logic programming -I should have explained things better. Yes, your interpretation of the second prorgam is correct and yes your idea is not off base at all :)
To try and put it another way to the other answers, Python functions take inputs in parentheses, do a calculation, and return results to the left of the name. A definite arrow-of-time, inputs first, then calculation, then results. e.g. in this line of code you pass in a width and height, and out comes an area, no other way round:
area = rectangle_area(width, height)
Prolog doesn't have that arrow-of-time, instead (subtle difference) you declare that Area, Width and Height are related by equality and multiplication in some abstract sense:
rectangle_length_width_area(Length, Width, Area)
It's a pattern, not a computation; as well as the obvious giving Legnth, Width and calculating Area, you can fill in Area and Width and get Length. You can fill just Length and get Widths and Areas (plurals) that could go with a fixed Length. You give whichever values you have and Prolog will search for the missing ones trying to match the pattern of how they relate to each other. If you give values for all of them, it will check them out and see if they do relate to each other in this rectangular way and tell you yes or no. If there might be more than one value that can fill in the gap(s) you left, it can generate many answers.
This is one of the ways Prolog can do impressive things - the code you write can express more and do so at a more abstract level than the same amount of code in Python. Downside is is can bloat past what any computer could run in a reasonable time. Function modes remove some of the Prolog magic and add constraints e.g. Length and Width are inputs, Area is an output. Slap on an arrow-of-time, lose the general purpose search and solve, bring Prolog down from magic to mortal realms, gain something that runs on a normal computer.
I think it's more related to Prolog than functions. In prolog you can use a predicate in many ways. For example append(L1, L2, L3) declared that the concatenation of the list L1 and L2 result in L3, but you can use it in a "reverse" way too (thx to Prolog awesomeness) like append(L1, L2, [1,2,3]) and Prolog will enumerate all possibilities for L1 and L2. This can be slow though, and I think 'mode' is a way to declare all this different way to use prediate (in, int, out), (out, out, in), etc. and maybe compile things efficiently. See https://www.mercurylang.org/information/doc-latest/mercury_r...
It's Saturday and I'm on my third coffe (in a very tiny cup) so I hope you will
excuse me a mini-essay to asnwer your question :)
Logic programming languages like Mercury adopt the syntax and semantics of the
predicate calculus, a.k.a. First Order Logic. FOL is the language of predicates,
where predicates are _relations_ between propositional terms. "Propositional"
here means that a term is a, well, propositional logic object or variable. For
example the objects "stassa" and "mercury" are propositions while the formula
"likes(stassa, mercury)" is a relation between the two (and "likes" itself is
the predicate of the relation).
Now, a "relation" is a generalisation of a function. Recall that functions are
mathematical objects that map input terms to output terms, for example the
function ƒ: X → Y maps elements of the set X to elements of the set Y.
Unlike functions, relations have no concept of "inputs" and "outputs". For
example, take "likes(stassa, mercury)". Suppose "stassa" is an element in the
set of persons (sorry) and "mercury" is an element of the set of programming
languages. The following FOL atomic formulae are both true:
∃y: likes(stassa, y)
∃x: likes(x, mercury)
More precisely, they are true given an interpretation (an assingment of truth
values to atomic formulae) that assigns the value "true" to the atomic formula
"likes(stassa,mercury)". "Atomic" just means these are the simplest kinds of
formulae in FOL syntax. The thing to keep from this is that a truth value can be
assigned to atomic formulae in FOL regardless of which argument is a variable or
a constant term.
By contrast, to implement likes/2 as a function [1] we would have
to split it in two: one function taking as input a "person" and returning a
"language" and the other taking as input a "language" and returning a "person":
ƒ₁: P → L, e.g. ƒ₁(stassa) = mercury
ƒ₂: L → P, e.g. ƒ₂(mercury) = stassa
How this translates in logic programming, and particularly Prolog, practice, is
that Prolog programs can be called with different combinations of instantiated
and non-instantiated arguments.
Here's an example (the typical example is the "append" predicate, mentioned in
the mercury site above, but this should be simpler). The "%" denotes a comment
in Prolog an the example works in Swi-Prolog:
?- length([a,b,c,d], N). % (1)
N = 4.
?- length(List, 3). % (2)
List = [_11524, _11530, _11536]
(Note that the underscored terms are variables created by the Swi-Prolog
internals).
length/2 is a Prolog predicate (a logic program) that determines the length of a
Prolog list (a linked list represented as [H|T] where H is the head element and
T is the rest of the list). The two queries above show that length/2 can be
called with either the first or second argument instantiated to a fully-ground
term (a term that has no variables) and length/2 will work in a way that makes
sense.
The first call, in (1), is made with a ground list "bound" (instantiated) to the
first argument of length/2 and with a free variable (N) as its second argument
(variables start with uppercase letters or the underscore, in Prolog). When
length/2 exits, it binds the constant "4" (an integer) to the second argument
and Prolog reports that the variable N is now bound to 4. Thus length/2 has
determined the length of the list [a,b,c,d]. The second call leaves the first
argument as a variable and binds the integer "3" to the second argument. Now
length/2 _generates_ a list of the given length. The elements of the generated
list are all free variables, a detail of the implementation of length/2.
It is also possible to call length/2 with _both_ arguments as free variables:
?- length(List, N).
List = [],
N = 0 ;
List = [_13650],
N = 1 ;
List = [_13650, _14782],
N = 2 ;
List = [_13650, _14782, _15914],
N = 3 .
In this case, length/2 generates lists of increasing length, again a detail of
its implementation ("[]" is the empty list, of length "0" and ";" asks for more
results).
Now, try to think what would happen if we made the following query; the
findall/3 predicate is used to generate all results of a predicate call:
?- findall(List-N, length(List,N), LNs). % (3)
Well, here's what happens (the output is from Swi-Prolog):
What happens is that there is no limit to the number of lists of arbitrary
length that can be generated by length/2, so findall/3 continues to generate
them until the Prolog interpreter runs out of its allocated resources. Even
given infinite resources the query in (3) would never terminate- it would
continue to generate lists of increasing length for ever.
A straightforward way to avoid this situation is to avoid calling length/2 with
both arguments as free variables, at least when we want to generate all
solutions. So we could mandate that if one argument is a variable, the other one
must be bound to an at least partially ground term.
In logic programming convention, we represent these acceptable instantiations
with various notations, the most commoon of which is to use "+" to mean that an
argument must be instantiated on call, "-" to mean that an argument must be left
as a free variable and "?" to mean that either is acceptable.
For length/2, we could declare:
length(+, -).
length(-, +).
These are the "modes" in which the programmer of length/2 communicates to users
how the program can be called. We can add some additional information:
length(+, -) is det.
length(-, +) is det.
"det" meaning that the predicate is deterministic _in that mode_, where
"deterministic" means that a call of the predicate at the declared mode succeeds
once, yielding one result, or fails if a result cannot be produced given the
arguments. We could also declare the (?,?) mode as _nondeterministic_ meaning
that it generates multiple results (or fails, once):
length(?, ?) is nondet.
This last mode declaration still allows the dangerous call in (3) above, but at
least now the programmer knows to expect this.
Now, these call and determinism modes are a _convention only_ in Prolog and most
other logic programming languages, meaning that they are solely used for
documentation purposes and are not enforced by interpreters or compilers.
Mercury instead makes mode declarations a part of the syntax of the language.
The Mercury website claims that this helps to catch a lot of bugs. While I have
not actually used Mercury, I certainly agree that having no clear inputs and
outputs, or determinism constraints, is a major source of bugs in Prolog code
and enforcing mode declarations is sure to help avoid some major sources of pain
when programming in Prolog. How good is Mercury at doing that, I can't tell. As
the website says, detecting determinism is undecidable but there seem to be some
heuristics that work "often enough".
________________
[1] I'm fudging the definition of "function" between mathematical functions,
that are typically not discussed in terms of "inputs" and "outputs", or
"returns" and programming-language functions, which are. Apologies, I hope the
meaning is clear from the context.
Targets gnu C extensions if available. Are there many which are both useful, type safe, memory safe and not in LLVM? because I suspect reasoning about back end compiled state is a thing for a strongly typed deterministic language and gnu extensions are funny.
The thing about how its syntactic sugar over pure-ish prolog was fascinating. (Talks about how its self hosted, bootstrapped up from prolog because they could)
Things gets better if you try a couple of hours rather than minutes. If you've never been exposed to the logic programming paradigm this is completely understandable. Isn't it?
https://en.wikipedia.org/wiki/Prince_(software) https://yeslogic.com/blog/allsorts-rust-font-shaping-engine/