In this post I’ll introduce a style of programming which may be totally unfamiliar and perhaps a bit magical to many readers. What if you could write a unit test for a problem and have your compiler automatically return a correct implementation? Call it "test only" instead of "test first." I’m going to translate the problem itself into a specification language and use Microsoft Research’s Z3 theorem prover to find a solution. I won’t write any implementation code at all!

### A Simple Problem

Here’s the problem I’ll use as an example, which is problem #4 from the Project Euler:

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.

Find the largest palindrome made from the product of two 3-digit numbers.

The usual approach is to use brute force. Here’s a C# example, which I suspect is similar to what many people do:

`(`

, Factor2 = factor2, Product = product}).First()fromfactor1inEnumerable.Range(100, 899)fromfactor2inEnumerable.Range(100, 899)letproduct = factor1 * factor2whereIsPalindrome(product)// defined elsewhereorderbyproductdescendingselect new{ Factor1 = factor1

This is not a terrible solution. It runs pretty quickly and returns the correct solution, and we can see opportunities for making it more efficient. I suspect most people would declare the problem finished and move on.

However, the LINQ syntax may obscure the fact that this is still a brute force solution. Any time we have to think about how to instruct a computer to find the answer to the problem instead of the characteristics of the problem itself, we add cognitive overhead and increase the chances of making a mistake.

Also, what is that `IsPalindrome(product)`

hiding? Most people implement this by converting the number to a string, and comparing it with the reversed value. But it turns out that the mathematical properties of a palindromic number are critical to finding an efficient solution.

Indeed, you can solve this problem fairly quickly with pencil and paper so long as you don’t do `IsPalindrome`

this way! (It would probably double the length of this post to explain how, so I’ll skip that. If there’s demand in comments I can explain, otherwise just read on.)

### Solving with Pure Specification

For my purely declarative solution, I’m going to use the language SMT-LIB. As a pure specification language, it doesn’t allow me to define an implementation at all! Instead, I’ll use it to express the constraints of the problem, and then use MSR’s Z3 solver to find a solution. SMT-LIB uses Lisp-like S-expressions, so, yes Virginia, there will be parentheses, but don’t let that scare you off. It’s always worthwhile to learn languages which make you think about problems differently, and I think you’ll find SMT-LIB really delivers!

Since this language will seem unusual to many readers, let’s walk through the problem step by step.

First, we need to declare some of the variables used in the problem. I use "variable" here in the mathematical, rather than software, sense: A placeholder for an unknown, but not something to which I’ll be assigning varying values. So here are three variables roughly equivalent to the corresponding C# vars above:

`(`

declare-constproduct Int) (declare-constfactor1 Int) (declare-constfactor2 Int)

In an S-expression, the first item inside the parentheses is the function, and the remaining items are arguments. So

is the function here and the remaining items are the variable name and its "sort" (type).**declare-const**

Next, the problem says that `product`

must be the, ahem, product of the two factors:

`(`

assert(=(*factor1 factor2) product))

"

" sounds like a unit test, doesn’t it? Indeed, many people coming to a theorem prover from a software development background will find that programming them is much more similar to writing tests than writing code. The line above just says that **assert**`factor1 * factor2 = product`

. But it’s an assertion, not an assignment; we haven’t specified values for any of these variables.

The problem statement says that both factors are three digit numbers:

`(`

assert(and(>=factor1 100) (<=factor1 999))) (assert(and(>=factor2 100) (<=factor2 999)))

Mathematically, what does it mean for a number to be a palindrome? In this case, the largest product of 3 digit numbers is going to be a six digit number of the form **abccba**, so product = 100000**a** + 10000**b** + 1000**c** + 100**c** + 10**b** + **a**. As I noted above, expressing the relationship this way is key to finding a non-brute-force solution using algebra. But you don’t need to know that in order to use Z3, because Z3 knows algebra! All you need to know is that you should express relationships mathematically instead of using string manipulation.

`(`

declare-consta Int) (declare-constb Int) (declare-constc Int) (assert(=product (+(*100000 a) (*10000 b) (*1000 c) (*100 c) (*10 b) a)))

I implied above that **a**, **b**, and **c **are single-digit numbers, so we need to be specific about that. Also, **a** can’t be 0 or we won’t have a 6 digit number.

`(`

assert(and(>=a 1) (<=a 9))) (assert(and(>=b 0) (<=b 9))) (assert(and(>=c 0) (<=c 9)))

These 4 assertions about a, b, and c are enough to determine that product is a palindrome. We’re not quite done yet, but let’s see how we’re doing so far. `(`

asks Z3 if there is a solution to the problem we’ve posed, and **check-sat**)`(`

displays that solution. Here’s the entire script so far:**get-model**)

`(`

declare-constproduct Int) (declare-constfactor1 Int) (declare-constfactor2 Int) (assert(and(>=factor1 100) (<factor1 1000))) (assert(and(>=factor2 100) (<factor2 1000))) (assert(=(*factor1 factor2) product)) (declare-consta Int) (declare-constb Int) (declare-constc Int) (assert(and(>=a 1) (<=a 9))) (assert(and(>=b 0) (<=b 9))) (assert(and(>=c 0) (<=c 9))) (assert(=product (+(*100000 a) (*10000 b) (*1000 c) (*100 c) (*10 b) a))) (check-sat) (get-model)

When you run this through Z3 (try it yourself!), the solver responds:

`sat (model (define-fun c () Int 1) (define-fun b () Int 0) (define-fun a () Int 1) (define-fun product () Int 101101) (define-fun factor2 () Int 143) (define-fun factor1 () Int 707) )`

That’s pretty good! `sat`

, here, means that Z3 found a solution (it would have displayed `unsat`

if it hadn’t). Eliding some of the syntax, the solution it found was 143 * 707 = 101101. Not bad for zero implementation code, but also not the answer to the Project Euler problem, which asks for the *largest* product.

### Optimization

"Optimization," in Z3 parlance, refers to finding the "best" proof for the theorem, not doing it as quickly as possible. But how do we tell Z3 to find the largest product?

(**Update: **I had a mistake in the original version of this post, and so I’ve significantly changed this section.)

Z3 has a function called `maximize`

, but it’s a bit limited. If I try adding `(`

, Z3 complains:**maximize **product)

`Z3(15, 10): ERROR: Objective function '(* factor1 factor2)' is not supported`

After some fiddling, however, it seems `(`

works, sort of. Adding this to the script above causes Z3 to return:**maximize **(**+ **factor1 factor2))

`(+ factor1 factor2) |-> [1282:oo] unknown --...`

Which is to say, Z3 could not find the maximal value. ("oo" just means ∞, and `unknown`

means it could neither prove nor disprove the theorem.) Guessing that **a** might be bigger than 1, I can change its range to 8..9 and Z3 arrives at a single solution:

`(+ factor1 factor2) |-> 1906 sat (model (define-fun b () Int 0) (define-fun c () Int 6) (define-fun factor1 () Int 913) (define-fun factor2 () Int 993) (define-fun a () Int 9) (define-fun product () Int 906609) )`

The final script is:

`(`

declare-constproduct Int) (declare-constfactor1 Int) (declare-constfactor2 Int) (assert(and(>=factor1 100) (<factor1 1000))) (assert(and(>=factor2 100) (<factor2 1000))) (assert(=(*factor1 factor2) product)) (declare-consta Int) (declare-constb Int) (declare-constc Int) ((assert(and(>=a 8 ) (<=a 9)))assert(and(>=b 0) (<=b 9))) (assert(and(>=c 0) (<=c 9))) (assert(=product (+(*100000 a) (*10000 b) (*1000 c) (*100 c) (*10 b) a))) (maximize(+factor1 factor2)) (check-sat) (get-model)

This bothers me just a little, since I had to lie ever so slightly about my objective, even though I did end up with the right answer.

That’s just a limitation of Z3, and it may be fixed some day; Z3 is under active development, and the optimization features are not even in the unstable or master branches. But think about what has been achieved here: We’ve solved a problem with nothing but statements about the properties of the correct answer, and without any "implementation" code whatsoever. Also, using this technique forced me to think deeply about what the problem actually meant.

## { 22 } Comments

Color me skeptical. The concept of "purely declarative programming" has been around for a looooong time, and it fails to one degree or another every time it’s introduced, because the fundamental fact is that programming is complex and complexity doesn’t just go away when you sweep it under a layer of abstraction. Possibly the best success story for declarative programming is SQL, and anyone with experience in databases knows all about the warty side of [insert RDBMS of choice here.]

Mason, it’s not an either/or choice. Most real world use of Z3 and related systems right now uses them hand and hand with a more conventional language. For example, Microsoft uses Z3 and the VCC compiler to look for faults in the Hyper-V memory manager. I will give several additional examples of real-world use in my next post on this topic.

The sales pitch of large-scale, purely declarative systems is, at best, premature, but no single tool can build such systems by itself today anyway. Z3 and other proving systems are a tool in your toolbox. Don’t believe anyone who claims that a single tool is right for everything, even, or especially, if it’s C++!

Oh, believe me, I haven’t believed C++ is the right tool for

anythingsince I first started learning it back in college.Possibly stupid question:

You say that (maximize (* factor1 factor2)) does not work and you seem to imply it’s a temporary limitation of Z3.

Maybe Z3 is using a simplex (or something similar to a simplex) to maximize the function?

If this is true, the objective function as a sum (instead of a product) is sort of mandatory, isn’t it?

Why did you need to change "(assert (and (>= a 1) (<= a 9)))" with "(assert (= a 9))"? Turns out that the former doesn’t find the correct solution, instead it yields 137*803=110011. Whatever the reason I think it’s worth mentioning in the article.

Hmmm, calling declarative programs tests, and therefore saying they are better than programming per-se is a bit fatuous. Also, code leans on tests and tests lean on code. having one without the other is bad. There is less ‘double checking’ opportunities. Whats to stop you having faulty assertions/tests? Just as code without tests can have bugs, so ‘tests’ without code have nothing to verify them.

This is an interesting post, thanks.

Funnily enough we are also currently looking at encoding some of the Euler problems in the B formal language and solving them using our ProB constraint solver.

A B solution is as follows (you can directly paste this into our online Logic Calculator at: http://www.stups.uni-duesseldorf.de/ProB/index.php5/ProB_Logic_Calculator):

num = %s.(s:seq(0..9)|SIGMA(i).(i:dom(s)|s(i)*10**(i-1))) /* convert a sequence of digits into a number */

&

all_palindromes = {x,y,s,p| {x,y} <: 100..999 & x<y &

size(s)=6 & p=x*y & num(s) = p & num(rev(s))=p} &

best_palindrome = max(ran(all_palindromes)) &

best_factors = dom(all_palindromes~[{best_palindrome}])

ProB does not (yet) have optimisation via an objective function; so we find all solutions here. Runtime is 0.23 seconds on my MacBook Air. I would be interested in knowing the runtime of the Z3 solution. It would be nice if one could avoid having to hardcode the 6 digits (both in the Z3 as well as in the B solution).

A version of the B model using unicode operators is as follows (in the hope this is more readable):

num = λs.(s ∈ seq(0 ‥ 9)|∑(i).(i ∈ dom(s)|s(i) * 10 ↑ (i - 1))) ∧ all_palindromes = {x,y,s,p|x ∈ 100 ‥ 999 ∧ y ∈ 100 ‥ 999 ∧ x < y ∧ size(s) = 6 ∧ p = x * y ∧ num(s) = p ∧ num(rev(s)) = p} ∧

best_palindrome = max(ran(all_palindromes)) ∧

best_factors = dom((all_palindromes~)[{best_palindrome}])

What follows is an opinion.

This is a step up from Prolog. Prolog has proven itself to be a great tool for reasoning about stuff that’s not easily reasoned about in a C-like programming language. But reasoning in it is really hard anyway. And the flexibility and performance and production-use capabilities are, well, not there. I’ve been on a project that started as a Prolog program and was quickly ported to C++ because of all reasons I already mentioned.

This idea is headed towards the same fate.

Prolog is an awesome exercise for the brain. This will be as well. Therefore thet’re academic (I might event say toys, but I respect people producing those tools very much and they are truly awesome)

Both this and Prolog is about abstracting algorithms away. We’re not ready to move away from algorithms. In a future near the Singularity we’ll be able to do that and program machines by defining the result. But it’s not something that waits around the corner.

Everytime someone talks about automatic code generation without programming they come with just another programming language. SMT-LIB, UML or SQL are just programming languages. The fact that they are declarative or graphic is only a matter of representation. It does not make programming easier. Since James Martin "Application programming without programmers" (1982) and the CASE tools myth, people keeps believing in automatic program generation, but the fact is that there are no silver bullets, as Frederick Brooks said.

Paolo: No, not a stupid question at all. The default optimizer is indeed primal Simplex, so you may be right as far as that goes. There are other solvers available, but I don’t understand all of the options. There is some discussion on the optimization tutorial.

George: Thanks; I made a mistake in editing the article. It’s an important point, so thanks for bringing it to my attention! I’ve updated that section.

Peter: I don’t think I claimed that pure specification is "better than programming." Indeed, I think this

isprogramming. I quite agree that having a "double entry" system is worthwhile! The interesting part here isn’t that there is only specification, but rather than the specification can stand on its own. We want to prove that the specification itself is complete and correct and self-consistent. Compare with the "specification" in the Project Euler problem which is "just words."Michael: Nice! Don’t think you need x > y; your solver seems to work fine without that term. Noticed you support TLA+, which is something I’d like to discuss in the future. If you want to try Z3 on your own hardware to compare, you can build it from source code.

Enrique: In my opinion, this

isprogramming. I don’t want to get rid of programming; I like programming. I want to make programs which arecorrect.@Enrique @Mason, sure declarative programming has not been as successful as we wish it would have. But to suggest that it should not be attempted is shortsighted. Z3 is one of the best tools that have been made, which can perhaps achieve it some day.

In theory you could express maximize as the absence of a larger solution, but sadly Z3 doesn’t seem to handle it: http://rise4fun.com/Z3/pq6A .

Interesting ideal, Laurentiu. When Z3 silently returns nothing in the web interface, one can sometimes work around it by increasing the timeout:

(set-option :timeout 100000)

…but that doesn’t seem to help in this case.

I tried with the (admittedly old) Z3 release from CodePlex and it gave me "unknown" after an hour or so.

"Indeed, you can solve this problem fairly quickly with pencil and paper so long as you don’t do IsPalindrome this way! (It would probably double the length of this post to explain how, so I’ll skip that. If there’s demand in comments I can explain, otherwise just read on.)"

I’d like to know how.

Hodor, short version is you’re looking for two factors of three digits, call them def and ghi, which equal abccda. But you know that a is almost certainly 9, and hence d and g are almost certainly 9 as well. Meaning f and i are either both 3 or are 9 and 1. To get the rest, you have to write out the multiplication: abccba = fi + 10ei + 100 di + 10fh + … and solve. But because we’ve already guessed so many of the individual digits, the solution is much easier than it seems at first glance.

Hi,

great post! I’m working on something similar, although using genetic programming, so it’s very much more generating program approximations. It requires a lot of time to find solutions, only works for simple programs, but requires less strictness in specifying a scoring function. See e.g., https://github.com/cslarsen/crianza/blob/master/examples/genetic/square-number.py#L40-L67 (yes, this is almost embarrasing to post; it finds a square function for you).

Anyway, I tried running your final optimized script with Z3. I’m a z3 novice, btw, but I put it into test.smt and did `z3 -smt2 test.smt`. The output was:

unsupported

; maximize

sat

(model

(define-fun b () Int

0)

(define-fun a () Int

(define-fun product () Int

801108)

(define-fun factor1 () Int

867)

(define-fun factor2 () Int

924)

(define-fun c () Int

1)

)

If I change the constraint of a to > 8 then it produces the correct result. Did I invoke it incorrectly? I ran it with the newest version of z3 on github.

Thanks for a great post!

Christian, the problem you’re having is "unsupported maximize". That means you’re not in the optimization branch.

## { 3 } Trackbacks

[...] I’ve written about proving systems in the past, and will have more to say in the future, but today we’ll talk about homomorphic encryption. [...]

[...] at Dog Food Conference 2014, 29-30 September, in Columbus, Ohio. If you found my post on "Test-Only Development" with the Z3 Theorem Prover was interesting, then you’ll love [...]

[...] Is there a solution which is as efficient as just returning 180 but which also proves that 180 is, in fact, the correct solution to the problem? Let’s encode the specification for the problem in a pure specification language, SMT-LIB: [...]

## Post a Comment