Skip to content

"Test-Only Development" with the Z3 Theorem Prover

In this post I’ll introduce a style of programming which may be totally unfamiliar and perhaps a bit magical to many readers. What if you could write a unit test for a problem and have your compiler automatically return a correct implementation? Call it "test only" instead of "test first." I’m going to translate the problem itself into a specification language and use Microsoft Research’s Z3 theorem prover to find a solution. I won’t write any implementation code at all!

A Simple Problem

Here’s the problem I’ll use as an example, which is problem #5 from the Project Euler:

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.

Find the largest palindrome made from the product of two 3-digit numbers.

The usual approach is to use brute force. Here’s a C# example, which I suspect is similar to what many people do:

(from    factor1 in Enumerable.Range(100, 899)
 from    factor2 in Enumerable.Range(100, 899)
 let     product = factor1 * factor2
 where   IsPalindrome(product) // defined elsewhere
 orderby product descending
 select  new { Factor1 = factor1, Factor2 = factor2, Product = product}).First()

This is not a terrible solution. It runs pretty quickly and returns the correct solution, and we can see opportunities for making it more efficient. I suspect most people would declare the problem finished and move on.

However, the LINQ syntax may obscure the fact that this is still a brute force solution. Any time we have to think about how to instruct a computer to find the answer to the problem instead of the characteristics of the problem itself, we add cognitive overhead and increase the chances of making a mistake.

Also, what is that IsPalindrome(product) hiding? Most people implement this by converting the number to a string, and comparing it with the reversed value. But it turns out that the mathematical properties of a palindromic number are critical to finding an efficient solution.

Indeed, you can solve this problem fairly quickly with pencil and paper so long as you don’t do IsPalindrome this way! (It would probably double the length of this post to explain how, so I’ll skip that. If there’s demand in comments I can explain, otherwise just read on.)

Solving with Pure Specification

For my purely declarative solution, I’m going to use the language SMT-LIB. As a pure specification language, it doesn’t allow me to define an implementation at all! Instead, I’ll use it to express the constraints of the problem, and then use MSR’s Z3 solver to find a solution. SMT-LIB uses Lisp-like S-expressions, so, yes Virginia, there will be parentheses, but don’t let that scare you off. It’s always worthwhile to learn languages which make you think about problems differently, and I think you’ll find SMT-LIB really delivers!

Since this language will seem unusual to many readers, let’s walk through the problem step by step.

First, we need to declare some of the variables used in the problem. I use "variable" here in the mathematical, rather than software, sense: A placeholder for an unknown, but not something to which I’ll be assigning varying values. So here are three variables roughly equivalent to the corresponding C# vars above:

(declare-const product Int)
(declare-const factor1 Int)
(declare-const factor2 Int)

In an S-expression, the first item inside the parentheses is the function, and the remaining items are arguments. So declare-const is the function here and the remaining items are the variable name and its "sort" (type).

Next, the problem says that product must be the, ahem, product of the two factors:

(assert (= (* factor1 factor2) product))

"assert" sounds like a unit test, doesn’t it? Indeed, many people coming to a theorem prover from a software development background will find that programming them is much more similar to writing tests than writing code. The line above just says that factor1 * factor2 = product. But it’s an assertion, not an assignment; we haven’t specified values for any of these variables.

The problem statement says that both factors are three digit numbers:

(assert (and (>= factor1 100) (<= factor1 999)))
(assert (and (>= factor2 100) (<= factor2 999)))

Mathematically, what does it mean for a number to be a palindrome? In this case, the largest product of 3 digit numbers is going to be a six digit number of the form abccba, so product = 100000a + 10000b + 1000c + 100c + 10b + a. As I noted above, expressing the relationship this way is key to finding a non-brute-force solution using algebra. But you don’t need to know that in order to use Z3, because Z3 knows algebra! All you need to know is that you should express relationships mathematically instead of using string manipulation.

(declare-const a Int)
(declare-const b Int)
(declare-const c Int)
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))

I implied above that a, b, and c are single-digit numbers, so we need to be specific about that. Also, a can’t be 0 or we won’t have a 6 digit number.

(assert (and (>= a 1) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))

These 4 assertions about a, b, and c are enough to determine that product is a palindrome. We’re not quite done yet, but let’s see how we’re doing so far. (check-sat) asks Z3 if there is a solution to the problem we’ve posed, and (get-model) displays that solution. Here’s the entire script so far:

(declare-const product Int)
(declare-const factor1 Int)
(declare-const factor2 Int)
(assert (and (>= factor1 100) (< factor1 1000)))
(assert (and (>= factor2 100) (< factor2 1000)))
(assert (= (* factor1 factor2) product))
(declare-const a Int)
(declare-const b Int)
(declare-const c Int)
(assert (and (>= a 1) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))
(check-sat)
(get-model)

When you run this through Z3 (try it yourself!), the solver responds:

sat
(model
  (define-fun c () Int
    1)
  (define-fun b () Int
    0)
  (define-fun a () Int
    1)
  (define-fun product () Int
    101101)
  (define-fun factor2 () Int
    143)
  (define-fun factor1 () Int
    707)
)

That’s pretty good! sat, here, means that Z3 found a solution (it would have displayed unsat if it hadn’t). Eliding some of the syntax, the solution it found was 143 * 707 = 101101. Not bad for zero implementation code, but also not the answer to the Project Euler problem, which asks for the largest product.

Optimization

"Optimization," in Z3 parlance, refers to finding the "best" proof for the theorem, not doing it as quickly as possible. But how do we tell Z3 to find the largest product?

(Update: I had a mistake in the original version of this post, and so I’ve significantly changed this section.)

Z3 has a function called maximize, but it’s a bit limited. If I try adding (maximize product), Z3 complains:

Z3(15, 10): ERROR: Objective function '(* factor1 factor2)' is not supported

After some fiddling, however, it seems (maximize (+ factor1 factor2)) works, sort of. Adding this to the script above causes Z3 to return:

(+ factor1 factor2) |-> [1282:oo]
unknown --...

Which is to say, Z3 could not find the maximal value. ("oo" just means ∞, and unknown means it could neither prove nor disprove the theorem.) Guessing that a might be bigger than 1, I can change its range to 8..9 and Z3 arrives at a single solution:

(+ factor1 factor2) |-> 1906
sat
(model
  (define-fun b () Int
    0)
  (define-fun c () Int
    6)
  (define-fun factor1 () Int
    913)
  (define-fun factor2 () Int
    993)
  (define-fun a () Int
    9)
  (define-fun product () Int
    906609)
)

The final script is:

(declare-const product Int)
(declare-const factor1 Int)
(declare-const factor2 Int)
(assert (and (>= factor1 100) (< factor1 1000)))
(assert (and (>= factor2 100) (< factor2 1000)))
(assert (= (* factor1 factor2) product))
(declare-const a Int)
(declare-const b Int)
(declare-const c Int)
(assert (and (>= a 8 ) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))
(maximize (+ factor1 factor2))
(check-sat)
(get-model)

This bothers me just a little, since I had to lie ever so slightly about my objective, even though I did end up with the right answer.

That’s just a limitation of Z3, and it may be fixed some day; Z3 is under active development, and the optimization features are not even in the unstable or master branches. But think about what has been achieved here: We’ve solved a problem with nothing but statements about the properties of the correct answer, and without any "implementation" code whatsoever. Also, using this technique forced me to think deeply about what the problem actually meant.

Can This Be Real?

At this point, you may have questions about doing software development in this way. Sure, it works fine for this trivial problem, but can you solve real-world problems this way? Is it fast? Are there any other tools with similar features? What are the downsides of working in this way? You may find the answers to these questions as surprising as the code above. Stay tuned!

{ 20 } Comments

  1. Mason Wheeler | July 7, 2014 at 3:11 pm | Permalink

    Color me skeptical. The concept of "purely declarative programming" has been around for a looooong time, and it fails to one degree or another every time it’s introduced, because the fundamental fact is that programming is complex and complexity doesn’t just go away when you sweep it under a layer of abstraction. Possibly the best success story for declarative programming is SQL, and anyone with experience in databases knows all about the warty side of [insert RDBMS of choice here.]

  2. Craig Stuntz | July 7, 2014 at 8:50 pm | Permalink

    Mason, it’s not an either/or choice. Most real world use of Z3 and related systems right now uses them hand and hand with a more conventional language. For example, Microsoft uses Z3 and the VCC compiler to look for faults in the Hyper-V memory manager. I will give several additional examples of real-world use in my next post on this topic.

    The sales pitch of large-scale, purely declarative systems is, at best, premature, but no single tool can build such systems by itself today anyway. Z3 and other proving systems are a tool in your toolbox. Don’t believe anyone who claims that a single tool is right for everything, even, or especially, if it’s C++!

  3. Mason Wheeler | July 8, 2014 at 12:16 pm | Permalink

    Oh, believe me, I haven’t believed C++ is the right tool for anything since I first started learning it back in college.

  4. Paolo Marino | July 10, 2014 at 3:32 am | Permalink

    Possibly stupid question:
    You say that (maximize (* factor1 factor2)) does not work and you seem to imply it’s a temporary limitation of Z3.

    Maybe Z3 is using a simplex (or something similar to a simplex) to maximize the function?
    If this is true, the objective function as a sum (instead of a product) is sort of mandatory, isn’t it?

  5. George Sakkis | July 10, 2014 at 3:57 am | Permalink

    Why did you need to change "(assert (and (>= a 1) (<= a 9)))" with "(assert (= a 9))"? Turns out that the former doesn’t find the correct solution, instead it yields 137*803=110011. Whatever the reason I think it’s worth mentioning in the article.

  6. Peter Morris | July 10, 2014 at 4:09 am | Permalink

    Hmmm, calling declarative programs tests, and therefore saying they are better than programming per-se is a bit fatuous. Also, code leans on tests and tests lean on code. having one without the other is bad. There is less ‘double checking’ opportunities. Whats to stop you having faulty assertions/tests? Just as code without tests can have bugs, so ‘tests’ without code have nothing to verify them.

  7. Michael Leuschel | July 10, 2014 at 4:24 am | Permalink

    This is an interesting post, thanks.
    Funnily enough we are also currently looking at encoding some of the Euler problems in the B formal language and solving them using our ProB constraint solver.
    A B solution is as follows (you can directly paste this into our online Logic Calculator at: http://www.stups.uni-duesseldorf.de/ProB/index.php5/ProB_Logic_Calculator):
    num = %s.(s:seq(0..9)|SIGMA(i).(i:dom(s)|s(i)*10**(i-1))) /* convert a sequence of digits into a number */
    &
    all_palindromes = {x,y,s,p| {x,y} <: 100..999 & x<y &
    size(s)=6 & p=x*y & num(s) = p & num(rev(s))=p} &
    best_palindrome = max(ran(all_palindromes)) &
    best_factors = dom(all_palindromes~[{best_palindrome}])

    ProB does not (yet) have optimisation via an objective function; so we find all solutions here. Runtime is 0.23 seconds on my MacBook Air. I would be interested in knowing the runtime of the Z3 solution. It would be nice if one could avoid having to hardcode the 6 digits (both in the Z3 as well as in the B solution).

    A version of the B model using unicode operators is as follows (in the hope this is more readable):
    num = λs.(s ∈ seq(0 ‥ 9)|∑(i).(i ∈ dom(s)|s(i) * 10 ↑ (i - 1))) ∧ all_palindromes = {x,y,s,p|x ∈ 100 ‥ 999 ∧ y ∈ 100 ‥ 999 ∧ x < y ∧ size(s) = 6 ∧ p = x * y ∧ num(s) = p ∧ num(rev(s)) = p} ∧
    best_palindrome = max(ran(all_palindromes)) ∧
    best_factors = dom((all_palindromes~)[{best_palindrome}])

  8. naugtur | July 10, 2014 at 6:51 am | Permalink

    What follows is an opinion.

    This is a step up from Prolog. Prolog has proven itself to be a great tool for reasoning about stuff that’s not easily reasoned about in a C-like programming language. But reasoning in it is really hard anyway. And the flexibility and performance and production-use capabilities are, well, not there. I’ve been on a project that started as a Prolog program and was quickly ported to C++ because of all reasons I already mentioned.
    This idea is headed towards the same fate.

    Prolog is an awesome exercise for the brain. This will be as well. Therefore thet’re academic (I might event say toys, but I respect people producing those tools very much and they are truly awesome)

    Both this and Prolog is about abstracting algorithms away. We’re not ready to move away from algorithms. In a future near the Singularity we’ll be able to do that and program machines by defining the result. But it’s not something that waits around the corner.

  9. Enrique | July 10, 2014 at 7:31 am | Permalink

    Everytime someone talks about automatic code generation without programming they come with just another programming language. SMT-LIB, UML or SQL are just programming languages. The fact that they are declarative or graphic is only a matter of representation. It does not make programming easier. Since James Martin "Application programming without programmers" (1982) and the CASE tools myth, people keeps believing in automatic program generation, but the fact is that there are no silver bullets, as Frederick Brooks said.

  10. Craig Stuntz | July 10, 2014 at 8:20 am | Permalink

    Paolo: No, not a stupid question at all. The default optimizer is indeed primal Simplex, so you may be right as far as that goes. There are other solvers available, but I don’t understand all of the options. There is some discussion on the optimization tutorial.

  11. Craig Stuntz | July 10, 2014 at 8:21 am | Permalink

    George: Thanks; I made a mistake in editing the article. It’s an important point, so thanks for bringing it to my attention! I’ve updated that section.

  12. Craig Stuntz | July 10, 2014 at 8:25 am | Permalink

    Peter: I don’t think I claimed that pure specification is "better than programming." Indeed, I think this is programming. I quite agree that having a "double entry" system is worthwhile! The interesting part here isn’t that there is only specification, but rather than the specification can stand on its own. We want to prove that the specification itself is complete and correct and self-consistent. Compare with the "specification" in the Project Euler problem which is "just words."

  13. Craig Stuntz | July 10, 2014 at 8:34 am | Permalink

    Michael: Nice! Don’t think you need x > y; your solver seems to work fine without that term. Noticed you support TLA+, which is something I’d like to discuss in the future. If you want to try Z3 on your own hardware to compare, you can build it from source code.

  14. Craig Stuntz | July 10, 2014 at 8:35 am | Permalink

    Enrique: In my opinion, this is programming. I don’t want to get rid of programming; I like programming. I want to make programs which are correct.

  15. Arturo Hernandez | July 10, 2014 at 9:15 am | Permalink

    @Enrique @Mason, sure declarative programming has not been as successful as we wish it would have. But to suggest that it should not be attempted is shortsighted. Z3 is one of the best tools that have been made, which can perhaps achieve it some day.

  16. Laurentiu Nicola | July 16, 2014 at 6:50 am | Permalink

    In theory you could express maximize as the absence of a larger solution, but sadly Z3 doesn’t seem to handle it: http://rise4fun.com/Z3/pq6A .

  17. Craig Stuntz | July 16, 2014 at 9:55 am | Permalink

    Interesting ideal, Laurentiu. When Z3 silently returns nothing in the web interface, one can sometimes work around it by increasing the timeout:

    (set-option :timeout 100000)

    …but that doesn’t seem to help in this case.

  18. Laurentiu Nicola | July 16, 2014 at 10:18 am | Permalink

    I tried with the (admittedly old) Z3 release from CodePlex and it gave me "unknown" after an hour or so.

  19. Hodor | July 21, 2014 at 2:34 pm | Permalink

    "Indeed, you can solve this problem fairly quickly with pencil and paper so long as you don’t do IsPalindrome this way! (It would probably double the length of this post to explain how, so I’ll skip that. If there’s demand in comments I can explain, otherwise just read on.)"

    I’d like to know how.

  20. Craig Stuntz | July 25, 2014 at 8:33 am | Permalink

    Hodor, short version is you’re looking for two factors of three digits, call them def and ghi, which equal abccda. But you know that a is almost certainly 9, and hence d and g are almost certainly 9 as well. Meaning f and i are either both 3 or are 9 and 1. To get the rest, you have to write out the multiplication: abccba = fi + 10ei + 100 di + 10fh + … and solve. But because we’ve already guessed so many of the individual digits, the solution is much easier than it seems at first glance.

{ 1 } Trackback

  1. [...] I’ve written about proving systems in the past, and will have more to say in the future, but today we’ll talk about homomorphic encryption. [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

Bad Behavior has blocked 713 access attempts in the last 7 days.

Close