Skip to content

Designing for Problems Too Big to Test

In this post, I will show an example of where using unit testing as a design methodology does not work, and how to produce a design for correct code anyway. There is no single design methodology which works for all problems, so it’s useful to have a variety of tools at your disposal.

This post is my contribution to the 2015 F# Advent Calendar.

I’m implementing a compiler for a tiny language without use of external libraries for things like parsing and code generation. The idea is to produce a minimal example of a purely functional compiler. This is an ongoing project, and some parts are further along than others, but you can see the source code as I work, and it does produce working EXEs today.

Designing a compiler is harder than many problems in programming, because they do something difficult: A compiler must be able to accept any string and either produce an equivalent program or explain clearly to a human being why this is string is not a valid program. And there are a lot of possible strings!

Designing a compiler is also easier than many problems in programming, because there exist formal methods for solving many of the harder sub-problems in the design. You can think of "formal methods," here, as recipes for a solution, but very special recipes which guarantee that you’ll touch all possible cases in the problem space.

Design Methodologies

Over the years, programmers have used a number of different methodologies when approaching program design. These include:

  • The Big Ball of Mud. Arguably the most common methodology, this involves just poking at the problem until things seem to work for the most common case, maybe.
  • Big Design Up Front. In this methodology, a full specification for the implementation of the system is developed before coding begins. Many people consider this obsolete or, at best, wildly inefficient.
  • Test Driven Design. I’m going to distinguish this from test driven development, because tests are useful both as a methodology for program design and for implementing a program design. In practice, people tend to combine these. As a design methodology, the general idea is that you start with either high or low level simple cases, and continue working until a design evolves. Some people divide this into sub-schools of test driven design.

Despite its ubiquity, few defend the big ball of mud as a design practice. Big design up front is widely ridiculed. That leaves TDD as the most prevalent design methodology that people are willing to publicly defend. Unfortunately, testing, while useful, is fundamentally limited.

"…program testing can be a very effective way to show the presence of bugs, but is hopelessly inadequate for showing their absence." Edsger Dijkstra

In cases where "the happy path" is far more prevalent than "edge cases," this might not seem to be a show-stopping limitation, and test driven design works OK in many cases.

There Are No Edge Cases In Programming Languages

As noted above, a compiler must be able to accept any string and either produce an equivalent program or explain clearly to a human being why this is string is not a valid program. A compiler designer cannot predict the valid programs people may write, nor the errors they may create.

For example, let’s consider Duff’s Device. It’s safe to presume that Brian Kernighan and Dennis Ritchie did not have this in mind when they designed the C programming language. For the uninitiated, Duff’s Device nests a while loop inside of a switch statement:

send(to, from, count)
register short *to, *from;
register count;
{
    register n = (count + 7) / 8;
    switch (count % 8 ) {
    case 0: do { *to = *from++;
    case 7:      *to = *from++;
    case 6:      *to = *from++;
    case 5:      *to = *from++;
    case 4:      *to = *from++;
    case 3:      *to = *from++;
    case 2:      *to = *from++;
    case 1:      *to = *from++;
            } while (--n > 0);
    }
}

This is unreadable to the point that it borders on obfuscation, but is legal C, per the specification, and does perform a useful optimization on a particular case. I bring it up because, as a language implementer, I think it drives home the point that there is no possibility of creating (anywhere near) all of the possible unit tests for all of the possible ways someone might choose to use your language.

Different Tasks, Different Design Methodologies

In many programming tasks, the number of "happy path" cases are similar to the number of edge and error cases. At least within the same order of magnitude. In these cases it’s probably possible to exhaustively test the system, even if people don’t usually bother to do so.

For other tasks, however, the number of "edge cases" is uncountably large. I gave a programming language example above, but there are others, such as designing an API for a cloud service. In these cases, we need a design methodology which gives us some assurance that our design will work with cases that we did not and could not possibly produce tests for, because real-world use cases will vastly outnumber our test cases.

Formal Methods

The solution to this problem is to break the problem space into a countable number of conditions. This is only effective if those countable conditions represent all possible states in the problem space. For example, for a programming language, we divide the task of compilation into small phases such as lexing, parsing, etc., and within each phase we use a formalism which guarantees that we cover the entire possible range of conditions within that phase.

This will make more sense if I give an example.

Lexing and Regular Expressions

In many compilers, the first phase of compilation is lexing, where the string representing the program source code is split into tokens. The token list will be passed to the parser, which attempts to match them up with the grammar of the programming language. As a practical example, consider the following expression from a Lisp-like language, which increments the number 1, resulting in the value 2.

(inc 1)

Lexing divides this source code into a stream of tokens, like this:

LeftParenthesis
Identifier "inc"
LiteralInt 1
RightParenthesis

These tokens will be consumed by the parser to produce and abstract syntax tree, type checked, optimized, etc., but let’s just look at lexing for now.

Substrings of the input source code are mapped to tokens using regular expressions. Not the PCRE type with zillions of features you might be familiar with, but a far simpler version with only a few rules. The lexical grammar for this language looks something like this:

leftParenthesis  = '('
rightParenthesis = ')'
letter           = 'A' | 'B' | 'C' | …
digit            = '0' | '1' | '2' | …
number           = ('+'digit|'-'digit|digit) digit*
alphanumeric     = letter | number
// …

You don’t use System.Text.RegularExpressions.Regex for this; it’s slow, and has features you won’t need.

How can we guarantee that we can tokenize any possible string? We don’t need to; as long as we explicitly handle the case of strings we can’t tokenize, we’re covered. I do this by having an extra token type for unrecognized characters. These are eventually mapped into errors the user sees.

How can we guarantee that we can tokenize any string representing a valid program without seeing an unrecognizable character? Because the parser is designed around a formalism (a context free grammar) which maps lexemes to abstract syntax trees, and the only valid programs are those which can be constructed from repeated applications of the production rules in the parser’s grammar. We have changed the scope of the problem from "any possible string" to "any possible sequence of lexemes."

Right away we have a big win in the number of test cases. Any "character" in a string could be one of 2^16 UTF-16 code points, but the number of possible lexemes is considerably smaller. A real language would have maybe 10 times more, but that’s still better than testing an input of any possible Unicode code point:

type Lexeme =
    | LeftParenthesis
    | RightParenthesis
    | Identifier    of string
    | LiteralInt    of int
    | LiteralString of string
    | Unrecognized  of char

We can test the lexer in isolation with a much smaller number of test cases.

The example I gave was a very simple expression, but real-world programs obviously contain more complicated expressions. Also, real-world code is often invalid and must be rejected by the compiler. Some coding errors cannot be detected until further on in the compilation pipeline, but there are possible errors at the lexing stage. For example, in my language, identifiers must begin with a letter, so the expression

(| 1)

…maps to:

LeftParenthesis
Unrecognized '|'
LiteralInt 1
RightParenthesis

Importantly, we should be able to examine any character of a real-world string, and map it into one of these types. The Unrecognized type serves as a kind of fall through for characters which do not fit into the types in the union.

F#’s exhaustiveness checking ensures that we cannot forget to handle a particular case even if we add additional lexemes to the language specification later. As a simple example, consider this pretty print function which takes a list of lexemes and produces a string similar to the originally parsed source code:

let private prettyPrintLexeme = function
| LeftParenthesis          -> "("
| RightParenthesis         -> ")"
| Identifier    identifier -> identifier
| LiteralInt    num        -> num.ToString()
| LiteralString str        -> sprintf "\"%s"\" str
| Unrecognized  ch         -> ch.ToString() 

let prettyPrint =
    List.map prettyPrintLexeme
    >> String.concat " "

If we were to, after the fact, add an additional type of lexeme, but forgot to modify the prettyPrint function, we would receive a compiler warning since the function would no longer be exhaustive.

The Rest of the Pipeline

What about parsing, type checking, and the rest of the compilation pipeline? Formalisms exist for those, as well.

Compilation phase Formalism
Parsing Context free grammar
Optimization Algebra
Type checking Logical inference rules
Code generation Denotational semantics

Isn’t This Just Big Design Up Front?

The idea of basing your implementation design around in an exhaustive specification might sound like a big design up front, but there is an important difference. The formalisms used in implementing a compiler are "off the shelf." Nobody has to sit down and create them, because they have been refined over decades of compiler implementations. You simply need to know that they exist.

If "formal methods" sounds too snobby for your taste, you can simply call this "programming with proven recipes."

And of this downside of this methodology is that it becomes big design up front in those cases when there is not an off the shelf formalism available for your use. That’s OK! The important thing is to know when these formalisms exist in how to use them, when necessary.

The full source code for this post is available.

Tagged , ,

F# Presentations at CodeMash 2016

I’ve been scouring the CodeMash accepted session list for talks featuring F#, and there are quite a few!

Bonus! Other talks of possible interest to functional programmers:

Double-Bonus! Interesting looking Pre-Compiler:

Triple-Bonus! Employment offer:

Improving is hiring developers and talented QAs in the Columbus area. It’s a nice place to work. We do interesting stuff. If that sounds good to you, talk to me! We don’t have remote positions at this time, but I’m trying to make that happen. Stay tuned!

Tagged , ,

Speaking at Dog Food Conference, CloudDevelop, and CodeMash

I’ll be speaking about compilers, testing, and machine learning at a conference near you! Abstracts for all of these sessions are on my Presentations page.

  • Programs that Write Programs: How Compilers Work Dog Food Conference and CodeMash
  • What Testing Can Never Do and How to Do It Anyway Dog Food Conference
  • Machine Learning with Azure ML CloudDevelop
Tagged , , , , ,

Adding a [FixedLength] Attribute in Code-First Entity Framework

In Code First Entity Framework models, you can define the length of a string field with StringLengthAttribute, but you have to write code in OnModelCreating to indicate a CHAR/NCHAR fixed length field:

public class MyEntity
{
	[Key]
	public int Id { get; set; }

	[StringLength(2)]
	public string FixedLengthColumn { get; set; }
}

public partial class MyContext : DbContext
{
	public virtual DbSet MyEntities { get; set; }

	protected override void OnModelCreating(DbModelBuilder modelBuilder)
	{
		if (modelBuilder == null) throw new ArgumentNullException("modelBuilder");

		modelBuilder.Entity()
			.Property(e => e.FixedLengthColumn)
			.IsFixedLength();
	}
}

I find it a bit confusing to split configuration like this, especially with a real model containing lots of fields and not this trivial example. Fortunately, you can fix it! Add these:

/// <summary>
/// Used to mark entity properties that are fixed length strings (CHAR(n) DB fields).
/// </summary>
[AttributeUsage(AttributeTargets.Property | AttributeTargets.Field, AllowMultiple = false, Inherited = true)]
public sealed class FixedLengthAttribute : Attribute {}

public class FixedLengthAttributeConvention :  PrimitivePropertyAttributeConfigurationConvention
{
	public override void Apply(ConventionPrimitivePropertyConfiguration configuration, FixedLengthAttribute attribute)
	{
		configuration.IsFixedLength();
	}
}

And change the model configuration to:

public class MyEntity
{
	[Key]
	public int Id { get; set; }

	[StringLength(2),
	 FixedLength]
	public string FixedLengthColumn { get; set; }
}

public partial class MyContext : DbContext
{
	public virtual DbSet MyEntities { get; set; }

	protected override void OnModelCreating(DbModelBuilder modelBuilder)
	{
		if (modelBuilder == null) throw new ArgumentNullException("modelBuilder");

		modelBuilder.Conventions.Add<FixedLengthAttributeConvention>();
	}
}
Tagged

How To Use Real Computer Science in Your Day Job

I’ll be speaking at Lambda Jam next week. Here’s the synopsis:

When you leave Lambda Jam and return to work, do you expect to apply what you’ve learned here to hard problems, or is there just never time or permission to venture outside of fixing “undefined is not a function" in JavaScript? Many of us do use functional languages, machine learning, proof assistants, parsing, and formal methods in our day jobs, and employment by a CS research department is not a prerequisite. As a consultant who wants to choose the most effective tool for the job and keep my customers happy in the process, I’ve developed a structured approach to finding ways to use the tools of the future (plus a few from the 70s!) in the enterprises of today. I’ll share that with you and examine research into the use of formal methods in other companies. I hope you will leave the talk excited about your job!

For Columbus friends who can’t make it to Chicago, I’ll be rehearsing the presentation this coming Saturday at 3.

Provable Optimization with Microsoft Z3

A few months ago, some coworkers sent around a Ruby challenge. It appears simple, but we can sometimes learn a lot from simple problems.

Write a Ruby program that determines the smallest three digit number such that when said number is divided by the sum of its digits the answer is 20.

In case that’s not clear, let’s pick a number, say, 123. The sum of the digits of 123 is 6, and 123/6 = 20.5, so 123 is not a solution. What is?

Here’s some Ruby code I wrote to solve it:

def digitSum(num, base = 10)
    num.to_s(base).split(//).inject {|z, x| z + x.to_i(base)}
end

def solution
    (100..999).step(20).reject {|n| n / digitSum(n) != 20 }.first
end

puts solution

Problem solved, right?

Well, no. For starters, it doesn’t even execute. There’s a really subtle type error in this code. You probably have to be fairly good with Ruby to figure out why without actually running it. Even when fixed, the cognitive overhead for understanding the code to even a simple problem is very high! It doesn’t look like the problem specification at all.

Does this version, when the bug is fixed, actually produce a correct answer to the problem? Does it even produce an incorrect solution? It’s not quite clear.

So maybe my solution isn’t so good. But one of my colleagues had an interesting solution:

def the_solution
    180
end

Well, that looks really, really efficient, and it typechecks. But is it the correct answer? Will you know, six months down the road, what question it’s even trying to answer? Tests would help, but the word "smallest" in the problem turns out to be tricky to test well. Do you want methods like this in code you maintain?

The Best of Both Worlds

Is there a solution which is as efficient as just returning 180 but which also proves that 180 is, in fact, the correct solution to the problem? Let’s encode the specification for the problem in a pure specification language, SMT-LIB:

(define-fun matches-problem ((n Int)) Bool
  (and
    (>= n 100)
    (< n 1000)                   ; three digits
    (= 20.0 (/ n (digit-sum n)))))

Z3/SMT-LIB doesn’t ship with a digit-sum function, so I had to write that. You can find that code in the full solution, below.

That’s most of the problem, but not quite all. We also have to show that this is the smallest solution. So let’s also assert that there exists a "smallest" solution, which means that any other solution is larger:

(declare-const num Int)
(assert
  (and
    (matches-problem num) ; "num" is a solution
    (forall ((other Int))
        (implies (matches-problem other) (>= other num))) ; any "other" solution is larger
  ))

Now let’s ask Z3 if this specification even makes sense, and if it could be reduced into something more efficient:

(check-sat)
(get-model)

And Z3 replies…

sat
(model
  (define-fun num () Int
    180)
)

A round of applause for the theorem prover, please! To see the full solution, try it yourself without installing anything.

One interesting point here: The output language is SMT-LIB, just like the input language. The "compile" transforms a provably consistent and more obviously correct specification into an more efficient representation of the answer in the same language as the input. This is especially useful for problems which do not have a single answer. Z3 can return a function matching a specification as easily as it can return an integer.

What does it mean when I ask "if this specification even makes sense?" Well, let’s say I don’t like the number 180. I can exclude it with an additional assert:

(assert (> num 180))

This time, when I check-sat, Z3 replies unsat, meaning the model is not satisfiable, which means there’s definitely no solution. So 180 is not only the smallest solution to the original problem, it turns out to be the unique solution.

Formal methods can show that your problem specifications are consistent and that your implementation is correct, and they can also guarantee that "extreme" optimizations are correct. This turns out to be really useful in real-world problems [PDF].

Tagged , , ,

What Is the Name of This Function?

There is a function I need. I know how to write it, but I don’t know if it has a standard name (like map, fold, etc.). It takes one argument — a list of something — and returns a list of 2-tuples of equal length. Each tuple contains one item from the list and the list without that item. It’s probably easiest if I show you an example:

> f [1; 2; 3];;
val it : (int * int list) list = [
    (1, [2; 3])
    (2, [1; 3])
    (3, [1; 2])
]

Here’s the implementation I’m using:

let f (items : 'T list) =
    let rec implementation (start : 'T list) = function
        | [] -> []
        | item :: tail -> (item, start @ tail) :: implementation (start @ [ item ]) tail
    implementation [] items

Anybody know a standard name for this function?

Appendix

In case you’re curious, the reason I want this is I’m implementing a decision tree. I have a list of functions which are predicates over the domain of my example data. I need to try each function in the list, pick the "best", and then recurse over the rest of the functions. "Best" is usually measured in terms of information gain.

It’s never a great idea to do equality comparisons on functions, so it’s helpful to transform this list into a list of functions paired with the remaining functions.

Tagged , , ,

Your Flying Car is Ready: Amazing Programming Tools of the Future, Today!

That’s the title of my presentation at Dog Food Conference 2014, 29-30 September, in Columbus, Ohio. If you found my post on "Test-Only Development" with the Z3 Theorem Prover was interesting, then you’ll love this.

What if simply writing "unit tests" was enough to produce a program which makes them pass? What if your compiler could guarantee that your OpenSSL replacement follows the TLS specification to the letter? What if you could write a test which showed that your code had no unintentional behavior?

Microsoft Research is well known for its contributions to Kinect, F#, the Entity Framework, WorldWide Telescope, and more, but it’s also the home of a number of programming tools which do things which many programmers would consider surprising, if not impossible. But they work, and in this session you’ll see them in action.

Like the idea of code contracts, but concerned about runtime performance and errors? The Dafny language can check contracts at compile time. Sounds a bit magical, but it works! I’ll use the Z3 theorem prover to generate working programs from specifications alone. Sound impractical? I’ll explain how it is used to make Hyper-V and Windows Azure secure. I’ll show the F7 specification language for F# and relate how its authors used it to not only produce a TLS implementation which probably follows the spec, but to also identify dangerous holes in the TLS specification itself. You’ll learn how Amazon uses the TLA+ specification language to prove that there are no edge cases in its internal protocols.

Far from being research toys, these tools are in daily use in cases where stability, security, and reliability of code matters most. Can they help with your hardest problems? You might be surprised!

Tagged , , , , ,

Cloud Security, For Real This Time: Homomorphic Encryption and the Future of Online Privacy

That’s the title of the presentation I’ll be giving at CloudDevelop 2014, on October 17th, in Columbus, Ohio. If you read my blog at all then you’re probably interested in where software development will be headed five years in the future. Two things I recommend that you study are proving systems and homomorphic encryption.

I’ve written about proving systems in the past, and will have more to say in the future, but today we’ll talk about homomorphic encryption.

Homomorphic encryption will change the web in the same way that SSL/TLS did. I say this with quite a bit more confidence than I have in the past! If you remember the web in 1993, that’s interesting to you. If not, imagine the web as a magazine which could show you ads, but required calling an 800 number if you wanted to make a purchase, and contrast that with today’s amazon.com.

I have given two presentations with similar titles before. But this presentation will be an almost complete rewrite, just like the last. In the time since my first article on homomorphic encryption, it’s gone from a gleam in a mathematician’s eye to an open source DB access library. It’s a fast-moving technology, which, thankfully, becomes more practical each year.

Interestingly, as the technology for cloud security becomes more practical, the need for it becomes more pressing.

CloudDevelop 2014 is just $20, which is quite cheap, as conferences go, but you can use this link to save 50%, which means that for the $10 you might have spent for lunch that day you get a conference for free!

Tagged , , ,

"Test-Only Development" with the Z3 Theorem Prover

In this post I’ll introduce a style of programming which may be totally unfamiliar and perhaps a bit magical to many readers. What if you could write a unit test for a problem and have your compiler automatically return a correct implementation? Call it "test only" instead of "test first." I’m going to translate the problem itself into a specification language and use Microsoft Research’s Z3 theorem prover to find a solution. I won’t write any implementation code at all!

A Simple Problem

Here’s the problem I’ll use as an example, which is problem #4 from the Project Euler:

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.

Find the largest palindrome made from the product of two 3-digit numbers.

The usual approach is to use brute force. Here’s a C# example, which I suspect is similar to what many people do:

(from    factor1 in Enumerable.Range(100, 899)
 from    factor2 in Enumerable.Range(100, 899)
 let     product = factor1 * factor2
 where   IsPalindrome(product) // defined elsewhere
 orderby product descending
 select  new { Factor1 = factor1, Factor2 = factor2, Product = product}).First()

This is not a terrible solution. It runs pretty quickly and returns the correct solution, and we can see opportunities for making it more efficient. I suspect most people would declare the problem finished and move on.

However, the LINQ syntax may obscure the fact that this is still a brute force solution. Any time we have to think about how to instruct a computer to find the answer to the problem instead of the characteristics of the problem itself, we add cognitive overhead and increase the chances of making a mistake.

Also, what is that IsPalindrome(product) hiding? Most people implement this by converting the number to a string, and comparing it with the reversed value. But it turns out that the mathematical properties of a palindromic number are critical to finding an efficient solution.

Indeed, you can solve this problem fairly quickly with pencil and paper so long as you don’t do IsPalindrome this way! (It would probably double the length of this post to explain how, so I’ll skip that. If there’s demand in comments I can explain, otherwise just read on.)

Solving with Pure Specification

For my purely declarative solution, I’m going to use the language SMT-LIB. As a pure specification language, it doesn’t allow me to define an implementation at all! Instead, I’ll use it to express the constraints of the problem, and then use MSR’s Z3 solver to find a solution. SMT-LIB uses Lisp-like S-expressions, so, yes Virginia, there will be parentheses, but don’t let that scare you off. It’s always worthwhile to learn languages which make you think about problems differently, and I think you’ll find SMT-LIB really delivers!

Since this language will seem unusual to many readers, let’s walk through the problem step by step.

First, we need to declare some of the variables used in the problem. I use "variable" here in the mathematical, rather than software, sense: A placeholder for an unknown, but not something to which I’ll be assigning varying values. So here are three variables roughly equivalent to the corresponding C# vars above:

(declare-const product Int)
(declare-const factor1 Int)
(declare-const factor2 Int)

In an S-expression, the first item inside the parentheses is the function, and the remaining items are arguments. So declare-const is the function here and the remaining items are the variable name and its "sort" (type).

Next, the problem says that product must be the, ahem, product of the two factors:

(assert (= (* factor1 factor2) product))

"assert" sounds like a unit test, doesn’t it? Indeed, many people coming to a theorem prover from a software development background will find that programming them is much more similar to writing tests than writing code. The line above just says that factor1 * factor2 = product. But it’s an assertion, not an assignment; we haven’t specified values for any of these variables.

The problem statement says that both factors are three digit numbers:

(assert (and (>= factor1 100) (<= factor1 999)))
(assert (and (>= factor2 100) (<= factor2 999)))

Mathematically, what does it mean for a number to be a palindrome? In this case, the largest product of 3 digit numbers is going to be a six digit number of the form abccba, so product = 100000a + 10000b + 1000c + 100c + 10b + a. As I noted above, expressing the relationship this way is key to finding a non-brute-force solution using algebra. But you don’t need to know that in order to use Z3, because Z3 knows algebra! All you need to know is that you should express relationships mathematically instead of using string manipulation.

(declare-const a Int)
(declare-const b Int)
(declare-const c Int)
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))

I implied above that a, b, and c are single-digit numbers, so we need to be specific about that. Also, a can’t be 0 or we won’t have a 6 digit number.

(assert (and (>= a 1) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))

These 4 assertions about a, b, and c are enough to determine that product is a palindrome. We’re not quite done yet, but let’s see how we’re doing so far. (check-sat) asks Z3 if there is a solution to the problem we’ve posed, and (get-model) displays that solution. Here’s the entire script so far:

(declare-const product Int)
(declare-const factor1 Int)
(declare-const factor2 Int)
(assert (and (>= factor1 100) (< factor1 1000)))
(assert (and (>= factor2 100) (< factor2 1000)))
(assert (= (* factor1 factor2) product))
(declare-const a Int)
(declare-const b Int)
(declare-const c Int)
(assert (and (>= a 1) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))
(check-sat)
(get-model)

When you run this through Z3 (try it yourself!), the solver responds:

sat
(model
  (define-fun c () Int
    1)
  (define-fun b () Int
    0)
  (define-fun a () Int
    1)
  (define-fun product () Int
    101101)
  (define-fun factor2 () Int
    143)
  (define-fun factor1 () Int
    707)
)

That’s pretty good! sat, here, means that Z3 found a solution (it would have displayed unsat if it hadn’t). Eliding some of the syntax, the solution it found was 143 * 707 = 101101. Not bad for zero implementation code, but also not the answer to the Project Euler problem, which asks for the largest product.

Optimization

"Optimization," in Z3 parlance, refers to finding the "best" proof for the theorem, not doing it as quickly as possible. But how do we tell Z3 to find the largest product?

(Update: I had a mistake in the original version of this post, and so I’ve significantly changed this section.)

Z3 has a function called maximize, but it’s a bit limited. If I try adding (maximize product), Z3 complains:

Z3(15, 10): ERROR: Objective function '(* factor1 factor2)' is not supported

After some fiddling, however, it seems (maximize (+ factor1 factor2)) works, sort of. Adding this to the script above causes Z3 to return:

(+ factor1 factor2) |-> [1282:oo]
unknown --...

Which is to say, Z3 could not find the maximal value. ("oo" just means ∞, and unknown means it could neither prove nor disprove the theorem.) Guessing that a might be bigger than 1, I can change its range to 8..9 and Z3 arrives at a single solution:

(+ factor1 factor2) |-> 1906
sat
(model
  (define-fun b () Int
    0)
  (define-fun c () Int
    6)
  (define-fun factor1 () Int
    913)
  (define-fun factor2 () Int
    993)
  (define-fun a () Int
    9)
  (define-fun product () Int
    906609)
)

The final script is:

(declare-const product Int)
(declare-const factor1 Int)
(declare-const factor2 Int)
(assert (and (>= factor1 100) (< factor1 1000)))
(assert (and (>= factor2 100) (< factor2 1000)))
(assert (= (* factor1 factor2) product))
(declare-const a Int)
(declare-const b Int)
(declare-const c Int)
(assert (and (>= a 8 ) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))
(maximize (+ factor1 factor2))
(check-sat)
(get-model)

This bothers me just a little, since I had to lie ever so slightly about my objective, even though I did end up with the right answer.

That’s just a limitation of Z3, and it may be fixed some day; Z3 is under active development, and the optimization features are not even in the unstable or master branches. But think about what has been achieved here: We’ve solved a problem with nothing but statements about the properties of the correct answer, and without any "implementation" code whatsoever. Also, using this technique forced me to think deeply about what the problem actually meant.

Can This Be Real?

At this point, you may have questions about doing software development in this way. Sure, it works fine for this trivial problem, but can you solve real-world problems this way? Is it fast? Are there any other tools with similar features? What are the downsides of working in this way? You may find the answers to these questions as surprising as the code above. Stay tuned!

Bad Behavior has blocked 1307 access attempts in the last 7 days.

Close