In recent years, I have developed a strong feeling that any software bug can be traced down to violations of developers’ intents (assuming that the developers are not malicious). Well…, this may seem too obvious — of course no sane developer will want to put bugs in their code. Therefore, any bug is at odds with their intention. But what exactly is violated here? What is an intent?

Defining an intent

Let’s reflect a bit on how we write code. In my personal experience, programming is essentially a consecutive refinement of state machines:

$$ S_0 \sqsupseteq S_1 \sqsupseteq \, ... \, \sqsupseteq S_n $$

from business requirements (\(S_0\)) all the way down to every implementation detail (\(S_n\)). I know in practice no one really follows this waterfall model. Most of the time we don’t even have the luxury of seeing all business requirements (\(S_0\)) clearly defined, let alone the rest of the lowering. With luck, some intermediate state machines (\(S_1, S_2, ...\)) may be sketched in the form of design documents or architectural overviews or simply comments in the code, …by whim or by arm-twisting…

Nevertheless, the refinement procedure is just a formality, the more important fact is that we make decisions and assumptions, explicitly or implicitly, along the process. Some of the decisions and assumptions are rather subtle but may turn out to be more important than we originally thought, especially when someone proves it wrong by throwing us a bug report.

With this mental exercise, now we can describe what an intent is, hopefully without too much ambiguity. An intent is either

  • a decision on how to refine a state machine \(S_i\) (denoted as Type-1 intent)
    i.e., why refining \(S_i\) to \(S_{i+1}\) instead of to another plausible state machine \(S'_{i+1}\) even when both \(S_i \sqsupseteq S_{i+1}\) and \(S_i \sqsupseteq S'_{i+1}\) hold, or
  • an assumption to establish \(S_i \sqsupseteq S_{i+1}\) (denoted as Type-2 intent)
    i.e., why the refinement between two state machines \(S_i \sqsupseteq S_{i+1}\) is believed to be correct (both sound and complete).

Types of violations

Technically speaking, a violation of Type-1 intents is not a bug as it does not render the program incorrect. Instead, it is usually treated as a bad coding practice which may lead to future bugs if not restricted. For example, using unsafe Rust does not necessarily introduce memory errors to the program, but it does extended the attack surface and increases the difficulty of maintaining the codebase. Similarly, the const modifier in C/C++ is not strictly needed if the developers are careful enough not to modify a pointer/reference in the code. But a const declaration solidifies the intent here.

What’s more interesting is violations of Type-2 intents. A Type-2 violation essentially produces a state machine \(S_{i+1}\) such that \(S_i \not\sqsupseteq S_{i+1}\). In other words, the refinement is either

  • unsound, causing \(S_{i+1}\) to have more states or state transitions than allowed — a weird machine. The famous DAO attack is a good illustration of unintended state transitions. Or
  • incomplete, causing \(S_{i+1}\) to omit certain states or state transitions — missing functionalities. The stuck of the GovernMental jackpot can be a loose approximation of missing state transitions.

Other attempts to define an intent

Before arriving at the above state machine-based definition, I made several unsatisfactory attempts on describing what an intent is.

Attempt 1: An intent is something that captures certain valuable thinking, decision, or expectation made by the while software team. Note that a software team encompasses more than just the developers. As an incomplete list, a software team might include:

  • Product designers who defines core and implied features of the software. Their intents are usually expressed in the form of marketing materials and legal agreements.
  • Architects who designs the key data structures and interfaces. Their intents are usually expressed in the form of design patterns, type definitions, API specifications, etc.
  • Programmers who are responsible for the coding details. Their intents are usually expressed in the type system or in the comments.

Obviously, this attempt is an instantiation of the state machine refinement view, but maybe more relatable to real-world experiences.

Attempt 2: Instead of defining what is an intent, I also tried to breakdown the definition of intent into a set of disjoint items. However, I failed to generalize the procedure beyond the Move language. So, here is a Move-specific list I got:

  • Shape of shared resources. A complete definition of data structures that appear in the global state.
  • Properties over shared resources. Global snapshot invariants and global update invariants.
  • Functional API contracts. Function pre/post conditions.
  • Implementation-binding details. Type definitions, including data invariants.
  • Implied code properties. No data races as all Move transactions are serialized.

Again, this attempt is still an instantiation of the state machine refinement view, and is heavily biased by my personal experience. I would love to see it generalized into more programming paradigms.

In the next post, we describe how to evaluate a programming language or a language feature from the perspectives of intent solicitation.