04 Oct

Integer Complexity Measures

[Note this post is a bit outdated, and not maintained; the most up-to-date version of this information can be found in my pages on the OEIS wiki.]

I’ve recently become interested in a family of interrelated sequences that can be found in the Online Encyclopedia of Integer Sequences (OEIS). These sequences all have to do with one approach to measuring the “complexity” of an integer. This approach assumes that you have some basic “building blocks” for constructing integers: starting numbers you’re allowed to use, and operations that you may apply to them. The idea, then, is that the number of these building blocks that you need to arrive at a particular integer is one way of characterizing how “complex” that integer is.

This approach to constructing integers may seem familiar from the “Four fours” puzzle: Given four instances of the number 4 to start with, and a variety of operations that you might use, how many different whole numbers can you construct? For some examples, \[1 = \frac44 \quad 2 = \frac{4+4}4 \quad 3 = \frac{4+4+4}4 \quad 5 = 4+\frac44 \quad \ldots \quad 8 = 4 + 4 \quad \ldots \] Here it seems quite reasonable that in terms of “expressing with fours,” the number eight is “simpler” than the number three.

The lowest-numbered sequence (that I’ve found) in the OEIS that has to do with integer complexity is A000792. In the case of this sequence, the only number you’re allowed to start with is one, and the only operations you’re allowed to use are plus and times. The $n$th entry of sequence A000792 then tells you the largest number you can produce using at most $n$ ones.

Formally, let $A$ be an “alphabet” of numbers and operators; for simplicity we’ll assume all operators are unary or binary. One can then form finite sequences of elements from $A$; such a sequence is a well-formed RPN expression if no initial subsequence has as many binary operators as numbers. Each well-formed RPN expression unambiguously determines its value: imagine you have an initially-empty stack of numbers, and go through the sequence from left to right. Any time you encounter a number, put it on top of the stack; when you encounter a unary operator, modify the top number on the stack by applying that operator; and when you encounter any binary operator $\ast$, remove $a$ and $b$ (respectively) from the second-to-top and top positions from the stack, and then put $a\ast b$ on top of the stack (reducing the height of the stack by one). The value of the RPN expression is the top number on the stack when the entire expression has been scanned. The well-formed condition guarantees the stack never empties.

We can now define the $A$-operand complexity of a number $n$, which we’ll write as $c_A(n)$, as the minimum number of numbers in any well-formed RPN expression with value $n$, the $A$-operator complexity $d_A(n)$ as the minimum number of operators in any well-formed RPN expression with value $n$, and similarly the $A$-expression complexity $e_A(n)$ as the minimum length of a well-formed RPN expression with value $n$. There is no general relationship between $c_A$, $d_A$, and $e_A$, but if $A$ contains only binary operators, then $e_A(n)$ $= 2c_A(n)-1$ $= 2d_A(n)+1$. (Briefly, count the number of operators needed to reduce the operands to a single number; clearly the minimal expression in any case ends with a stack of just one number.)

Since it’s the most common notion of complexity used in the OEIS, we take “complexity” without an adjective to refer to $c_A$, the $A$-operand complexity.

I’ve also assembled a table that organizes sequences in the OEIS related to these notions of integer complexity, according to the alphabet $A$ and the particular measure or property of the complexity.