I will go into quite some mathematical detail in this post. I have tried my best to make the advanced math entirely optional to read. If you are intimidated by the proofs or simply not interested, don’t worry about skimming over them or skipping them entirely. The proofs are meant to give a convincing argument and may not be rigorous.

A recurrence relation for the self-predicting sequence Q_n. The left-hand side reads: Q with a subscript saying Q of n plus n. The right-hand side reads n.

Whenever I meet up with a certain good friend of mine, we usually have some kind of mathematical microfixation that we like to explore. Around two years ago, we were looking into the properties of some number sequences. Number sequences are an ordered list of numbers, denoted for example \(A_n\). The subscript refers to the \(n\)th element that appears in the sequence. In this post, I will talk about our efforts on a specific integer sequence we found that refers directly to itself. It turned out to take us along more mathematical concepts than I had imagined.


Self-referential sequences

On a late train ride back home, I was trying to think of interesting ideas for integer sequences. Integer sequences are number sequences in which each element is an integer. I enjoyed the idea of creating a sequence that somehow describes itself. Such a sequence is called self-referential. My first thought was to make a sequence that, at each position \(n\), gives the amount of times the number \(n\) appears in the sequence. However, I realized that constructing such a sequence would cause issues soon enough. The problem is that large numbers will take up more space than available. For instance, if the value at a large index \(N\) is another large number \(M\), then the number \(N\) has to appear in a total of \(M\) positions in the sequence. As a result, the numbers corresponding to these positions need to appear \(N\) times themselves! This already occupies \(N \cdot M\) places in the sequence, which is far more than the amount of available places up to position \(N\). For an infinite sequence we can get away with pushing numbers further and further ahead, which will indeed lead to a valid sequence called the Golomb sequence. Moreover, I realized I had previously encountered a similar idea, regarding self-describing numbers instead of sequences. The challenge is to find all the autobiographical numbers, in which the consecutive digits of the number describe the amounts of \(0\)s, \(1\)s, \(2\)s and so forth, up to \(9\)s. An example is the number \(1210\), which contains one \(0\), two \(1\)s, one \(2\) and no \(3\)s.

There are exactly seven autobiographical numbers. Can you find a pattern to construct them? What does this say about the first digit of an infinite autobiographical sequence?

My second idea proved to be more unique: is there an integer sequence that describes at position \(n\) in how many steps we encounter the number \(n\)? We could call this a self-predicting sequence, since it “predicts” when we will see a given number. A couple of ambiguities arise here: can we repeat numbers? Can the sequence refer to a certain number in any position, or does it need to point to the very next occurence of that number? I will address and make suitable choices for these questions while we build the sequence from the ground up.


Constructing the sequence

Let’s start with an empty table:


Empty table for the self-predicting sequence, showing the first sixteen terms to be filled in.


This table will contain the first sixteen elements of our self-predicting sequence. The top row shows the position \(n\) and the bottom row shows the corresponding element in the sequence, which I will call \(Q_n\).

As a start, we can consider \(Q_0\). This number should describe after how many steps we see a \(0\) in the sequence. Let’s think about what happens when a \(0\) appears in the sequence: if at position \(n\) we see \(Q_n = 0\), it tells us that after zero steps we should see the number \(n\). But taking zero steps means we stay at the same position! We cannot have both a \(0\) and an \(n\) in the same position, unless the \(n\) in question is \(0\) itself. This forces \(Q_0\) to be \(0\); any other number would require a \(0\) to appear later on the sequence, which leads to a contradiction.


Table for the self-predicting sequence that shows the first sixteen terms. A 0 is entered at position 0 and highlighted.


Now that \(Q_0\) is in place, what could \(Q_1\) be? It surely cannot be a \(0\), as we stated just now. Let’s try \(Q_1 = 1\). In one step, we should then encounter a \(1\), so \(Q_2 = 1\). This in turn tells us that in one step, we should see the number \(2\), so \(Q_3 = 2\). Placing a single number seemingly escalates into a whole list of numbers we can fill in. I will call these number progressions a chain, as each number inside it is linked to the number that initiates it:


Table for the self-predicting sequence that shows the first sixteen terms. The leading 0 is filled in. The first chain is shown and highlighted. Arrows connect the number 1 at position 1, number 1 at position 2, number 2 at position 3, number 3 at position 5 and number 5 at position 8. A final dotted arrow indicates that this chain continues indefinitely.


An interesting pattern emerges, which some may recognize as the Fibonacci sequence, commonly denoted as \(F_n\). The Fibonacci sequence has the property that each term is the sum of the two numbers prior to it: \(F_n = F_{n-1} + F_{n-2}\) for \(n \geq 2\). It begins with the numbers \(0, 1, 1, 2, 3, 5, 8, 13, 21, 34, \cdots\).

To see why we encounter these numbers in our self-predicting sequence \(Q_n\), it might help to look at its own defining relation. \(Q_n\) tells us after how many steps we will see an \(n\) in the sequence. In other words, at position \(n + Q_n\), the sequence shows the number \(n\) itself. By the same reasoning, we will encounter the number \(n + Q_n\) at position \((n + Q_n) + n\). Notice that we keep adding the last two numbers to arrive at the new position. This follows the exact same principle as the Fibonacci sequence! Since we put \(Q_1 = 1\), we start out exactly like the original Fibonacci terms \(1, 1, 2, 3, \cdots\) and cause the rest of the chain to unroll.

Time for me to address the first ambiguity, then. Placing a \(1\) at \(Q_1\) already leads to two consecutive \(1\)s to show up in our sequence. If we really want to avoid repeating numbers, we would have to start with a higher number, only to encounter a \(1\) later on. For now, we will allow for repeating numbers. As we will see, repetition of numbers leads to a nicely increasing sequence, rather than one that looks more like an unstable rollercoaster.

The next blank entry in the table is \(n = 4\). Of course, we cannot fill in any number that points us to a position that is already occupied by another number. This means we are not allowed to enter a \(1\) or a \(4\), because the positions \(n = 5\) and \(n = 8\) are already filled in. However, the numbers \(2\) and \(3\) seem valid. We will enter the smallest number: a \(2\). It’s not a surprise that we again uncover a whole Fibonacci-esque chain of terms:


Table for the self-predicting sequence that shows the first sixteen terms. The leading 0 and the first chain are filled in. The second chain is shown and highlighted. Arrows connect the number 2 at position 4, number 4 at position 6 and number 6 at position 10. A final dotted arrow indicates that this chain continues indefinitely.


So far, so good. However, can we be sure that the two Fibonacci chains never overlap? After all, our table only extends up to \(n = 15\), so we can’t see what happens further down the sequence. Perhaps the chains collide somewhere thousands of steps ahead! Luckily, we can prove that they stay out of one another’s ways, under one important condition.



We now understand how our sequence behaves and have one rule at hand to avoid any overlap: the value of \(Q_n\) should lie between its neighbors or equal one of them. The rest of the table becomes a lot more straightforward to fill in. \(Q_7\) is bounded by its neighboring terms, so we can choose either a \(4\) or a \(5\). This is where I’ll bring up the second ambiguity: if we pick a \(5\), then \(Q_5\) no longer points to the very next occurence of a \(5\) in the sequence. We will take this as another restriction and only allow the sequence to refer to the very next appearance of a number. When we encounter a blank space, we therefore enter a number that is strictly smaller than any number already filled in further down the sequence. In fact, we can choose to always enter the lowest available number, which is \(Q_n = Q_{n-1}\). Thus, our only option is to set \(Q_7 = 4\). With this extra rule, we can complete our self-predicting sequence:


Table for the self-predicting sequence that shows the first sixteen terms. The terms read, in order: 0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 6, 7, 7, 8, 9, 9.


And there we go! We have constructed a sequence that predicts itself, by only following some simple rules. Although the method is straightforward, producing the sequence up to a large position can take quite some time and effort. Can we do better than this?


From sequence to formula

Like true mathematicians in possession of a brand new integer sequence, we can now start to wonder whether there exists a simple formula for \(Q_n\) in terms of \(n\). Where would we begin? Remember that we have already encountered an interesting property of the sequence: each number we entered, escalated into an elaborate chain of numbers. The Fibonacci-like nature of these chains makes the ratio \(\frac{n}{Q_n}\) of our sequence surprisingly predictable, as we will discover. The Fibonacci sequence itself appeared when we kept adding the previous two terms of a chain to arrive at the next position. The ratio between two consecutive terms in the Fibonacci sequence \(F_n\) and \(F_{n-1}\) approaches the golden ratio \(\varphi\) in the limit where \(n\) grows to infinity. This constant equals \(\frac{1 + \sqrt{5}}{2} \approx 1.6180339887\cdots\). We can use this together with our self-predicting property to show that \(\frac{n}{Q_n}\) also converts to \(\varphi\).



We went on a long tangent to arrive at a simple result: when we take \(n\) large enough, \(\frac{n}{Q_n} \approx \varphi\). Switching this expression around a little gives us our desired relation between \(Q_n\) and \(n\):

\[\begin{equation} Q_n \approx \frac{n}{\varphi}. \end{equation}\]

Unfortunately, the two sides of this equation cannot be exactly equal, since we require \(Q_n\) to be an integer and \(\varphi\) is irrational. Clearly, we will need to do some rounding. Let’s dust off an old reliable math tool.


Fiddling with floor functions

What if we simply round \(\frac{n}{\varphi}\) to the nearest integer? We can denote this as follows:

\[\begin{equation} \left\lfloor \frac{n}{\varphi}+\frac{1}{2} \right\rfloor, \end{equation}\]

where \(\left\lfloor x \right\rfloor\) is the floor function, which rounds \(x\) down to an integer. A few examples of the floor function are \(\left\lfloor 3.14 \right\rfloor = 3\) and \(\left\lfloor 9.81 \right\rfloor = 9\). If \(x\) is already an integer, then the function will simply return \(x\) itself: \(\left\lfloor 2 \right\rfloor = 2\). Note how we construct a function that rounds up or down to the nearest integer by first adding \(\frac{1}{2}\) to our number. For the given examples, we get \(\left\lfloor 3.14 + \frac{1}{2} \right\rfloor = 3\) but \(\left\lfloor 9.81 + \frac{1}{2} \right\rfloor = 10\).

This rounding function has a certain bias on numbers that lie halfway between two integers. What is this bias and why does it not matter for our sequence?

The table corresponding to this straightforward rounding function is shown below, along with the table we constructed before.


Table comparing the simple rounding function of n over phi against our self-predicting sequence. All the integers agree, except for a highlighted 6 at position 9 in the rounding function, which differs from the 5 that we are looking for.


This looks a lot like what we were aiming for! The sequence has an entry in The On-Line Encyclopedia of Integer Sequences (OEIS) with index A101803. However, this sequence is not exactly the same as our \(Q_n\). Within the first sixteen terms, only \(Q_9\) is rounded in the wrong direction. It still points to a \(9\), but not to the very next \(9\) in the sequence. Maybe we can obtain the sequence we want by nudging the term \(\frac{1}{2}\) that we added inside the floor function, which I will call the rounding bias. It makes sense to only consider values between \(0\) and \(1\) for the bias, so we can generalize the bias as \(\frac{1}{b}\) with the bias parameter \(b\) ranging from \(1\) to infinity. For the case \(\frac{1}{b} < \frac{1}{2}\), the function will round down more often than it rounds up. Similarly, if \(\frac{1}{b} > \frac{1}{2}\), it rounds up more. With a general bias, our function looks like

\[\begin{equation} Q_{n} = \left\lfloor \frac{n}{\varphi}+\frac{1}{b} \right\rfloor. \end{equation}\]

Fascinatingly, there is a range of values for \(b\) for which the sequence predicts itself! Let’s prove it.



With some effort, we managed to show that for bias parameters \(\varphi < b \leq \varphi^2\), our sequence is indeed self-predicting! The sequence \(Q_n\) that specifically refers to the very next occurence of each number has to be generated by the upper bound of \(b\). After all, the sequence with this extra property always picks the smallest allowed integer for each unoccupied space, as we saw before. It then makes sense that it corresponds to the lowest bias, such that numbers are rounded down most often. Indeed, \(b = \varphi^2\) leads to the lowest bias. Finally, we obtain our sought-after formula:

\[\begin{equation} Q_n = \left\lfloor \frac{n}{\varphi} + \frac{1}{\varphi^2} \right\rfloor. \end{equation}\]


Rounding off

After a rather elaborate journey along integer sequences, the golden ratio and floor functions, we managed to find a beautiful equation that describes exactly how our self-predicting sequence \(Q_n\) behaves. In a way, we found an infinite number of self-predicting sequences. Each value of the bias parameter \(b\) leads to a slightly different sequence due to the change in rounding bias.

Although the usefulness of our sequence most likely doesn’t reach beyond recreational mathematics, I found the process of exploring and uncovering its structure enlightening. A simple idea for an integer sequence unfolded into a list of progressively detailed mathematical reasoning. Much like the Fibonacci-esque chains, each new step naturally built onto the previous ones. If I had been given the sequence \(Q_n\) together with its formula, I probably wouldn’t have been particlularly impressed. However, after personally constructing and exploring the various properties of a sequence I thought up myself, reaching a final formula felt like a little discovery! This kind of drive is exactly what continues to convince me of the importance of simply giving things a go, especially in mathematics. Sometimes tackling an interesting idea leads to nothing groundbreaking, but it might as well yield an unpredicted result.


To conclude, here are the first \(100\) terms of \(Q_n\):

\[\begin{align} &0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 6, 7, 7, 8, 9, 9, 10, 10, 11, 12, 12, 13, 13, 14, 15, 15, 16, 17, 17, 18, \\ &18, 19, 20, 20, 21, 22, 22, 23, 23, 24, 25, 25, 26, 26, 27, 28, 28, 29, 30, 30, 31, 31, 32, 33, \\ &33, 34, 34, 35, 36, 36, 37, 38, 38, 39, 39, 40, 41, 41, 42, 43, 43, 44, 44, 45, 46, 46, 47, 47, \\ &48, 49, 49, 50, 51, 51, 52, 52, 53, 54, 54, 55, 56, 56, 57, 57, 58, 59, 59, 60, 60, 61, \cdots \end{align}\]

For more interesting integer sequences, check out the OEIS.