I’ve been applying for internships in quant finance recently, and I’m now fortunate enough to have been accepted onto a quant research internship for summer 2021^{1}. Like many people in my position, I spent a considerable amount of time preparing for the interview process employed by quant firms, which can be a bit gruelling. In general, I think if you are interested in what quant firms do and you have the appropriate background, these interviews are perfectly passable - despite the hype, you don’t need to be a fields medallist or anything. But interview brainteasers are a very specific kind of problem, and you definitely improve with practice at the specific kinds of problems people like to ask.

I thought I could share one particular solution technique - recursion - which crops up a lot in interview questions, but which I wasn’t very familiar with in this context from my undergraduate, and I think can easily catch out mathematically capable candidates who aren’t well-prepared enough for interview problems specifically. You see this a lot in Markov chain questions, but it comes up everywhere. It’s particularly relevant in Markov chains because these have very general solutions for almost any relevant quantity in terms of the transition matrix of the chain, but the matrix calculations are too complex to reproduce on pen and paper in an interview. If your undergraduate experience was anything like mine, you have studied Markov chains, and learnt about the general solutions, but haven’t had a great deal of practice using this recursive ‘shortcut’, and may have never seen it before ^{2}. Of course, Markov chains are very important to quantitative finance theory, and if you can get a job at a quant shop without having to get through at least one Markov chain brainteaser I would be amazed. If you understand the fundamentals, this approach is easy once you know what you are doing. I’m going to go through a few example questions where this technique can be applied, hopefully illustrating it’s generality and power, and a few concrete additional tips.

The best way to get a feel for this approach is to go through some examples. I’ve also included a few classic examples without solutions for you to try if you like. As the companies I’ve interviewed with normally asked me nicely not to share interview questions, I’ve tried to avoid using any questions I’ve been exactly asked in an interview, but the questions here have a similar flavour to problems I have been directly asked. A lot of these can also be found on the internet or in books of practice interview questions, so I feel that they are fair game to use here without giving away too much.

The most important part of recursion questions is, of course, spotting the recursive structure. This is a bit more of an art than a science, but you can definitely get better at it. I hope that the questions I’ve picked are a representative sample of the kind of places you should think of looking for this kind of angle of attack.

# Small number of states - simultaneous equations

The simplest example of a recursive solution technique is with a Markov chain with a discrete number of states. This can be brought to bear on the following classic interview question;

I repeatedly flip a fair coin. What is the expected number of flips I need to make before I get two heads in a row?

This is obviously a Markov chain. We can easily see that it has three relevant states;

- No heads so far. This is the state we start in.
- 1 head. We are in this state if the previous flip was a head (but the one before wasn’t)
- 2 heads. This is the absorbing state.

As I mentioned, you can solve this problem, and any similar problem, by computing the fundamental matrix of the Markov chain. The problem with doing this in an interview is that it requires you to invert a matrix, which is tedious and longwinded, with plenty of opportunities to get bogged down in algebra. Under pressure, I would struggle to do this without making any mistakes, especially (say) in a phone interview where your interviewer won’t be easily able to correct you as you go. In case you can do this without breaking a sweat, the other reason to avoid this approach is that it doesn’t scale very well on paper - if we had the same question but for three heads in a row, we would have to invert a 4 by 4 matrix.

If we are in the no heads state, after the next flip we will be in the same state with probability \(1/2\) (if we get tails) and in the 1 head state with probability \(1/2\) (if we get heads). The key to the recursive approach is to break up the mean number of flips into the conditional means; \[ \begin{aligned} x_0 &:= \mathbb{E}[\textrm{Number of flips} \mid \textrm{No heads so far}] \\ x_1 &:= \mathbb{E}[\textrm{Number of flips} \mid \textrm{1 head}] \\ x_2 &:= \mathbb{E}[\textrm{Number of flips} \mid \textrm{2 head}] = 0 \end{aligned} \]

The logic here is to break up the expected number of flips in a given state into \(\mathbb{E}[n \mid x_t = x ] = 1 + \sum_y \mathbb{E}[n \mid x_{t+1} = y] p(x_{t+1} = y \mid x_{t} = x)\) - the expected number of flips starting in a given (non absorbing) state is 1 + the expected number of flips from every possible subsequent state, weighted by the probability of ending up in that state.

For this specific problem, we can then formulate the simultaneous equations

\[ \begin{aligned} x_0 &= 1 + \frac{1}{2} x_0 + \frac{1}{2} x_1 \\ x_1 &= 1 + \frac{1}{2} 0 + \frac{1}{2} x_0 = 1 + \frac{1}{2} x_0 \end{aligned} \]

which gives, substituting for \(x_1\),

\[ \begin{aligned} x_0 &= 1 + \frac{1}{2} x_0 + \frac{1}{2} ( 1 + \frac{1}{2} x_0) \\ &= 1 + \frac{1}{2} x_0 + \frac{1}{2} + \frac{1}{4} x_0 \\ 4 x_0 &= 6 + 3x_0 \\ x_0 &= 6 \end{aligned} \]

Since we wanted the expected number of flips starting in the \(0\) state, this is what we actually care about, though we could also find \(x_1\) using this, of course. Hopefully you agree that this allowed us to get to the correct answer with a minimum of fuss. (You can easily that the answer is correct by running a quick simulation.)

## Exercise for the reader

How many times do you need to flip a fair coin before you see your first head, on average?

Suppose you have a fair six sided die. How many rolls do you need to get 6 three times in a row, on average?

You have a bag of sweets, in six different colours. All the colours are present in equal numbers, and you can consider the bag large enough that removing a sweet does not appreciably change the proprotions of colours in the bag. Sequentially, you draw sweets from the bag and place them on the table in front of you. If you have, at any time, 2 sweets of the same colour in front of you, you eat them both. You continue doing this until you have exactly one of each color of sweet on the table. How many sweets, in expectation, do you expect to eat?

This last one is significantly harder, but I assure you that it can be solved by applying the same technique - it’s just there is a bit more careful thinking to do to choose the correct way to frame the problem, and which variables to take the expectation over.

# Recurrence Relations

The recursive technique can also be applied to Markov chains with an infinite number of states. If the recursion is of a particular type, this can often lead to a linear recurrence relation to solve;

You are playing a game. Every round, you flip a coin. If you get tails, you lose $1. If you get heads you win $2. At the start of the game, you have \(N\) dollars, and you must stop playing the game if you run out of money. What is the probability you

nevergo bankrupt, if you play as long as you are able to?

If asked in an interview, you may well see the problem given with some concrete number instead of \(N\) - the fact I use \(N\) is a pretty big hint as to the correct approach.

The first part of this problem is very similar to the previous problem. Let \(f(n)\) be the probability of avoiding bankrupcy, given a current stake of \(n\) dollars. By applying the Markov nature of the problem, we must have that

\[f(n) = \frac{1}{2} f(n-1) + \frac{1}{2} f(n+2)\]

This is a similar principle to the one above - the probability of an event happening, given the current state, must be equal to the probability of it happening in any possible subsquent state, weighted by the probability of ending up in that state.

We can re-arrange this a little bit and re-label the indices to get

\[ f(n) - 2 f(n-2) + f(n-3) = 0 \]

This is a linear recurrence relation. Solving these is easy once you know how - rather like differential equations, the trick is to know the solution in advance. Linear recurrence relations always have solutions of the form \(f(n) = A r^n\). Subsituting this into the relation above, we get the characteristic polynomial of the equation;

\[ r^3 - 2r + 1 = 0 \]

which has solutions \(r=1, r_1=\frac{\sqrt{5}}{2} -\frac{1}{2} = 0.61, r_2=-\frac{\sqrt{5}}{2} -\frac{1}{2} = -1.61\)

Linear recurrence equations obey a superposition principle ^{3}, and the general solution is therefore

\[ f(n) = A + B (\frac{\sqrt{5}}{2} -\frac{1}{2})^n + C (-\frac{\sqrt{5}}{2} -\frac{1}{2})^n \]

We need to find the values of the coefficients by applying the boundary conditions of the problem. To find these boundary conditions, we can use the constraints of the problem at hand. Remember that \(f(n)\) are probabilities, and we therefore need to have that \(0 < f(n) < 1\) for all \(n\). This condition implies \(C = 0\), since this term grows without bound with \(n\) and changes sign depending on the sign of \(n\). In addition, we must have that \(f(0) = 0\), since if we have no money we are certainly bankrupt, and substituting this into the above gives \(A=-B\). We must also have that in the limit \(n\to \infty\) that \(f(n) = 1\), which implies \(A = 1\) as the \(B\) term decays to zero with \(n\), so that in the end we have

\[ f(n) = 1 - (\frac{\sqrt{5}}{2} -\frac{1}{2})^n \]

which is the desired relationship. It is easy to see that it has intuitive behaviour, increasing rapidly to 1 with \(n\).

Many properties of Markov chains lead to linear equations - these will be linear recurrence equations if the relationship between adjacent states is translation invariant.

## Exercises for the reader

Consider a flea on a beam. The beam is exactly 100 cm long. Every second, the flea can jump either 1cm or 2cm to the left. How many distinct paths can the flea take?

This isn’t a Markov chain question, but it allows a similar very recursive attack to the one I just went over.

You play a game with an unfair coin, with probability of \(p\) of coming up heads. Every round you flip the coin, winning $1 if it comes up heads and losing $1 if it comes up tails. What is the probability of avoiding bankrupcy if you start with $10 and play for as long as possible? For what values of \(p\) should you play the game?

You play a game where you roll a 3 sided die. If it comes up 1, you lose $2. If it comes up 2, you win $1. If it comes up 3, you win $3. What is your probability of avoiding bankrupcy as a function of your starting wealth?

# Differential Equations

Finally, let’s consider a similar example in a continuous case.

I draw numbers \(x_1, x_2, ...\) i.i.d from the uniform distribution on the unit interval. If \(x_i > x_{i-1}\), then I stop drawing numbers, otherwise I continue. How many draws should I expect to make?

The key here, similarly to the previous question, is to consider the expected number of draws given that the previous value was \(x_n = x\). Conditioned on this, the problem becomes Markov. Let’s define \(f(x) = \mathbb{E}[\textrm{number of subsequent draws} \mid x_n < x_{n-1}, x_n = x]\), that is, the number of draws given than the draw last round was \(x\) and assuming that the game carries on, so \(x_n\) was less than \(x_{n-1}\). Obviously, in the other case, the number of subsequent draws is 0 deterministically. Applying a recursive decomposition, we must have that

\[ \begin{aligned} f(x) &= 1 + \int_0^1 \mathbb{E}[\textrm{number of subsequent draws} \mid x_{n-1} = x, x_n = y] dy \\ &= 1 + \int_0^x f(y) dy \end{aligned} \]

because for \(y > x\) the number of subsequent draws of the game is zero, as mentioned, and for \(y < x\) the expected number of draws is \(f(y)\), due to the Markov nature of the game. Defining \(F= \int f(x) dx\) as an antiderivative of \(f\) and applying the fundamental theorem of calculus, we have

\[ \begin{aligned} f(x) &= 1 + F(x) - F(0) \end{aligned} \]

Differentiating with respect to x gives us

\[ \begin{aligned} \partial_x f(x) &= f(x) \end{aligned} \]

which clearly implies that \(f\) is of the form \(f(x) = A e^x\), and we need to find the unknown constant \(A\). We can do this by considering the boundary condition that \(f(0)\) must be \(1\), since in this case the \(x_{n+1} > x_n\) with certainty, and so the game will finish next round. This gives that \(A = 1\), and so \(f(x) = e^x\). To find the expected number of draws from the start of the game, we only need to see that in effect the game starts with \(x_n = 1\), since \(x_1\) will be accepted no matter it’s value, so the expected number of draws is exactly \(f(1) = e\). This is a bit of a striking result, but it’s easily confirmed with a simulation.

\(e\) often makes an appearence in the answer to these kind of questions - it’s not so suprising once you see, as here, that the solution can often be expressed in terms of ordinary differential equations via a recursive argument.

## Exercises for the reader

In the scenario above, what is \(\mathbb{E}[\min{\{x_i\}}] = \mathbb{E}[x_{n-1}]?\)

You are playing a game called ‘continuous blackjack’ against the house. The game goes as follows - you draw a values from the uniform distribution on \(U(0, 1)\). You can draw as many as you like, and you get to see the value you draw before making a decision whether to draw the next (‘hit’) or stick with your current value. Your score is the sum of the values which you drew. If your score is above 1, you are ‘bust’ and the house wins. Once you stick, the house plays in the same way as you. You win if your value is above the house. How should you play this game?

What is the optimal strategy in the game above if there are \(n\) players after you, rather than just the dealer?

# More complicated situations

I think that these three techniques - simultaneous equations, linear recurrence relations, and linear ODEs - are the most ‘bread and butter’ way to deal with this kind of problem, and they are the ones that you will frequently see. It’s definitely worth being able to solve these three kinds of problem without too much difficulty if you are preparing for this kind of interview; once you’ve done a few questions like this, you should be able to quickly get a feel for what kinds of problems are reducible to this kind of approach.

This technique of decomposition into sub-problems is very generally applicable, and you will also see it in computer algorithms (‘dynamic programming’) so learning to spot this kind of relationship can also come in handy in programming type questions.

If an interviewer is feeling particularly malevolent, they may ask a pen-and-paper question where you need to use this kind of decomposition, but the solution technique required is a bit more complicated and bespoke. An example is the following problem;

Consider the following procedure.

```
n = N;
c = 0
while True:
c += 1
n = randint(n)
if n == 0:
break
return c
```

clearly, the returned value of c is the number of calls to randint we make before we draw 0. If \(N = 10^{10}\), what is \(\mathbb{E}[c]\)?

You should now, of course, be able to spot that \(f(n) = \mathbb{E}[c \mid N = n]\) is the quantity we want to attack, and use the recursive approach to express this in terms of itself. However, in this case we will find that \(f(n)\) depends on \(f(i)\) for all \(i < n\), rather than forming a nice discrete recurrence relation that can be solved using the technique above. A good trick in this kind of situation is to try to compute \(f(n) - f(n-1)\), and see if you can replace the sums in this expression with evaluations of \(f(x)\). This question is a good example of one where the recursive attack is a good starting point, but you need to do some clever manipulations still to get to a nice answer rather than simply grinding the handle. I will leave it to the reader to work through the details of this question, though I think the answer is on math overflow somewhere.

Another example of a more ‘bespoke’ kind of solution is the following

A drunken pirate is standing on a plank. At every timestep, he takes a step exactly 1 foot forwards or backwards with equal probability. He starts at one end of the plank. If the plank is \(N\) feet long, how many steps, on average, does he make before he falls of either end of the plank?

We can attack this problem with the recursive method, rather than directly. To see this, first it’s worth computing the probability of which end of the plank he falls off. Let’s label the pirate’s starting position \(x_0\) on a plank of length N as 0, so that -1 and N are the absorbing states, and 0,1,…N-1 are the transient states. Since this is a random walk with equal probability of going in either direction, we can see that it’s a ‘Martingale’ - for any \(t> T\), \(\mathbb{E}[x_t | x_T] = x_T\). We can use this, together with the fact that the only possible end states are \(x_T = -1\) and \(x_T = N\), to see that, if \(T\) is the time that the pirate ends up in either of these two states,

\[\mathbb{E}[x_T | x_0] = -1 p(x_T = -1) + p(x_T = N) N = x_0 = 0\]

by the optional stopping theorem. ^{4} We must, therefore, have that \(p(x_T = -1) = N/(N+1)\). We can use this to express the expected number of steps as follows. Let \(f(n)\) be the expected number of steps we spend on a plank of length \(n\), given that we start in the leftmost position.

`|*(--...N...--)|`

In the diagram above, the star is the current position of the pirate. What are the possiblilities? With probability \(1/2\), we take a step to the left, ending the walk. With probability \(1/2\), we take a step to the right. We are now in the following situation

`|-(*--...N-1...--)|`

But now, consider that the amount of time we spend in the region in brackets is exactly the same problem as before, but for a plank of length \(N-1\)! We must spend, therefore, an average of \(f(n-1)\) steps in the region enclosed in brackets, before either falling of the right side with probability \(1/N\), or going back to the leftmost position on the plank with probability \((N-1)/N\). If we return to the leftmost position, then we are back to our original state.

This gives us the following recurrence relation;

\[ \begin{aligned} f(n) &= 1 + \frac{1}{2}(f(n-1) + \frac{n-1}{n} f(n)) \\ \implies f(n) &= \frac{n}{1 + n} (f(n-1) + 2) \end{aligned} \]

If there is a general, easy way to attack recurrence relations with non-constant coefficients, I’m not sure what it is (there is a general solution but it’s a bit unwieldy, and I don’t think memorising it is worth the effort, though if you do remember it feel free to use it). However, we can try working a few of these out and see if we can spot a trend. We have that \(f(1) = 1\), since any step will remove the pirate from the plank. Substituting in, we get that \(f(2) = 2\), \(f(3) = 3\) and so on. It seems as though the solution is simply \(f(n) = n\). Once we have this guess, it is easy to show it is correct by induction; we know that \(f(n) = n\) for \(n = 1\), and it’s easy to show \(f(n) = n\) given that \(f(n-1) = n-1\) by substitution into our recurrence relation, so it must be true for all \(n\).

These last two examples I think are fairly representative of the harder kind of problem that can be solved with this approach - even after you apply the recursive technique to get an equation to solve, you still need to deploy a bit of lateral thinking or clever manipulation to get a final answer, rather than just grinding the handle on one of the three approaches above. The old staple of working out the first few terms of a recurrent sequence by hand to see if this let’s you spot the pattern is always valuable - as in the case above, it’s easy to *show* that \(f(n) = n\) solves the problem, but it’s not very easy to see it just by staring at the recurrence relation unless you work out a few terms to spot the pattern.

If you don’t get that, at least you will probably get partial credit, or some help from an interviewer, if you can get the recurrence relation correct even if you don’t quite get to a solution. While it’s obviously good if you can, you don’t actually have to ace every single question without help to get past these interviews. Again, that’s why I think the recursive approach is so important to master, since it gives you a useful way to *begin* to attack very wide variety of problems. It looks (and feels!) much better to be able to make a start on a question, then get stuck, than to have no idea at all what to do.

If you ever see a problem like this in the wild, I hope that this guide has been helpful.

After all, who wants to read interview advice from someone who got rejected?↩

I’m fairly certain that the first time I saw this method explained was in the Jane Street interview guide. This guide is quite interesting, though not good enough to prevent me screwing up an interview with them a couple of years ago.↩

Convince yourself by showing \(f(x) + g(x)\) is a solution if \(f\) and \(g\) are solutions by substituting them in if this isn’t obvious to you↩

If you aren’t comfortable with the optional stopping theorem or Martingales, you could also show this using a discrete recurrence like the one above. But this is a bit more long-winded.↩