Random sums of sines and random walks

John Cook, at his Probability Fact twitter feed (@ProbFact), asked (I’ve cleaned up the notation):

What is the expected amplitude for the sum of N sines with random phase? i.e. sum of \sin(x + \phi_i) where \phi_i ~ uniform[0, 2\pi]

Intuitively one expects something on the order of sqrt{N}, since we’re adding together what are essentially N independent random variables. It’s not too hard to throw together a quick simulation, without even bothering with any trigonometry, and this was my first impulse. This code just picks the \phi_i uniformly at random, and takes the maximum of f(x) = \sum_i \sin(x + \phi_i) for values of x which are multiples of \pi/100.

x = (0:200)/(2*Pi)
n = 1:100
num.samples = 100


max.of.sines = function(phi){
max(rowSums(outer(x, phi, function(x,y){sin(x+y)})))
}


mean.of.max = function(n, k){mean(replicate(k, max.of.sines(runif(n, 0, 2*pi))))}
averages = sapply(n, function(n){mean.of.max(n, num.samples)})

This is a bit tricky: in the matrix in max.of.sines, output by outer, each column gives the values of a single sine function \sin(x + \phi_i), and rowSums adds them together.

We can then plot the resulting averages and fit a model y^2 \sim Cx. I get C \approx 0.7872 from my simulation, which is close enough to pi/4 to ring a bell:


C = lm(averages^2~n+0)$coefficients


qplot(n, averages, xlab="n", ylab="mean", main="means for 100 samples") +
stat_function(fun = function(x){sqrt(C*x)})

sample-means

At this point we start thinking theory. If you’re me and you haven’t looked at a trig function in a while, you start at the wikipedia page, and discover that it actually does all the work for you:

\sum_i a_i \sin (x + \delta_i) = a \sin (x+\delta)

where

a^2 = \sum_{i,j} a_i a_j \cos (\delta_i - \delta_j).

That is, the sum of a bunch of sinusoids with a period is a single sinusoid with the same period, and an amplitude easily calculated from the amplitudes and phases of the original sinusoids. There’s a formula for \delta as well, but it’s not relevant here.
In our case all the a_i are 1 and so we get

a^2 = \sum_{i, j} \cos (\delta_i - \delta_j)

If you take the expectation of both sides, and recognize that E \cos (\delta_i - \delta_j) is 1 if i = j (it’s cos(0)) and 0 if i \not = j (just the average of the cosine function), then you learn E(a^2) = N where N is the number of summands. That agrees with our original guess, and is enough to prove that E(a) \le \sqrt{N} by Jensen’s inequality.

To get the exact value of E(a) we can expand on David Radcliffe’s comment: “Same as mean dist from origin after N unit steps in random directions. Agree with sqrt(N*pi/4)”. In particular, consider a random walk in the complex plane, where the steps are given by exp(i \theta_j) where \theta_j is uniform on the interval [0, 2\pi). We can work out that its sum after N steps is

S_N = \sum_{j=1}^N exp(i \theta_j) = \sum_{j=1}^N (\cos \theta_j + i \sin \theta_j)

and so, breaking up into the real and imaginary components,

|S_N|^2 = \left( \sum_{j=1}^N \cos \theta_j \right)^2 + \left( \sum_{j=1}^n \sin \theta_j \right)^2.

Rewriting the squared sums as double sums gives

|S_N|^2 = \left( \sum_{j=1}^N \sum_{k=1}^N \cos \theta_j \cos \theta_k \right) + \left( \sum_{j=1}^N \sum_{k=1}^N \sin \theta_j \sin \theta_k \right)

and combining the double sums gives
|S_N|^2 = \sum_{j=1}^n \sum_{k=1}^N (\cos \theta_j \cos \theta_k - \sin \theta_j \sin \theta_k)

and by the formula for the cosine of a difference we get

|S_N|^2 = \sum_{j=1}^n \sum_{k=1}^N \cos (\theta_j - \theta_k)

which is exactly the a^2 given above. So the amplitude of our sum of cosines is just the distance from the origin in a two-dimensional random walk!

It just remains to show that the expected distance from the origin of the random walk with unit steps in random directions after N steps is \sqrt{\pi N/4}. A good heuristic demonstration is as follows: clearly the distribution of the position (X, Y) is rotationally invariant, i. e. symmetric around the origin. The position X is the sum of N independent variables each of which is distributed like the cosine of a uniformly chosen angle; that is, it has mean \mu = {1 \over 2\pi} \int_0^{2\pi} \cos \theta \: d\theta = 0 and variance \sigma^2 = {1 \over 2\pi} \int_0^{2\pi} \cos^2 \theta \: d \theta - \mu^2 = 1/2. So the X-coordinate after N steps is approximately normally distributed with variance N/2. The overall distribution, being rotationally symmetric with normal marginals, ought to be approximately jointly normal with X and Y both having mean 0, variance N/2, and uncorrelated; then sqrt{X^2 + Y^2} is known to be Rayleigh-distributed, which finishes the proof modulo that one nasty fact.

2 thoughts on “Random sums of sines and random walks

  1. Hallo Mr. Lugo. I can not find any email contact on you at this web so I’m trying to ask you a question concerning the “Random sums of sines..” this way. Your blog is hugely interesting for me, since for several weeks already, I was trying to explain to myself why the expected value for the sum of N sines of random angles should be kind of sqrt(N). I encountered this problem trying to understand a physical theory behind some nanocrystalline materials. I’m not any good in math nor in English and maybe this is why I didn’t understand the last sentence of this blog saying: “.., which finishes the proof modulo that one nasty fact. ” This sentence sounds a little frightening to me, although I don’t understand it fully. So let me ask what did you mean by the formulation “modulo that one nasty fact” ? Which fact did you mean? And why you called this fact nasty? Did you mean some weak point of your derivation?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s