John Cook, at his Probability Fact twitter feed (@ProbFact), asked (I’ve cleaned up the notation):
What is the expected amplitude for the sum of
sines with random phase? i.e. sum of
where
Intuitively one expects something on the order of , since we’re adding together what are essentially
independent random variables. It’s not too hard to throw together a quick simulation, without even bothering with any trigonometry, and this was my first impulse. This code just picks the
uniformly at random, and takes the maximum of
for values of
which are multiples of
.
x = (0:200)/(2*Pi)
n = 1:100
num.samples = 100
max.of.sines = function(phi){
max(rowSums(outer(x, phi, function(x,y){sin(x+y)})))
}
mean.of.max = function(n, k){mean(replicate(k, max.of.sines(runif(n, 0, 2*pi))))}
averages = sapply(n, function(n){mean.of.max(n, num.samples)})
This is a bit tricky: in the matrix in max.of.sines, output by outer, each column gives the values of a single sine function , and rowSums adds them together.
We can then plot the resulting averages and fit a model . I get
from my simulation, which is close enough to
to ring a bell:
C = lm(averages^2~n+0)$coefficients
qplot(n, averages, xlab="n", ylab="mean", main="means for 100 samples") +
stat_function(fun = function(x){sqrt(C*x)})
At this point we start thinking theory. If you’re me and you haven’t looked at a trig function in a while, you start at the wikipedia page, and discover that it actually does all the work for you:
where
.
That is, the sum of a bunch of sinusoids with a period is a single sinusoid with the same period, and an amplitude easily calculated from the amplitudes and phases of the original sinusoids. There’s a formula for as well, but it’s not relevant here.
In our case all the are 1 and so we get
If you take the expectation of both sides, and recognize that is 1 if
(it’s
) and 0 if
(just the average of the cosine function), then you learn
where
is the number of summands. That agrees with our original guess, and is enough to prove that
by Jensen’s inequality.
To get the exact value of we can expand on David Radcliffe’s comment: “Same as mean dist from origin after N unit steps in random directions. Agree with sqrt(N*pi/4)”. In particular, consider a random walk in the complex plane, where the steps are given by
where
is uniform on the interval
. We can work out that its sum after
steps is
and so, breaking up into the real and imaginary components,
.
Rewriting the squared sums as double sums gives
and combining the double sums gives
and by the formula for the cosine of a difference we get
which is exactly the given above. So the amplitude of our sum of cosines is just the distance from the origin in a two-dimensional random walk!
It just remains to show that the expected distance from the origin of the random walk with unit steps in random directions after steps is
. A good heuristic demonstration is as follows: clearly the distribution of the position
is rotationally invariant, i. e. symmetric around the origin. The position
is the sum of
independent variables each of which is distributed like the cosine of a uniformly chosen angle; that is, it has mean
and variance
. So the
-coordinate after
steps is approximately normally distributed with variance
. The overall distribution, being rotationally symmetric with normal marginals, ought to be approximately jointly normal with
and
both having mean 0, variance
, and uncorrelated; then
is known to be Rayleigh-distributed, which finishes the proof modulo that one nasty fact.
Oh, nice bit of work. I’d missed the original problem posting, which is a shame, since it’s a good one.
Hallo Mr. Lugo. I can not find any email contact on you at this web so I’m trying to ask you a question concerning the “Random sums of sines..” this way. Your blog is hugely interesting for me, since for several weeks already, I was trying to explain to myself why the expected value for the sum of N sines of random angles should be kind of sqrt(N). I encountered this problem trying to understand a physical theory behind some nanocrystalline materials. I’m not any good in math nor in English and maybe this is why I didn’t understand the last sentence of this blog saying: “.., which finishes the proof modulo that one nasty fact. ” This sentence sounds a little frightening to me, although I don’t understand it fully. So let me ask what did you mean by the formulation “modulo that one nasty fact” ? Which fact did you mean? And why you called this fact nasty? Did you mean some weak point of your derivation?