Poisson processes appropriate for today

Say I have two Poisson processes of constant density λ on the unit interval [0, 1]. What’s the probability that the maximum of the first process is greater than the minimum of the second? (For reasons to be explained later, I’ll stipulate that the maximum of no numbers is negative infinity, and the minimum of no numbers is positive infinity.) Call this probability f(λ).

To answer this question by simulation, we can first sample two indpendent random Poisson(λ) variables (which is kind of annoying), M and N; then sample independent uniform random variables $X_1, X_2, \ldots, X_M$ and $Y_1, Y_2, \ldots, Y_N$; and finally check if $\max(X_1, \ldots, X_M) > \min(Y_1, \ldots, Y_N)$.

For example, with λ = 3 we might have M = 3, N = 2; then perhasp $X_1 = 0.48, X_2 = 0.77, X_3 = 0.30; Y_1 = 0.07, Y_2 = 0.45$. The maximum of the $X_i$ is 0.77, which is greater than the minimum of the $Y_j$, 0.07.

A few lines of R suffice to run, say, ten thousand simulations for any given λ (which should get us f(λ) to within one percent or so):

 simulate = function(lambda, n) { x = replicate(n, max(runif(rpois(1,lambda),0,1))); y = replicate(n, min(runif(rpois(1,lambda),0,1))); sum(x>y)/n } 

And from there we can generate data for a plot, say, by estimating f(λ) for λ = 0, 0.1, …, 6, with ten thousand simulations each:

An analytic solution is also possible, from standard facts about Poisson processes – the minimum of a density-λ Poisson process on [0, &infty;) is exponentially distributed with rate λ. Suitably modifying this for the fact that we’re dealing with [0,1] and sometimes with maxima, and doing some double integrals, it turns out that $f(\lambda) = 1-e^{-\lambda}(\lambda+1)$, the red line in the plot above.

Finally, why would anyone care about this question? Imagine you run a web site, and on each comment you put a time stamp, and that time stamp is the time that it was at your server at the time the comment was made. Then say someone comes by at 1:45 AM Pacific Daylight Time this morning and leaves a comment, and someone else comes along at 1:15 AM Pacific Standard Time — which is actually a half-hour later — and leaves a comment. The comments will appear to be in the wrong order, like they do here. Then f(λ) is the probability of this occuring where λ is the number of comments per hour. Alternatively, it’s the probability that given just the sequence of timestamps in local time you can work out which are the last daylight-savings-time comment and the first standard-time comment. As I said here, this is an increasing function of λ, although I am not too lazy to work it out.

7 thoughts on “Poisson processes appropriate for today”

1. steven says:

Small typo: you use the LaTeX-style “infty” where you need HTML-style “infin”.

2. steven says:

Oh, and has anybody tried to check whether Metafilter comments are actually Poisson? Comments replying to one another could ruin the Poissonness, right?

3. Nice post. Technically, I believe it is easier to store dates in UTC and at presentation time render them in the server’s timezone (which may or not follow the DST convention), rather than continuously tracking, and updating $\lambda$ (as far as I know comment volume follows a circadian pattern, with irregular peaks due to the occasional very popular article).

4. You can calculate it in your head without the double integrals. Consider two intervals, one running from 0 up to the minimum of the second process, the other running from 1 down to the maximum of the first process. These are independent and both have \exp(\lambda) distribution. The min of the second process is less than the max of the first process if the lengths of these two intervals sum to less than one. But the probability that the sum of two independent \exp(\lambda) variables is less than 1 is the probability that there are at least 2 points in a Poisson process of rate \lambda on [0,1], which is just P(X>1) where X is Poisson(\lambda), i.e. 1 – e^{-\lambda} – \lambda e^{-\lambda}.