Central limit theorem : Exercises

Introduction

The central limit theorem states that for a set of independent and identically distributed random variables, $X_1$, $X_2$, $\dots$ the following holds: $$ \lim_{N \rightarrow \infty} P\left( \frac{S_n/N - \mu}{\sigma/\sqrt{N}} \le z \right) = \Phi(z) $$ where $N$ is the number of random variables and $S_n = X_1 + X_2 + \dots$. In the above expression $\mu=\mathbb{E}(X)$ and $\sigma^2=\textrm{var}(X)$ so obviously this expression cannot be used if your random variable has either $\mathbb{E}(X)=\infty$ or $\textrm{var}(X)=\infty$. In doing the following exercises you will learn how to extract probabilistic information about the sums of independent and identical random variables using the central limit theorem.

Example problems

Click on the problems to reveal the solution

Problem 1

Just substuting the numbers from the question into the central limit theorem here gives: $$ x = \frac{ S_n / n - \mu }{ \sigma / \sqrt{n} } = \frac{ 140/100 - 1.2 }{ 1.0/\sqrt{100} } = \frac{1.4 - 1.2}{ 0.1 } = \frac{ 0.2 }{0.1} = 2 \nonumber $$ Thus: $$ P( N \le 140 ) \approx \Phi(x) \approx \Phi(2) \approx 0.9772 \nonumber $$

Problem 2

WE are given information on a random variable in the question. We can use this information to calcualte the expectation as shown below: $$ \mathbb{E}(X) = [(-0.05) \times 0.3] + [0.0 \times 0.2] + [0.05 \times 0.5] = 0.2 \times 0.05 = 0.01 $$ We can also use this information to calcualte the variance. $$ \begin{aligned} \mathbb{E}[(X - \mu)^2] &= (-0.05 - 0.01)^2 \times 0.3 + (0.0 - 0.01)^2 \times 0.2 + ( 0.05 - 0.01 )^2 \times 0.5 \\ & = 0.0019 \end{aligned} \nonumber $$ And the square root of the variance. $$ \sigma = \sqrt{ \mathbb{E}[(X - \mu)^2] } = \sqrt{0.0019} = 0.0436 \nonumber $$ We now use the central limit theorem to determine how the share price changes. We want to look at what happens after 3 hours. Our random variable tells us how the share price changes each minute. What we are thus calculating is the sum of 180 of these random variables. Substituting this into the central limit theorem we find that: $$ x = \frac{ S_n / n - \mu }{ \sigma / \sqrt{n} } = \frac{ 1.2 / 180 - 0.01 }{ 0.0436 / \sqrt{180} } \approx \frac{ -0.00333 }{ 0.00325 } = -1.025 $$ and thus: $$ P( s \le 1.2 ) = \Phi(x) = \Phi(-1.025) = 0.15625 \nonumber $$ We want the probability that $s>1.2$, however, which is just: $$ P( s > 1.2 ) = 1 - P( s \le 1.2 ) = 1 - 0.15625 = 0.844 \nonumber $$

Problem 3

We will assume that each penalty attempt is the generation of a random number from a Bernoulli random variable, $X$. In other words for every penalty we assume that the underlying probability of the footballer scoring is the same. The number of goals is the sum of many of these random numbers. That is to say it is: $S_n = X_1 + X_2 + X_3 + \dots + X_n$ We can thus estimate the parameter $p$, which is the mean of the Bernoulli random variable, by assuming the Law of Large Numbers holds: $$ p = \frac{ \textrm{Number of goals scored} }{ \textrm{Number of attempts} } = \frac{175}{213} = 0.822 $$ (a more correct method here is to note that $S_n$ is a binomially distributed random variable with parameter $p$ and to calculate the parameter using maximum likelihood). We can use the central limit theorem to get error bars for this estimate. We want to be able to state the following about the probability distribution for the sum of the Bernoulli random variables (penalty attempts): $$ \begin{aligned} P\left( \frac{ - \epsilon}{ \sigma / \sqrt{n} } \le \frac{ S_n / n - \mu }{\sigma/\sqrt{n}} \le \frac{\epsilon}{ \sigma / \sqrt{n} } \right) & = 0.95 \\ \Phi\left( \frac{\epsilon}{ \sigma / \sqrt{n} } \right) - \Phi\left( - \frac{\epsilon}{ \sigma / \sqrt{n} } \right) & = 0.95 \\ \Phi\left( \frac{\epsilon}{ \sigma / \sqrt{n} } \right) - \left[ 1 - \Phi\left( \frac{\epsilon}{ \sigma / \sqrt{n} } \right) \right] & = 0.90 \\ 2\Phi\left( \frac{\epsilon}{ \sigma / \sqrt{n} } \right) - 1 &=0.95 \end{aligned} $$ In the second line here we use the central limit theorem, which we know is the probability distribution for the sum of random variables. In the third line we use the fact that the normal distribution function is an odd function. Rearranging the above tells us that our error bars are given by: $$ \epsilon = \frac{\sigma}{\sqrt{n}} \Phi^{-1}\left( \frac{1.95}{2} \right) $$ The variance ($\sigma^2$) for a Bernoulli random variable is given by $\sigma^2=p(1-p)$. So in our case this is 0.146. Substituting this and the value of $n$ into the above gives: $$ \epsilon = \frac{\sqrt{0.146}}{\sqrt{213}} \Phi^{-1}\left( \frac{1.95}{2} \right) \approx 0.0262*1.9600 \approx 0.05135 $$

Problem 4

We will assume that each serve is the generation of a random number from a Bernoulli random variable, $X$. In other words we will assume that the probability of getting each serve in is a constant. The number of services that landed in is thus the sum of many of these random numbers. That is to say it is: $$ S_n = X_1 + X_2 + X_3 + \dots + X_n \nonumber $$ We can thus estimate the parameter $p$, which is the mean of the Bernoulli random variable, by assuming the Law of Large Numbers holds: $$ p = \frac{ \textrm{Number of first serves in} }{ \textrm{Number of first serves} } = \frac{90}{141} \approx 0.638 $$ (a more correct method here is to note that $S_n$ is a binomially distributed random variable with parameter $p$ and to calculate the parameter using maximum likelihood).

We can use the central limit theorem to get error bars for this estimate. We want to be able to state the following about the probability distribution for the sum of the Bernoulli random variables (serve attempts): $$ \begin{aligned} P\left( \frac{ - \epsilon}{ \sigma / \sqrt{n} } \le \frac{ S_n / n - \mu }{\sigma/\sqrt{n}} \le \frac{\epsilon}{ \sigma / \sqrt{n} } \right) & = 0.90 \\ \Phi\left( \frac{\epsilon}{ \sigma / \sqrt{n} } \right) - \Phi\left( - \frac{\epsilon}{ \sigma / \sqrt{n} } \right) & = 0.90 \\ \Phi\left( \frac{\epsilon}{ \sigma / \sqrt{n} } \right) - \left[ 1 - \Phi\left( \frac{\epsilon}{ \sigma / \sqrt{n} } \right) \right] & = 0.90 \\ 2\Phi\left( \frac{\epsilon}{ \sigma / \sqrt{n} } \right) - 1 & =0.90 \end{aligned} $$ In the second line here we use the central limit theorem, which we know is the probability distribution for the sum of random variables. In the third line we use the fact that the normal distribution function is an odd function. Rearranging the above tells us that our error bars are given by: $$ \epsilon = \frac{\sigma}{\sqrt{n}} \Phi^{-1}\left( \frac{1.90}{2} \right) $$ The variance ($\sigma^2$) for a Bernoulli random variable is given by $\sigma^2=p(1-p)$. So in our case this is 0.231. Substituting this and the value of $n$ into the above gives: $$ \epsilon \approx \frac{\sqrt{0.231}}{\sqrt{141}} \Phi^{-1}\left( \frac{1.90}{2} \right) \approx 0.0405 \times 1.6449 \approx 0.667 $$ This is not a sensible way of calculating the probability as serving on championship point is nothing like serving during the normal course of the match. The individual random variables are not identically distributed

Problem 5

When we select a bird we are generating a Bernoulli random variable. This assumes that the genders of our selected birds are all independent.
We can use the central limit theorem to estimate the value of $n$ such that: \[ P\left( \frac{S_n/n - \mu}{\sigma/\sqrt{n}} \le z \right) = 0.9 \] where $\frac{S_n}{n} = 0.5$ as we want at least half of birds to be female, $\mu=0.55$ as this is the mean of our Bernoulli random variable and $\sigma = \sqrt{0.55(0.45)} = 0.497$ as this is variance of a Bernoulli random variable. We can find the value of $z$ from the tables by looking up the inverse function for the normal distribution. We want to find $z$ such that: \[ \Phi(z) = 0.9 \qquad \rightarrow \qquad z = 1.2816 \] We thus have (for minimal sample size) i.e. $\frac{S_n/n - \mu}{\sigma/\sqrt{n}} = z$: \[ \frac{0.5 - 0.55}{0.497/\sqrt{n} } = 1.2816 \quad \rightarrow \quad n = \left( \frac{1.2816 \times 0.497}{-0.05}\right)^2 \approx 163 \]

Contact Details

School of Mathematics and Physics,
Queen's University Belfast,
Belfast,
BT7 1NN

Email: g.tribello@qub.ac.uk
Website: mywebsite