Lisbon Accounting and Business School – Polytechnic University of Lisbon
These slides are a free translation and adaptation from the slide deck for Estatística I by Prof. Sandra Custódio and Prof. Teresa Ferreira from the Lisbon Accounting and Business School - Polytechnical University of Lisbon.
A r.v. \(X\) follows a uniform distribution in \([a,b]\subset\mathbb{R}\) with \(-\infty<a<b<\infty\), if its probability density is given by:
\[ f_X(x)=\begin{cases} \frac{1}{b-a} & a\leq x \leq b\\ 0 & otherwise \end{cases} \]
We write \(X\sim U[a,b]\)
The cumulative distribution function is given by:
\[ F_X(x)=\begin{cases} 0 & x< a\\ \frac{x-a}{b-a} & a\leq x < b\\ 1 & x\geq b \end{cases} \]
Note, this distribution is symmetric, and its first two moments are:
The length of small spots in a TV network is a r.v. \(X\) distributed \(U[5,12]\).
Let the r.v. \(X\) be distributed \(U[2,b]\) with \(b>2\). What value must \(b\) take to make \(P(3\leq X\leq 5)=0.4\)?
The exponential distribution is rooted in the Poisson distribution, reflecting the waiting time between events originated according to a Poisson process.
Nevertheless, we can apply the exponential distribution to many other phenomena.
A r.v \(X\) is distributed exponentially, with parameter \(\lambda>0\), \(X\sim Exp(\lambda)\), if its probability density function is given by:
\[ f_X(x)=\begin{cases} \lambda e^{-\lambda x} & x\geq 0\\ 0 & x< 0 \end{cases} \]
\(\lambda\) can be interpreted as the expected waiting time (or space) between events.
The cumulative probability function is:
\[ F_X(x)=\begin{cases} 0 & x< 0\\ 1-e^{-\lambda x} & x\geq 0 \end{cases} \]
And its moments are:
Property
Lack of memory of the exponential distribution:
Let the r.v. \(X\sim Exp(\lambda)\), then:
\[P(X>x+h|X>x)=P(X>h)\]
with \(x,h>0\)
Considering the survival application of this distribution, this property states that, the time left to leave is independent of what it already lived.
In a factory, the execution time of a piece is random variable distributed exponentially with expected value of 5 minutes.
The time it takes until the first consultation, and between consultations, in the clinic of Dr. Shawn are independent and distributed exponentially with \(\lambda=0.1\).
What is the probability that no consultation occurs before the first 10 minutes?
The Gaussian or Normal Distributions is one of the most used distributions, playing a key role in statistical inference.
Rhe r.v. \(X\) is normally distributed, \(X\sim N(\mu,\sigma^2)\), if it’s density and cumulative probability distribution functions:
\[ f(x)=\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{\sigma^2}}\\ P(X\leq a)=F(a)=\int_{-\infty}^a f(t)dt \]
Where \(E[X]=\mu\) and \(Var[X]=\sigma^2\)
If \(\mu=0\) and \(\sigma^2=1\) we call this the standard normal distribution, where \(Z\sim N(0,1)\).
Theorem
Let \(X\sim N(\mu,\sigma^2)\). Let \(Z=\frac{X-\mu}{\sigma}\), then \[Z\sim N(0,1)\]
That is, we can standardize a r.v. distributed Normally. Its probability function is denoted \(\Phi\), and its density is \(\phi\).
\[ \phi(z)= \frac{1}{\sqrt{2\pi}}e^{-z^2} \\ \Phi(z)=P(Z\leq z) = \int_{-\infty}^z \frac{1}{\sqrt{2\pi}}e^{-t^2} dt \]
How do we know if a r.v. follows a Normal distribution?
Consider the r.v. \(Z\sim N(0,1)\). Find \(P(Z\leq 1.65)\).
\[P(Z\leq 1.65) = F(1.65)=\Phi(1.65)=0.9505\]
Warning
Consider the r.v. \(X\sim N(6,25)\). Find \(P(X\leq 12)\).
\[P(X\leq 12)=P\left(\frac{X-6}{5}\leq \frac{12-6}{5}\right)\]
Or
\[P(Z\leq 1.2)=\Phi(1.2)=0.8849\]
If \(X\sim N(6,25)\), find \(P(X\leq 12)\)
\[ P(6<X\leq 12)=F(12)-F(6)= \] \[ P\left(\frac{6-6}{5}<\frac{X-6}{5}\leq \frac{12-6}{5}\right) \]
Or
\[ P(0<Z\leq 1.2)=\Phi(1.2)-\Phi(0)\\0.8849-0.5=0.3849 \]
Let \(X\sim N(6,25)\). Find \(P(X\leq -4)\) and \(P(X\geq 16)\)
\[ P(X\leq -4)=P\left(\frac{X-6}{5}\leq \frac{-4-6}{5}\right)=\\ P(Z\leq -2)=\Phi(-2)=1-\Phi(2)=0.0228 \]
Check that \(P(X\leq -4)=P(X\geq 16)\)
Because of symmetry: \(P(X\leq \mu-k)=P(X\geq \mu+k)\) \(\forall k\in\mathbb{R}\)
\(P(X\leq -4)=P(X\leq 6-10)=\\=P(X\geq 6+10)=P(X\geq 16)\)
Corollary
\(\Phi(-k)=P(Z\leq -k)=P(Z\geq k)=1-P(Z\leq k)=1-\Phi(k)\)
Consider the distribution \(X\sim N(6,25)\)
Find \(P(0\leq X\leq 8)\)
\(P(0\leq X \leq 8)=P(0<X\leq 8)\) which is equivalent to
\[ P(-1.2<Z\leq 0.4)=\Phi(0.4)-\Phi(-1.2)=\\ \Phi(0.4)-\left[1-\Phi(1.2)\right]=\\ 0.6554-[1-0.8849]=0.5403 \]
Consider the r.v. \(X\sim N(6, 25)\). Find \(P(|X-6|>10)\)
\[P(|X-6|>10)=1-P(X-6|\leq 10)\]
\[1-P(-10\leq X-6\leq 10)=1-P(-2<Z\leq 2)\]
\[1-[\Phi(2)-\Phi(-2)]=1-[\Phi(2)-1+\Phi(2)]\]
\[2-2\Phi(2)=2-2\times 0.9772=0.0456\]
Cosnider the r.v. \(X\sim N(6,25)\), find \(k\) such that \(P(X>k)=0.9\)
\(P(X>k)=0.9\Leftrightarrow 1-P(X\leq k)=0.9\)
Or
\[P(X\leq k)=0.10 \Leftrightarrow P\left(Z\leq \frac{k-6}{5}\right)=0.10\]
\[P\left(Z\leq \frac{k-6}{5}\right)=0.10\Leftrightarrow \Phi\left(\frac{k-6}{5}\right)=0.10\]
Because of symmetry we have \(P(Z\leq -z)=P(Z\geq z)\)
Using the table we find that \(P(Z\geq z)=0.1\) implies \(\Phi^{-1}(z)=0.1\) or \(z=\pm 1.282\). Substituting we get:
\[\Phi(-1.282)=0.1\Leftrightarrow \frac{k-6}{5}=-1.282\] \[k=6-1.282\times 5 = -0.41\]
Theorem: Normal additivity
If \(X_1\sim N(\mu_1,\sigma_1^2)\) and \(X_2\sim N(\mu_2,\sigma_2^2)\), then for any \(a,b\in\mathbb{R}\) we have that \(T=aX_1+bX_2\), where \[T\sim N(\mu_T,\sigma_T^2)\]
To find \(\mu_T\) and \(\sigma_T^2\) remember the properties of the mean and variance.
Let the r.v.s \(X\sim N(6,4)\) and \(Y\sim N(6,4)\), with \(T=0.5 X-Y\) Find \(\mu_T\) and \(\sigma_T^2\)
\[\mu_T=E[0.5X-Y]=0.5E[X]-E[Y]=\\ 0.5 \times 6 - 6 = -3\]
\[\sigma_T^2=V[0.5 X- Y]=V[0.5 X]+V[-Y]=\\ 0.5^2 V[X]+V[Y]= 0.25 \times 4 + 4 = 5\]
Find \(P(T>0)\)
\(T\sim N(-3, 5)\)
\[P(T>0)=P\left(Z>\frac{0+3}{\sqrt{5}}\right)=1-P(Z\leq 1.34)=\\ 1-\Phi(1.34)=0.0901\]
Corollaries
Then:
Let \(X_i\sim N(120, 64)\) be r.v.s representing the number of bank deposits made in a specific day. Then \(T=X_1+\dots+X_5\) are the weekly deposits.
Find the probability of the weekly deposits exceed 620.
\(T\sim N(600, 320)\) because of the Normal additivity property.
\[P(T>620)=P\left(Z>\frac{620-5\times 120}{\sqrt{5}\times 8}\right)=\\1-P(Z\leq 1.12)=1-\Phi(1.12)=0.1314\]
| df-\(\alpha/2\) | 0.50 | 0.25 | 0.20 | 0.15 | 0.10 | 0.05 | 0.025 | 0.01 | 0.005 | 0.001 | 0.0005 | 
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.000 | 1.000 | 1.376 | 1.963 | 3.078 | 6.314 | 12.71 | 31.82 | 63.66 | 318.31 | 636.62 | 
| 2 | 0.000 | 0.816 | 1.061 | 1.386 | 1.886 | 2.920 | 4.303 | 6.965 | 9.925 | 22.327 | 31.599 | 
| 3 | 0.000 | 0.765 | 0.978 | 1.250 | 1.638 | 2.353 | 3.182 | 4.541 | 5.841 | 10.215 | 12.924 | 
| 4 | 0.000 | 0.741 | 0.941 | 1.190 | 1.533 | 2.132 | 2.776 | 3.747 | 4.604 | 7.173 | 8.610 | 
| 5 | 0.000 | 0.727 | 0.920 | 1.156 | 1.476 | 2.015 | 2.571 | 3.365 | 4.032 | 5.893 | 6.869 | 
| 6 | 0.000 | 0.718 | 0.906 | 1.134 | 1.440 | 1.943 | 2.447 | 3.143 | 3.707 | 5.208 | 5.959 | 
| 7 | 0.000 | 0.711 | 0.896 | 1.119 | 1.415 | 1.895 | 2.365 | 2.998 | 3.499 | 4.785 | 5.408 | 
| 8 | 0.000 | 0.706 | 0.889 | 1.108 | 1.397 | 1.860 | 2.306 | 2.896 | 3.355 | 4.501 | 5.041 | 
| 9 | 0.000 | 0.703 | 0.883 | 1.100 | 1.383 | 1.833 | 2.262 | 2.821 | 3.250 | 4.297 | 4.781 | 
| 10 | 0.000 | 0.700 | 0.879 | 1.093 | 1.372 | 1.812 | 2.228 | 2.764 | 3.169 | 4.144 | 4.587 | 
| 11 | 0.000 | 0.697 | 0.876 | 1.088 | 1.363 | 1.796 | 2.201 | 2.718 | 3.106 | 4.025 | 4.437 | 
| 12 | 0.000 | 0.695 | 0.873 | 1.083 | 1.356 | 1.782 | 2.179 | 2.681 | 3.055 | 3.930 | 4.318 | 
| 13 | 0.000 | 0.694 | 0.870 | 1.079 | 1.350 | 1.771 | 2.160 | 2.650 | 3.012 | 3.852 | 4.221 | 
| 14 | 0.000 | 0.692 | 0.868 | 1.076 | 1.345 | 1.761 | 2.145 | 2.624 | 2.977 | 3.787 | 4.140 | 
| 15 | 0.000 | 0.691 | 0.866 | 1.074 | 1.341 | 1.753 | 2.131 | 2.602 | 2.947 | 3.733 | 4.073 | 
| 16 | 0.000 | 0.690 | 0.865 | 1.071 | 1.337 | 1.746 | 2.120 | 2.583 | 2.921 | 3.686 | 4.015 | 
| 17 | 0.000 | 0.689 | 0.863 | 1.069 | 1.333 | 1.740 | 2.110 | 2.567 | 2.898 | 3.646 | 3.965 | 
| 18 | 0.000 | 0.688 | 0.862 | 1.067 | 1.330 | 1.734 | 2.101 | 2.552 | 2.878 | 3.610 | 3.922 | 
| 19 | 0.000 | 0.688 | 0.861 | 1.066 | 1.328 | 1.729 | 2.093 | 2.539 | 2.861 | 3.579 | 3.883 | 
| 20 | 0.000 | 0.687 | 0.860 | 1.064 | 1.325 | 1.725 | 2.086 | 2.528 | 2.845 | 3.552 | 3.850 | 
| 21 | 0.000 | 0.686 | 0.859 | 1.063 | 1.323 | 1.721 | 2.080 | 2.518 | 2.831 | 3.527 | 3.819 | 
| 22 | 0.000 | 0.686 | 0.858 | 1.061 | 1.321 | 1.717 | 2.074 | 2.508 | 2.819 | 3.505 | 3.792 | 
| 23 | 0.000 | 0.685 | 0.858 | 1.060 | 1.319 | 1.714 | 2.069 | 2.500 | 2.807 | 3.485 | 3.768 | 
| 24 | 0.000 | 0.685 | 0.857 | 1.059 | 1.318 | 1.711 | 2.064 | 2.492 | 2.797 | 3.467 | 3.745 | 
| 25 | 0.000 | 0.684 | 0.856 | 1.058 | 1.316 | 1.708 | 2.060 | 2.485 | 2.787 | 3.450 | 3.725 | 
| 26 | 0.000 | 0.684 | 0.856 | 1.058 | 1.315 | 1.706 | 2.056 | 2.479 | 2.779 | 3.435 | 3.707 | 
| 27 | 0.000 | 0.684 | 0.855 | 1.057 | 1.314 | 1.703 | 2.052 | 2.473 | 2.771 | 3.421 | 3.690 | 
| 28 | 0.000 | 0.683 | 0.855 | 1.056 | 1.313 | 1.701 | 2.048 | 2.467 | 2.763 | 3.408 | 3.674 | 
| 29 | 0.000 | 0.683 | 0.854 | 1.055 | 1.311 | 1.699 | 2.045 | 2.462 | 2.756 | 3.396 | 3.659 | 
| 30 | 0.000 | 0.683 | 0.854 | 1.055 | 1.310 | 1.697 | 2.042 | 2.457 | 2.750 | 3.385 | 3.646 | 
| 40 | 0.000 | 0.681 | 0.851 | 1.050 | 1.303 | 1.684 | 2.021 | 2.423 | 2.704 | 3.307 | 3.551 | 
| 60 | 0.000 | 0.679 | 0.848 | 1.045 | 1.296 | 1.671 | 2.000 | 2.390 | 2.660 | 3.232 | 3.460 | 
| 80 | 0.000 | 0.678 | 0.846 | 1.043 | 1.292 | 1.664 | 1.990 | 2.374 | 2.639 | 3.195 | 3.416 | 
| 100 | 0.000 | 0.677 | 0.845 | 1.042 | 1.290 | 1.660 | 1.984 | 2.364 | 2.626 | 3.174 | 3.390 | 
| 1000 | 0.000 | 0.675 | 0.842 | 1.037 | 1.282 | 1.646 | 1.962 | 2.330 | 2.581 | 3.098 | 3.300 | 
| Z | 0.000 | 0.674 | 0.842 | 1.036 | 1.282 | 1.645 | 1.960 | 2.326 | 2.576 | 3.090 | 3.291 | 
In this case the r.v. \(X\) is the time until the first consultation, and \(X\sim Exp(0.1)\).
\[P(X>10)=1-P(X<10)=1-F(10)\]
Or
\[1-(1-exp^{-0.1\times 10})=e^{-1}\approx 0.3679\]
Statistics I