Lisbon Accounting and Business School — Polytechnic University of Lisbon
Let \(X \sim N(\mu, \sigma)\), with \(\mu\) and \(\sigma\) unknown.
Let \(X_1, X_2, \ldots, X_n\) be a random sample from population \(X\).
Goal: How can we estimate \(\mu\), the population mean, from the sample \(X_1, X_2, \ldots, X_n\)?
The objective of point estimation is to use all available information from the sample in order to select a single value that is the most plausible for the (unknown) parameter to be estimated.
Let \(X\) be a random variable whose probabilistic behaviour is known and characterised by a parameter \(\theta\), which is unknown.
If \(X_1, X_2, \ldots, X_n\) is a random sample of size \(n\) from that population, a point estimator of \(\theta\), denoted \(\hat{\theta}\), is any statistic \(T(X_1, X_2, \ldots, X_n)\) that takes values only in \(\Theta\) (the set of possible values for \(\theta\)).
Once a particular sample \(x_1, x_2, \ldots, x_n\) is observed, we obtain a point estimate for \(\theta\): \(T(x_1, x_2, \ldots, x_n)\).
There are specific methods that allow us to choose the estimator for each population parameter to be estimated, such as the maximum likelihood method, the method of moments, and others.
Because there may be several estimators for the same population parameter, we consider some properties that estimators should ideally possess, which serve as guidance on how to choose the “best” one.
Definition
An estimator \(\hat{\theta}\) is said to be unbiased (or centred) for the parameter \(\theta\) if and only if: \[E(\hat{\theta}) = \theta\]
An estimator \(\hat{\theta}\) is said to be biased for the parameter \(\theta\) if and only if: \[E(\hat{\theta}) \neq \theta\]
The bias of an estimator \(\hat{\theta}\) is measured by: \[\text{Bias}(\hat{\theta}) = E(\hat{\theta}) - \theta\]
Consider a population \(X\) with mean \(\mu\) and variance \(\sigma^2\).
Given a random sample \(X_1, X_2, \ldots, X_n\) from \(X\), the following hold for any distributional behaviour of \(X\) (provided \(E(X)\) and \(V(X)\) exist):
\[E(\bar{X}) = \mu \quad \Rightarrow \quad \bar{X} \text{ is an unbiased estimator for } \mu\]
\[E(S'^2) = \sigma^2 \quad \Rightarrow \quad S'^2 \text{ is an unbiased estimator for } \sigma^2\]
\[E(S^2) = \frac{n-1}{n}\sigma^2 \neq \sigma^2 \quad \Rightarrow \quad S^2 \text{ is NOT an unbiased estimator for } \sigma^2\]
Definition
Let \(\hat{\theta}\) and \(\tilde{\theta}\) be two unbiased estimators of the same parameter \(\theta\).
The estimator \(\hat{\theta}\) is said to be more efficient than \(\tilde{\theta}\) if:
\[\text{Var}(\hat{\theta}) < \text{Var}(\tilde{\theta}) \quad \Longleftrightarrow \quad \frac{\text{Var}(\hat{\theta})}{\text{Var}(\tilde{\theta})} < 1\]
Sufficient Condition for Consistency
A sufficient condition for an estimator \(\hat{\theta}\) to be consistent for the parameter \(\theta\) is:
\[\text{i)} \quad \lim_{n \to +\infty} E(\hat{\theta}) = \theta\]
\[\text{ii)} \quad \lim_{n \to +\infty} \text{Var}(\hat{\theta}) = 0\]
Consider a sample \(X_1, X_2, \ldots, X_n\), \(n \in \mathbb{N}\), drawn from a population \(X\) with mean \(\mu\) and variance \(\sigma^2\), both finite.
Consider the estimator \(T_1\) for \(\mu\):
\[T_1 = X_1 + \frac{1}{n-1}\sum_{i=2}^{n} X_i\]
a) True or False: This estimator is biased for \(\mu\) and its bias equals \(\mu\).
\[E(T_1) = E\!\left(X_1 + \frac{1}{n-1}\sum_{i=2}^{n}X_i\right) = E(X_1) + \frac{1}{n-1}\sum_{i=2}^{n}E(X_i)\]
\[= \mu + \frac{1}{n-1}\sum_{i=2}^{n}\mu = \mu + \frac{1}{n-1}(n-1)\mu = 2\mu \neq \mu\]
Therefore, \(T_1\) is a biased estimator for \(\mu\).
\[\text{Bias}(T_1) = E(T_1) - \mu = 2\mu - \mu = \mu\]
The statement is true.
Now consider a second estimator \(T_2\) for \(\mu\):
\[T_2 = \frac{1}{2}T_1\]
True or False: \(T_2\) is a consistent estimator for \(\mu\).
For \(T_2\) to be consistent for \(\mu\), we need to verify:
\[\text{i)} \quad \lim_{n \to +\infty} E(T_2) = \mu \qquad \text{ii)} \quad \lim_{n \to +\infty} \text{Var}(T_2) = 0\]
\[\lim_{n \to +\infty} E(T_2) = \lim_{n \to +\infty} E\!\left(\frac{1}{2}T_1\right) = \lim_{n \to +\infty} \frac{1}{2}E(T_1) = \lim_{n \to +\infty} \frac{1}{2} \times 2\mu = \mu \checkmark\]
Condition i) is satisfied.
We need \(\lim_{n \to +\infty}\text{Var}(T_2)\). Let us first compute \(\text{Var}(T_1)\):
\[\text{Var}(T_1) = \text{Var}\!\left(X_1 + \frac{1}{n-1}\sum_{i=2}^{n}X_i\right) = \sigma^2 + \frac{1}{(n-1)^2}\sum_{i=2}^{n}\sigma^2 = \sigma^2 + \frac{n-1}{(n-1)^2}\sigma^2 = \sigma^2 + \frac{\sigma^2}{n-1}\]
Then:
\[\lim_{n \to +\infty}\text{Var}(T_2) = \lim_{n \to +\infty}\frac{1}{4}\text{Var}(T_1) = \frac{1}{4}\lim_{n \to +\infty}\left(\sigma^2 + \frac{\sigma^2}{n-1}\right) = \frac{1}{4}\sigma^2 \neq 0\]
Condition ii) is not satisfied. Therefore \(T_2\) is not a consistent estimator for \(\mu\).
The statement is false.
Consider a population \(X\) whose distribution depends on a parameter \(\theta \in \mathbb{R}\), with unknown value. It is known that:
\[E(X) = \theta - 2 \qquad \text{and} \qquad V(X) = 1\]
From a random sample \(X_1, X_2, \ldots, X_n\), \(n \geq 2\), two estimators \(\theta^*\) and \(\hat{\theta}\) were obtained, with the following known properties:
\[\theta^* = \bar{X} + 2, \qquad E(\hat{\theta}) = \theta, \qquad V(\hat{\theta}) = \frac{2}{n}\]
i) Regarding estimators \(\theta^*\) and \(\hat{\theta}\):
| a) Only \(\theta^*\) is unbiased | b) Neither estimator is unbiased |
| c) Both are unbiased | d) Only \(\hat{\theta}\) is unbiased |
Solution:
Population \(X\): \(\mu = E(X) = \theta - 2\); \(\sigma^2 = V(X) = 1\)
\(E(\hat{\theta}) = \theta\) — given in the problem statement, so \(\hat{\theta}\) is unbiased.
\[E(\theta^*) = E(\bar{X} + 2) = E(\bar{X}) + 2 = \mu + 2 = (\theta - 2) + 2 = \theta\]
Since \(E(\theta^*) = E(\hat{\theta}) = \theta\), both estimators are unbiased. Answer: (c).
ii) True or False: \(\theta^*\) is a more efficient estimator than \(\hat{\theta}\).
Solution:
Both estimators are unbiased, so the more efficient one has the smaller variance:
\[V(\theta^*) = V(\bar{X} + 2) = V(\bar{X}) = \frac{\sigma^2}{n} = \frac{1}{n}\]
\[V(\theta^*) = \frac{1}{n} < \frac{2}{n} = V(\hat{\theta})\]
Therefore, \(\theta^*\) is more efficient than \(\hat{\theta}\).
The statement is true.
Statistics II — Point Estimation