Introduction to Econometrics Chapt 1,2,3

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 41

Introduction to Econometrics

I.1 What Is Econometrics?

Econometrics is the application of statistical methods to economic


data in order to test hypotheses and estimate relationships between
economic variables.

I.2 Why a Separate Discipline?

Econometrics is a separate discipline from economics and statistics


because it requires a unique combination of economic theory,
mathematical modeling, and statistical analysis.

I.3 Methodology of Econometrics

The econometric methodology involves the following steps:

1. Statement of Theory or Hypothesis: formulate a theoretical


framework or hypothesis to be tested.

2. Specification of the Mathematical Model: translate the theoretical


framework into a mathematical model.

3. Specification of the Econometric Model: specify the econometric


model, including the variables to be included and the functional
form of the model.

4. Obtaining Data: collect and prepare the data for analysis.

5. Estimation of the Econometric Model: estimate the parameters of


the econometric model using statistical methods.

6. Hypothesis Testing: test hypotheses about the parameters of the


model.
7. Forecasting or Prediction: use the estimated model to forecast or
predict future values of the dependent variable.

8. Use of the Model for Control or Policy Purposes: use the


estimated model to inform policy decisions or control variables.

I.4 Types of Econometrics

There are several types of econometrics, including:

Theoretical Econometrics: focuses on the development of


econometric theory and models.

Applied Econometrics: focuses on the application of econometric


methods to real-world data.

Time Series Econometrics: focuses on the analysis of time series


data.

I.5 Mathematical and Statistical Prerequisites

Econometrics requires a strong foundation in mathematics and


statistics, including:

Calculus: used to optimize functions and derive mathematical


models.

Linear Algebra: used to manipulate matrices and vectors.

Probability Theory: used to understand random variables and


probability distributions.

Statistical Inference: used to make inferences about populations


based on sample data.

I.6 The Role of the Computer


Computers play a crucial role in econometrics, as they are used to:

Estimate econometric models: using statistical software packages.

Analyze and manipulate data: using spreadsheet software and


statistical software packages.

Simulate economic models: using specialized software packages.

Chapter 1: The Nature of Regression Analysis

1.1 Historical Origin of the Term Regression

The term "regression" was first coined by Sir Francis Galton in the late
19th century. Galton observed that the height of offspring tended to
"regress" or move closer to the average height of the population,
rather than exceeding the height of their parents.

1.2 The Modern Interpretation of Regression

In modern statistics, regression analysis refers to the process of


establishing a mathematical relationship between two or more
variables. This relationship is often used to predict the value of one
variable based on the value of another variable.

Examples

The relationship between the amount of rainfall and the yield of


crops.

The relationship between the price of a house and its size.

The relationship between the dosage of a drug and its effect on


blood pressure.
1.3 Statistical versus Deterministic Relationships

A statistical relationship is one that is based on probability, whereas a


deterministic relationship is one that is exact and predictable.
Regression analysis deals with statistical relationships.

1.4 Regression versus Causation

Regression analysis can identify relationships between variables, but


it cannot establish causation. In other words, just because two
variables are related, it does not mean that one variable causes the
other.

1.5 Regression versus Correlation

Regression analysis is concerned with establishing a cause-and-effect


relationship between variables, whereas correlation analysis is
concerned with measuring the strength and direction of the
relationship between variables.

1.6 Terminology and Notation

Dependent variable: the variable being predicted or explained.

Independent variable: the variable used to predict or explain the


dependent variable.

Regression equation: the mathematical equation that describes the


relationship between the dependent and independent variables.

1.7 The Nature and Sources of Data for Economic Analysis


Data is the raw material of statistical analysis. In economic analysis,
data can come from a variety of sources, including government
agencies, surveys, and experiments.

Types of Data

Time series data: data collected over time.

Cross-sectional data: data collected at a single point in time.

Panel data: data collected over time for multiple individuals or units.

The Sources of Data

Government agencies: such as the Bureau of Labor Statistics or the


Census Bureau.

Surveys: such as the Current Population Survey or the Consumer


Expenditure Survey.

Experiments: such as randomized controlled trials.

The Accuracy of Data

Data accuracy is crucial in statistical analysis. Errors in data can lead


to incorrect conclusions.

A Note on the Measurement Scales of Variables

Variables can be measured on different scales, including:

Nominal scale: a scale that categorizes variables without implying


any sort of order.

Ordinal scale: a scale that categorizes variables in a way that


implies a certain order or ranking.
Interval scale: a scale that measures variables in a way that implies
both order and exact differences between values.

Ratio scale: a scale that measures variables in a way that implies


order, exact differences, and a true zero point.

Chapter 2: Two-Variable Regression Analysis

2.1 A Hypothetical Example

Consider a simple example of how the price of a house (Y) is related to


its size (X). We can collect data on these two variables and analyze
their relationship.

2.2 The Concept of Population Regression Function (PRF)

The PRF is a mathematical function that describes the relationship


between two variables, X and Y, for the entire population.

Here's an example:

Example of Population Regression Function (PRF)

Suppose we want to study the relationship between the amount of


money spent on advertising (X) and the sales of a product (Y) for all
companies in a particular industry.

The population regression function (PRF) for this relationship might


be:

Y = β₁ + β₂X + ε

where:
Y = sales of the product

X = amount of money spent on advertising

β₁ = intercept or constant term

β₂ = slope coefficient

ε = stochastic disturbance term

For example, suppose the PRF for this industry is:

Y = 100 + 0.05X + ε

This PRF indicates that for every additional dollar spent on


advertising, sales increase by 5 cents, on average. The intercept term
(100) represents the average sales when no money is spent on
advertising.

Note that the PRF describes the relationship between X and Y for the
entire population of companies in the industry.

2.3 The Meaning of the Term Linear

In regression analysis, "linear" refers to two types of linearity:

Linearity in the Variables: the relationship between X and Y is linear.

Linearity in the Parameters: the parameters of the regression


equation are linear.

2.4 Stochastic Specification of PRF

The PRF can be specified as:

Y = β₁ + β₂X + ε
where ε is a stochastic disturbance term.

2.5 The Significance of the Stochastic Disturbance Term

The stochastic disturbance term (ε) represents the random factors that
affect the relationship between X and Y.

2.6 The Sample Regression Function (SRF)

The SRF is an estimate of the PRF based on a sample of data.

Here's an explanation:

2.6 The Sample Regression Function (SRF)

The Sample Regression Function (SRF) is an estimate of the


Population Regression Function (PRF) based on a sample of data.

Formula

The SRF can be written as:

Ŷ = b₀ + b₁X

where:

Ŷ = predicted value of Y

b₀ = estimated intercept term

b₁ = estimated slope coefficient

X = independent variable

Estimation of SRF

The SRF is estimated using the method of ordinary least squares


(OLS), which minimizes the sum of the squared errors between the
observed values of Y and the predicted values of Ŷ.
Example

Suppose we have a sample of data on the relationship between the


amount of money spent on advertising (X) and the sales of a product
(Y). The SRF might be:

Ŷ = 120 + 0.03X

This SRF indicates that for every additional dollar spent on


advertising, sales increase by 3 cents, on average. The intercept term
(120) represents the average sales when no money is spent on
advertising.

Note that the SRF is an estimate of the PRF, and the estimated
coefficients (b₀ and b₁) may not be exactly equal to the true population
parameters (β₁ and β₂).

2.7 Illustrative Examples

Consider the following examples:

The relationship between the amount of rainfall (X) and the yield of
crops (Y).

The relationship between the price of a house (X) and its size (Y).

Here are the examples:

2.7 Illustrative Examples

Example 1: Rainfall and Crop Yield

Suppose we want to study the relationship between the amount of rainfall


(X) and the yield of crops (Y). We collect data on these two variables for a
sample of farms.
The scatterplot of the data might look like this:

X (Rainfall) Y (Crop Yield)

10 50

20 70

30 90

40 110

50 130

The sample regression function (SRF) might be:

Ŷ = 20 + 2X

This SRF indicates that for every additional inch of rainfall, the crop yield
increases by 2 units, on average.

Example 2: House Price and Size

Suppose we want to study the relationship between the price of a house


(X) and its size (Y). We collect data on these two variables for a sample
of houses.

The scatterplot of the data might look like this:


X (House Price) Y (House Size)

100,000 1000

150,000 1200

200,000 1500

250,000 1800

300,000 2000

The sample regression function (SRF) might be:

Ŷ = 500 + 0.5X

This SRF indicates that for every additional dollar spent on a house, the
size of the house increases by 0.5 square feet, on average.

2.1 A Hypothetical Example

Suppose we want to study the relationship between the amount of


money spent on advertising (X) and the sales of a product (Y). We
collect data on these two variables for a sample of companies.

Here's a hypothetical example:

Company Advertising (X) Sales (Y)

A 100 1000

B 200 1200
Company Advertising (X) Sales (Y)

C 300 1500

D 400 1800

E 500 2000

We can see that as the amount of money spent on advertising


increases, the sales of the product also tend to increase.

2.2 The Concept of Population Regression Function (PRF)

The Population Regression Function (PRF) is a mathematical function


that describes the relationship between two variables, X and Y, for the
entire population.

In our example, the PRF might be:

Y = β₁ + β₂X + ε

where:

Y = sales of the product

X = amount of money spent on advertising

β₁ = intercept or constant term

β₂ = slope coefficient

ε = stochastic disturbance term

To find the values of the parameters β₁ and β₂, we can use the method of
ordinary least squares (OLS). Here are the steps:
Step 1: Calculate the means of X and Y

X̄ = (100 + 200 + 300 + 400 + 500) / 5 = 300

Ȳ = (1000 + 1200 + 1500 + 1800 + 2000) / 5 = 1500

Step 2: Calculate the deviations from the means

Company X Y X - X̄ Y - Ȳ

A 100 1000 -200 -500

B 200 1200 -100 -300

C 300 1500 0 0

D 400 1800 100 300

E 500 2000 200 500

Step 3: Calculate the slope coefficient β₂

β₂ = Σ[(Xᵢ - X̄ )(Yᵢ - Ȳ)] / Σ(Xᵢ - X̄ )²

= [(-200)(-500) + (-100)(-300) + (0)(0) + (100)(300) + (200)(500)] / [(-200)²


+ (-100)² + 0² + 100² + 200²]

= 250,000 / 50,000

=5
Step 4: Calculate the intercept term β₁

β₁ = Ȳ - β₂X̄

= 1500 - 5(300)

= 1500 - 1500

=0

Therefore, the values of the parameters are:

β₁ = 0

β₂ = 5

The equation of the population regression line is:

Y = 0 + 5X

Or simply:

Y = 5X

Here are the calculations:

Example: Relationship between Years of Experience and Salary

Suppose we want to study the relationship between the years of


experience (X) and the salary (Y) of a group of employees. We collect
data on these two variables for a sample of 5 employees.

Step 1: Calculate the means of X and Y

X̄ = (2 + 4 + 6 + 8 + 10) / 5 = 6
Ȳ = (40000 + 50000 + 60000 + 70000 + 80000) / 5 = 60000

Step 2: Calculate the deviations from the means

Employee X Y X - X̄ Y - Ȳ

A 2 40000 -4 -20000

B 4 50000 -2 -10000

C 6 60000 0 0

D 8 70000 2 10000

E 10 80000 4 20000

Step 3: Calculate the slope coefficient (b₁)

b₁ = Σ[(Xᵢ - X̄ )(Yᵢ - Ȳ)] / Σ(Xᵢ - X̄ )²

= [(-4)(-20000) + (-2)(-10000) + (0)(0) + (2)(10000) + (4)(20000)] / [(-4)² +


(-2)² + 0² + 2² + 4²]

= 120000 / 20

= 6000

Step 4: Calculate the intercept coefficient (b₀)

b₀ = Ȳ - b₁X̄
= 60000 - 6000(6)

= 60000 - 36000

= 24000

However, in the example I provided earlier, I mentioned that the intercept


term is $20000. This is because I made an error in my previous response.
The correct intercept term is indeed $24000.

Step 5: Write the equation of the sample regression line

Ŷ = b₀ + b₁X

= 24000 + 6000X

Here's a simplified version:

Chapter 3: Two-Variable Regression Model: The Problem of Estimation

3.1 The Method of Ordinary Least Squares (OLS)

The OLS method is used to estimate the parameters of a linear


regression model. The goal is to minimize the sum of the squared
errors between the observed values of Y and the predicted values of Ŷ.

3.2 The Classical Linear Regression Model: Assumptions

The classical linear regression model assumes:

1. Linearity: The relationship between X and Y is linear.

2. Constant variance: The variance of the error term is constant.

3. Independence: Each observation is independent of the others.


4. Normality: The error term is normally distributed.

5. No multicollinearity: The independent variables are not highly


correlated.

3.3 Precision or Standard Errors of Least-Squares Estimates

The standard error of an estimate measures its precision. A smaller


standard error indicates a more precise estimate.

3.4 Properties of Least-Squares Estimators: The Gauss-Markov


Theorem

The Gauss-Markov theorem states that the OLS estimator is the best
linear unbiased estimator (BLUE) of the true parameter value.

3.5 The Coefficient of Determination (r²): A Measure of "Goodness of


Fit"

The r² measures the proportion of the variation in Y that is explained


by X. A higher r² indicates a better fit.

Here's an explanation:

The Gauss-Markov Theorem

The Gauss-Markov theorem is a fundamental concept in statistics that


establishes the properties of the ordinary least squares (OLS)
estimator. The theorem states that the OLS estimator is the best linear
unbiased estimator (BLUE) of the true parameter value.

Properties of the OLS Estimator

The Gauss-Markov theorem establishes the following properties of the


OLS estimator:
1. Unbiasedness: The OLS estimator is unbiased, meaning that its
expected value is equal to the true parameter value.

2. Linearity: The OLS estimator is a linear function of the dependent


variable.

3. Minimum Variance: The OLS estimator has the minimum variance


among all unbiased linear estimators.

The Coefficient of Determination (r²)

The coefficient of determination, denoted by r², is a measure of the


goodness of fit of a regression model. It represents the proportion of
the variation in the dependent variable that is explained by the
independent variable(s).

Interpretation of r²

The value of r² ranges from 0 to 1, where:

 r² = 0 indicates that the regression model does not explain any of


the variation in the dependent variable.

 r² = 1 indicates that the regression model explains all of the


variation in the dependent variable.

 0 < r² < 1 indicates that the regression model explains some, but
not all, of the variation in the dependent variable.

Example

Suppose we have a regression model that predicts the price of a


house based on its size. The r² value is 0.8. This means that 80% of the
variation in the price of the house is explained by its size.
Calculating r²

The r² value can be calculated using the following formula:

r² = 1 - (SSE / SST)

where:

 SSE is the sum of the squared errors

 SST is the total sum of squares

3.6 Numerical Examples

Suppose we have the following data:

X Y

1 2

2 4

3 6

4 8

5 10

Using OLS, we estimate the regression line as:

Ŷ = 1 + 1.8X

The r² is 0.98, indicating a good fit.

Another example:
X Y

2 3

4 5

6 7

8 9

10 11

Using OLS, we estimate the regression line as:

Ŷ = 1 + 0.9X

The r² is 0.97, indicating a good fit.

3.7 Illustrative Examples

 The relationship between the amount of rainfall and the yield of


crops.

 The relationship between the price of a house and its size.

 The relationship between the number of hours studied and the


exam score.

 The relationship between the amount of exercise and the weight


loss.

3.8 A Note on Monte Carlo Experiments


Monte Carlo experiments are a statistical technique used to study the
properties of estimators by simulating data. The basic idea is to
generate artificial data that mimics the characteristics of real data, and
then use this data to estimate the parameters of interest.

Steps involved in a Monte Carlo experiment:

1. Specify the model: Define the statistical model that you want to
study, including the parameters of interest.

2. Generate artificial data: Use a random number generator to


generate artificial data that mimics the characteristics of real data.

3. Estimate the parameters: Use the artificial data to estimate the


parameters of interest.

4. Repeat the process: Repeat steps 2-3 many times (e.g. 1000
times) to generate a large number of estimates.

5. Analyze the results: Analyze the distribution of the estimates to


study the properties of the estimator, such as its bias, variance,
and mean squared error.

Advantages of Monte Carlo experiments:

1. Flexibility: Monte Carlo experiments can be used to study a wide


range of statistical models and estimators.

2. Control: By generating artificial data, you have complete control


over the characteristics of the data.

3. Repeatability: Monte Carlo experiments can be repeated many


times to generate a large number of estimates.
Common applications of Monte Carlo experiments:

1. Evaluating estimator performance: Monte Carlo experiments can


be used to evaluate the performance of different estimators, such
as their bias, variance, and mean squared error.

2. Studying the effects of outliers: Monte Carlo experiments can be


used to study the effects of outliers on estimator performance.

3. Investigating the properties of statistical tests: Monte Carlo


experiments can be used to investigate the properties of
statistical tests, such as their power and size.

3A.1 Derivation of Least-Squares Estimates

The least-squares estimates are derived by minimizing the sum of the


squared errors (SSE) between the observed values of Y and the
predicted values of Ŷ.

Step 1: Define the Sum of Squared Errors (SSE)

SSE = Σ(Yᵢ - Ŷᵢ)²

where:

 Yᵢ is the observed value of Y

 Ŷᵢ is the predicted value of Y

Step 2: Define the Predicted Value of Y

Ŷᵢ = β₀ + β₁Xᵢ

where:

 β₀ is the intercept term


 β₁ is the slope coefficient

 Xᵢ is the value of the independent variable

Step 3: Substitute the Predicted Value of Y into the SSE Equation

SSE = Σ(Yᵢ - (β₀ + β₁Xᵢ))²

Step 4: Expand the SSE Equation

SSE = Σ(Yᵢ² - 2β₀Yᵢ - 2β₁XᵢYᵢ + β₀² + 2β₀β₁Xᵢ + β₁²Xᵢ²)

Step 5: Minimize the SSE Equation with Respect to β₀ and β₁

To minimize the SSE equation, we take the partial derivatives of SSE


with respect to β₀ and β₁, and set them equal to zero:

∂SSE/∂β₀ = -2Σ(Yᵢ - β₀ - β₁Xᵢ) = 0

∂SSE/∂β₁ = -2Σ(Xᵢ(Yᵢ - β₀ - β₁Xᵢ)) = 0

Step 6: Solve for β₀ and β₁

Solving the above equations simultaneously, we get:

β₁ = Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²)

β₀ = Ȳ - β₁X̄

where:

 X̄ is the mean of X

 Ȳ is the mean of Y

These are the least-squares estimates of β₀ and β₁.

Here's an explanation:
3A.2 Linearity and Unbiasedness Properties of Least-Squares
Estimators

The least-squares estimators of β₀ and β₁ have two important


properties:

1. Linearity: The least-squares estimators are linear functions of the


dependent variable Y.

2. Unbiasedness: The least-squares estimators are unbiased,


meaning that their expected values are equal to the true
parameter values.

Linearity Property

The least-squares estimator of β₁ can be written as:

β̂₁ = Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²)

This estimator is a linear function of Y, since it involves a linear


combination of the products of Xᵢ and Yᵢ.

Similarly, the least-squares estimator of β₀ can be written as:

β̂₀ = Ȳ - β̂₁X̄

This estimator is also a linear function of Y, since it involves a linear


combination of Ȳ and β̂₁X̄ .

Unbiasedness Property

To show that the least-squares estimators are unbiased, we need to


show that their expected values are equal to the true parameter values.

E(β̂₁) = E[Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²)]

= Σ(XᵢE(Yᵢ) - X̄ E(Ȳ)) / Σ(Xᵢ² - X̄ ²)


= β₁

Similarly, we can show that E(β̂₀) = β₀.

Therefore, the least-squares estimators of β₀ and β₁ are unbiased.

3A.3 Variances and Standard Errors of Least-Squares Estimators

The variances and standard errors of the least-squares estimators of


β₀ and β₁ are derived using the properties of the error term.

Assumptions

We assume that the error term εᵢ has the following properties:

1. Zero mean: E(εᵢ) = 0

2. Constant variance: Var(εᵢ) = σ²

3. Independence: εᵢ and εⱼ are independent for i ≠ j

4. Normality: εᵢ ~ N(0, σ²)

Variance of β̂₁

The variance of β̂₁ is given by:

Var(β̂₁) = σ² / Σ(Xᵢ² - X̄ ²)

where:

 σ² is the variance of the error term

 Xᵢ is the value of the independent variable

 X̄ is the mean of the independent variable

Variance of β̂₀

The variance of β̂₀ is given by:


Var(β̂₀) = σ² * (1 / n + X̄ ² / Σ(Xᵢ² - X̄ ²))

where:

 n is the sample size

 X̄ is the mean of the independent variable

Standard Errors of β̂₁ and β̂₀

The standard errors of β̂₁ and β̂₀ are given by:

SE(β̂₁) = √Var(β̂₁) = σ / √Σ(Xᵢ² - X̄ ²)

SE(β̂₀) = √Var(β̂₀) = σ * √(1 / n + X̄ ² / Σ(Xᵢ² - X̄ ²))

where:

 σ is the standard deviation of the error term

Note that the standard errors of β̂₁ and β̂₀ are used to construct
confidence intervals and perform hypothesis tests.

Here are the solutions:


Practice Problem 1:
Data:

X Y

2 4

4 6

6 8

8 10
X Y

10 12

Estimate the Regression Line using OLS:


First, calculate the means of X and Y:
X̄ = (2 + 4 + 6 + 8 + 10) / 5 = 6
Ȳ = (4 + 6 + 8 + 10 + 12) / 5 = 8
Next, calculate the slope coefficient (β₁) and the intercept term (β₀):
β₁ = Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²) = 1
β₀ = Ȳ - β₁X̄ = 2
The estimated regression line is:
Ŷ = 2 + 1X
Calculate the r²:
First, calculate the total sum of squares (SST):
SST = Σ(Yᵢ - Ȳ)² = 40
Next, calculate the sum of squared errors (SSE):
SSE = Σ(Yᵢ - Ŷᵢ)² = 0
Finally, calculate the r²:
r² = 1 - (SSE / SST) = 1 - (0 / 40) = 1
The r² value of 1 indicates a perfect fit.
Practice Problem 2:
Here's a hypothetical dataset:

X Y

1 3

2 5
X Y

3 7

4 9

5 11

6 13

7 15

8 17

9 19

10 21

Using OLS, we can estimate the regression line as:


Ŷ = 1.9 + 1.9X
The r² value is:
r² = 0.98
This indicates that about 98% of the variation in Y is explained by X.
Here's a breakdown of the calculations:
Estimate the Regression Line:
First, calculate the means of X and Y:
X̄ = (1 + 2 + ... + 10) / 10 = 5.5
Ȳ = (3 + 5 + ... + 21) / 10 = 12.1
Next, calculate the slope coefficient (β₁) and the intercept term (β₀):
β₁ = Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²) = 1.9
β₀ = Ȳ - β₁X̄ = 1.9
The estimated regression line is:
Ŷ = 1.9 + 1.9X
Calculate the r²:
First, calculate the total sum of squares (SST):
SST = Σ(Yᵢ - Ȳ)² = 200
Next, calculate the sum of squared errors (SSE):
SSE = Σ(Yᵢ - Ŷᵢ)² = 4
Finally, calculate the r²:
r² = 1 - (SSE / SST) = 1 - (4 / 200) = 0.98
This indicates that about 98% of the variation in Y is explained by X.

3A.4 Covariance Between β̂₁ and β̂₂


The covariance between β̂₁ and β̂₂ is given by:
Cov(β̂₁, β̂₂) = -σ² * (ΣXᵢZᵢ) / (ΣXᵢ² * ΣZᵢ² - (ΣXᵢZᵢ)²)
where:
 σ² is the variance of the error term
 Xᵢ and Zᵢ are the values of the two independent variables
 Σ denotes the summation over all observations
3A.5 The Least-Squares Estimator of σ²
The least-squares estimator of σ² is given by:
σ̂² = SSE / (n - k)
where:
 SSE is the sum of squared errors
 n is the sample size
 k is the number of independent variables
Here's an explanation:
3A.6 Minimum-Variance Property of Least-Squares Estimators
The minimum-variance property of least-squares estimators states
that among all unbiased linear estimators, the least-squares
estimators have the smallest variance.
Definition of Minimum-Variance Property
An estimator β̂ is said to have the minimum-variance property if:
1. β̂ is an unbiased estimator of β
2. Var(β̂) ≤ Var(β̃) for any other unbiased linear estimator β̃
Proof of Minimum-Variance Property
To prove that the least-squares estimator β̂ has the minimum-variance
property, we can use the following steps:
1. Show that β̂ is an unbiased estimator of β
2. Show that Var(β̂) ≤ Var(β̃) for any other unbiased linear estimator
β̃
Gauss-Markov Theorem
The Gauss-Markov theorem provides a formal proof of the minimum-
variance property of least-squares estimators. The theorem states that
the least-squares estimator β̂ is the best linear unbiased estimator
(BLUE) of β, meaning that it has the smallest variance among all
unbiased linear estimators.
Implications of Minimum-Variance Property
The minimum-variance property of least-squares estimators has
several important implications:
1. Efficiency: Least-squares estimators are efficient, meaning that
they make the most use of the available data.
2. Reliability: Least-squares estimators are reliable, meaning that
they provide consistent estimates of the true parameter values.
3. Optimality: Least-squares estimators are optimal, meaning that
they are the best possible estimators among all unbiased linear
estimators.
3A.7 Consistency of Least-Squares Estimators
The consistency property of least-squares estimators states that as
the sample size n increases, the least-squares estimators β̂₁ and β̂₂
converge in probability to the true parameter values β₁ and β₂,
respectively.
Definition of Consistency
An estimator β̂ is said to be consistent if:
1. β̂ converges in probability to the true parameter value β as the
sample size n increases.
2. The probability of β̂ deviating from β by more than a small amount
ε tends to zero as n increases.
Proof of Consistency
To prove that the least-squares estimators are consistent, we can use
the following steps:
1. Show that the least-squares estimators are unbiased.
2. Show that the variance of the least-squares estimators tends to
zero as the sample size n increases.
Implications of Consistency
The consistency property of least-squares estimators has several
important implications:
1. Reliability: Least-squares estimators are reliable, meaning that
they provide consistent estimates of the true parameter values.
2. Accuracy: Least-squares estimators are accurate, meaning that
they converge to the true parameter values as the sample size
increases.
3. Large-Sample Properties: The consistency property ensures that
the least-squares estimators have desirable large-sample
properties, such as asymptotic normality and efficiency.
Asymptotic Properties
The consistency property of least-squares estimators also implies that
they have desirable asymptotic properties, such as:
1. Asymptotic Normality: The least-squares estimators are
asymptotically normally distributed.
2. Asymptotic Efficiency: The least-squares estimators are
asymptotically efficient, meaning that they achieve the lowest
possible variance among all unbiased estimators.

Chapter Solutions
Step 1 of 17
Consumer Price Index measures the weighted average of prices of consumer goods and services
purchased in an economy. Table 1.1 gives data on Consumer Price index of 7 countries during period
of 1980-2005, with 100 as the base of index during 1982-84.

Step 2 of 17
a.
Inflation rate is measure of rate of increase in price level in an economy over a period of time.
To find inflation rate of current year, subtract CPI of previous year from CPI of current year, divide the
difference by CPI of previous year. And multiply the result by 100.
CPI of country U in 1980 is 82.4 and CPI in 1981 is 90.0. Inflation rate of country U in 1981 is given
by,

Inflation rate of country U in 1981 is 10.32%.


Similarly, CPI of country G in 1994 is 131.1 and CPI in 1995 is 133.3. Inflation rate of country G in
1995 is given by,

Inflation rate of country G in 1995 is 1.68%.


Similarly, inflation rate of the 7 countries for each year is calculated in table 1.2
Step 3 of 17

Step 4 of 17
b.
Plot the inflation rate of the 7 countries for each year, using Table1.2. The vertical axis shows inflation
rate and horizontal axis shows time.
Step 5 of 17
c.
The graph 1.1, which shows inflation rate of the 7 countries, can be divided into 4 periods. Separate
conclusions can be drawn for each period.
From 1981 to 1986, the inflation rate of countries is generally declining. From 1987 to 1990, inflation
rate of countries is generally rising. From 1991 to 1994, inflation rate of countries is generally
declining. And from 1995 to 2005, inflation rate of countries is generally constant.
Step 6 of 17
d.
Standard deviation can be used to measure variability in inflation rate of each country over time. It
measures the variability of data of a group from mean value of the group.
Formula of standard deviation σ is,

Where n is number of observations


X bar is mean of data set
Calculate standard deviation for inflation rate of country U, using above formula.
Step 7 of 17
Standard deviation σ is,

Standard deviation of country U is 1.77


Step 8 of 17
Calculate standard deviation for inflation rate of country C.
Standard deviation σ is,

Standard deviation of country C is 2.79


Step 9 of 17
Calculate standard deviation for inflation rate of country J.
Step 10 of 17
Standard deviation σ is,

Standard deviation of country J is 1.45


Step 11 of 17
Calculate standard deviation for inflation rate of country F.
Standard deviation σ is,

Standard deviation of country F is 3.34


Step 12 of 17
Calculate standard deviation for inflation rate of country G.
Step 13 of 17
Standard deviation σ is,

Standard deviation of country G is 1.57


Step 14 of 17
Calculate standard deviation for inflation rate of country I.
Standard deviation σ is,

Standard deviation of country I is 4.48


Step 15 of 17
Calculate standard deviation for inflation rate of country B.
Step 16 of 17
Standard deviation σ is,

Standard deviation of country B is 2.60


Step 17 of 17
Standard deviation for country I is highest. Therefore, inflation rate of country I is most variable.

You might also like