Похожие презентации:
Modeling non-stationary variables
1.
Modeling Non-stationary VariablesJoint Vienna Institute/ IMF ICD
Macro-econometric Forecasting and Analysis,
JV16.12, L02, Vienna, Austria, May 17, 2016
Presenter
Mikhail Pranovich
This training material is the property of the International Monetary Fund (IMF) and is intended for use in
IMF’s Institute for Capacity development (ICD) courses. Any reuse requires the permission of ICD.
2.
Lecture Objectives• Revisit the concept of non-stationary (unit root) process and
its implications for analysis and forecasting
• Understand key tests for unit root
• Revisit the concept of cointegration
• … and testing for cointegration
2
3. Outline
Stationary and non-stationary variablesTesting for unit roots
Cointegration
Testing for cointegration
3
4. Introduction
Many economic (macro/financial) variables exhibit trendingbehavior
e.g., real GDP, real consumption, assets prices, dividends…
Key issue for estimation/forecasting:
the nature of this trend….
… is it deterministic (e.g., linear trend) or stochastic (e.g., random
walk)
The nature of the trend has important implications for the
model’s parameters and their distributions…
… and thus for the statistical procedures used to conduct
inference and forecasting
4
Macro-econometric Forecasting and
Analysis
5. Key Macro Series Appear to have trends
Share PricesExchange Rate
4.75
15
Real GDP
14
GDP Deflator
4.50
4.25
4.00
13
3.75
3.50
12
3.25
11
3.00
2.75
10
1950
5
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
1950
1955
1960
1965
1970
Macro-econometric Forecasting and
1975
1980
1985
1990
1995
2000
6. Deterministic and Stochastic Trends in Data
Two types of trends: deterministic or stochasticA Deterministic trend is a non-random function of time
Example: linear time-trend
y t 1 2 t εt
A stochastic trend is random, i.e. varies over time
Examples:
(Pure) Random Walk Model: a time series is said to follow a pure random walk if
the change is i.i.d.
yt yt 1 εt
Random Walk with a Drift
yt yt 1 εt
is a ‘drift’. If > 0, then yt increases on average
6
7. Example: Processes with Trends
6040
Deterministic trend
Stochastic trend
35
50
30
40
25
30
20
15
20
10
10
5
0
0
20
40
60
80
100
120
140
160
180
200
0
20
40
60
80
100
120
140
160
7
180
200
8. Stationary and non-stationary processes (1)
Consider the data generation process (DGP)y qy t
t
t 1
1.0, variable is stationary (i.e.,
If q <the
mean and variance)
Standard econometric procedures may be used to
estimate/forecast this model
8
Macro-econometric Forecasting and
Analysis
yt finite
, has
9.
Stationary and non-stationary processes (2)If q ³ 1.0,
model is said to be non-stationary and its associated
(statistical) distribution theory is non-standard.
In particular:
Sample moments do not have finite limits, but converge (weakly) to random
quantities;
Least squares estimate of
T
(stationary case);
Asymptotic distribution of the least squares estimator is non-standard (i.e., nonnormal).
isqsuper consistent with convergence rates greater than
Bottom line: nature of the trend has important implications for
hypothesis testing and forecasting, especially in multivariate settings
(e.g., VARS).
9
Macro-econometric Forecasting and
Analysis
10. Reminder: Autoregressive AR(p) Process
We shall check how shocks affect stationary and non-stationary variables, but first recall what is an AR(p) process
An AR(p) autoregressive process (AR-process of order p):
y t q1 y t 1 q 2 y t 2 ... q p y t p εt
The error εt, is assumed to be independently and identically
distributed (i.i.d.), with a zero mean and a constant variance
10
11. Stochastic trends, autoregressive models and a unit root
The condition for stationarity in an AR(p) model: roots z of thecharacteristic equation
1- θ1z - θ2z2 - θ3z3 - ... - θpzp =0
must all be greater than one in absolute value: |z| >1
If an AR(p) process has z=1 => variable has a unit root
Example: AR(1) process yt =
+ θyt-1 + vt
A special case is θ =1 => z =1 => yt has unit root (stochastic trend)
Stationarity requires that |θ| <1 for |z|>1
11
12. The Impact of Shocks on Stationary and Non-stationary variables
Consider a simple AR(1):yt = θyt-1 + νt,
where θ takes any value for now
We can write:
yt-1= θyt-2 + νt-1
yt-2= θyt-3 + νt-2
Substituting yields:
yt = θ(θyt-2 + νt-1) + εt = θ2yt-2 + θνt-1 + νt
Successive substituting for yt-2, yt-3,... gives an representation in terms of
initial value y-1 and past errors νt-1, νt-2,...,ν0
yt = θt+1y-1 + θνt-1 + θ2νt-2 + θ3νt-3 + ...+ θtν0 + νt
12
13. The Impact of Shocks for Stationary and Non-stationary Series (2)
Representation at t=T: yT = θT+1y-1 +θvT-1 +θ2vT-2 + θ3vT-3 + ...+ θTv0 + vTAt t =0 the variable is hit by a non-zero shock
We have 3 cases (depending on value of θ):
1.
v0
|θ|< 1 θT 0 and θTv0 0 as T
Shocks have only a transitory effect (gradually dies away with time)
2.
θ = 1 θT = 1 and θTv0 = v0 T
Shocks have a permanent effect in the system and never die away:
T
yT y 1 vi
i 0
... just a sum of past shocks plus some starting value of y-1. The
grows without bound (Tσ2 ) as T
3.
variance
|θ|>1. Now shocks become more influential as time goes on (explosive
effect), since if θ>1, then |θ|T>...>|θ|3 > |θ|2 > |θ| etc.
13
14. Integration
Another way to write the stochastic trend model is:y y
t
t
y
t 1
t
Thus the first difference of yt is stationary provided vt is
stationary (“difference stationary” process). Also
referred to as an I(1) variable.
Similarly, in the case of the deterministic trend model, yt
is interpreted as trend stationary
14
because removal of the deterministic trend from yt renders it
a stationary random variable
Macro-econometric Forecasting and
Analysis
15. Order of Integration: I(d)
In general, if yt is I(d) then:d y (1 L) d y t
t
t
If d=0, then the series is already stationary
15
Macro-econometric Forecasting and
Analysis
16. Problems due to Stochastic Trends (from a statistical perspective)
Non-standard distribution of test statisticsSpurious regression:
in a simple linear regression, two (or more) non-stationary time series
may appear to be related even though they are not
Need to use special modeling techniques when dealing with
non-stationary data (VARs in differences or VECMs)
Need to distinguish btw. stochastic and deterministic trends as
it may affect estimates of policy-relevant variables
e.g. estimate of an output gap or of a structural budget deficit
… for that we need unit root tests…
16
17. Figure 5: Distribution of OLS estimator for θ
17Macro-econometric Forecasting and
Analysis
18. Testing For Unit Roots
Previous section suggests that I(1) variables needspecial handling
So how do we identify I(1) processes, i.e., test for
unit roots?
Natural test is to consider the t-statistic for the nullhypothesis of a unit root, i.e., qˆ 1
Given the previous graph, it is not surprising that the
t-distribution for qˆ 1 is non-normal
18
Macro-econometric Forecasting and
Analysis
19. Testing for Unit Roots: Procedures
Dickey FullerAugmented Dickey Fuller
Phillips Perron
Kwiatkowski, Phillips, Schmidt and Shin (KPSS)
19
Macro-econometric Forecasting and
Analysis
20. Dickey Fuller Test
Fuller (1976), Dickey and Fuller (1979)Example:
consider a particular case of an AR(1) model:
yt = θyt-1 + εt
We test a hypothesis
H0: θ =1 → the series contains a unit root/stochastic trend (is a random
walk)
against
H1: |θ| <1 → the series is a zero-mean stationary AR(1)
20
21. Dickey-Fuller Test (2)
For the purpose of testing we reformulate the regression:yt = yt – yt-1 =θyt-1 -yt-1 + vt = (θ-1)yt-1 + vt =
= yt-1 + vt
so that the test of H0: θ = 1 H0: = 0
The test is based on the t-ratio for
this t-ratio does not have the usual t-distribution under the H0
critical values are derived from Monte Carlo experiments, and are tabulated
(known): see appendix A
The test is not invariant to the addition of deterministic
components (more general formulation: intercept + time-trend)
21
22. Dickey-Fuller Test (3)
Important issue – shall deterministic components be included in the test model foryt. Is this
yt = yt-1 + vt
or
yt = 1+ yt-1 + vt
or
yt = 1+ 2t+ yt-1 + vt ?
Two ways around:
Use prior information/assume whether the deterministic components are included, i.e. use
the restrictions (easy to implement in Eviews):
1≠0 and 2≠0
1≠0 and 2=0
1=0 and 2=0
Allow for uncertainty about deterministic components (more complicated in Eviews) and
implement a testing strategy to find out:
restrictions on deterministic components
if yt is non-stationary
22
23. DF-Test (3): Deterministic Components are Known
Say, we assume yt includes an intercept, but not a time trendyt = 1+ θyt-1 + vt
We test a hypothesis:
H0: θ =1 → the series has a unit root/stochastic trend
against
H1: |θ| <1 → the series is zero-mean stationary AR(1)
Reformulate:
yt = 1+ yt-1 + vt
Test H0: =0 → the series has a unit root (stochastic trend) against
H1: < 0 → the series has no unit root (is stationary)
This way is easy – it is ready for you in Eviews
But, there are risks involved...
23
24. DF-Test (4): Risks Posed by Deterministic Components
If deterministic components are not included in the test, whenthey should be, then the test is not correctly sized:
If deterministic components are included but they should not be,
then the test has low power (especially in finite (short) samples):
The test will reject the H0: =0, although it is in fact true and should not
be rejected (yt is non-stationary) – type I error
The test will not reject the H0: =0, although it is false and must be
rejected (yt is stationary) – type II error
This is why we may prefer (a degree of) uncertainty about
deterministic components and use testing strategies (see appendix
A for details):
Enders Strategy
Elder and Kennedy Strategy
24
25. The Augmented Dickey Fuller (ADF) Test
2The DF-test above is only valid if εt is a white noise: εt i.i.d (o, )
εt will be autocorrelated if there was autocorrelation in the first
difference ( yt), and we have to control for it
The solution is to “augment” the test using p lags of the
dependent variable. The alternative model (including the
constant and the time trend) is now written as:
p
y t 1 2 t y t 1
a y
i
t i
εt
i 1
25
26. The ADF-Test (2)
Again, we have three choices:(1) include neither a constant nor a time trend
(2) include a constant
(3) include a constant and a time trend
Again, we either:
use prior information and impose a model from the beginning, or
remain uncertain about deterministic components and follow one of the
Strategies
Useful result: Critical values for the ADF-test are the same as for
DF-test
Note, however, that the test statistics are sensitive to the lag length p
26
27. The ADF-Test: Lag Length Selection
Three approaches are commonly used:Akaike Information Criterion (AIC)
Schwarz-Bayesian Criterion (SBC)
General-to-Specific successive t-tests on lag coefficients
AIC and BIC are statistics that favour fit (smaller residuals) but penalize for every
additional parameter that needs to be estimated:
So, we prefer a model with a smaller value of a criterion statistic
General-to-Specific: begin with a general model where p is fairly large, and
successively re-estimate with one less lag each time (keeping the sample fixed)
It is advised to use AIC
Tendency of SBC to select too parsimonious of a model
The ADF-test is biased when any autocorrelation remains in the residuals
27length
Note: the test critical values do not depend on the method used to select the lag
28. Dickey-Fuller (and ADF) Test: Criticism
The power of the tests is low if the process is stationary butwith a root “close” to 1 (so called “near unit root” process)
e.g. the test is poor at rejecting θ = 1 (ψ=0), when the true
data generating process is
yt = 0.95yt-1 + εt
This problem is particularly pronounced in small samples
28
29. The Phillips Perron (PP) test
Rather popular in the analysis of financial time seriesThe test regression for the PP-tests is
yt 1 2 t yt 1 t
PP modifies the test statistic to account for any serial correlation and
heteroskedasticity of εt
The usual t-statistic in the DF-testt 0
… is modified:
1
ˆ 2
T
T
…
2
2
2
ˆ
1 ˆ ˆ T SE ( ˆ )
Zt
t 0
2
ˆ2
ˆ2
2
ˆ
ˆt2 estimate of variance
1/ 2
t 1
q
T
j
1
ˆ 2 [1
] ˆ j , ˆ j
ˆt ˆt j estimate of autocovariance of order j ,
q 1
T t j 1
j 1
ˆ2
2
q is a number of lags, up to which errors autocorrelation might be present
29
30. The PP test (2)
Under the null hypothesis that ψ = 0, Zt statistic has the sameasymptotic distribution as the ADF t-statistic
Advantages:
PP-test is robust to general forms of heteroskedasticity in εt
No need to specify the lag length for the test regression
30
31. The Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test
The KPSS test is a stationarity test. The H0 is: yt ~I(0)Start with the model:
yt Dt t t
t t 1 ut , ut i.i.d (0, u2 ),
Dt contains deterministic components, εt is I(0) and may be heteroskedastic
2
2
0
The test is then H0: u
against the alternative H1: u 0
The KPSS test statistic is:
T
t
KPSS T 2
t 1
Sˆt2 / ˆ2
ˆ2
where
is a cumulative residual function and
is a
j 1
long-run variance of εt as defined earlier (see slide 32)
Sˆt uˆ j
See Appendix C on some details w.r.t. critical values
31
32. Testing for Higher Orders of Integration
Just when we thought it is over... Consider:yt = yt-1 + εt
we test H0: =0 vs. H1: <0
If H0 is rejected, then yt is stationary
What if H0 is not rejected? The series has a unit root, but is that
it? No! What if yt I(2)? So we now need to test
H0: yt I(2) vs. H1: yt I(1)
Regress 2yt on yt-1 (plus lags of 2yt, if necessary)
Test H0: yt I(1), which is equivalent to H0: yt I(2)
32
33. Working with Non-Stationary Variables
Consider a regression model with two variables; there are 4 cases to dealwith:
Case 1: Both variables are stationary=> classical regression model is valid
Case 2: The variables are integrated of different orders=> unbalanced
(meaningless) regression
Case 3: Both variables are integrated of the same order; regression
residuals contain a stochastic trend=> spurious regression
Case 4: Both variables are integrated of the same order; the residual series
is stationary=> y and x are said to be cointegrated and…
You will have more on this in L-5, L-8 and L-9
33
34. Cointegration
Important implication is that non-stationary timeseries can be rendered stationary by differencing
Now we turn to the case of N>1 (i.e., multiple
variables)
An alternative approach to achieving stationarity is to
form linear combinations of the I(1) series – this is
the essence of “cointegration” [Engle and Granger
(1987)]
34
Macro-econometric Forecasting and
Analysis
35. Cointegration
Three main implications of cointegration:35
Existence of cointegration implies a set of dynamic long-run
equilibria where the weights used to achieve stationarity are the
parameters of the long-run (or equilibrium) relationship.
The OLS estimates of the weights converge to their population
values at a super-consistent rate of “T” compared to the usual
T
rate of convergence,
Modeling a system of cointegrated variables allows for
specification of both the long-run and short-run dynamics. The
end result is called a “Vector Error Correction Model (VECM)”.
Macro-econometric Forecasting and
Analysis
36. Cointegration
We will see that cointegrated systems (VECMs) arespecial VARS.
Specifically, cointegration implies a set of non-linear
cross-equation restrictions on the VAR.
Easiest/most flexible way to estimate VECM’s is by
full-information maximum likelihood.
36
Macro-econometric Forecasting and
Analysis
37. Long-Run Equilibrium Relationships: Examples
Permanent Income Hypothesis (PIH)Postulates a long-run relationship between log real
consumption and log real income:
log(rct ) b c b y log( ryt ) ut
37
Assuming real consumption and income are nonstationary (I(1)) variables, then the PIH is postulating that
real consumption and income move together over time
and that ut is a stationary series.
Macro-econometric Forecasting and
Analysis
38. Term Structure Of Interest Rates
Models the relationship between the yields on bonds ofdiffering maturities.
Prior is that yields of different (longer) maturities can be
explained in terms of a single (typically shorter) maturity yield.
For example:
r3,t b c ,1 b1,1r1,t u1,t
r2,t b c ,2 b 2,1r1,t u2,t
All the yields are assumed to be I(1), but the residuals are I(0)
[stationary]. This is an example of a system of three variables
with two (2) long-run relationships
38
Macro-econometric Forecasting and
Analysis
39. VECM
Cointegration postulates the existence of long-runequilibrium relationships between non-stationary
variables where short-run deviations from equilibrium
are stationary.
What is the underlying economic model?
How do we estimate such a model?
39
Macro-econometric Forecasting and
Analysis
40. Bivariate VECMs
Consider a bivariate model containing two I(1)variables, say y1,t and y 2,t .
Assume the long-run relationship is given by
y1,t b c b y y2,t ut
Here b c b y y2,t represents the long-run equilibrium,
and ut represents the short-run deviations from the
long-run equilibrium (see next slide).
40
Macro-econometric Forecasting and
Analysis
41. Phase Diagram: VECM
y1B
C
D
A
y2
41
Macro-econometric Forecasting and
Analysis
42. Adjusting Back To Equilibrium
Suppose there is a positive shock in the previousperiod, raising y1,t to point B while leaving y2,t-1
unchanged.
How can the system converge back to its long-run
equilibrium?
There are three possible trajectories…
42
Macro-econometric Forecasting and
Analysis
43. Adjustments Are Made by Y1,t
Long-run equilibrium is restored by y1,t decreasingtoward point A while y2,t remains unchanged at its
initial position.
Assuming that the short-run change in y1,t are a
linear function of the size of the deviation from the
LR equilibrium, ut-1, the adjustment in y1,t is given by:
y1,t y1,t 1 a1ut 1 v1,t a1 ( y1,t 1 b c b y y2,t 1 ) v1,t
a1 < 0
43
Macro-econometric Forecasting and
Analysis
44. Adjustments Are Made by Y2,t
Long-run equilibrium is restored by y2,t increasingtoward point C while y1,t remains unchanged after the
initial shock.
Assuming that the short-run movements in y2,t are a
linear function of the size of shock, ut, the adjustment
in y2,t is given by:
y2,t y2,t 1 a 2ut 1 v2,t a 2 ( y1,t 1 b c b y y2,t 1 ) v2,t
a2 0
44
Macro-econometric Forecasting and
Analysis
45. Adjustments are made by both Y1,t and Y2,t
The previous two equations may operatesimultaneously with both y1,t and y2,t converging to a
point on the long-run equilibrium path such as D.
The relative strengths of the two adjustment paths
depend on the relative magnitudes of the adjustment
parameters, a1 and a 2 .
The parameters a1 and a 2 are known as the “errorcorrection parameters” or short-run adjustment
coefficients.
45
Macro-econometric Forecasting and
Analysis
46. VECM = Special VAR
A VECM is actually a special case of a VAR wherethe parameters are subject to a set of cross-equation
restrictions because all the variables are governed
by the same long-run equations. Consider what we
have when we put the two equations together:
é y1,t ù é a1b c ù éa1 ù éë1 b y ùû é y1,t 1 ù é v1,t ù
ê ú
ê y ú ê
ê y ú êv ú
ú
ë 2,t û ë a 2 b c û ëa 2 û
ë 2,t 1 û ë 2,t û
or in terms of a VAR…
46
Macro-econometric Forecasting and
Analysis
47. VECM = Special VAR
é y1,t ù é a1b c ù é1 a1 a1b y ù é y1,t 1 ù é v1,t ùê
ê ú
ú
êy ú ê
ê
ú
ú
ë 2,t û ë a 2 b c û ë a 2 1 a 2 b y û ë y2,t 1 û ëv2,t û
which is clearly a first-order VAR
yt Fyt 1 vt
47
Macro-econometric Forecasting and
Analysis
48. VECM = Special VAR
Obviously, we have a first order VAR with tworestrictions on the parameters.
In an unconstrained VAR of order one, no crossequation restrictions are imposed, implying 6
unknown parameters.
However, a VECM – owing to the cross-equation
restrictions – has only four unknown parameters.
Less restrictions are needed to identify the model.
48
Macro-econometric Forecasting and
Analysis
49. Multivariate Methods: N > 2
Multivariate Methods: N > 2Can easily generalize the relationship between a
VAR and a VECM to N variables and p lags.
Assume first that p = 1:
Subtracting yt-1 from both sides:
yt yt 1 ( I N F1 ) yt 1 vt
yt F1 yt 1 vt
yt F (1) yt 1 vt , where F (1)= ( I N F1 )
or
This is a VECM, but with p = 0 lags.
49
Macro-econometric Forecasting and
Analysis
50. VAR with p lags > 1
VAR with p lags > 1Allowing for p lags gives:
F ( L) yt vt
where vt is an N dimensional vector of iid
p
F
(
L
)
I
F
L
K
F
L
disturbances and
is a p-th order
N
1
p
polynomial in the lag operator.
The resulting VECM has p-1 lags given by:
p 1
p
j 1
i j 1
yt F (1) yt 1 G j yt j vt , where G j F i
50
Macro-econometric Forecasting and
Analysis
51. Cointegration
If the vector time series yt is assumed to be I(1), then yt iscointegrated if there exists an N x r full column rank
matrix, b, such that the r linear combinations:
b ¢ yt ut
are I(0).
The dimension
b “r” is called the cointegrating rank and the
columns of
are called the co-integrating vectors.
This implies that (N – r) common trends exist that are I(1).
51
Macro-econometric Forecasting and
Analysis
52.
Granger Representation TheoremSuppose yt, which can be I(1) or I(0), is generated by
p 1
yt F (1) yt 1 G j yt j vt
j 1
Three important cases:
(a) If F (1) has full rank, i.e., r = N, then yt is I(0)
(b) If F (1) has reduced rank 0 < r < N,
F (1) ab ¢ where a and b are each ( N x r ) matrices with full column rank.
b ¢ yt
52
then yt is I(1) and
is I(0)
b with cointegrating vectors
given by
the columns of
F (1)
F (1) 0
(c) if
has zero rank, r = 0,
and yt is I(1) and not
cointegrated.
Macro-econometric Forecasting and
Analysis
53. Examples: Rank of Long-Run Models
The form of F (1) for the two long-run models weconsidered above:
Permanent Income: (N=2, r=1)
Term structure: (N = 3, r = 2)
éa1,1 ù é 1 ù¢
F (1) ab ¢ ê ú ê
ú
b
a
ë 2,1 û ë y û
éa1,1 a1,2 ù é 1 0 ù¢
ê
ú
ê
ú
F (1) ab ¢ êa 2,1 a 2,2 ú ê 0 1 ú
êëa 3,1 a 3,2 úû êë b 3,1 b3,2 úû
53
Macro-econometric Forecasting and
Analysis
54.
Key Implications of the GE RepresentationTheorem
The Granger-Engle theorem suggests the form of the
model that should be estimated given the nature of the
data.
If F (1) has full rank, N, then all the time series must be
stationary, and the original VAR should be specified in
levels. This is the “unrestricted model”.
If F (1) has reduced rank, with 0 < r < N, then a VECM
should be estimated subject to the restrictions
F (1) ab ¢ , viz:
p 1
yt ab ¢ yt 1 G j yt j vt
54
j 1
Macro-econometric Forecasting and
Analysis
55.
Key Implications of the GE RepresentationTheorem
If F (1) 0, then the appropriate model is:
p 1
yt G j yt j vt
j 1
In other words, if all the variables in yt are I(1) and not
cointegrated, we should estimate a VAR(p-1) in first
differences.
Note that this is the most restricted model compared to the
previous two, which is important when calculating
likelihood ratio tests for cointegration.
55
Macro-econometric Forecasting and
Analysis
56. Dealing With Deterministic Components
We can easily extend the base VECM to include adeterministic time trend, viz:
p 1
yt 0 1t ab ¢ yt 1 G j yt j vt
j 1
where now 0 and 1 are (N x 1) vectors of
parameters associated with the intercept and time
trend.
The deterministic components can contribute both to
the short-run and the long-run components of y t
56
Macro-econometric Forecasting and
Analysis
57. Deterministic Components
Suppose we can decompose these parameters intotheir short-run and long-run components by defining:
j j ab ¢j , j 0, 1
where j (N x 1) is the short-run component and ab ¢j
is the long-run component.
We can rewrite the model as:
p 1
yt 0 1t a ( b 0¢ b1¢t b ¢ yt 1 ) G j yt j vt
j 1
57
Macro-econometric Forecasting and
Analysis
58. Deterministic Components
The term ( b 0¢ b1¢t b ¢ yt 1 ) represents the long-runrelationship among the variables.
The parameter 0 provides a drift component in the
equation of yt , so it contributes a trend to yt
Similarly 1t allows for linear time trend in yt and a
quadratic trend to yt
By contrast, b 0 contributes a constant to the EC-Eq
b1¢t
and
contributes a linear time trend to EC-Eq
58
Macro-econometric Forecasting and
Analysis
59. Deterministic Components
The equationp 1
yt 0 1t a ( b 0¢ b1¢t b ¢ yt 1 ) G j yt j vt
j 1
contains five important special cases summarized on the next
slide.
Model 1 is the simplest (and most restricted) as there are no
deterministic components.
Model 2 allows for r intercepts in the long-run equations.
Model 3 (most common) allows for constants in both the shortrun and the long-run equations – total of N+r intercepts.
59
Macro-econometric Forecasting and
Analysis
60. Alternative Deterministic Structures
60Macro-econometric Forecasting and
Analysis
61. Estimating VECM Models
If you are willing to assume that the error term vt iswhite noise and N(0,σ2), the parameters of the VECM
can be estimated directly by full-information maximum
likelihood techniques.
Basically, one estimates a traditional VAR subject to the
cross-equation restrictions implied by cointegration.
Using FIML is the most flexible approach, but it requires
one to ensure that the parameters of the overall model
are identified (via exclusion restrictions). More on this
later.
61
Macro-econometric Forecasting and
Analysis
62. Three Cases:
F (1) can be inverted.VECM is equivalent to the unconstrained VAR. No
restrictions are imposed on the VAR.
Maximum likelihood estimator is obtained by
applying OLS to each equation separately.
The estimator is applied to the levels of the data,
since they are (must be) stationary.
62
Macro-econometric Forecasting and
Analysis
63. Reduced Rank (Cointegration) Case: FIML
If F (1) cannot be inverted (i.e., reduced rank case, orwe are dealing with a cointegrated system), we
impose the cross-equation restrictions coming from
the lagged ECM term(s), and then estimate the
system using full-information maximum likelihood
methods.
The VECM is a restricted model compared to the
unconstrained VAR.
63
Macro-econometric Forecasting and
Analysis
64. Reduced Rank Case: Johansen Estimator
We can also use the Johansen (1988) estimator.This differs from FIML in that the cross-equation
identifying restrictions are NOT imposed on the
model before estimation.
The Johansen approach estimates a basis for the
vector space spanned by the cointegrating vectors,
and THEN imposes identification on the coefficients.
64
Macro-econometric Forecasting and
Analysis
65. Zero-Rank Case for
F(1)When F (1) 0, the VECM reduces to a VAR in
first differences.
As with the full-rank model, the maximum
likelihood estimator is the ordinary least squares
estimator applied to each equation separately.
This is the most constrained model compared to
a VECM/unconstrained VAR in levels.
65
Macro-econometric Forecasting and
Analysis
66. Identification
The Johansen procedure requires one to normalizethe cointegrating vectors so that one of the variables
in the equation is regarded as the dependent
variable of the long-run relationship.
In the bi-variate term structure and the permanent
income example, the normalization takes the form of
designating one of variables in the system as the
dependent variable.
66
Macro-econometric Forecasting and
Analysis
67. Identification: Triangular Restrictions
Suppose there are r long-run relationships.Identification can be achieved by transforming the
top (r x r) block of bˆ (the long-run parameters) to
the identity matrix.
If r = 1, this corresponds to normalizing one the
coefficients to unity.
67
Macro-econometric Forecasting and
Analysis
68. Triangular Restrictions
If there are N = 3 variables and r = 2 cointegratingequations, one sets bˆ to:
é 1 0 ù
ê
ú
ˆ
b ê 0 1 ú
êˆ
ú
ˆ
ë b 3,1 b3,2 û
This form of the normalized estimated co-integrated
vector is appropriate for the tri-variate term structure
model introduced earlier.
68
Macro-econometric Forecasting and
Analysis
69. Structural Restrictions
Traditional identification methods can also be usedwith VECM’s, including exclusion restrictions, crossequation restrictions, and restrictions on the
disturbance covariance matrix.
Example: Johansen and Juselius(1992) propose an
open economy model in which yt { st , pt , pt* , it , it*}
represents, respectively, the spot exchange rate, the
domestic price level, the foreign price, the domestic
interest rate and the foreign interest rate.
Thus, N = 5.
69
Macro-econometric Forecasting and
Analysis
70. Open Economy Model
Assuming r = 2 long-run equations, the followingrestrictions consisting of normalization, exclusion
and cross-equation restrictions on yield the
normalized long-run parameter matrix
é1 b 2,1 b 2,1 0 0 ù
b¢ ê
ú
b
ë0 0 0 1 5,1 û
The long-run equations represent PPP and UIP.
st b 2,1 ( pt pt* ) u1,t [PPP]
it b 5,1it* u2,t
70
[Uncovered IP]
Macro-econometric Forecasting and
Analysis
71. Cointegration Rank
So far we have taken the rank of the system as given. Buthow do we decide how many co-integrating vectors are in
the vector of N variables?
Simple approach is to estimate models of different rank and
then do a formal likelihood ratio test to decide whether
restricted model (i.e., the model with rank r less than N) is
appropriate.
Specifically, one would estimate the most restricted model (r
= 0), a model that assumes (r=1), then a model that
assumes r = 2, etc. The process ends when we cannot
reject the null (r = r0).
71
Macro-econometric Forecasting and
Analysis
72. Cointegration Rank: Likelihood Ratio Test
Suppose we estimate the model assuming nocointegration. Let the parameters involved in that
model be denoted byqˆr N .
Let the value of the likelihood of this model be
denoted by LT qˆr N
Now estimate the model assuming r ≥ 1. Obviously,
this is an restricted model compared to the r = N
case. Let the value of the likelihood in this case be
denoted by LT qˆr r
(
)
(
72
0
)
Macro-econometric Forecasting and
Analysis
73. Cointegration Rank: Likelihood Ratio Test
Using the standard result for the likelihood ratio test,we get the following LR test statistic:
(
(
)
(
LR 2 (T p) ln LT qˆr r0 (T p ) ln LT qˆr N
))
We reject the restricted model if the likelihood ratio
test is greater than the corresponding critical value.
In this case, imposing the restrictions does not yield a
superior model.
73
Macro-econometric Forecasting and
Analysis
74. Cointegration Rank: Johansen Approach
A numerically equivalent approach was proposed byJohansen (1988).
He expressed the problem in terms of the eigen values
of the likelihood function – an approach that is
numerically equivalent to the likelihood ratio test. He
termed it the “trace statistic”.
The critical values of the LR test are non-standard, and
depend on the structure of the deterministic part of the
model. Critical values are shown on the next slide.
74
Macro-econometric Forecasting and
Analysis
75. Critical Values of the Likelihood Ratio Test
75Macro-econometric Forecasting and
Analysis
76. Tests on the Cointegrating Vector (Long-Run Parameters)
bHypothesis tests on the cointegrating vector, ,
constitute tests of long-run economic theories.
In contrast to the cointegration rank tests, the
asymptotic distribution of the Wald, Likelihood Ratio
2
c
and Lagrange Multiplier tests
is under the null
hypothesis that the restrictions are valid.
76
Macro-econometric Forecasting and
Analysis
77. Exogeneity
An important feature of a VECM is that all of the variablesin the system are endogenous.
When the system is out of equilibrium, all the variables
interact with each other to move the system back into
equilibrium,
In a VECM, this process occurs (as we saw) through the
impact of lagged variables so that yi,t is affected by the
lags of the other variables either through the error
correction term, ut-1, or through the lags of y j ,t , j ¹ i
77
Macro-econometric Forecasting and
Analysis
78. Weak versus Strong Exogeneity
If the first channel does not exist, i.e., the lagged errorcorrection term does not influence the adjustment
process, the variable concerned is said to be weakly
exogenous.
If the first and second channels do not exist, then only
the lagged values of a variable can be used to explain
its changes. In this case, we say that that variable is
strongly exogenous.
Strong exogeneity testing is equivalent to Granger
causality testing.
78
Macro-econometric Forecasting and
Analysis
79. Example: Exogeneity
Consider the bi-variate term structure model withone cointegrating vector.
y a1 ( y
10
t
10
t 1
p 1
p 1
b 0 b y ) 10,i y 10,i yt1 i t
1
1 t 1
i 1
10
t i
i 1
p 1
p 1
i 1
i 1
yt1 a 2 ( yt10 1 b 0 b1 yt1 1 ) 1,i yt10 i 1,i yt1 i t
10
y
The ten-year interest rate, t , is said to be weakly
exogenous if a1 0
Strong exogeneity amounts to the requirement that
a1 0, 10,i 0 i
79
Macro-econometric Forecasting and
Analysis
80. Impulse Response Functions
The dynamics of a VECM can be investigated usingimpulse response functions.
The approach is to re-express the VECM as a VAR,
but preserving the implied restrictions on the
parameters.
For example, consider the VECM
p 1
yt 0 1t ab ¢ yt 1 G j yt j vt
j 1
80
Macro-econometric Forecasting and
Analysis
81. Impulse Response Functions: VECM
This VECM can be expressed as a VAR in levels:p
yt F j yt j vt
j 1
subject to the restrictions:
F1 ab ¢ G1 I N
F j G j G j 1 , j 2,3,L , p
81
Macro-econometric Forecasting and
Analysis
82. Appendices
83. Appendix A: Process moments, key results: AR(1) model with θ < 1
Appendix A: Process moments, key results:AR(1) model with θ < 1
Mean (first moment):
(8)
t 1
j 0
j 0
as t
1 q
Variance (second moment):
(9)
t 1
E[ yt ] q q j vt j q t y0
j
é t 1 j
2ù
2
var[ yt ] E[( yt E[ yt ]) ] E ê ( q vt j ) ú
as t
2
ë j 0
û 1 q
2
Key point to note is that the first and second moments are converging to finite
constants.
1 T
1 T 2
p
p
yt 1 ¾¾
lim E [ yt ] and yt 1 ¾¾
lim E éë yt2 ùû
t
t
T t 2
T t 2
So WLLN applies:
So any estimator based on these quantities should converge in a similar fashion.
83
Macro-econometric Forecasting and
Analysis
84. Appendix A: Process moments, Simulation of an AR(1) model
Assume0.0, 0.8, 2 1.0
It follows that
lim E [ yt ]
t
0.0
0.0
1 q 1 0.8
2
1.0
lim var(yt )
2.778
t
1 q 2 1 0.82
Also
Note that the sample moments converge to these values as the sample size
increases. Also, the variance of the estimator is approaching zero as T
increases.
84
Macro-econometric Forecasting and
Analysis
85. Appendix A: Process moments, key results: AR(1) model with θ = 1
First moment:t 1
E [ yt ] q y0 q j y0 t
t
j 0
Second moment:
var( yt )
2
t 1
2j
2
2
4
2
q
(1
q
q
K
)
t
j 0
Appropriate scaling factors for these moments are T 3/2
2
T
and
respectively.
Define
85
m1
1
T 3/2
T
1
yt 1 , m2 2
T
t 2
T
2
y
t 1
t 2
(sample moments)
Macro-econometric Forecasting and
Analysis
86. Appendix A: Process moments, simulation of an I(1) Process
Notice that the variances of the first two sample moments do not fallas the sample size is increased (Columns 2 and 4).
The variances converge to 1/3, so m1 and m2 converge to random
variables in the limit.
86
Macro-econometric Forecasting and
Analysis
87. Appendix B: Enders Strategy
Estimate yt = 1+ 2t+ yt-1 + εtNo unit root (yt is stationary). Additional
testing is needed for deterministic
components
Test H0: =0
t-ratio test, 5% Crit. value is
-3.45
Test H0: 2= =0
F-test, 5% Crit. value is
6.49
Estimate yt = 1+ yt-1 + εt
Test H0: =0
t-ratio test, 5% Crit. value is
-2.89
Test H0: 1= =0
F-test, 5% Crit. value is
4.71
Estimate yt = yt-1
+ εt
Test H0: =0
t-ratio test, 5% Crit. value is -1.64
Test H0: =0 using Ndistribution
t-test, 5% Crit. value isNo
-1.64
unit root (yt is
Unit root (yt has both
stationary around
stochastic and
deterministic trend).
deterministic trends).
yt = 1+ 2t+θyt-1 + εt ,|
yt = 1+ 2t + yt-1 + εt
No unit root (yt is
θ|<1
stationary). Additional
testing of 1 is needed
Test H0: =0 using Ndistribution
t-test, 5% Crit. value is -1.64
No unit root (yt is
Unit root (yt is nonstationary).
stationary) yt = 1+yt-1 + εt
yt = 1+θyt-1 + εt ,|θ|<1
No unit root (yt is
stationary).
yt = θyt-1 + εt ,|θ|<1
Unit root (yt is non-stationary). yt = yt-1 +
εt
87
88. Appendix B: Enders Strategy (2)
Enders Strategy was criticized for:triple- and double-testing for unit roots
unrealistic outcomes: economic variables unlikely contain both
stochastic and deterministic trend as in
yt = 1+ 2t+ yt-1 + εt , 2≠0, =0,
this possibility should be excluded from the test
not taking advantage of prior knowledge
Alternative: Elder and Kennedy Strategy
88
89. Appendix B: Elder and Kennedy Strategy
Estimate yt = 1+ 2t+ yt-1 + εtTest H0: =0
t-ratio test, 5% Crit. value is
-3.45
Unit root (yt is nonstationary).
Estimate yt = 1+ εt
Test H0: 1=0
double sided t-test,
5% Crit. values are
-1.95<t<1.95
Unit root (yt is nonstationary without
intercept):
yt = yt-1 + εt
No unit root (yt is stationary).
Test H0: 2=0
double sided t-test,
5% Crit. values are
-1.95<t<1.95
No unit root (yt is
No unit root (yt is
stationary around
stationary without
deterministic trend).
deterministic trend):
yt = 1+ 2t+θyt-1 + εt ,|θ|
yt = 1+ θyt-1 + εt ,|θ|<1
<1
Unit root (yt is nonstationary with intercept).
yt = 1+ yt-1 + εt
89
90. Nonstationary Asymptotics
90Macro-econometric Forecasting and
Analysis
91. Nonstationary Asymptotics
Source: faculty.washington.edu/ezivot/econ584/notes/unitroot.pdf91
Macro-econometric Forecasting and
Analysis