0% found this document useful (0 votes)
9 views10 pages

MSVAR

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 10

The MSVAR package for gretl

by Sven Schreiber
v0.22, July 2021, this is work in progress and not a finished product!

This package started as a port of the freely available Gauss code by Anders Warne, for
which I am very grateful. The interface is not yet cast in stone, incompatible changes may
happen until further notice. Estimation is based on an EM algorithm.
In order to prevent any disappointment, let us start reporting aspects that are not yet
available:
• The way the starting values are chosen right now probably has a lot of room for
improvement; see also section 4.1.
• Arbitrary exogenous variables are not supported.
• It is not yet possible to restrict the intercept term in such a way that the restriction
applies to the means of the variables. (This is related to the model type 6 below, which
is not yet enabled.)
• Additional parameter restrictions are in principle possible in the underlying code base,
but the interface is not established nor enabled in this package, and that part of the
code has not been tested yet (after the port).

1 Models
The notation is as follows:
p
X
yt = µ(st ) + Φ(st )Dt + Ai (st )yt−i + ut (1)
i=1

with ut |st = N (0, V (st )). In the model variants below some of the coefficients will be
restricted to be homogeneous, i.e. will not depend on the regime state st .
Dt is deterministic (d, 1) and st describes q regimes, meaning that it takes discrete values
among {1, . . . , q}. The q × q transition probability matrix is denoted by P .
The following describes the implemented model variants. The corresponding number
must be specified when setting up a new MSVAR model.
1. m1:switchall – This is the most general model variant. All coefficients in (1) are
actually regime-varying. Remember that this also includes the one-step forecast error
variance-covariance.
2. m2:switchallbutV – Same as model 1, except that the variance is restricted to be
homogeneous, V (st ) = V .
3. m3:switchVmu – Here the intercepts µ(st ) and variances V (st ) are regime-dependent,
but all other coefficients are homogeneous.
4. m4:switchmu – This model is a restricted version of model 3, because in addition the
variances are homogeneous, such that only the equation intercepts µ(st ) are regime-
varying.
5. m5:switchV – In a way this model is the opposite of model 2, because only the variance
V (st ) is switching while all other coefficients are assumed constant.

1
6. This model in Warne’s original list is not (yet?) supported, as it is not really covered
by the formulation (1) and also because it is computationally heavy. Maybe later?

Note that Warne’s special model numbers 7,8,9 which make it possible to have no lags in
the system (p = 0) are automatically detected from the lag order in the VAR. Since they
are special cases of models 1,2,5, just input these numbers along with a $system bundle
containing a zero lag specification. (However, models 3 and 4 cannot be combined with no
lags.)
Also notice that fancy specifications of exogenous variables are not (yet?) supported
and the package tries to detect such situations. Please just use a constant term always and
seasonal dummies if wanted; a linear trend is not recommended, but it should in principle
also be picked up when using the GUI approach or the MS_afterVAR interface below.

2 Usage
2.1 Scripting
The MSVAR package offers two different scripting interfaces, sharing the same backend
engine. Either you first estimate a regular VAR with gretl and then pass the correspond-
ing accessor bundle named “$system” to the MS_afterVAR function (which automatically
invokes MS_exec, so you don’t have to do that explicitly). Alternatively, if you prefer
not to run a plain VAR with gretl’s command first, there’s the more direct setup function
MS_setup, then followed by an explicit call of the MS_exec function. So in either case,
there are two steps involved.

2.2 menu-based GUI


This package’s menu entry will be found in the “Edit” menu of the VAR model window,
provided the user allows the menu attachment.1 Note that you can actively trigger this
menu attachment in gretl’s local function package window by using the “+” (add to menu)
button, or alternatively by right-clicking on the package name and selecting “add to menu”.
So, first estimate a plain VAR with gretl for the desired sample and with the desired
variables and lag order. Again, please use a constant term, and seasonal dummies if wanted.2
Then in the resulting window go to Edit / MS-VAR which brings up a dialog window where
you can specify the parameters that are explained below for the function MS_afterVAR.3

2.3 MS_afterVAR
Arguments (as always with the hansl language, the arguments are positional, so the exact
names in this interface do not matter for the user):

1. bundle sys "system bundle from standard VAR"


2. int m "model type", default 1
3. int q "number of regimes", default 2
4. int qs "number of variables in state", default 1

5. bool sump "serially uncorrelated Markov process", default TRUE


6. int verbosity, from 0 to 2, default 0
7. int manageinitvals "initial value handling", from 0 to 2, default 0
1 In gretl 2021b there was a bug which suppressed the automatic menu attachment for zip packages such
as MSVAR(.zip). Please trigger the attachment manually with that version.
2 Technical detail: seasonal dummies are always centered in MSVAR.
3 If you’re running gretl in another language, the menu title “Edit” is translated to your language, but the

menu entry should always be “MS-VAR”.

2
8. bool timing, default FALSE

Returns a bundle with contents described below in Section 3.


The second argument about the model type (m) obviously refers to the list in 1.
The third argument (q) specifies how many values the state indicator may take on, as
explained above.
A special arrangement is available for the case that several (namely, qs) variables are
assumed to drive the system’s state; namely each of these variables is then restricted to
reflect only two different states, and combining all these binary possibilities then implies a
total number of regimes of q = 2qs , overriding whatever the user gave as input for q.
If the fifth argument (sump) is set to TRUE, then the underlying Markov process is
restricted to be free from autocorrelation. (TODO: add explanation what this means).
The sixth argument defines how much information is printed out.
The functionality behind the seventh argument (manageinitvals) is work in progress, and
concerns the starting values that are used for the estimation algorithm. The default setting
0 is understood to mean: if the parallel-processing technology MPI is present on the host
machine, then try two different sets of starting values, run the estimations based on those
sets independently in parallel, and return the result which yields the higher likelihood value
(given convergence).4 In contrast, the setting 1 chooses to use only a single set of starting
coefficient values, even if MPI is present. And the setting 2 indicates to run the independent
computations even if parallel processing is not possible, which may be time-consuming.
Finally, if the eighth argument is set to TRUE, then some time measurements of different
parts of the algorithm are done and printed out (if not silenced via the verbosity switch).

2.4 MS_setup
Again, this function is for scripting from scratch, without (explicitly) running a standard
linear VAR first. Its interface is heavily bundle-based, which means that all parameters and
options are specified in the options bundle (second argument). If the default settings are
OK for you, the most minimal usage only requires to specify the list of endogenous variables
(first argument).
Arguments:
1. list endo “endogenous variables”
2. bundle opts “options”; optional, can contain the following members which in this in-
terface must be exactly spelled as given. See also the corresponding arguments to
MS_afterVAR.
• “lagord”: lag order of the VAR (default: 1)
• “seasdums”: whether to include centered seasonal dummies (default: FALSE)
• “mtype”: switching model type as explained above (default: 1)
• “nregimes”: number of regimes (default: 2; irrelevant if qs > 1)
• “sdrivers”: number of latent variables driving the regime states (default 1; equiv-
alent to qs above; if qs > 1, then automatically nregimes: q = 2qs )
• “sump”, “verbosity”, “manageinitvals”, “timing” (see above)
Returns a bundle which forms the basis for the next step, to pass to MS_exec.
4 FIXME: The actual computation with different sets of starting values is not yet operational in this version

of the package. – See the manual “gretl + MPI” from the gretl help menu for background information on
what MPI is and how to enable it.

3
2.5 MS_exec
Only to be invoked by the user after the MS_setup call. In contrast, the MS_afterVAR
function takes care of this by itself already. The end result of both approaches is the same
(and described in Section 3).
Arguments:
1. pointer to bundle obtained from MS_setup
2. bool timing (optional, default FALSE)
If the second argument (timing) is TRUE, then some basic assessment of the time spent in
various parts of the code is attempted.
Returns as a scalar the number of sub-bundles that are added as nested members into
the main bundle, see subsection 3.2 for details. Also adds some stuff directly to the top level
of the input bundle. If all goes well (and verbosity is not switched off completely), it will
also print out estimates and some tests.

2.6 MS_irfs
This function computes regime-specific impulse response functions by taking internally the
array of estimated “compantops” (see subsection 3.1.2) and applying gretl’s built-in vma
function to each of them. Therefore arguments 2 and 3 are equivalent to those of vma.
Note that these impulse responses are conditional on staying in the analyzed regime and
therefore ignore the possibility that in any period the regime may switch. Also bear in
mind that the IRFs only depend on the autoregressive parameters and will only differ across
regimes if the AR coefficients are actually allowed to switch; i.e., only in model types 1 and
2.
Arguments:
1. MSVAR estimation result bundle, having been processed either by MS_afterVAR or
MS_exec (or via the GUI execution).
2. (Optional) Identification matrix K of dimension n × n, defining the relationship be-
tween reduced-form errors u and structural shocks e: ut = Ket . This matrix is taken
to be the same across regimes. If this is omitted (or an explicit null value is passed)
then it is taken to be the identity matrix.
3. (Optional) Maximum horizon range i, implying horizons from 0 (contemporaneous
impact) to i − 1. Default value is 24.
Returns q matrices arrays in a nested fashion, best explained by example:5 After a call
arrays switchIrfs = MS_irfs(b), one can access a specific impulse response matrix like
switchIrfs[2][4], which means to pick the impulse responses from the 2nd regime and
for the 4th shock. The rows of this matrix refer to the target variable, with the horizon
periods in columns.

2.7 What’s next


You might want to inspect the regime probabilities, so check out the “exanteprobs” / ”fil-
terprobs” / ”smoothprobs” matrix results in the main bundle (see also below in subsection
3.1). Presumably you want to use the coefficients for further analysis. For example, the
AR coefficients for the various regimes are collected in the “compantops” array of matrices
(again, a bundle member), and the Markov transition matrix is “Pmat”.

3 Members of the result bundle


Being bundle members, they are referenced by their keys which must be spelled exactly as
given (and are case sensitive as always in hansl ).
5A nested array of arrays is one of the most recent object type additions in gretl, starting in gretl 2019d.

4
3.1 Top level of the bundle (the one created by the setup functions
and subsequently amended)
3.1.1 Global and other results
• t1 and t2 – observation index numbers of the effective sample in the current gretl
workfile (copied from $system, more useful than ep,sp)
• T – effective sample size (t2 − t1 + 1)
• lnL – log-likelihood value (at the final iteration, thus either at convergence or at the
max iteration)
• dlnL – change of log-likelihood (at the final iteration, thus either at convergence or at
the max iteration)
• Conscore (matrix) – conditional score matrix (T columns)
• tauscore (matrix) – tau score matrix (T columns)
• it – number of iterations done
• infocrit (matrix) – row vector with the three information criteria
Finally we have the following results of the task of “Constructing the mean of A(s(t)) and
V(s(t)) conditional on y(t),y(t-1),., and the estimated parameters”:
• mus
• Vis

3.1.2 Per-regime coefficient results


We offer several matrices arrays of length q each, such that the matrix members are directly
indexed by the regime number:
• compantops (matrices array) – holds the respective estimated companion matrix (top
portions) for each regime, usable as input for gretl’s native vma() function. Each
matrix has n rows and np columns.
• resCovs (matrices array) – the n × n innovation covariance matrix for each regime.
• cnstcfs (matrices array) – the n × 1 constant term coefficients for each regime.
• exocfs (matrices array) – is an empty (zero-element) array if there are no exogenous
terms apart from the constant; otherwise has q elements with n rows each.
Then there are the “raw” results which are just different representations of the previous
terms:
• A (matrix) – the collection of cofficients µ, Φ, and Ai ; the coefficients for the various
regimes are vertically stacked, so the number of rows is: number of regimes times
number of equations (q × n). The various Ai are horizontally concatenated, and in the
case of no further deterministic variables Φ is omitted, so the number of columns is
then 1 + np.
• V (matrix) – collection of system residual covariance matrix in each regime, vertically
stacked (qn × n)
• Amod (matrix) – vector with (np)2 q rows holding the roots moduli of the AR part of
each regime (regimes stacked vertically)
• Aerror (matrix) – standard errors of A
• Verror (matrix) – standard errors of V
• Perror (matrix) – standard errors of P (Pmat)

5
3.1.3 Estimated probabilities
• Pmat (matrix) – transition probabilities (q × q)
• nstates (matrix) – has q rows; first column simply holds regime index sequence 1 . . . q,
last column the number of observations that fall into the corresponding regime; for
qs = 1 these are the only columns, for qs > 1 there are further columns in the middle
with subindices

The following matrix results with estimated regime probabilities each have q columns if
qs = 1; else they have 2qs columns. In the following, “regime-number” ranges between
1 . . . q, and “statevarnum” ranges between 1..qs.
• exanteprobs (matrix) – ex-ante probabilities: P r(st = regime-number|yt−1 , yt−2 , ...)
or P r(sstatevarnum,t = <1 or 2>|yt−1 , yt−2 , ...)

• filterprobs (matrix) – filtered probabilities: P r(st = regime-number|yt , yt−1 , ...) or


P r(sstatevarnum,t = <1 or 2>|yt , yt−1 , ...)
• smoothprobs (matrix) – smoothed probabilities: P r(st = regime-number|yT , yT −1 , ...)
or P r(sstatevarnum,t = <1 or 2>|yT , yT −1 , ...)

3.2 Nested bundles added by MS_exec


If there is no error and the estimation algorithm converged, the MS_exec function stuffs
into the main bundle some nested sub-bundles which contain further results. Otherwise they
will simply be absent.

3.2.1 inputs – Copied or encoded user inputs


These objects reflect the user input and the data properties and are moved to the sub-bundle
“inputs” to avoid clutter of the top level.
• m (alias mtype) – the model type (see numbers in section 1)

• q (alias nregimes) – number of states/regimes


• qs (alias sdrivers)– number of regime variables; if this is greater than one, then the
assumption is that each of these (hidden) variables can only have two states, such that
the total number of states then is q = 2qs . (If qs is unity, then q can be any positive
number in principle.)

• datamat (matrix) – observations of endogenous variables in rows (including initial


values for lags, so obsT rows), n variables in columns
• obsT – sample size including initial values (T + p)
• n – number of equations / endogenous variables

• verbosity – chosen value by user


• freqpd – data periodicity, basically the same as gretl’s $pd (data other than annual,
quarterly or monthly may or may not work)
• p (alias lagord) – lag order

• sump – Indicator for the user choice of a serially uncorrelated Markov process or not
(this restriction is not supported for qs > 1). Choosing this means that the transition
probabilities are restricted to be equal to the long-run probabilities.
• convcrit – convergence criterion for the loglik, usually 1e-9

6
• dparm – tolerance criterion for the relative change of parameter values (for the nuerical
calculation of derivatives), usually 1e-5

• mvars – comma-separated string with the names of endogenous variables (example:


"dy,dm")
• maxit – maximum iterations allowed (usually 500)
• Trend – indicator whether a linear trend is included (currently not supported!)

• Sea – indicating seasonal centered dummies; possible values are 0 (none) or the data
periodicity, e.g. 4 or 12
• d – number of deterministic terms (apart from the constant; Trend + Sea -1)
• manageinitvals – indicates whether to use (and how) different sets of initial values; 0:
try different sets only when MPI present (to do that in parallel), 1: only try the most
natural set (based on the plain VAR), even if MPI present, 2: try two sets, even if
MPI absent (by running them sequentially) (FIXME: not [fully?] working yet)
• initvalsets – An array of bundles (since bundles can live within bundles) holding dif-
ferent sets of possible initial (=starting) values for A, V, Pmat. So far there are two
array members. To look at the initial values for A in the second set you would have
to write (assuming the main bundle is called ’bMS’): bMS.inputs.initvalsets[2].A

3.2.2 bs – sub-bundle from the smoothing calculation


“bs” is partly related to smoothed probabilities P [s(t) = i|y(0), ., y(T )], P [s(t) = i, s(t − 1) =
j|y(0), ., y(T )] and P [(s(t) = i|y(0), ., y(t)]; but also again many internals

• bs.rho
• bs.Ji
• bs.Jij
• bs.piT

• bs.pijT
• bs.pjT
• bs.xiqt

• bs.lnL
• bs.dlnL
• bs.xikt
• bs.xikt1

• bs.Pstar

3.2.3 bg – sub-bundle stemming from the gradient calculation


“bg” is a nested bundle inside the main bundle. Normally you don’t need to access it,
members are only internally relevant.
• bg.Gradient

• bg.maxgr
• bg.Pmat

7
• bg.Pmat0

• bg.p0q
• bg.prijT
• bg.H1qs
• bg.e1t

• bg.e2t
• bg.Jhqs

3.2.4 bserr – from standard error calculations


This sub-bundle has the following members:

• Np0 – Nyblom stability tests on the Markov chain


• Nmu – Nyblom stability tests on mu
• Noma – Nyblom stability tests on the residual covariance
[These three contain 90%/95%/99% critical values in columns 2-4. / Only for at least
one lag.]

• Nyeqst (and Nyeqstcv) – Nyblom stability tests on params in equations (rows; state
in cols; again only for at least one lag)
• Wald (and Waldpv) – Wald test of serially uncorrel. Markov chain (only if serially
uncorrel. MC choice FALSE)

• Waldf (and Waldfpv) – F-test of serially uncorrel. Markov chain (only if serially
uncorrel. MC choice FALSE)
• Wald1 (and Wald1pv) – Wald tests of no serial correlation for each independent Markov
chain (only applicable to qs ≥ 2)

• Wald1f (and Wald1fpv) – F-test of no serial correlation for each independent Markov
chain (only applicable to qs ≥ 2)
• LM (and LMpv) – LM test for qs independent two-state regime variables, with p-
values (given by s11-s1 d.o.f., where both s11 and s1 are also bundle members; only
applicable to qs ≥ 2)

• LMf (and LMfpv) – same thing, small-sample corrected F version (only applicable to
qs ≥ 2)
• s1 – restriction counter such that s12 restrictions are imposed on the Markov proba-
bilities

• LMrest (and LMrestpv) – LM test of restrictions; can be either restrictions on the VAR
parameters6 and/or restrictions on the Markov transition probabilities (only applicable
if restrictions, e.g. if serially uncorrel. MC choice TRUE)
• LMfres (and LMfrespv) – same thing, small-sample corrected F version
6 not implemented yet in this package!

8
3.2.5 bmis – misspecification test-related
Tthis sub-bundle contains results that are only applicable if no restrictions are imposed
(neither on the VAR parameters nor a serially uncorrelated MC), otherwise it will be absent.
All tests are based on conditional scores. Members:

• misspece (and pvspece) – Equation-by-equation misspecification tests based on condi-


tional scores; the column contents are: 1 – equation number, 2 – Autocorr, 3 – ARCH,
4 – Markov (the last column only if qs = 1 or if n = 1)
• misspec (and pvspec) – System tests variant; columns are. 1 – Autocorr, 2 – ARCH,
3 – Markov (only if n > 1)

Warne refers to Hamilton (1991), "Specification Testing in Markov-Switching Time-Series


Models", for the special case of n = 1, and explains: “The test statistic used is an F-version
of the conditional score test proposed by White-Newey-Tauchen.”

3.2.6 bc – from the calculation of the conditional score


Computes scores for the Markov probabilities for samples τ = 1, 2, ., T . These are the basis
for the computation of the conditional scores, i.e. for dlnL(y(τ )|Y (τ − 1))/dp(i, j). Two of
the main results are actually stuffed into the top level of the main bundle already (Conscore,
tauscore). Again internals remaining.
• GrLMrest

• Diagp01
• Diagp11

3.2.7 bmv – related to mean and var of y


This may be absent for example if the stationarity check fails, or if other deterministics are
included; members:

• dely and convy – First two moments / Mean and covariance matrix conditional on s(t)
• stddely and seconvy – standard errors of the above (based on conditional scores and
the Delta method using numerical partials)
• muy and vy – First two moments / Mean and covariance matrix

• stdmuy and stdvy – standard errors of the above (based on conditional scores and the
Delta method using numerical partials)
• pivec and pierr1 – ergodic probabilities and their standard errors (based on conditional
scores) (only if serially uncorrel. MC choice FALSE)
[ these above only if no trend, and if at least one lag]

3.2.8 internal – Internal or Not used


And there are some elements which are not used yet, not really needed, or not supported at
all, but might become relevant in the future; all collected in the sub-bundle “internal”.
• rho (matrix) – internal, q 2 column vector

• rest – number of imposed restrictions (not used yet)


• Rmat, rvec, Rort (matrix-es) – representation of restrictions (not used yet)
• check – request indicator if the provided initial values should be checked whether they
fulfill the specified restrictions (not used yet)

9
• fobs – first observation numerical pseudo label, would indicate subsampling if larger
than sp, not really needed in the gretl context
• sp, ep – numerical decimal pseudo labels of the start and ending observations, not
needed usually in the gretl context (example: 1959.08)

4 Other notes
4.1 Starting values
Currently the starting values (for the coefficients) are based on a non-switching plain VAR,
where some ad-hoc variations are introduced for the various regimes. Given that there is a
known danger to get trapped in a local maximum of the likelihood function, one plan for
the future is to run several estimation sets with different initializations in parallel and check
afterwards which one produced the best likelihood.

4.2 Comparison with Eviews 11 and 12


Take the sample script but change the model type to m=1. The corresponding Eviews
settings are / should be:
• switching, Markov (not switching means)
• sample 1959m1 1995m2, lag intervals “1 2”
• endogenous: 1200*dlog(income) 1200*dlog(money), exogenous: c, probability regres-
sors: c
• number of regimes 2, varying: lagged endogenous, exogenous, error cov matrix
• initial probs: uniform, start method Eviews supplied, covariance ordinary, Hessian-
observed, optimization BFGS/marquardt, convergence 0.0001 (coefficient name: c)
⇒ This gives basically the same results; loglik Eviews -2852.911 vs MSVAR -2852.617
However, the Eviews output is strange in that it seems to have swapped the constant’s
coefficients with those of the 2nd lag of money. Given the much larger values of the coef-
ficients of the constant term, our results seem to be correct (or else the system would be
estimated as explosive in both regimes).

5 Changelog
• v0.22, July 2021, make the case of no lags (p=0) actually work, fix a bug in the
setVerrorsm1 function, fix a bug with plotting in the case qs > 1, fix the initialization
in the case q > 2 (otherwise regimes 2,3,... would be identical, a bug introduced at
some time along the way)
• v0.21, July 2021, fix harmless but show-stopping bug for the case of a single lag (p=1)
• v0.2, June 2021, add convenience function MS_irfs, and harmonize residual notation
(forecast errors are now u, as in the new SVAR ≥v1.95 notation and elsewhere); ensure
that alternative spec names are always included
• v0.12, May 2021, move internal stuff into a subbundle, introduce model type labels
• v0.11, May 2021, work on enabling seasonal dummies (and linear trend, partly)
• version 0.1, May 2021, requires bleeding-edge gretl 2021b because of the (possible)
menu attachment to the VAR model window
7 Tested on both Eviews versions 11 and 12. The Schwarz criterion from Eviews and the BIC from MSVAR

previously did not agree, but this was actually a typo bug in MSVAR (factor 2 applied to too many terms),
now fixed. Also a miscalculation of the date information was fixed.

10

You might also like