Title: | Multiple Hurdle Tobit Models |
---|---|
Description: | Estimation of models with dependent variable left-censored at zero. Null values may be caused by a selection process Cragg (1971) <doi:10.2307/1909582>, insufficient resources Tobin (1958) <doi:10.2307/1907382>, or infrequency of purchase Deaton and Irish (1984) <doi:10.1016/0047-2727(84)90067-7>. |
Authors: | Yves Croissant [aut, cre] , Fabrizio Carlevaro [aut], Stephane Hoareau [aut] |
Maintainer: | Yves Croissant <[email protected]> |
License: | GPL (>=2) |
Version: | 1.3-2 |
Built: | 2024-11-15 05:23:04 UTC |
Source: | https://github.com/ycroissant/mhurdle |
a cross section from 2014
A dataframe containing :
the month of the interview,
the number of person in the household,
the number of consumption units in the household,
the income of the household for the 12 month before the interview,
the logarithme of the net income per consumption unit divided by its mean,
the square of link
,
does the household live in a SMSA (yes
or no
),
the sex of the reference person of the household (male
and
female
),
the race of the head of the household, one of white
,
black
, indian
, asian
, pacific
and
multirace
,
is the reference person of the household is hispanic
(no
or yes
),
the number of year of education of the reference person of the household,
the age of the reference person of the household - 50,
the square of age
cars in the household,
food,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
.
number of observations : 1000
observation : households
country : United-States
Consumer Expenditure Survey (CE), program of the US Bureau of Labor Statistics https://www.bls.gov/cex/, interview survey.
mhurdle fits a large set of models relevant when the dependent variable is 0 for a part of the sample.
mhurdle( formula, data, subset, weights, na.action, start = NULL, dist = c("ln", "n", "bc", "ihs"), h2 = FALSE, scaled = TRUE, corr = FALSE, robust = TRUE, check_gradient = FALSE, ... )
mhurdle( formula, data, subset, weights, na.action, start = NULL, dist = c("ln", "n", "bc", "ihs"), h2 = FALSE, scaled = TRUE, corr = FALSE, robust = TRUE, check_gradient = FALSE, ... )
formula |
a symbolic description of the model to be fitted, |
data |
a |
subset |
see |
weights |
see |
na.action |
see |
start |
starting values, |
dist |
the distribution of the error of the consumption
equation: one of |
h2 |
if |
scaled |
if |
corr |
a boolean indicating whether the errors of the different equations are correlated or not, |
robust |
transformation of the structural parameters in order to avoid numerical problems, |
check_gradient |
if |
... |
further arguments. |
mhurdle
fits models for which the dependent variable is zero for
a part of the sample. Null values of the dependent variable may
occurs because of one or several mechanisms : good rejection, lack
of ressources and purchase infrequency. The model is described
using a three-parts formula : the first part describes the
selection process if any, the second part the regression equation
and the third part the purchase infrequency process. y ~ 0 | x1 + x2 | z1 + z2
means that there is no selection process. y ~ w1 + w2 | x1 + x2 | 0
and y ~ w1 + w2 | x1 + x2
describe the same
model with no purchase infrequency process. The second part is
mandatory, it explains the positive values of the dependant
variable. The dist
argument indicates the distribution of the
error term. If dist = "n"
, the error term is normal and (at least
part of) the zero observations are also explained by the second
part as the result of a corner solution. Several models described
in the litterature are obtained as special cases :
A model with a formula like y~0|x1+x2
and dist="n"
is the Tobit
model proposed by (Tobin 1958).
y~w1+w2|x1+x2
and dist="l"
or dist="t"
is the single hurdle
model proposed by (Cragg 1971). With dist="n"
,
the double hurdle model also proposed by
(Cragg 1971) is obtained. With corr="h1"
we get
the correlated version of this model described by
(Blundell and Meghir 1987).
y~0|x1+x2|z1+z2
is the P-Tobit model of
(Deaton and Irish 1984), which can be a single hurdle
model if dist="t"
or dist="l"
or a double hurdle model if
dist="n"
.
#' an object of class c("mhurdle", "maxLik")
.
A mhurdle
object has the following elements :
coefficients: the vector of coefficients,
vcov: the covariance matrix of the coefficients,
fitted.values: a matrix of fitted.values, the first column being the probability of 0 and the second one the mean values for the positive observations,
logLik: the log-likelihood,
gradient: the gradient at convergence,
model: a data.frame containing the variables used for the estimation,
coef.names: a list containing the names of the coefficients in the
selection equation, the regression equation, the infrequency of purchase
equation and the other coefficients (the standard deviation of the error
term and the coefficient of correlation if corr = TRUE
,
formula: the model formula, an object of class Formula
call: the call,
rho: the lagrange multiplier test of no correlation.
Blundell R, Meghir C (1987). “Bivariate Alternatives to the Tobit Model.” Journal of Econometrics, 34, 179-200.
Cragg JG (1971). “Some Statistical Models for Limited Dependent Variables with Applications for the Demand for Durable Goods.” Econometrica, 39(5), 829-44.
Deaton AS, Irish M (1984). “A Statistical Model for Zero Expenditures in Household Budgets.” Journal of Public Economics, 23, 59-80.
Tobin J (1958). “Estimation of Relationships for Limited Dependent Variables.” Econometrica, 26(1), 24-36.
data("Interview", package = "mhurdle") # independent double hurdle model idhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview, dist = "ln", h2 = TRUE, method = "bfgs") # dependent double hurdle model ddhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview, dist = "ln", h2 = TRUE, method = "bfgs", corr = TRUE) # a double hurdle p-tobit model ptm <- mhurdle(vacations ~ 0 | linc + linc2 | car + size, Interview, dist = "ln", h2 = TRUE, method = "bfgs", corr = TRUE)
data("Interview", package = "mhurdle") # independent double hurdle model idhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview, dist = "ln", h2 = TRUE, method = "bfgs") # dependent double hurdle model ddhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview, dist = "ln", h2 = TRUE, method = "bfgs", corr = TRUE) # a double hurdle p-tobit model ptm <- mhurdle(vacations ~ 0 | linc + linc2 | car + size, Interview, dist = "ln", h2 = TRUE, method = "bfgs", corr = TRUE)
specific predict, fitted, coef, vcov, summary, ... for mhurdle objects. In particular, these methods enables to extract the several parts of the model
## S3 method for class 'mhurdle' coef( object, which = c("all", "h1", "h2", "h3", "h4", "sd", "corr", "tr", "pos"), ... ) ## S3 method for class 'mhurdle' vcov( object, which = c("all", "h1", "h2", "h3", "h4", "sd", "corr", "tr", "pos"), ... ) ## S3 method for class 'mhurdle' logLik(object, naive = FALSE, ...) ## S3 method for class 'mhurdle' print( x, digits = max(3, getOption("digits") - 2), width = getOption("width"), ... ) ## S3 method for class 'mhurdle' summary(object, ...) ## S3 method for class 'summary.mhurdle' coef( object, which = c("all", "h1", "h2", "h3", "sd", "corr", "tr", "pos"), ... ) ## S3 method for class 'summary.mhurdle' print( x, digits = max(3, getOption("digits") - 2), width = getOption("width"), ... ) ## S3 method for class 'mhurdle' fitted(object, which = c("all", "zero", "positive"), mean = FALSE, ...) ## S3 method for class 'mhurdle' predict(object, newdata = NULL, what = c("E", "Ep", "p"), ...) ## S3 method for class 'mhurdle' update(object, new, ...) ## S3 method for class 'mhurdle' nobs(object, which = c("all", "null", "positive"), ...) ## S3 method for class 'mhurdle' effects( object, covariate = NULL, data = NULL, what = c("E", "Ep", "p"), reflevel = NULL, mean = FALSE, ... )
## S3 method for class 'mhurdle' coef( object, which = c("all", "h1", "h2", "h3", "h4", "sd", "corr", "tr", "pos"), ... ) ## S3 method for class 'mhurdle' vcov( object, which = c("all", "h1", "h2", "h3", "h4", "sd", "corr", "tr", "pos"), ... ) ## S3 method for class 'mhurdle' logLik(object, naive = FALSE, ...) ## S3 method for class 'mhurdle' print( x, digits = max(3, getOption("digits") - 2), width = getOption("width"), ... ) ## S3 method for class 'mhurdle' summary(object, ...) ## S3 method for class 'summary.mhurdle' coef( object, which = c("all", "h1", "h2", "h3", "sd", "corr", "tr", "pos"), ... ) ## S3 method for class 'summary.mhurdle' print( x, digits = max(3, getOption("digits") - 2), width = getOption("width"), ... ) ## S3 method for class 'mhurdle' fitted(object, which = c("all", "zero", "positive"), mean = FALSE, ...) ## S3 method for class 'mhurdle' predict(object, newdata = NULL, what = c("E", "Ep", "p"), ...) ## S3 method for class 'mhurdle' update(object, new, ...) ## S3 method for class 'mhurdle' nobs(object, which = c("all", "null", "positive"), ...) ## S3 method for class 'mhurdle' effects( object, covariate = NULL, data = NULL, what = c("E", "Ep", "p"), reflevel = NULL, mean = FALSE, ... )
object , x
|
an object of class |
which |
which coefficients or covariances should be extracted
? Those of the selection ( |
... |
further arguments. |
naive |
a boolean, it |
digits |
see |
width |
see |
mean |
if |
newdata , data
|
a |
what |
for the |
new |
an updated formula for the |
covariate |
the covariate for which the effect has to be computed, |
reflevel |
for the computation of effects for a factor, the reference level, |
This function computes the R squared for multiple hurdle models. The measure is a pseudo coefficient of determination or may be based on the likelihood.
rsq( object, type = c("coefdet", "lratio"), adj = FALSE, r2pos = c("rss", "ess", "cor") )
rsq( object, type = c("coefdet", "lratio"), adj = FALSE, r2pos = c("rss", "ess", "cor") )
object |
an object of class |
type |
one of |
adj |
if |
r2pos |
only for pseudo coefficient of determination, should the
positive part of the R squared be computed using the residual sum of squares
( |
a numerical value
McFadden D (1974). The Measurement of Urban Travel Demand. Journal of Public Economics, 3, 303-328.
data("Interview", package = "mhurdle") # independent double hurdle model idhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview, dist = "ln", h2 = TRUE, method = "bfgs") rsq(idhm, type = "lratio") rsq(idhm, type = "coefdet", r2pos = "rss")
data("Interview", package = "mhurdle") # independent double hurdle model idhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview, dist = "ln", h2 = TRUE, method = "bfgs") rsq(idhm, type = "lratio") rsq(idhm, type = "coefdet", r2pos = "rss")
The Vuong test is suitable to discriminate between two non-nested models.
vuongtest( x, y, type = c("non-nested", "nested", "overlapping"), true_model = FALSE, variance = c("centered", "uncentered"), matrix = c("large", "reduced") )
vuongtest( x, y, type = c("non-nested", "nested", "overlapping"), true_model = FALSE, variance = c("centered", "uncentered"), matrix = c("large", "reduced") )
x |
a first fitted model of class |
y |
a second fitted model of class |
type |
the kind of test to be computed, |
true_model |
a boolean, |
variance |
the variance is estimated using the |
matrix |
the W matrix can be computed using the general expression
|
an object of class "htest"
Vuong Q.H. (1989) Likelihood ratio tests for model selection and non-nested hypothesis, Econometrica, vol.57(2), pp.307-33.
vuong
in package pscl
.
data("Interview", package = "mhurdle") # dependent double hurdle model dhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview, dist = "ln", h2 = TRUE, method = "bhhh", corr = TRUE) # a double hurdle p-tobit model ptm <- mhurdle(vacations ~ 0 | linc + linc2 | car + size, Interview, dist = "ln", h2 = TRUE, method = "bhhh", corr = TRUE) vuongtest(dhm, ptm)
data("Interview", package = "mhurdle") # dependent double hurdle model dhm <- mhurdle(vacations ~ car + size | linc + linc2 | 0, Interview, dist = "ln", h2 = TRUE, method = "bhhh", corr = TRUE) # a double hurdle p-tobit model ptm <- mhurdle(vacations ~ 0 | linc + linc2 | car + size, Interview, dist = "ln", h2 = TRUE, method = "bhhh", corr = TRUE) vuongtest(dhm, ptm)