Hurdle Models for Count Data Regression (2024)

hurdle {pscl}R Documentation

Description

Fit hurdle regression models for count data via maximum likelihood.

Usage

hurdle(formula, data, subset, na.action, weights, offset, dist = c("poisson", "negbin", "geometric"), zero.dist = c("binomial", "poisson", "negbin", "geometric"), link = c("logit", "probit", "cloglog", "cauchit", "log"), control = hurdle.control(...), model = TRUE, y = TRUE, x = FALSE, ...)

Arguments

formula

symbolic description of the model, see details.

data, subset, na.action

arguments controlling formula processingvia model.frame.

weights

optional numeric vector of weights.

offset

optional numeric vector with an a priori known component to beincluded in the linear predictor of the count model. See below for moreinformation on offsets.

dist

character specification of count model family.

zero.dist

character specification of the zero hurdle model family.

link

character specification of link function in the binomialzero hurdle (only used if zero.dist = "binomial".

control

a list of control arguments specified viahurdle.control.

model, y, x

logicals. If TRUE the corresponding componentsof the fit (model frame, response, model matrix) are returned.

...

arguments passed to hurdle.control in thedefault setup.

Details

Hurdle count models are two-component models with a truncated countcomponent for positive counts and a hurdle component that models thezero counts. Thus, unlike zero-inflation models, there are not twosources of zeros: the count model is only employed if the hurdle formodeling the occurrence of zeros is exceeded. The count model is typicallya truncated Poisson or negative binomial regression (with log link).The geometric distribution is a special case of the negative binomial withsize parameter equal to 1. For modeling the hurdle, either a binomial modelcan be employed or a censored count distribution. The outcome of the hurdlecomponent of the model is the occurrence of a non-zero (positive) count.Thus, for most models, positive coefficients in the hurdle component indicatethat an increase in the regressor increases the probability of a non-zero count.Binomial logit and censored geometric models as the hurdle part both lead to the same likelihood function and thus to the same coefficient estimates.A censored negative binomial model for the zero hurdle is only identifiedif there is at least one non-constant regressor with (true) coefficient differentfrom zero (and if all coefficients are close to zero the model can be poorlyconditioned).

The formula can be used to specify both components of the model:If a formula of type y ~ x1 + x2 is supplied, then the sameregressors are employed in both components. This is equivalent toy ~ x1 + x2 | x1 + x2. Of course, a different set of regressorscould be specified for the zero hurdle component, e.g.,y ~ x1 + x2 | z1 + z2 + z3 giving the count data model y ~ x1 + x2conditional on (|) the zero hurdle model y ~ z1 + z2 + z3.

Offsets can be specified in both parts of the model pertaining to count andzero hurdle model: y ~ x1 + offset(x2) | z1 + z2 + offset(z3), wherex2 is used as an offset (i.e., with coefficient fixed to 1) in thecount part and z3 analogously in the zero hurdle part. By the rulestated above y ~ x1 + offset(x2) is expanded toy ~ x1 + offset(x2) | x1 + offset(x2). Instead of using theoffset() wrapper within the formula, the offset argumentcan also be employed which sets an offset only for the count model. Thus,formula = y ~ x1 and offset = x2 is equivalent toformula = y ~ x1 + offset(x2) | x1.

All parameters are estimated by maximum likelihood using optim,with control options set in hurdle.control.Starting values can be supplied, otherwise they are estimated by glm.fit(the default). By default, the two components of the model are estimated separatelyusing two optim calls. Standard errors are derived numerically usingthe Hessian matrix returned by optim. Seehurdle.control for details.

The returned fitted model object is of class "hurdle" and is similarto fitted "glm" objects. For elements such as "coefficients" or"terms" a list is returned with elements for the zero and count components,respectively. For details see below.

A set of standard extractor functions for fitted model objects is available forobjects of class "hurdle", including methods to the generic functionsprint, summary, coef, vcov, logLik, residuals, predict, fitted, terms,model.matrix. See predict.hurdle for more detailson all methods.

Value

An object of class "hurdle", i.e., a list with components including

coefficients

a list with elements "count" and "zero"containing the coefficients from the respective models,

residuals

a vector of raw residuals (observed - fitted),

fitted.values

a vector of fitted means,

optim

a list (of lists) with the output(s) from the optim call(s) forminimizing the negative log-likelihood(s),

control

the control arguments passed to the optim call,

start

the starting values for the parameters passed to the optim call(s),

weights

the case weights used,

offset

a list with elements "count" and "zero"containing the offset vectors (if any) from the respective models,

n

number of observations (with weights > 0),

df.null

residual degrees of freedom for the null model (= n - 2),

df.residual

residual degrees of freedom for fitted model,

terms

a list with elements "count", "zero" and"full" containing the terms objects for the respective models,

theta

estimate of the additional \theta parameter of thenegative binomial model(s) (if negative binomial component is used),

SE.logtheta

standard error(s) for \log(\theta),

loglik

log-likelihood of the fitted model,

vcov

covariance matrix of all coefficients in the model (derived from theHessian of the optim output(s)),

dist

a list with elements "count" and "zero" with characterstrings describing the respective distributions used,

link

character string describing the link if a binomial zero hurdle modelis used,

linkinv

the inverse link function corresponding to link,

converged

logical indicating successful convergence of optim,

call

the original function call,

formula

the original formula,

levels

levels of the categorical regressors,

contrasts

a list with elements "count" and "zero"containing the contrasts corresponding to levels from therespective models,

model

the full model frame (if model = TRUE),

y

the response count vector (if y = TRUE),

x

a list with elements "count" and "zero"containing the model matrices from the respective models(if x = TRUE).

Author(s)

Achim Zeileis <Achim.Zeileis@R-project.org>

References

Cameron, A. Colin and Pravin K. Trivedi. 1998. Regression Analysis of Count Data. New York: Cambridge University Press.

Cameron, A. Colin and Pravin K. Trivedi 2005. Microeconometrics: Methods and Applications.Cambridge: Cambridge University Press.

Mullahy, J. 1986. Specification and Testing of Some Modified Count Data Models.Journal of Econometrics. 33:341–365.

Zeileis, Achim, Christian Kleiber and Simon Jackman 2008.“Regression Models for Count Data in R.” Journal of Statistical Software, 27(8).URL https://www.jstatsoft.org/v27/i08/.

See Also

hurdle.control, glm,glm.fit, glm.nb,zeroinfl

Examples

## datadata("bioChemists", package = "pscl")## logit-poisson## "art ~ ." is the same as "art ~ . | .", i.e.## "art ~ fem + mar + kid5 + phd + ment | fem + mar + kid5 + phd + ment"fm_hp1 <- hurdle(art ~ ., data = bioChemists)summary(fm_hp1)## geometric-poissonfm_hp2 <- hurdle(art ~ ., data = bioChemists, zero = "geometric")summary(fm_hp2)## logit and geometric model are equivalentcoef(fm_hp1, model = "zero") - coef(fm_hp2, model = "zero")## logit-negbinfm_hnb1 <- hurdle(art ~ ., data = bioChemists, dist = "negbin")summary(fm_hnb1)## negbin-negbin## (poorly conditioned zero hurdle, note the standard errors)fm_hnb2 <- hurdle(art ~ ., data = bioChemists, dist = "negbin", zero = "negbin")summary(fm_hnb2)

[Package pscl version 1.5.9 Index]

Hurdle Models for Count Data Regression (2024)
Top Articles
Best L brackets for your camera
What Are L Brackets & What Are They Used for?
Noaa Charleston Wv
Nehemiah 4:1–23
What spices do Germans cook with?
Algebra Calculator Mathway
Fort Carson Cif Phone Number
Math Playground Protractor
My Boyfriend Has No Money And I Pay For Everything
Gunshots, panic and then fury - BBC correspondent's account of Trump shooting
Xrarse
Learn How to Use X (formerly Twitter) in 15 Minutes or Less
Walgreens On Nacogdoches And O'connor
The Weather Channel Facebook
今月のSpotify Japanese Hip Hopベスト作品 -2024/08-|K.EG
ᐅ Bosch Aero Twin A 863 S Scheibenwischer
7543460065
979-200-6466
Troy Bilt Mower Carburetor Diagram
Unity - Manual: Scene view navigation
Abby's Caribbean Cafe
Kayky Fifa 22 Potential
Bible Gateway passage: Revelation 3 - New Living Translation
Nz Herald Obituary Notices
At&T Outage Today 2022 Map
Boston Dynamics’ new humanoid moves like no robot you’ve ever seen
SN100C, An Australia Trademark of Nihon Superior Co., Ltd.. Application Number: 2480607 :: Trademark Elite Trademarks
Ecampus Scps Login
Fleet Farm Brainerd Mn Hours
The Banshees Of Inisherin Showtimes Near Broadway Metro
Bj타리
Lacey Costco Gas Price
Gillette Craigslist
Kuttymovies. Com
Kristy Ann Spillane
Eegees Gift Card Balance
Life Insurance Policies | New York Life
Hotel Denizen Mckinney
Srg Senior Living Yardi Elearning Login
Muziq Najm
The Transformation Of Vanessa Ray From Childhood To Blue Bloods - Looper
Ksu Sturgis Library
Red Dead Redemption 2 Legendary Fish Locations Guide (“A Fisher of Fish”)
Gt500 Forums
Sas Majors
Does Target Have Slime Lickers
Tom Kha Gai Soup Near Me
Funkin' on the Heights
Lawrence E. Moon Funeral Home | Flint, Michigan
Gt500 Forums
Deviantart Rwby
Gainswave Review Forum
Latest Posts
Article information

Author: Terence Hammes MD

Last Updated:

Views: 6563

Rating: 4.9 / 5 (69 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Terence Hammes MD

Birthday: 1992-04-11

Address: Suite 408 9446 Mercy Mews, West Roxie, CT 04904

Phone: +50312511349175

Job: Product Consulting Liaison

Hobby: Jogging, Motor sports, Nordic skating, Jigsaw puzzles, Bird watching, Nordic skating, Sculpting

Introduction: My name is Terence Hammes MD, I am a inexpensive, energetic, jolly, faithful, cheerful, proud, rich person who loves writing and wants to share my knowledge and understanding with you.