Title: | Generalized Waring Regression Model for Count Data |
---|---|
Description: | Statistical functions to fit, validate and describe a Generalized Waring Regression Model (GWRM). |
Authors: | Silverio Vilchez-Lopez [aut, cre], Antonio Jose Saez-Castillo [aut], Maria Jose Olmo-Jimenez [aut], Jose Rodriguez-Avi [aut], Antonio Conde-Sanchez [aut], Ana Maria Martinez-Rodriguez [aut] |
Maintainer: | Silverio Vilchez-Lopez <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 2.1.0.4 |
Built: | 2025-02-19 02:42:11 UTC |
Source: | https://github.com/cran/GWRM |
Compute all the single terms in the scope argument that can be added to the GWRM model, fit those models and compute a table of the changes in fit.
## S3 method for class 'gw' add1(object, scope, test = c("none", "Chisq"), k = 2, trace = FALSE, ...)
## S3 method for class 'gw' add1(object, scope, test = c("none", "Chisq"), k = 2, trace = FALSE, ...)
object |
a fitted object of class inheriting from |
scope |
a formula giving the terms to be considered for adding. |
test |
|
k |
the penalty constant in AIC / Cp. |
trace |
if |
... |
further arguments passed to or from other methods. |
An object of class "anova"
summarizing the differences in fit between the models.
data(goals) fit0 <- gw(goals ~ offset(log(played)), data = goals) summary(fit0) fit1 <- add1(fit0, ~ position) summary(fit1)
data(goals) fit0 <- gw(goals ~ offset(log(played)), data = goals) summary(fit0) fit1 <- add1(fit0, ~ position) summary(fit1)
Compute all the single terms in the scope argument that can be dropped from the GWRM model, fit those models and compute a table of the changes in fit.
## S3 method for class 'gw' drop1(object, scope, test = c("none", "Chisq"), k = 2, trace = FALSE, ...)
## S3 method for class 'gw' drop1(object, scope, test = c("none", "Chisq"), k = 2, trace = FALSE, ...)
object |
a fitted object of class inheriting from |
scope |
a formula giving the terms to be considered for dropping. |
test |
|
k |
the penalty constant in AIC / Cp. |
trace |
if |
... |
further arguments passed to or from other methods. |
An object of class "anova"
summarizing the differences in fit between the models.
data(goals) fit0 <- gw(goals ~ offset(log(played)), data = goals) summary(fit0) fit1 <- step(fit0, ~ position) summary(fit1)
data(goals) fit0 <- gw(goals ~ offset(log(played)), data = goals) summary(fit0) fit1 <- step(fit0, ~ position) summary(fit1)
The response variable goals
, is the number of goals scored by the footballers (excluding goalkeepers) in the first division of the Spanish league from the 2000/2001 to the 2006/2007 seasons. Since there are footballers who played more than one season, the season in which each one has played more matches has been selected. The covariates considered are the final classification of the team in each season, the position in the field (forward, midfielder and defender) and the number of matches played.
data(goals)
data(goals)
A data frame with 1224 observations on the following 4 variables
clasif
a numeric vector
position
a factor with levels Defender
Forward
Midfielder
played
a numeric vector
goals
a numeric vector
MARCA sports paper
Rodriguez-Avi, J., Conde-Sanchez, A., Saez-Castillo, A. J., Olmo-Jimenez, M. J. and Martinez Rodriguez, A. M.(2009). A generalized Waring regression model for count data. Computational Statistics and Data Analysis, 53, pp. 3717-3725.
gw
is used to fit Generalized Waring Regression Models (GWRM), specified by giving a symbolic description of the linear predictor.
gw( formula, data, weights, k = NULL, subset, na.action, kstart = 1, rostart = 2, betastart = NULL, offset, control = list(...), method = NULL, hessian = TRUE, model = TRUE, x = FALSE, y = TRUE, ... )
gw( formula, data, weights, k = NULL, subset, na.action, kstart = 1, rostart = 2, betastart = NULL, offset, control = list(...), method = NULL, hessian = TRUE, model = TRUE, x = FALSE, y = TRUE, ... )
formula |
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from |
weights |
an optional vector of 'prior weights' to be used in the fitting process. Should be |
k |
optional value for the |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data contain |
kstart |
starting value for the |
rostart |
starting value for the |
betastart |
starting values for the vector of means. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be |
control |
a list of parameters for controlling the fitting process. |
method |
the method to be used in fitting the model. The default method initially uses non-linear minimization ( |
hessian |
if |
model |
a logical value indicating whether model frame should be included as a component of the returned value. |
x , y
|
For For |
... |
further arguments. |
gw
returns an object of class "gw"
. The function summary
can be used to obtain or print a summary of the results. An object of class "gw"
is a list containing the following components:
Y
if requested (the default), the y
vector used.
W
the weights supplied, a vector of 1
s if none were.
covars
names of the covariates in the model.
nobs
number of observations.
covoffset
a logical value specifying if an offset is present.
loglik
the maximized log-likelihood.
aic
a version of Akaike's An Information Criterion, minus twice the maximized log-likelihood plus twice the number of parameters.
bic
Bayesian Information Criterion, minus twice the maximized log-likelihood plus the number of parameters multiplied by the logarithm of the number of observations.
df.residual
the residual degrees of freedom.
residuals
the residuals in the final iteration of the fit.
coefficients
a named vector of coefficients.
betaIIpars
parameters estimates of the BetaII distribution.
betascoefs
a vector of coefficients.
fitted.values
the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.
hessian
a symmetric matrix giving an estimate of the Hessian at the solution found in the optimization of the log-likelihood function.
cov
an estimate of the covariance matrix of the model coefficients.
se
a vector of the standard errors estimates of the estimated coefficients.
corr
an estimate of the correlation matrix of the model coefficients.
code
a code that indicates successful convergence of the fitter function used (see nlm
and optim
helps).
converged
logical value that indicates if the optimization algorithms succesfull.
method
the name of the fitter function used.
k
if requested, the k
value used.
kBool
a logical value specifying whether there is a k
value or it is estimated.
call
the matched call.
formula
the formula supplied.
terms
the terms
object used.
data
the data
argument.
offset
the offset vector used.
control
the value of the control
argument used.
method
the name of the fitter function used.
contrasts
(where relevant) the contrasts used.
xlevels
(where relevant) a record of the levels of the factors used in fitting.
data(goals) gw(goals ~ position + offset(log(played)), data = goals)
data(goals) gw(goals ~ position + offset(log(played)), data = goals)
Statistical functions to fit, validate and describe a Generalized Waring Regression Model (GWRM).
GWRM is a package for fitting and computing Generalized Waring Regression Models. It includes functions for fitting the model to data, for diagnosis and for computing important statistics, as those related to the partition of the variance. It also includes the dataset and the example of Rodriguez-Avi et al. (2009). The main function you're likely to need from GWRM is gw
, in order to obtain a GWRM fit from data.
Rodriguez-Avi, J., Conde-Sanchez, A., Saez-Castillo, A. J., Olmo-Jimenez, M. J. and Martinez Rodriguez, A. M.(2009). A generalized Waring regression model for count data. Computational Statistics and Data Analysis, 53, pp. 3717-3725.
In a GWRM model, the variance may be split into three terms. The first component of this decomposition represents the variability due to randomness and it comes from the underlying Poisson model. The other two components refer to the variability that is not due to randomness but is explained by the presence of liability and proneness, respectively.
partvar(object, newdata = NULL, ...)
partvar(object, newdata = NULL, ...)
object |
an object class |
newdata |
optionally, a data frame in which to look for variables with which to obtain the partition. If omitted, all the cases are used. |
... |
further arguments passed to or from other methods. |
One of the main drawbacks of using the Univariate Generalized Waring Distribution with parameters a, k and ro is that the parameters a and k are interchangeable when there is no auxiliary information given by the covariates. This identification problem prevents liability and proneness components from being distinguished in the univariate fits. To solve it, Irwin (1968) proposed that the expert should deduce which of these components is which from their own knowledge of the phenomenon. Xekalaki (1984) proposed a less subjective solution, developing a bivariate model that divides the observation period into two non-overlapping subperiods in which the model for proneness does not change. In the GWRM with, at least, one covariate, the parameters a and k are not interchangeable because, as in Xekalaki's bivariate model, the random model for proneness does not change. So, the identification problem of the non-random components is solved.
Two data frames, with ratio of sources of variation and sources of variation in which variance is splitted.
data(goals) fit <- gw(goals ~ position, data = goals) pos <- factor(c("Defender", "Midfielder"), levels = c("Defender", "Midfielder", "Forward")) lev <- data.frame(position = pos, played = c(17, 21)) partvar(fit, newdata = lev)
data(goals) fit <- gw(goals ~ position, data = goals) pos <- factor(c("Defender", "Midfielder"), levels = c("Defender", "Midfielder", "Forward")) lev <- data.frame(position = pos, played = c(17, 21)) partvar(fit, newdata = lev)
Obtains predictions from a fitted GWRM object.
## S3 method for class 'gw' predict(object = NULL, newdata = NULL, ...)
## S3 method for class 'gw' predict(object = NULL, newdata = NULL, ...)
object |
a fitted object of class inheriting from |
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used. |
... |
further arguments passed to or from other methods. |
A data frame with newdata and their fitted means.
data(goals) fit <- gw(goals ~ position, data = goals) predict(fit)
data(goals) fit <- gw(goals ~ position, data = goals) predict(fit)
residuals is a method which extracts model residuals from "gw"
, commonly returned by gw
function. Optionally, it produces a normal plot with a simulated envelope of the residuals.
## S3 method for class 'gw' residuals( object, type = "pearson", rep = 19, envelope = FALSE, title = "Simulated Envelope of Residuals", trace = FALSE, parallel = TRUE, ncores = 2, ... )
## S3 method for class 'gw' residuals( object, type = "pearson", rep = 19, envelope = FALSE, title = "Simulated Envelope of Residuals", trace = FALSE, parallel = TRUE, ncores = 2, ... )
object |
object of class |
type |
type of residuals to be extracted. Default is |
rep |
number of replications for envelope construction. Default is 19, that is the smallest 95 percent band that can be built. |
envelope |
a logical value to specify if the envelope is required. |
title |
a title for the envelope. |
trace |
if |
parallel |
if |
ncores |
is the number of cores that we use if |
... |
further arguments passed to or from other methods. |
The usual Q-Q plot may show an unsatisfactory pattern of the residuals of a model fitted: then we are led to think that the model is badly specificated. The normal plot with simulated envelope indicates that under the distribution of the response variable the model is OK if only a few points fall off the envelope.
Residuals values and plot
data(goals) set.seed(1) fit0 <- gw(goals ~ position, data = goals[sample(1:nrow(goals), 75), ]) residuals(fit0, type = "pearson", rep = 19, envelope = TRUE, trace = FALSE, ncores = 2)
data(goals) set.seed(1) fit0 <- gw(goals ~ position, data = goals[sample(1:nrow(goals), 75), ]) residuals(fit0, type = "pearson", rep = 19, envelope = TRUE, trace = FALSE, ncores = 2)
Random generation of values from a Generalized Waring distribution with parameters a
, k
and ro
.
rgw(n, a, k, ro)
rgw(n, a, k, ro)
n |
number of random values to return. |
a |
vector of (non-negative) first parameters. |
k |
vector of (non-negative) second parameters. |
ro |
vector of (non-negative) third parameters. |
rgw
is an auxiliar function which generates random samples from a Generalized Waring distribution to be used in the simulated envelope called by residuals
.