serie de experimentos independientes donde la respuesta puede ser sólo de dos categorías.
serie de experimentos independientes donde la respuesta puede ser sólo de dos categorías.
\[P(y = 1) = p\] \[P(y = 0) = 1-p\] \[P(y) = p^y(1-p)^{1-y}\]
La variable respuesta se distribuye como Binomial
\[Y_{i} \sim B(N, p)\] El enlace canónico es logit
\[\eta =log(\frac{\mu_{i}}{1-\mu{i}})\] Por lo que el valor esperado es igual al logit inverso.
\[\mu_{i} = \frac{e^{\eta}}{1+e^{\eta}}\]
Colocando las variables predictoras en la parte sistemática…
\[\mu_{i} = \frac{e^{\beta_{0} + \beta_{1}X_{1i} + ... + \beta_{k}X_{ki}}}{1+e^{\beta_{0} + \beta_{1}X_{1i} + ... + \beta_{k}X_{ki}}}\]
\[logit(p(x)) = log\frac{p(x)}{1-p(x)}\]
Donde \(p(x)\) es la probabilidad de éxito y \(1-p(x)\) es la probabilidad de fracaso.
dat <- read.table("horse.txt", header = TRUE)
fit <- glm(y ~ x, data = dat, family = binomial)
summary(fit)
##
## Call:
## glm(formula = y ~ x, family = binomial, data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.0281 -1.0458 0.5480 0.9066 1.6942
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -12.3508 2.6287 -4.698 2.62e-06 ***
## x 0.4972 0.1017 4.887 1.02e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 225.76 on 172 degrees of freedom
## Residual deviance: 194.45 on 171 degrees of freedom
## AIC: 198.45
##
## Number of Fisher Scoring iterations: 4
dat <- read.table("horse.txt", header = TRUE)
fit <- glm(y ~ x, data = dat, family = binomial)
anova(fit, test = "Chisq")
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: y
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 172 225.76
## x 1 31.306 171 194.45 2.204e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
dat <- read.table("horse.txt", header = TRUE)
fit0 <- glm(y ~ 1, data = dat, family = binomial)
fit1 <- glm(y ~ x, data = dat, family = binomial)
anova(fit0, fit1, test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: y ~ 1
## Model 2: y ~ x
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 172 225.76
## 2 171 194.45 1 31.306 2.204e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
dat <- read.table("horse.txt", header = TRUE)
fit1 <- glm(y ~ x, data = dat, family = binomial)
exp(fit$coefficients)
## (Intercept) x
## 4.326214e-06 1.644162e+00
\[LD_{50} = -\beta_{0}/\beta_{1}\]
dat <- read.table("gourlieana.txt", header = TRUE)
rta <- cbind(dat$pol.si, dat$pol.no)
fit <- glm(rta ~ espolon + flores, data = dat, family = binomial)
summary(fit)
##
## Call:
## glm(formula = rta ~ espolon + flores, family = binomial, data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.1171 -1.1451 0.0584 0.8644 3.3657
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.82541 0.84009 -4.554 5.27e-06 ***
## espolon 0.25991 0.06662 3.902 9.56e-05 ***
## flores 0.01628 0.01716 0.949 0.343
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 216.85 on 80 degrees of freedom
## Residual deviance: 201.26 on 78 degrees of freedom
## AIC: 409.67
##
## Number of Fisher Scoring iterations: 4
dat <- read.table("gourlieana.txt", header = TRUE)
rta <- cbind(dat$pol.si, dat$pol.no)
fit <- glm(rta ~ espolon + flores, data = dat, family = binomial)
anova(fit, test = "Chisq")
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: rta
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 80 216.84
## espolon 1 14.6886 79 202.16 0.0001268 ***
## flores 1 0.8991 78 201.26 0.3430157
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
dat <- read.table("gourlieana.txt", header = TRUE)
rta <- cbind(dat$pol.si, dat$pol.no)
fit.q <- glm(rta ~ espolon + flores, data = dat, family = quasibinomial)
summary(fit.q)
##
## Call:
## glm(formula = rta ~ espolon + flores, family = quasibinomial,
## data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.1171 -1.1451 0.0584 0.8644 3.3657
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.82541 1.28630 -2.974 0.00391 **
## espolon 0.25991 0.10200 2.548 0.01279 *
## flores 0.01628 0.02628 0.619 0.53741
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for quasibinomial family taken to be 2.344379)
##
## Null deviance: 216.85 on 80 degrees of freedom
## Residual deviance: 201.26 on 78 degrees of freedom
## AIC: NA
##
## Number of Fisher Scoring iterations: 4
dat <- read.table("gourlieana.txt", header = TRUE)
rta <- cbind(dat$pol.si, dat$pol.no)
fit.q <- glm(rta ~ espolon + flores, data = dat, family = quasibinomial)
anova(fit.q, test = "F")
## Analysis of Deviance Table
##
## Model: quasibinomial, link: logit
##
## Response: rta
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev F Pr(>F)
## NULL 80 216.84
## espolon 1 14.6886 79 202.16 6.2655 0.0144 *
## flores 1 0.8991 78 201.26 0.3835 0.5375
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
dat <- read.table("gourlieana.txt", header = TRUE)
rta <- cbind(dat$pol.si, dat$pol.no)
fit <- glm(rta ~ espolon + flores, data = dat, family = binomial)
PL <- fit$linear.predictors^2
fit2 <- glm(rta ~ espolon + flores + PL, data = dat, family = binomial)
summary(fit2)
##
## Call:
## glm(formula = rta ~ espolon + flores + PL, family = binomial,
## data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.0619 -1.1211 0.1236 0.8965 3.3048
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.29040 2.81956 -2.231 0.0257 *
## espolon 0.43586 0.20335 2.143 0.0321 *
## flores 0.02915 0.02222 1.312 0.1895
## PL 0.62963 0.68520 0.919 0.3581
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 216.85 on 80 degrees of freedom
## Residual deviance: 200.42 on 77 degrees of freedom
## AIC: 410.83
##
## Number of Fisher Scoring iterations: 4
dat <- read.table("gourlieana.txt", header = TRUE)
rta <- cbind(dat$pol.si, dat$pol.no)
fit <- glm(rta ~ espolon + flores, data = dat, family = quasibinomial)
fit$coefficients
exp(fit$coefficients)
exp(confint(fit))