|
| 1 | +# Moran Eigenvectors |
| 2 | + |
| 3 | +## Introduction [^1] |
| 4 | + |
| 5 | +The Moran eigenvector approach ([Dray, Legendre, and Peres-Neto |
| 6 | +2006](#ref-dray+legendre+peres-neto:06); [Griffith and Peres-Neto |
| 7 | +2006](#ref-griffith+peres-neto:06)) involved the spatial patterns |
| 8 | +represented by maps of eigenvectors; by choosing suitable orthogonal |
| 9 | +patterns and adding them to a linear or generalised linear model, the |
| 10 | +spatial dependence present in the residuals can be moved into the model. |
| 11 | + |
| 12 | +It uses brute force to search the set of eigenvectors of the matrix |
| 13 | +$`\mathbf{M W M}`$, where |
| 14 | + |
| 15 | +\$\$\mathbf{M} = \mathbf{I} - \mathbf{X}(\mathbf{X}^{\rm T} |
| 16 | +\mathbf{X})^{-1}\mathbf{X}^{\rm T}\$\$ is a symmetric and idempotent |
| 17 | +projection matrix and $`\mathbf{W}`$ are the spatial weights. In the |
| 18 | +spatial lag form of `SpatialFiltering` and in the GLM `ME` form below, |
| 19 | +$`\mathbf{X}`$ is an $`n`$-vector of ones, that is the intercept only. |
| 20 | + |
| 21 | +In its general form, `SpatialFiltering` chooses the subset of the $`n`$ |
| 22 | +eigenvectors that reduce the residual spatial autocorrelation in the |
| 23 | +error of the model with covariates. The lag form adds the covariates in |
| 24 | +assessment of which eigenvectors to choose, but does not use them in |
| 25 | +constructing the eigenvectors. `SpatialFiltering` was implemented and |
| 26 | +contributed by Yongwan Chun and Michael Tiefelsdorf, and is presented in |
| 27 | +Tiefelsdorf and Griffith ([2007](#ref-tiefelsdorf+griffith:07)); `ME` is |
| 28 | +based on Matlab code by Pedro Peres-Neto and is discussed in Dray, |
| 29 | +Legendre, and Peres-Neto ([2006](#ref-dray+legendre+peres-neto:06)) and |
| 30 | +Griffith and Peres-Neto ([2006](#ref-griffith+peres-neto:06)). |
| 31 | + |
| 32 | +``` r |
| 33 | +library(spdep) |
| 34 | +require("sf", quietly=TRUE) |
| 35 | +if (packageVersion("spData") >= "2.3.2") { |
| 36 | + NY8 <- sf::st_read(system.file("shapes/NY8_utm18.gpkg", package="spData")) |
| 37 | +} else { |
| 38 | + NY8 <- sf::st_read(system.file("shapes/NY8_bna_utm18.gpkg", package="spData")) |
| 39 | + sf::st_crs(NY8) <- "EPSG:32618" |
| 40 | + NY8$Cases <- NY8$TRACTCAS |
| 41 | +} |
| 42 | +``` |
| 43 | + |
| 44 | + ## Reading layer `NY8_utm18' from data source |
| 45 | + ## `/home/rsb/lib/r_libs/spData/shapes/NY8_utm18.gpkg' using driver `GPKG' |
| 46 | + ## Simple feature collection with 281 features and 17 fields |
| 47 | + ## Geometry type: POLYGON |
| 48 | + ## Dimension: XY |
| 49 | + ## Bounding box: xmin: 358241.9 ymin: 4649755 xmax: 480393.1 ymax: 4808545 |
| 50 | + ## Projected CRS: WGS 84 / UTM zone 18N |
| 51 | + |
| 52 | +``` r |
| 53 | +NY_nb <- read.gal(system.file("weights/NY_nb.gal", package="spData"), override.id=TRUE) |
| 54 | +``` |
| 55 | + |
| 56 | +``` r |
| 57 | +library(spatialreg) |
| 58 | +nySFE <- SpatialFiltering(Z~PEXPOSURE+PCTAGE65P+PCTOWNHOME, data=NY8, nb=NY_nb, style="W", verbose=FALSE) |
| 59 | +nylmSFE <- lm(Z~PEXPOSURE+PCTAGE65P+PCTOWNHOME+fitted(nySFE), data=NY8) |
| 60 | +summary(nylmSFE) |
| 61 | +``` |
| 62 | + |
| 63 | + ## |
| 64 | + ## Call: |
| 65 | + ## lm(formula = Z ~ PEXPOSURE + PCTAGE65P + PCTOWNHOME + fitted(nySFE), |
| 66 | + ## data = NY8) |
| 67 | + ## |
| 68 | + ## Residuals: |
| 69 | + ## Min 1Q Median 3Q Max |
| 70 | + ## -1.5184 -0.3523 -0.0105 0.3221 3.1964 |
| 71 | + ## |
| 72 | + ## Coefficients: |
| 73 | + ## Estimate Std. Error t value Pr(>|t|) |
| 74 | + ## (Intercept) -0.51728 0.14606 -3.542 0.000469 *** |
| 75 | + ## PEXPOSURE 0.04884 0.03230 1.512 0.131717 |
| 76 | + ## PCTAGE65P 3.95089 0.55776 7.083 1.25e-11 *** |
| 77 | + ## PCTOWNHOME -0.56004 0.15688 -3.570 0.000423 *** |
| 78 | + ## fitted(nySFE)vec13 -2.09397 0.60534 -3.459 0.000630 *** |
| 79 | + ## fitted(nySFE)vec44 -2.24003 0.60534 -3.700 0.000261 *** |
| 80 | + ## fitted(nySFE)vec6 1.02979 0.60534 1.701 0.090072 . |
| 81 | + ## fitted(nySFE)vec38 1.29282 0.60534 2.136 0.033613 * |
| 82 | + ## fitted(nySFE)vec20 1.10064 0.60534 1.818 0.070150 . |
| 83 | + ## fitted(nySFE)vec14 -1.05105 0.60534 -1.736 0.083662 . |
| 84 | + ## fitted(nySFE)vec75 1.90600 0.60534 3.149 0.001826 ** |
| 85 | + ## fitted(nySFE)vec21 -1.06331 0.60534 -1.757 0.080138 . |
| 86 | + ## fitted(nySFE)vec36 -1.17861 0.60534 -1.947 0.052578 . |
| 87 | + ## fitted(nySFE)vec61 -1.08582 0.60534 -1.794 0.073986 . |
| 88 | + ## --- |
| 89 | + ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 |
| 90 | + ## |
| 91 | + ## Residual standard error: 0.6053 on 267 degrees of freedom |
| 92 | + ## Multiple R-squared: 0.3401, Adjusted R-squared: 0.308 |
| 93 | + ## F-statistic: 10.58 on 13 and 267 DF, p-value: < 2.2e-16 |
| 94 | + |
| 95 | +``` r |
| 96 | +nylm <- lm(Z~PEXPOSURE+PCTAGE65P+PCTOWNHOME, data=NY8) |
| 97 | +anova(nylm, nylmSFE) |
| 98 | +``` |
| 99 | + |
| 100 | + ## Analysis of Variance Table |
| 101 | + ## |
| 102 | + ## Model 1: Z ~ PEXPOSURE + PCTAGE65P + PCTOWNHOME |
| 103 | + ## Model 2: Z ~ PEXPOSURE + PCTAGE65P + PCTOWNHOME + fitted(nySFE) |
| 104 | + ## Res.Df RSS Df Sum of Sq F Pr(>F) |
| 105 | + ## 1 277 119.619 |
| 106 | + ## 2 267 97.837 10 21.782 5.9444 3.988e-08 *** |
| 107 | + ## --- |
| 108 | + ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 |
| 109 | + |
| 110 | +Since the `SpatialFiltering` approach does not allow weights to be used, |
| 111 | +we see that the residual autocorrelation of the original linear model is |
| 112 | +absorbed, or ‘whitened’ by the inclusion of selected eigenvectors in the |
| 113 | +model, but that the covariate coefficients change little. The addition |
| 114 | +of these eigenvectors – each representing an independent spatial pattern |
| 115 | +– relieves the residual autocorrelation, but otherwise makes few changes |
| 116 | +in the substantive coefficient values. |
| 117 | + |
| 118 | +The `ME` function also searches for eigenvectors from the spatial lag |
| 119 | +variant of the underlying model, but in a GLM framework. The criterion |
| 120 | +is a permutation bootstrap test on Moran’s $`I`$ for regression |
| 121 | +residuals, and in this case, because of the very limited remaining |
| 122 | +spatial autocorrelation, is set at $`\alpha = 0.5`$. Even with this very |
| 123 | +generous stopping rule, only few eigenvectors are chosen; their combined |
| 124 | +contribution only just improves the fit of the GLM model. |
| 125 | + |
| 126 | +``` r |
| 127 | +NYlistwW <- nb2listw(NY_nb, style = "W") |
| 128 | +set.seed(111) |
| 129 | +``` |
| 130 | + |
| 131 | +``` r |
| 132 | +nyME <- ME(Cases~PEXPOSURE+PCTAGE65P+PCTOWNHOME, data=NY8, offset=log(POP8), family="poisson", listw=NYlistwW, alpha=0.46) |
| 133 | +``` |
| 134 | + |
| 135 | +``` r |
| 136 | +nyME |
| 137 | +``` |
| 138 | + |
| 139 | + ## Eigenvector ZI pr(ZI) |
| 140 | + ## 0 NA NA 0.31 |
| 141 | + ## 1 24 NA 0.44 |
| 142 | + ## 2 223 NA 0.42 |
| 143 | + ## 3 206 NA 0.43 |
| 144 | + ## 4 169 NA 0.48 |
| 145 | + |
| 146 | +``` r |
| 147 | +NY8$eigen_1 <- fitted(nyME)[,1] |
| 148 | +NY8$eigen_2 <- fitted(nyME)[,2] |
| 149 | +``` |
| 150 | + |
| 151 | +``` r |
| 152 | +#gry <- brewer.pal(9, "Greys")[-1] |
| 153 | +plot(NY8[,c("eigen_1", "eigen_2")]) |
| 154 | +``` |
| 155 | + |
| 156 | + |
| 157 | + |
| 158 | +``` r |
| 159 | +nyglmME <- glm(Cases~PEXPOSURE+PCTAGE65P+PCTOWNHOME+offset(log(POP8))+fitted(nyME), data=NY8, family="poisson") |
| 160 | +summary(nyglmME) |
| 161 | +``` |
| 162 | + |
| 163 | + ## |
| 164 | + ## Call: |
| 165 | + ## glm(formula = Cases ~ PEXPOSURE + PCTAGE65P + PCTOWNHOME + offset(log(POP8)) + |
| 166 | + ## fitted(nyME), family = "poisson", data = NY8) |
| 167 | + ## |
| 168 | + ## Coefficients: |
| 169 | + ## Estimate Std. Error z value Pr(>|z|) |
| 170 | + ## (Intercept) -8.13431 0.18388 -44.237 < 2e-16 *** |
| 171 | + ## PEXPOSURE 0.14136 0.03134 4.511 6.45e-06 *** |
| 172 | + ## PCTAGE65P 4.16875 0.60149 6.931 4.19e-12 *** |
| 173 | + ## PCTOWNHOME -0.39290 0.19222 -2.044 0.04096 * |
| 174 | + ## fitted(nyME)vec24 1.62658 0.72243 2.252 0.02435 * |
| 175 | + ## fitted(nyME)vec223 0.92941 0.70391 1.320 0.18671 |
| 176 | + ## fitted(nyME)vec206 -0.11559 0.68987 -0.168 0.86693 |
| 177 | + ## fitted(nyME)vec169 1.82674 0.68142 2.681 0.00735 ** |
| 178 | + ## --- |
| 179 | + ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 |
| 180 | + ## |
| 181 | + ## (Dispersion parameter for poisson family taken to be 1) |
| 182 | + ## |
| 183 | + ## Null deviance: 428.25 on 280 degrees of freedom |
| 184 | + ## Residual deviance: 340.08 on 273 degrees of freedom |
| 185 | + ## AIC: Inf |
| 186 | + ## |
| 187 | + ## Number of Fisher Scoring iterations: 5 |
| 188 | + |
| 189 | +``` r |
| 190 | +nyGLMp <- glm(Cases~PEXPOSURE+PCTAGE65P+PCTOWNHOME+offset(log(POP8)), data=NY8,family="poisson") |
| 191 | +anova(nyGLMp, nyglmME, test="Chisq") |
| 192 | +``` |
| 193 | + |
| 194 | + ## Analysis of Deviance Table |
| 195 | + ## |
| 196 | + ## Model 1: Cases ~ PEXPOSURE + PCTAGE65P + PCTOWNHOME + offset(log(POP8)) |
| 197 | + ## Model 2: Cases ~ PEXPOSURE + PCTAGE65P + PCTOWNHOME + offset(log(POP8)) + |
| 198 | + ## fitted(nyME) |
| 199 | + ## Resid. Df Resid. Dev Df Deviance Pr(>Chi) |
| 200 | + ## 1 277 353.35 |
| 201 | + ## 2 273 340.08 4 13.269 0.01003 * |
| 202 | + ## --- |
| 203 | + ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 |
| 204 | + |
| 205 | +Figure |
| 206 | +``` math |
| 207 | +fig:geigen2 |
| 208 | +``` |
| 209 | +shows the spatial patterns chosen to match the very small amount of |
| 210 | +spatial autocorrelation remaining in the model. As with the other |
| 211 | +Poisson regressions, the closeness to TCE sites is highly significant. |
| 212 | +Since, however, many TCE sites are also in or close to more densely |
| 213 | +populated urban areas with the possible presence of both point-source |
| 214 | +and non-point-source pollution, it would be premature to take such |
| 215 | +results simply at their face value. There is, however, a potentially |
| 216 | +useful contrast between the cities of Binghamton in the south of the |
| 217 | +study area with several sites in its vicinity, and Syracuse in the north |
| 218 | +without TCE sites in this data set. |
| 219 | + |
| 220 | +## References |
| 221 | + |
| 222 | +Dray, S., P. Legendre, and P. R. Peres-Neto. 2006. “Spatial Modeling: A |
| 223 | +Comprehensive Framework for Principle Coordinate Analysis of Neighbor |
| 224 | +Matrices (PCNM).” *Ecological Modelling* 196: 483–93. |
| 225 | + |
| 226 | +Griffith, D. A., and P. R. Peres-Neto. 2006. “Spatial Modeling in |
| 227 | +Ecology: The Flexibility of Eigenfunction Spatial Analyses.” *Ecology* |
| 228 | +87: 2603–13. |
| 229 | + |
| 230 | +Tiefelsdorf, M., and D. A. Griffith. 2007. “Semiparametric Filtering of |
| 231 | +Spatial Autocorrelation: The Eigenvector Approach.” *Environment and |
| 232 | +Planning A* 39: 1193–1221. |
| 233 | + |
| 234 | +[^1]: This vignette formed pp. 302–305 of the first edition of Bivand, |
| 235 | + R. S., Pebesma, E. and Gómez-Rubio V. (2008) Applied Spatial Data |
| 236 | + Analysis with R, Springer-Verlag, New York. It was retired from the |
| 237 | + second edition (2013) to accommodate material on other topics, and |
| 238 | + is made available in this form with the understanding of the |
| 239 | + publishers. |
0 commit comments