Source: UNIV OF WISCONSIN submitted to
APPLICATIONS OF STATISTICS TO AGRICULTURE: ANALYSIS OF SPATIALLY AUTOCORRELATED CATEGORICAL DATA
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0193926
Grant No.
(N/A)
Project No.
WIS04674
Proposal No.
(N/A)
Multistate No.
(N/A)
Program Code
(N/A)
Project Start Date
Oct 1, 2002
Project End Date
Sep 30, 2006
Grant Year
(N/A)
Project Director
Clayton, M.
Recipient Organization
UNIV OF WISCONSIN
21 N PARK ST STE 6401
MADISON,WI 53715-1218
Performing Department
PLANT PATHOLOGY
Non Technical Summary
It is often of interest to study correlated data that are collected over a two-dimensional region, such as a tree plantation. Currently there are few tools for analyzing such data if they are categorical: e.g. if we are focusing on tree condition (dead/alive) or absence/presence of a pest. This research will provide new methods for studying such data, focusing especially on situations where data are missing, or where there are large regional trends in the data.
Animal Health Component
(N/A)
Research Effort Categories
Basic
65%
Applied
35%
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
90173102090100%
Goals / Objectives
The proposed research involves the application of statistical methodologies to the analysis and interpretation of categorical data that are collected over space. Such data can arise in many settings in the agricultural, biological, and environmental sciences, including plant pathology, forestry, soil science, hydrogeology, and so on. A specific application of interest arises in forest entomology, where, for example, it is of interest to determine whether death of trees is independent of the presence of bark beetles on the trees. When data are spatially correlated, the usual chi-squared tests of such a hypothesis will be invalid, and new methods must be developed. In work previously funded by Hatch, we developed statistical analysis tools that accurately test the hypothesis of independence in the presence of spatial correlation in the data. Therefore, we have developed asymptotic distributions for test statistics of independence in the presence of spatial correlation. We have also developed methods for simulating spatially dependent categorical data. Based on these, we have demonstrated that our asymptotics approach is effective, although small-sample corrections would be valuable. We are now developing methods to address missing data. The proposed work will extend this work to deal with missing data and the uncertainties that arise when spatial autocorrelation must be estimated. In addition, we will consider situations where there are large-scale regional trends in the data.
Project Methods
We will use three general sets of tools to develop and evaluate our methods: statistical theory, computer simulations, and applications to field data. Statistical theory relevant to spatial methods and to categorical data will be used as a foundation for the methods that we develop. Researchers at the University of Wisconsin-Madison have offered to provide field data which we can use to test our methods. Computer simulations will play an important role here, because while the applications of our work have been motivated by specific problems arising in the state of Wisconsin, we intend to produce methods that are applicable more broadly. Thus, we will use computer simulations to simulate data beyond the specific field data available presently and to more extensively test our methods. To proceed, we will follow Handcock and Stein and use Bayesian methods wherein probability distributions are used to characterize information about the parameters. This lets us take advantage of Markov chain Monte Carlo (MCMC) methods which can produce an approximation to the posterior distribution of the parameters of interest, even when conventional analytical methods fail. We will consider three approaches: (1) use of an autologistic model for spatial correlation; (2) a direct extension of our current model that will use quasi-likelihood methods to approximate the likelihood in an MCMC approach; (3) development and use of a multinomial autologistic model. These approaches differ in terms of interpretability, ease of implementation, and their ability to deal with the three concerns of this proposal: missing data, variability in parameter estimation, and trend surfaces.

Progress 10/01/02 to 09/30/06

Outputs
This research involves the application of statistical methodologies to the analysis of categorical data that are collected over space. Such data arise in the agricultural, biological, and environmental sciences. A specific application of interest arises in forest entomology. Of interest is whether death of trees is independent of the presence of Ips pini on the trees. When data are spatially correlated, the usual chi-squared tests of this hypothesis will be invalid. We have developed methods based on a multinomial autologistic model. This allows us to take advantage of Markov chain Monte Carlo methods and use a Bayesian approach to deal with the uncertainties that arise when spatial autocorrelation must be estimated, and to analyze multinomial data.

Impacts
This research will provide new methods for studying spatially correlated categorical data, focusing especially on situations where data are missing, or where there are large regional trends in the data

Publications

  • Mladenoff DJ, Clayton MK, Sickley TA, Wydeven AP. 2006. L. D. Mech critique of our work lacks scientific validity. Wildlife Society Bulletin 34:878-881.
  • Gangnon RE, Clayton MK. 2006. Cluster detection using Bayes factors from overparametrized cluster models. Environmental and Ecological Statistics. (to appear).
  • Hawbaker TJ, Radeloff VC, Clayton MK, Hammer RB, Gonzalez-Abraham CE. 2006. Road development, housing growth, and landscape fragmentation in northern Wisconsin: 1937-1999 Ecological Applications 16:1222-1237.
  • St-Louis V, Pidgeon AM, Radeloff VC, Hawbaker TJ, Clayton MK. 2006. High-resolution image texture as a predictor of bird species richness. Remote Sensing Of Environment 105:299-312.
  • Yan P, Clayton MK, 2006. A cluster model for space-time disease counts. Statistics in Medicine. 25:867-881.
  • Gonzalez-Abraham CE, Radeloff VC, Hammer RB, Hawbaker RJ, Stewart SI, Clayton MK. 2007. Effects of building density, land ownership and land cover on landscape fragmentation in northern Wisconsin, USA. Landscape Ecology. (to appear).
  • Syphard AD, Radeloff VC, Keeley JE, Hawbaker TJ, Clayton MK, Stewart SI, Hammer RB. 2007. Human influence on California fire regimes. Ecological Applications. (to appear).


Progress 01/01/05 to 12/31/05

Outputs
This research involves the application of statistical methodologies to the analysis of categorical data that are collected over space. Such data arise in the agricultural, biological, and environmental sciences. A specific application of interest arises in forest entomology. Of interest is whether death of trees is independent of the presence of Ips pini on the trees. When data are spatially correlated, the usual chi-squared tests of this hypothesis will be invalid. We have developed methods based on a multinomial autologistic model. This allows us to take advantage of Markov chain Monte Carlo methods and use a Bayesian approach to deal with the uncertainties that arise when spatial autocorrelation must be estimated, and to analyze multinomial data.

Impacts
This research will provide new methods for studying spatially correlated categorical data, focusing especially on situations where data are missing, or where there are large regional trends in the data.

Publications

  • Bennett EM, Carpenter SR, Clayton MK, 2005. Soil phosphorus variability: Scale-dependence in an urbanizing agricultural landscape. Landscape Ecology. 20:389-400.
  • Hawbaker TJ, Radeloff VC, Hammer RB, Clayton MK. 2005. Road density and landscape pattern in relation to housing density, land ownership, land cover, and soils. Landscape Ecology. 20:609-625.
  • Lin P-S, Clayton MK, 2005. Properties of binary data generated from a truncated Gaussian random field. Communications in Statistics-Theory and Methods. 34:537-544.
  • Lin P-S, Clayton MK, 2005. Analysis of binary spatial data by quasi-likelihood estimating equations. Annals of Statistics. 33:542-555.
  • Yan, P. and Clayton, M. K., 2005. A cluster model for space-time disease counts. Statistics in Medicine. (to appear).


Progress 01/01/04 to 12/31/04

Outputs
This research involves the application of statistical methodologies to the analysis of categorical data that are collected over space. Such data arise in the agricultural, biological, and environmental sciences. A specific application of interest arises in forest entomology. Of interest is whether death of trees is independent of the presence of Ips pini on the trees. When data are spatially correlated, the usual chi-squared tests of this hypothesis will be invalid. We have developed methods based on an autologistic model. This allows us to take advantage of Markov chain Monte Carlo methods, thus permitting us, through a Bayesian approach, to deal with the uncertainties that arise when spatial autocorrelation must be estimated, and to deal with situations where there are large-scale regional trends in the data.

Impacts
This research will provide new methods for studying spatially correlated categorical data, focusing especially on situations where data are missing, or where there are large regional trends in the data.

Publications

  • Gangnon, R. E. and Clayton, M. K. 2004. Likelihood based tests for localized spatial clustering of disease. Environmetrics.
  • Bennett, E. M., R. Carpenter, S. R., and Clayton, M. K., 2004. Soil phosphorus variability: Scale-dependence in an urbanizing agricultural landscape. Landscape Ecology.
  • Brown, D. J., Clayton, M. K., and McSweeney, K. 2004. Potential terrain controls on soil color, texture contrast and grain-size deposition for the original catena landscape in Uganda. Geoderma.


Progress 01/01/03 to 12/31/03

Outputs
This research involves the application of statistical methodologies to the analysis of categorical data that are collected over space. Such data arise in the agricultural, biological, and environmental sciences. A specific application of interest arises in forest entomology. Of interest is whether death of trees is independent of the presence of Ips pini on the trees. When data are spatially correlated, the usual chi-squared tests of this hypothesis will be invalid. We have initiated the development of methods based on an autologistic model. This will allow us to take advantage of Markov chain Monte Carlo methods, thus allowing us, through a Bayesian approach, to deal with the uncertainties that arise when spatial autocorrelation must be estimated, and to deal with situations where there are large-scale regional trends in the data.

Impacts
This research will provide new methods for studying spatially correlated categorical data, focusing especially on situations where data are missing, or where there are large regional trends in the data.

Publications

  • Brown, D.J., Helmke, P. A., and Clayton, M. K. 2003. Robust geochemical indices for redox and weathering on a granitic laterite landscape in central Uganda. Geochimica et Cosmochimica Acta. 67:2711-2723.
  • Burrows, S.N., S.T. Gower, J.M. Norman, G. Diak, D.S. Mackay, D.E. Ahl, and M.K. Clayton. 2003. Spatial variability of aboveground net primary productivity for a forested landscape in northern Wisconsin. Canadian Journal of Forest Research, 33:2007-2018.
  • McManus, P.S., Caldwell, R.W., Voland, R.P., Best, V.M., and Clayton, M.K. 2003. Evaluation of sampling strategies for determining incidence of cranberry fruit rot and fruit rot fungi. Plant Disease. 87:585-590.
  • Upper, C. D., Hirano, S. S., Dodd, K. K., and Clayton, M. K. 2003. Factors that affect spread of Pseudomonas syrinage in the phyllosphere. Phytopathology. 93:1082-1092.


Progress 10/01/02 to 12/31/02

Outputs
This research involves the application of statistical methodologies to the analysis of categorical data that are collected over space. Such data arise in the agricultural, biological, and environmental sciences. A specific application of interest arises in forest entomology. Of interest is whether death of trees is independent of the presence of Ips pini on the trees. When data are spatially correlated, the usual chi-squared tests of this hypothesis will be invalid. We have initiated the development of methods based on an autologistic model. This will allow us to take advantage of Markov chain Monte Carlo methods, thus allowing us, through a Bayesian approach, to deal with the uncertainties that arise when spatial autocorrelation must be estimated, and to deal with situations where there are large-scale regional trends in the data.

Impacts
This project will supply needed statistical tools for the analysis of important ecological data. The advent of increased computing power facilitates the development of these tools. Their ultimate application will be of value in numerous ecological and agricultural research settings.

Publications

  • No publications reported this period