Book Image

Spatial Analytics with ArcGIS

By : Eric Pimpler
Book Image

Spatial Analytics with ArcGIS

By: Eric Pimpler

Overview of this book

Spatial statistics has the potential to provide insight that is not otherwise available through traditional GIS tools. This book is designed to introduce you to the use of spatial statistics so you can solve complex geographic analysis. The book begins by introducing you to the many spatial statistics tools available in ArcGIS. You will learn how to analyze patterns, map clusters, and model spatial relationships with these tools. Further on, you will explore how to extend the spatial statistics tools currently available in ArcGIS, and use the R programming language to create custom tools in ArcGIS through the ArcGIS Bridge using real-world examples. At the end of the book, you will be presented with two exciting case studies where you will be able to practically apply all your learning to analyze and gain insights into real estate data.
Table of Contents (16 chapters)
Title Page
About the Author
About the Reviewer
Customer Feedback

An overview of the Spatial Statistics Tools toolbox in ArcGIS

The ArcGIS Spatial Statistics Tools toolbox is available for all license levels of ArcGIS Desktop, including basic, standard, and advanced. The toolbox includes a number of toolsets, which are as follows:

  • The Analyzing Patterns toolset
  • The Mapping Clusters toolset
  • The Measuring Geographic Distributions toolset
  • The Modeling Spatial Relationships toolset

The Measuring Geographic Distributions toolset

The Measuring Geographic Distributions toolset in the Spatial Statistics Tools toolbox contains a set of tools that provide descriptive geographic statistics, including the Central Feature, Directional Distribution, Linear Directional Mean, Mean Center, Median Center, and Standard Distance tools. Together, this toolset provides a set of basic statistical exploration tools. These basic descriptive statistics are used only as a starting point in the analysis process. The following screenshot displays the output from the Directional Distribution tool for an analysis of crime data:

The Central Feature, Mean Center, and Median Center tools all provide similar functionality. Each creates a feature class containing a single feature that represents the centrality of a geographic dataset.

The Linear Directional Mean tool identifies the mean direction, length, and geographic center for a set of lines. The output of this tool is a feature class with a single linear feature.

The Standard Distance and Directional Distribution tools are similar, in that they both measure the degree to which features are concentrated or dispersed around the geometric center, but the Directional Distribution tool, also known as the Standard Deviational Ellipse, is superior as it also provides a measure of directionality in the dataset.

The Analyzing Patterns toolset

The Analyzing Patterns toolset in the Spatial Statistics Tools toolbox contains a series of tools that help evaluate whether features or the values associated with features form a clustered, dispersed, or random spatial pattern. These tools generate a single result for the entire dataset in question. In addition, the result does not take the form of a map, but rather statistical output, as shown in the following screenshot:

Tools in this category generate what is known as inferential statistics or the probability of how confident we are that the pattern is either dispersed or clustered. Let's examine the following tools found in the Analyzing Patterns toolset:

  • Average Nearest Neighbor: This tool calculates the nearest neighbor index based on the average distance from each feature to its nearest neighboring feature. For each feature in a dataset, the distance to its nearest neighbor is computed. An average distance is then computed. The average distance is compared to the expected average distance. In doing so, an ANN ratio is created, which in simple terms is the observed/expected. If the ratio is less than 1, we can say that the data exhibits a clustered patterns, whereas a value greater than 1 indicates a dispersed pattern in our data.
  • Spatial Autocorrelation: This tool measures spatial autocorrelation by simultaneously measuring feature locations and attribute values. If features that are close together have similar values, then that is said to be clustering. However, if features that are close together have dissimilar values then they form a dispersed pattern. This tool outputs a Moran's I index value along with a z-score and a p-value.
  • Spatial Autocorrelation (Morans I): This tool is similar to the previous tools, but it measures spatial autocorrelation for a series of distances and can create an optional line graph of those distances along with their corresponding z-scores. This tool is similar to the new Optimized Hot Spot tool and isn't used as frequently anymore as a result. This tool is often used as a distance aid for other tools such as Hot Spot Analysis or Point Density.
  • High/Low Clustering (Getis-Ord General G): This looks for high value clusters and low value clusters. It is used to measure the concentration of high or low values for a given study area and return the Observed General G, Expected General G, z-score, and p-value. It is most appropriate when there is a fairly even distribution of values.
  • Multi-Distance Spatial Cluster Analysis (Ripleys K Function): This determines whether feature locations show significant clustering or dispersion. However, unlike the other spatial pattern tools that we've examined in this section, it does not take the value at a location into account. It only determines clustering by the location of the features. This tool is often used in fields such as environmental studies, health care, and crime where you are attempting to determine whether one feature attracts another feature.

The Mapping Clusters toolset

The Mapping Clusters toolset is probably the most well-known and commonly used toolset in the Spatial Statistics Tools toolbox, and for a good reason. The output from these tools is highly visual and beneficial in the analysis of clustering phenomena. There are many examples of clustering: housing, businesses, trees, crimes, and many others. The degree of this clustering is also important. The tools in the Mapping Clusters toolset don't just answer the question Is there clustering?, but they also take on the question of Where is the clustering?

Tools in the Mapping Clusters toolset are among the most commonly used in the Spatial Statistics Tools toolbox:

  • Hot Spot Analysis: This tool is probably the most popular tool in the Spatial Statistics Tools toolbox, and given a set of weighted features, it will identify statistically hot and cold spots using the Getis-Ord Gi* statistics, as shown in the output of real estate sales activity in the following screenshot:
  • Similarity Search: This tool is used to identify candidate features that are most similar or most dissimilar to one or more input features by the attributes of a feature. Dissimilarity searches can be equally as important as similarity searches. For example, a community development organization, in its attempts to attract new businesses, might show that their city is dissimilar to other competing cities when comparing crimes.
  • Grouping Analysis: This tool groups features based on feature attributes, as well as optional spatial/temporal constraints. The output of this tool is the creation of distinct groups of data where the features that are part of the group are as similar as possible and between groups are as dissimilar as possible. An example is displayed in the following screenshot. The tool is capable of multivariate analysis and the output is a map and a report. The output map can have either contiguous groups or non-contiguous groups:
  • Cluster and Outlier Analysis: The final tool in the Mapping Clusters toolset is the Cluster and Outlier Analysis tool. This tool, in addition to performing hot spot analysis, identifies outliers in your data. Outliers are extremely relevant to many types of analyses. The tool starts by separating features and neighborhoods from the study area. Each feature is examined against every other feature to see whether it is significantly different from the other features. Likewise, each neighborhood is examined in relationship to all other neighborhoods to see whether it is statistically different than other neighborhoods. An example of the output from the Cluster and Outlier Analysis tool is provided in the following screenshot:

The Modeling Spatial Relationships toolset

The Modeling Spatial Relationships toolset contains a number of regression analysis tools that help you examine and/or quantify the relationships between features. They help measure how features in a dataset relate to each other in space.

The regression tools provided in the Spatial Statistics Tools toolbox model relationships among data variables associated with geographic features, allowing you to make predictions for unknown values or to better understand key factors influencing a variable you are trying to model. Regression methods allow you to verify relationships and to measure how strong those relationships are. The Exploratory Regression tool allows you to examine a large number of Ordinary Least Squares models quickly, summarize variable relationships, and determine whether any combination of candidate explanatory variables satisfy all of the requirements of the OLS method.

There are two regression analysis tools in ArcGIS which are as follows:

  • Ordinary Least Squares: This tool is a linear regression tool used to generate predictions or model a dependent variable in terms of its relationships to a set of explanatory variables. OLS is the best-known regression technique and provides a good starting point for spatial regression analysis. This tool provides a global model of a variable or process you are trying to understand or predict. The result is a single regression equation that depicts a positive or negative linear relationship. The following screenshot depicts partial output from the OLS tool:
  • Geographically Weighted Regression: Geographically Weighted Regression or GWR is a local form of linear regression for modeling spatially varying relationships. Note that this tool does require an Advanced ArcGIS license. GWR constructs a separate equation for each feature and is most appropriate when you have several hundred features. GWR creates an output feature class (shown in the following screenshot) and table. The output table contains a summary of the tool execution. When running GWR, you should use the same explanatory variables that you specified in your OLS model:

The Modeling Spatial Relationships toolset also includes the Exploratory Regression tool.

  • Exploratory Regression: This tool can be used to evaluate combinations of exploratory variables for OLS models that best explain the dependent variable. This data-mining tool does a lot of the work for you for finding variables that are well suited and can save you a lot of time finding the right combination of variables. The results of this tool are written to the progress dialog, result window, and an optional report file. An example of the output from the Exploratory Regression tool can been seen in the following screenshot: