## An overview of the Spatial Statistics Tools toolbox in ArcGIS

The ArcGIS Spatial Statistics Tools toolbox is available for all license levels of ArcGIS Desktop, including basic, standard, and advanced. The toolbox includes a number of toolsets, which are as follows:

- The
toolset`Analyzing Patterns`

- The
toolset`Mapping Clusters`

- The
toolset`Measuring Geographic Distributions`

- The
toolset`Modeling Spatial Relationships`

### The Measuring Geographic Distributions toolset

The ** Measuring Geographic Distributions** toolset in the

**toolbox contains a set of tools that provide descriptive geographic statistics, including the**

`Spatial Statistics Tools`

**,**

`Central Feature`

**,**

`Directional Distribution`

**,**

`Linear Directional Mean`

**,**

`Mean Center`

**, and**

`Median Center`

**tools. Together, this toolset provides a set of basic statistical exploration tools. These basic descriptive statistics are used only as a starting point in the analysis process. The following screenshot displays the output from the**

`Standard Distance`

**tool for an analysis of crime data:**

`Directional Distribution`

The ** Central Feature**,

**, and**

`Mean Center`

**tools all provide similar functionality. Each creates a feature class containing a single feature that represents the centrality of a geographic dataset.**

`Median Center`

The ** Linear Directional Mean** tool identifies the mean direction, length, and geographic center for a set of lines. The output of this tool is a feature class with a single linear feature.

The ** Standard Distance** and

**tools are similar, in that they both measure the degree to which features are concentrated or dispersed around the geometric center, but the**

`Directional Distribution`

**tool, also known as the**

`Directional Distribution`

**, is superior as it also provides a measure of directionality in the dataset.**

` Standard Deviational Ellipse`

### The Analyzing Patterns toolset

The ** Analyzing Patterns** toolset in the

**toolbox contains a series of tools that help evaluate whether features or the values associated with features form a clustered, dispersed, or random spatial pattern. These tools generate a single result for the entire dataset in question. In addition, the result does not take the form of a map, but rather statistical output, as shown in the following screenshot:**

`Spatial Statistics Tools`

Tools in this category generate what is known as inferential statistics or the probability of how confident we are that the pattern is either dispersed or clustered. Let's examine the following tools found in the ** Analyzing Patterns** toolset:

: This tool calculates the nearest neighbor index based on the average distance from each feature to its nearest neighboring feature. For each feature in a dataset, the distance to its nearest neighbor is computed. An average distance is then computed. The average distance is compared to the expected average distance. In doing so, an ANN ratio is created, which in simple terms is the observed/expected. If the ratio is less than 1, we can say that the data exhibits a clustered patterns, whereas a value greater than 1 indicates a dispersed pattern in our data.`Average Nearest Neighbor`

: This tool measures spatial autocorrelation by simultaneously measuring feature locations and attribute values. If features that are close together have similar values, then that is said to be clustering. However, if features that are close together have dissimilar values then they form a dispersed pattern. This tool outputs a Moran's I index value along with a z-score and a p-value.`Spatial Autocorrelation`

: This tool is similar to the previous tools, but it measures spatial autocorrelation for a series of distances and can create an optional line graph of those distances along with their corresponding z-scores. This tool is similar to the new`Spatial Autocorrelation (Morans I)`

tool and isn't used as frequently anymore as a result. This tool is often used as a distance aid for other tools such as`Optimized Hot Spot`

or`Hot Spot Analysis`

.`Point Density`

: This looks for high value clusters and low value clusters. It is used to measure the concentration of high or low values for a given study area and return the Observed General G, Expected General G, z-score, and p-value. It is most appropriate when there is a fairly even distribution of values.`High/Low Clustering (Getis-Ord General G)`

: This determines whether feature locations show significant clustering or dispersion. However, unlike the other spatial pattern tools that we've examined in this section, it does not take the value at a location into account. It only determines clustering by the location of the features. This tool is often used in fields such as environmental studies, health care, and crime where you are attempting to determine whether one feature attracts another feature.`Multi-Distance Spatial Cluster Analysis (Ripleys K Function)`

### The Mapping Clusters toolset

The ** Mapping Clusters** toolset is probably the most well-known and commonly used toolset in the

**toolbox, and for a good reason. The output from these tools is highly visual and beneficial in the analysis of clustering phenomena. There are many examples of clustering: housing, businesses, trees, crimes, and many others. The degree of this clustering is also important. The tools in the**

`Spatial Statistics Tools`

**toolset don't just answer the question**

`Mapping Clusters`

*Is there clustering?*, but they also take on the question of

*Where is the clustering?*

Tools in the ** Mapping Clusters** toolset are among the most commonly used in the

**toolbox:**

`Spatial Statistics Tools`

: This tool is probably the most popular tool in the Spatial Statistics Tools toolbox, and given a set of weighted features, it will identify statistically hot and cold spots using the`Hot Spot Analysis`

`Getis-Ord Gi*`

statistics, as shown in the output of real estate sales activity in the following screenshot:

: This tool is used to identify candidate features that are most similar or most dissimilar to one or more input features by the attributes of a feature. Dissimilarity searches can be equally as important as similarity searches. For example, a community development organization, in its attempts to attract new businesses, might show that their city is dissimilar to other competing cities when comparing crimes.`Similarity Search`

: This tool groups features based on feature attributes, as well as optional spatial/temporal constraints. The output of this tool is the creation of distinct groups of data where the features that are part of the group are as similar as possible and between groups are as dissimilar as possible. An example is displayed in the following screenshot. The tool is capable of multivariate analysis and the output is a map and a report. The output map can have either contiguous groups or non-contiguous groups:`Grouping Analysis`

: The final tool in the`Cluster and Outlier Analysis`

toolset is the`Mapping Clusters`

tool. This tool, in addition to performing hot spot analysis, identifies outliers in your data. Outliers are extremely relevant to many types of analyses. The tool starts by separating features and neighborhoods from the study area. Each feature is examined against every other feature to see whether it is significantly different from the other features. Likewise, each neighborhood is examined in relationship to all other neighborhoods to see whether it is statistically different than other neighborhoods. An example of the output from the`Cluster and Outlier Analysis`

tool is provided in the following screenshot:`Cluster and Outlier Analysis`

### The Modeling Spatial Relationships toolset

The ** Modeling Spatial Relationships** toolset contains a number of regression analysis tools that help you examine and/or quantify the relationships between features. They help measure how features in a dataset relate to each other in space.

The regression tools provided in the ** Spatial Statistics Tools** toolbox model relationships among data variables associated with geographic features, allowing you to make predictions for unknown values or to better understand key factors influencing a variable you are trying to model. Regression methods allow you to verify relationships and to measure how strong those relationships are. The

**tool allows you to examine a large number of**

`Exploratory Regression`

**models quickly, summarize variable relationships, and determine whether any combination of candidate explanatory variables satisfy all of the requirements of the OLS method.**

`Ordinary Least Squares`

There are two regression analysis tools in ArcGIS which are as follows:

: This tool is a linear regression tool used to generate predictions or model a dependent variable in terms of its relationships to a set of explanatory variables. OLS is the best-known regression technique and provides a good starting point for spatial regression analysis. This tool provides a global model of a variable or process you are trying to understand or predict. The result is a single regression equation that depicts a positive or negative linear relationship. The following screenshot depicts partial output from the OLS tool:`Ordinary Least Squares`

:`Geographically Weighted Regression`

or GWR is a local form of linear regression for modeling spatially varying relationships. Note that this tool does require an Advanced ArcGIS license. GWR constructs a separate equation for each feature and is most appropriate when you have several hundred features. GWR creates an output feature class (shown in the following screenshot) and table. The output table contains a summary of the tool execution. When running GWR, you should use the same explanatory variables that you specified in your OLS model:`Geographically Weighted Regression`

The ** Modeling Spatial Relationships** toolset also includes the

**tool.**

`Exploratory Regression`

: This tool can be used to evaluate combinations of exploratory variables for OLS models that best explain the dependent variable. This data-mining tool does a lot of the work for you for finding variables that are well suited and can save you a lot of time finding the right combination of variables. The results of this tool are written to the progress dialog, result window, and an optional report file. An example of the output from the`Exploratory Regression`

tool can been seen in the following screenshot:`Exploratory Regression`