Cluster Analysis
Identifying incident clusters for a fire station and identifying income clusters for the economic development office in Tarrant County
Problem and Objective
Two separate scenarios were undertaken in this assignment. For the first, a fire chief has requested a map to present during the next city council meeting showing hot and cold spots for Incident reports responded to during January of 2015. In the second scenario, the Dallas County Economic Development Office has requested a map showing hot and cold spots for median household income so they can better direct their job creation and fundraising efforts.
Analysis Procedures
To address these requests, I used ESRIs ArcGIS Pro 3.1 and various tools from the Mapping Clusters section of the Spatial Statistics toolbox including Cluster and Outlier Analysis and the Hot Spot Analysis tools. The data used for this analysis included the Incident Reports for January 2015 for the 2nd Battalion Fire Department and Median Household Income data for the area. For the second exercise, median household income for Dallas County was used. This data was provided by our GIS 520 class and came from the “GIS Tutorial 2 – Spatial Analyst Workbook” (4th Ed.) by David W. Allen. This data was provided by our GIS 520 class and came from the “GIS Tutorial 2 – Spatial Analyst Workbook” (4th Ed.) by David W. Allen.
After establishing statistical significance of patterns using global calculation techniques, I used local statistics methods to find the areas where clustering is occurring. After importing the map provided from the exercises, I ran the Cluster and Outlier analysis tool, using the FEE value (representing incident urgency) as the weight field. Spatial relationships were conceptualized using Inverse Distance in the first run, with no standardization and 0 permutations used. I then changed the symbology, so the cluster types were more distinguishable on the map. The same process was done again, but this time spatial relationships were conceptualized using the fixed band method, set at 900ft.
For the next exercise, the Dallas County map was imported. This map contained a line layer with major roads in Dallas County and a polygon layer with census income data. After importing the map, I ran the Hot Spot Analysis tool using median household income as the weight field. The distance band value was set to 5280 feet.

Results
The first map shows the results of the cluster and outlier analysis using Anselin Local Moran’s I and Inverse Distance.

The second map shows the results of running the Cluster and Outlier Analysis tool with a fixed distance band of 900 ft.

This map shows the clustering of income levels in Dallas County using Hot Spot Analysis and shows distinct sections of income levels within the county.

Application & Reflection
One especially useful use case of Hot Spot analysis is finding hot spots for diseases such as influenza and tracking the spread, such as in this hypothetical scenario in Cumberland County.
Problem description
An epidemiologist has been tasked with finding out more information about influenza outbreaks in Cumberland County, NC and if population factors are affecting the outbreak numbers. They want to find out if there are certain hot spots within the county to help direct vaccines and aid.
Data needed
The shapefile of Cumberland County in NC, data on influenza outbreaks and severity, as well as census data on age and population density of areas.
Analysis procedures
I would first import all of the data in ArcGIS. Then, I would run the Cluster and Outlier Analysis tool on the influenza outbreak data, using the influenza severity as the weight field. I would then visually inspect the results and see where there are hot spots. I would then run the Hot Spot analysis tool on the age census data as well as the population density data to create two separate layers. I would then compare and see if influenza outbreaks are clustered in older age groups and / or in areas where there is greater population density.

