Saturday, November 14, 2015

Arcpy: Anselin Local Moran’s I

Introduction

Global spatial analysis or global spatial autocorrelation analysis yields only one statistic to summarize the study area. In other words, global analysis assumes homogeneity. If that assumption does not hold, then having only one statistic does not make sense as the statistic should offer over space.
But if there is no global autocorrelation or no clustering, we can still find clusters at a local level using local spatial autocorrelation. The fact that Moran's I is a summation of individual crossproducts is exploited by the "local indicators of spatial association" (LISA) to evaluate the clustering in those individual units by calculating Local Moran's I for each spatial unit and evaluating the statistical significance for each Ii, the Local Moran's I statistic of spatial association is given as:
\[{I_i} = \frac{{{x_i} - \overline x }}{{S_i^2}}\sum\limits_{j = 1,j \ne i}^n {{w_{i,j}}\left( {{x_j} - \overline x } \right)} \]
where xi is an value of location i, xbar is the mean of the corresponding x, wij is the spatial weight between i and j, and
\[S_i^2 = \frac{{\sum\limits_{j = 1,j \ne i}^n {{{\left( {{x_j} - \overline x } \right)}^2}} }}{{n - 1}} - {\overline x ^2}\]
with n equating to the total number of pixels.
The zIi-score for the statistics are computed as:
\[{z_{{I_i}}} = \frac{{{I_i} - {\rm{E}}\left[ {{I_i}} \right]}}{{\sqrt {{\rm{V}}\left[ {{I_i}} \right]} }}\]
Where
\[{\rm{E}}\left[ {{I_i}} \right] =  - \frac{{\sum\limits_{j = 1,j \ne i}^n {{w_{ij}}} }}{{n - 1}}\]
\[{\rm{V}}\left[ {{I_i}} \right] = {\rm{E}}\left[ {I_i^2} \right] - E{\left[ {{I_i}} \right]^2}\]
\[{\rm{E}}\left[ {{I^2}} \right] = A - B\]
\[A = \frac{{\left( {n - {b_{{2_i}}}} \right)\sum\limits_{j = 1,j \ne i}^n {w_{i,j}^2} }}{{n - 1}}\]
\[B = \frac{{\left( {2{b_{{2_i}}} - n} \right)\sum\limits_{k = 1,k \ne i}^n {\sum\limits_{h = 1,h \ne i}^n {{w_{i,k}}{w_{i,h}}} } }}{{\left( {n - 1} \right)\left( {n - 2} \right)}}\]
\[{b_{{2_i}}} = \frac{{\sum\limits_{i = 1,i \ne j}^n {{{\left( {{x_i} - \overline x } \right)}^4}} }}{{{{\left( {{{\sum\limits_{i = 1,i \ne j}^n {\left( {{x_i} - \overline x } \right)} }^2}} \right)}^2}}}\]
The p-value for the null hypothesis is given as:
\[p = {\rm{erfc}}\left( {\frac{{\left| {{I_i} - {\rm{E}}\left[ {{I_i}} \right]} \right|}}{{\sqrt {2{\rm{V}}\left[ {I{}_i} \right]} }}} \right)\]
无奈,上面的这些公式计算结果就是不能与ArcGIS计算结果一致,只好放弃Matlab对栅格文件的直接操作,转而调用ArcPy操作。

Example

Fig.1
This tool creates a new Output Feature Class with the following attributes for each feature in the Input Feature Class: Local Moran's I index (LMiIndex), z-score (LMiZScore), p-value (LMiPValue), and cluster/outlier type (COType). The field names of these attributes are also derived tool output values for potential use in custom models and scripts. (GRID_CODE: Original Input Value)
Fig. 2
The z-scores and p-values are measures of statistical significance which tell you whether or not to reject the null hypothesis, feature by feature. In effect, they indicate whether the apparent similarity (a spatial clustering of either high or low values) or dissimilarity (a spatial outlier) is more pronounced than one would expect in a random distribution.
A high positive z-score for a feature indicates that the surrounding features have similar values (either high values or low values). The COType field in the Output Feature Class will be HH for a statistically significant (0.05 level) cluster of high values and LL for a statistically significant (0.05 level) cluster of low values.
A low negative z-score (for example, < -1.96) for a feature indicates a statistically significant (0.05 level) spatial outlier. The COType field in the Output Feature Class will indicate if the feature has a high value and is surrounded by features with low values (HL) or if the feature has a low value and is surrounded by features with high values (LH).
代码运行结果如Fig. 3所示,红框之处的COType显示非空,表示该点通过显著性检验(α=0.05)。
Fig. 3

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# Created by LI Xu
# Version 1.0
# November 10, 2015

# Description:
# Arcpy implemented Local Moran's I

# If you have any question about this code,
# please do not hesitate to contact me via E-mail: 
# jeremy456@163.com

# Blog:
# http://blog.sciencenet.cn/u/lixujeremy

import arcpy
import os
import datetime
import time

def GetFileListExt(SouDir, FlagStr):
    import glob, os
    FileList=[]
    os.chdir(SouDir)
    filter_str='*'+FlagStr
    #print filter_str
    for filepath in glob.glob(filter_str):
        FileList.append(os.path.join(SouDir, filepath))
    return FileList


##Main
begintime=time.strftime("%Y-%m-%d %H:%M:%S")
print 'Time Begin:'+begintime
starttime = datetime.datetime.now()

#Source Directory
SouDir=r'E:\Tools\localMoranI\features'
#Destination Directory
DesDir=r'E:\Tools\localMoranI\features\output'
#InputField
InField='GRID_CODE'
#Spatial Relationships
SpRelations='INVERSE_DISTANCE'
#Distance_Method
DisMethod='EUCLIDEAN_DISTANCE'

#Retrieve files
FlagStr='.shp'
files=GetFileListExt(SouDir,FlagStr)
for filepath in files:
    filename=filepath.split('\\')
    filename=filename[len(filename)-1]
    filename=filename.split('.')
    otpath=DesDir+'\\'+filename[0]+'_LMI.shp'
    # Set geoprocessor object property to overwrite outputs if they already exist
    arcpy.gp.OverwriteOutput=True
    #Process: Local Moran's I
    clusters=arcpy.ClustersOutliers_stats(filepath,
                                          InField,
                                          otpath,
                                          SpRelations,
                                          DisMethod,
                                          'NONE',
                                          '#', '#')
    print otpath

print "END"
endtime=time.strftime("%Y-%m-%d %H:%M:%S")
print 'Time End:'+endtime
finishtime=datetime.datetime.now()
timespan=(finishtime-starttime).seconds
timespan='%f' %timespan
print 'Time Span:'+timespan+' s'

References






No comments:

Post a Comment