Abstract Due to the rapid development of Geographic Information Systems (GIS) in recent years, spatial data analysis has received considerable attention and played an important role in social science. Although many standard statistical techniques are attractive in traditional data analysis, they cannot be implemented uncritically for spatial data. Generally, most of the studies in spatial data analysis can be divided into two branches: the model-driven approach and the data-driven approach. The main aim of this paper is the comparison of both approaches. To carry out such a task, crime rate data in Columbus (Ohio), coming from a well-known database, have been used. The main aim of this paper is to illustrate how spatial effects can be viewed as spatial econometric models, which assess the limitations of standard techniques in a spatial context, suggesting alternative methods to deal with this problem. An application to the crime rate in Columbus (Ohio) has been carried out.
Keywords Weight matrix * Spatial correlation * Spatial econometrics * Econometric models * Autocorrelation. Kriging estimator
JEL C10 * C21 * C40 * E00
Introduction
The importance of space as the fundamental concept underlying the essence of social science is unquestioned. Since the 1950s, a large number of spatial theories and operational models have been developed, which have gradually disseminated into the practice of urban and regional policy and analysis. However, this theoretical contribution has not been matched by a similar advance in the methodology for the econometric analysis of georreferenced data.
Spatial Term Development
Spatial means each item of data has a geographical reference so we know where each case occurs on a map. A definition of spatial analysis is that it represents a collection of techniques and models that explicitly use the spatial referencing associated with each data value or object that is specified within the system under study. The main idea when spatial effects appear is how such effects can be measured. The term spatial econometrics was coined by Jean Paelinck and Klaasen (1979) to designate a growing body of the regional science literature that dealt primarily with estimation and testing problems encountered in the implementation of multiregional econometric models. On one hand, the distinction between spatial econometrics and statistical econometrics is easy and essential. If activities such as estimating spatial interaction models, statistically analyzing urban density functions and empirically implementing regional econometric models are analyzed by applying standard econometrics, the study of the models in question will tend to ignore specific spatial aspects. On the other hand, the distinction between spatial econometrics and spatial statistics is less straightforward and methods tend to be categorized as belonging to one field or the other depending on the personal preference of the researcher. One possible categorization can be extracted from Haining (1986) or Ansclin (1988). They refer to the data-driven orientation in spatial statistics (1) and to the model-driven approach in spatial econometrics (2). Moreover, spatial econometrics typically deals with models related to regional and urban economics, whereas a substantial body of the spatial statistics literature is primarily focused on physical phenomena in biology and geology.
This paper first addresses the model-driven approach (reviewing the definition of the spatial weight matrix). Secondly, the data-driven orientation (kriging methodology) is studied. Thirdly, an application to the crime rate in Columbus (Ohio) is presented and, finally, concluding remarks will be given.
The Traditional Econometric Approach
As Anselin (1988) pointed out, spatial effects are the essential reason for the existence of a separate field of spatial econometrics. Spatial effects can be divided into two general groups: spatial dependence or spatial autocorrelation and spatial heterogeneity. While spatial dependence can be considered as the existence of a functional relationship between what happens at one point in space and what happens elsewhere, spatial heterogeneity implies that functional forms and parameters vary by location and are not homogeneous throughout the data set.
The paper focuses on spatial dependence but, due to the multidirectional nature of dependence in space, a different methodological framework is required to that used in a one-directional situation in time. It is a well known fact that the consequence of spatial dependence and heterogeneity is that the observations contain less information than if there had been independent. The statistical properties for estimators and hypothesis tests in standard econometric approaches will not hold in the presence of such spatial dependencies. In order to obtain approximately the same degree of information as in an independent set of observations, information on the structure of the spatial pattern in question will be required.
The main characteristic of spatial econometrics is the way in which spatial effects (in our case, spatial autocorrelation) are taken into account. Typically, the use of a spatial weight matrix makes it possible for spatial models to be applied to many empirical contexts, providing spatial dependence is properly expressed in terms of weights and that spatial heterogeneity is accounted for in the specification of the model. The most important taxonomies for spatial regression models have been provided by Anselin (1988, 2001) and Florax and Folmer (1992).
Specification of Models with Spatial Effects
The simple regression model (SRM) and multiple linear regression model (MLRM) have been widely used in the valuation of spatial variables, but they have not usually taken into account their localization. The introduction of microlocalization factors and spatial dependence within regression models implies the definition of the concept of vicinity and the well known matrix W of vicinities. The ultimate objective of using W in the specification of spatial econometric models is to relate a variable at one point in space to the observations for that variable in other spatial units in the system.
A general spatial weight matrix can be defined by a symmetric binary contiguity matrix, which can be generated from the topological information given by GIS based on either adjacency or distance criteria. According to adjacency criteria, the element of the spatial weight matrix {[w.sub.ij]} is one if location i is adjacent to location j, and zero otherwise. According to distance criteria, the element of the spatial weight matrix {[w.sub.ij]} is one if the separation between locations i and j is within a given distance (d) and zero otherwise. For ease of interpretation, the weight matrix is usually defined in a row standardized form, in which row elements sum to one (see Chasco 2003). The most used weights are displayed in Table 1.
There are two basic alternatives for introducing W in the model: including WY and/or WX as exogenous variables (substantive dependence), including a spatial autoregressive disturbance, or both. These alternatives lead to the following models:
1) The first-order spatial autoregressive model (SAR (1)), Besag (1974), consists of a linear relationship between a conditional expectation of the dependent variable and its values in the rest of the system. It can be expressed in the following form:
y = [rho]Wy + u; u[approximately equal to]N(0, [[sigma].sup.2]I) (1)
in which y is expressed in deviations from the mean, [rho] is the autoregressive parameter, W denotes a spatial weight matrix, u is a vector of independently normally distributed errors with zero expectation and variances [[sigma].sup.2] and I the identity matrix.
2) The spatial lag model (SLM) is a general spatial autoregressive model, in which explanatory variables include a spatial lag for the dependent variable as well as a set of exogenous variables, for example, homicide rate model, Baller et al. (2001). It can be expressed as:
y = [rho]Wy + X[beta] + u; u[approximately equal to]N(0, [[sigma].sup.2]I) (2)
where X is a matrix of explanatory variables and [beta] is a vector. SLM is a suitable model when spatial dependence is significant and persists regardless of the explanatory variables added, implying the presence of substantive spatial dependence. This means that the observed value of a variable at each location is truly jointly determined by the values at other locations, making the independent premise unreliable.
3) The spatial autoregressive model with spatial error dependence consists of a linear relationship between a conditional expectation of the dependent variable and its values in the rest of the system, with spatial dependent error terms. It can be written as:
y = [rho][W.sub.1]y + u; u = [lambda][W.sub.2]u + [epsilon]; [epsilon][approximately equal to]N(0, [[sigma].sup.2]I) (3)
where W denotes a spatial weight matrix, [lambda] is the coefficient in a spatial autoregressive structure for the disturbance u, and [epsilon] is a vector with independent normal distribution.
4) In the linear regression model with a spatial autoregressive disturbance or spatial error model (SEM), explanatory variables contain only exogenous variables, but the error term follows a spatial autoregressive process, for example, the cancer mortality rate model, Haining (1995). It is usually expressed as:
y = X[beta] + u; u = [lambda]Wu + [epsilon]; [epsilon][approximately equal to]N(0;[[sigma].sup.2]I) (4)
5) By combining the SLM and the spatial autoregressive model with spatial error dependence and adding one or more spatially lagged exogenous variables, the mixed-regressive-spatial autoregressive model with a spatial autoregressive disturbance or general spatial regression model (GSRM) is obtained:




Mobile Edition
Print
Get the Mag
Weekly Updates