[Openspace] Spatial Poisson Regression

Julia Koschinsky koschins at uiuc.edu
Sat Aug 7 16:12:16 CDT 2004


Mark,

For references on the subject, you might be interested in:

Sudipto Banerjee , Bradley P. Carlin , Alan E Gelfand. 
(2004). Hierarchical Modeling and Analysis for Spatial Data
Agrandir. Chapman & Hall/CRC.  

Re. your #2 assumption: Your coefficients will be biased.

Julia


---- Original message ----
>Date: Sun, 18 Jul 2004 11:25:08 -0400
>From: "Burkey" <mburkey at triad.rr.com>  
>Subject: [Openspace] Spatial Poisson Regression  
>To: <openspace at agec221.agecon.uiuc.edu>
>
>
>I am aware that at the present time that correctly 
estimating a model with a
>count dependent variable is not possible.  I want to get 
some feedback on a
>process I used, to ensure that my assumptions are correct, 
and the process
>and conclusions seem reasonable.  I appreciate all feedback!
>
>Basic setup: Descriptive regression on the number of 
locations of various
>types of businesses using approximately 800 Zip Code 
Tabulation Areas.
>Explanatory variables such as population, race, income, 
etc. are used.
>
>1) I estimated a Poisson model (MLE with log link) without 
including lagged
>values of Y. ASSUMPTION: If the true value of the spatial 
coefficient p in
>pWy is nonzero, this could cause omitted variable bias  in 
the B's of the
>included variables.
>2) To check for bias, I included Wy (calculated using 
GeoDa) as an
>explanatory variable in a poison model (estimated outside 
of GeoDa).
>ASSUMPTION: By doing this I think my coefficient estimates 
will be unbiased,
>but the standard errors will not be computed correctly.  Is 
this correct?
>3) When I compared the results from #1 and #2 above, the 
coefficients were
>very similar.  Therefore I concluded that though there is 
likely spatial
>autocorrelation, omitting the lagged values did not appear 
to significantly
>bias the coefficients.
>4) As a final check, I followed a suggestion from Cameron 
and Trivedi.  They
>suggest that if you must estimate a count data model that 
can't be estimated
>properly with existing software, to run a log-linear model 
with the ad-hoc
>solution of converting all y to (y+1) or converting all 
zeros to 0.5.  So, I
>ran several variants of this model in GeoDa.  The Spatial 
Lag model seemed
>the best specification, and once again, the signs, size, 
and significance of
>the coefficients on the major explanatory variables were in 
the same
>ballpark as with the other models.  
>
>In a paper on this work, I included items 1,2, and 3 above, 
but omitted
>discussion of item 4.  
>
>Are my assumptions, conclusions, and process above 
reasonable?  What better
>suggestions are there for working with spatial count data?  
Any references
>on the subject?
>
>Thank you.  If anyone would like to see the paper, feel 
free to email me
>directly.
>
>Mark L. Burkey
>burkeym at ncat.edu


More information about the Openspace mailing list