Protocol - Tobacco Retailer Density/Proximity - Administrative Neighborhoods

Add to Toolkit

Protocol Name from Source:

N/A, see source.


Publicly available


The Duncan et al. protocol is used when a more specific residential location is unknown.


Administrative Neighborhoods (when residence address is unavailable)

  1. For tobacco retailer density calculations, administrative neighborhoods may be as large as counties or as small as census block groups. Respondents may provide information about county or zipcode, or the nearest cross streets to their residential addresses in order to obtain geolocation data for tract or block group.
  2. The Duncan protocol obtained address data for tobacco retailers from a state agency (cigarette and tobacco excise unit of Department of Revenue). To compute density, defined in this protocol as retailers per square kilometer, ArcGIS was used to compute the count of tobacco retailers for each participant’s census block group. The protocol compared this density measure with computations for census tract as well as for 400-meter and 800-meter ego-centric buffers.
  1. To calculate proximity to tobacco retailers when residence address is unknown, the Duncan protocol used the census areas’ internal point, which are calculated by the US Census Bureau (http://www.census.gov/geo/www/2010census/gtc/gtc_area_attr.html). Usually, the internal point is at or near the geographical center of the unit. However, for some nonconvex geographical units, the calculated geographical center may be located outside the boundaries of the unit. In this circumstance, the internal point is identified as a point inside the entity boundaries nearest to the calculated geographical center. For simplicity, refer to the internal points as centroids, which they are in many cases.

Personnel and Training Required

Personnel must have GIS expertise as a result of training or education (e.g., GIS Specialist).

Knowledge of census data products and websites such as American Factfinder (http://factfinder.census.gov) and/or commercial geospatial data products

After extracting the necessary data, statistical methods are used (e.g., principal component analysis (PCA) and factor analysis).

Equipment Needs

Geospatial Data Prouducts


Requirement CategoryRequired
Average time of greater than 15 minutes in an unaffected individualYes
Major equipmentNo
Specialized requirements for biospecimen collectionNo
Specialized trainingNo

Mode of Administration


Life Stage:

Infant, Toddler, Child, Adolescent, Adult, Senior, All Ages, Pregnancy

Specific Instructions:

Collectively, this measure for tobacco retailer density/proximity includes the following components:

  1. Valid data sources providing the location of tobacco product retailers are required
  2. Density requires a definition of "neighborhood" or other spatial unit that is then linked with census-based information about neighborhood characteristics, such as land area, roadway miles, or population size. Density is typically a ratio measure (retailer count divided by neighborhood attribute) and the "correct" choice for the denominator depends on the research question. The protocol computes density for each neighborhood by dividing the count of retailers by the land area (e.g., retailers per km2). Alternatively, one could compute density by dividing the count of retailers by the population in a spatial unit (e.g., retailers per 1,000 people).
  1. Proximity of tobacco retailers may be computed for residence, schools, or other locations (e.g., distance between two retailers). The Duncan et al., protocol measures the distance from a known residence to the nearest tobacco retailer in roadway miles and the latter contains a discussion about data confidentiality.

Assuming information on current address (see PhenX Demographics - Current Address) has been collected for a study respondent, then via geocoding it is possible to link the address of a study participant to a measure of tobacco retailer proximity (distance to nearest retailer) and to a measure of density for a defined neighborhood, however defined.

For any density/proximity measure, the WG suggests using GIS software, such as ESRI ArcGIS version 10.1 (ESRI, Redlands, CA). Investigators without such software or expertise may employ a third party vendor to compute these measures for a nominal cost. Multiple steps are required:

  • Obtain address data for licensed or likely tobacco retailers: Where there are state or local tobacco retailer licensing requirements, the investigator may obtain retailer addresses from the appropriate licensing authority. When licensing is not required or unavailable to researchers, address lists for likely tobacco retailers may be obtained from commercial vendors (e.g., Dun & Bradstreet), along with some determination of whether or not they sell tobacco products, or investigators may use on-the-ground assessments to identify tobacco retailers in communities.
  • Geocode the latitudes and longitudes of addresses for tobacco retailers and participants’ residences (and/or schools and workplaces). Mapping rates of 90% or greater are typical, but the mapping rate depends on the individual data set and one would expect lower rates in rural areas. When geocoding residential address data to a random shift may be employed to avoid incidental disclosure for shared data.
  • Define neighborhood: Egocentric neighborhoods (also referred to as "egocentric buffers" and "egohoods") are defined by a radius around a particular location, such as a residence, and these definitions are preferred by the Duncan et al. protocol. Network-based data better captures the travel distance necessary to obtain tobacco products from retailers nearest to participants’ residence. The appropriate distance (400m, 500m, 800m, 1km) depends on the research question. Street-network buffers excluding highways and ramps are created by using software similar to ESRI’s ArcGIS 10 Buffer tool, ArcGIS 10 Data and Maps, and ArcGIS Network Analysis Extension. According to the Duncan et al. protocol, when residential address data are unavailable, alternative definitions of neighborhood are administrative units, such as census block group, tract, zip code tabulation area, city or county.
  • Extract census data to characterize each neighborhood: Use data from decennial census or intercensal estimates to compute the land area (or other attribute, such as roadway miles, population size). When buffers overlap multiple tracts, buffer characteristics are weighted in proportion to tract area inside the buffer.
  • Compute density: Use software (such as ArcGIS Spatial Join tool) or third-party vendor to calculate the count of tobacco retailers in each neighborhood, and compute retailer density by dividing by the count of retailers by the area attribute of interest (e.g., acres or roadway miles or population size).
  • Compute proximity: Use ArcGIS Closest Facility tool (or comparable tool in alternate software) to determine the distance between two points, such as the roadway distance from each residential address to the nearest tobacco.
Research Domain Information

Release Date:

October 17, 2016


Using geospatial data, density measures the spatial concentration of tobacco retailers in a neighborhood, defined by either an area centered on a respondent’s residence, school/workplace, or an administrative area, such as counties, school districts, or census tracts. Proximity measures distance to the nearest tobacco retailer from a point of interest (e.g., residence, school/workplace, or another retailer).


There is growing evidence that tobacco retailers are concentrated in areas of economic disadvantage, and that greater physical access is associated with increased tobacco use, particularly among youth. There is some evidence that proximity to tobacco retailers is associated with lower efficacy to quit and less success with quitting. This measure describes the retail availability of tobacco products by characterizing the quantity and location of retailers with respect to a respondent’s residence, school or workplace.

Selection Rationale

The Duncan et al. protocol provides examples of using geolocation data to measure the spatial relationship of tobacco retailers to administrative neighborhoods (e.g., tracts, school districts, counties) when precise address data are not available.




Common Data Elements (CDE)Tobacco Retailer Proximity to Neighborhood Distance in Meter5519683CDE Browser

Process and Review

The Expert Review Panel has not reviewed this measure yet.


Duncan D, et al. Examination of How Neighborhood Definition Influences Measurements of Youths’ Access to Tobacco Retailers: A Methodological Note on Spatial Misclassification, Am J Epidemiol. 2014;179(3):373-381

General References

Frank LD, Schmid TL, Sallis JF, Chapman J, Saelens BE. Linking objectively measured physical activity with objectively measured urban form: findings from SMARTRAQ. Am J Prev Med. 2005;28(suppl 2):117---125.

Timperio A, Crawford D, Telford A, et al. Perceptions about the local neighborhood and walking and cycling among children. Prev Med. 2004;38(1):39-47.

Colabianchi N, Dowda M, Pfeiffer KA, et al. Towards an understanding of salient neighborhood boundaries: adolescent reports of an easy walking distance and convenient driving distance. Int J Behav Nutr Phys Act. 2007;4:66.

Protocol ID:



Export Variables
Variable NameVariable IDVariable DescriptionVersiondbGaP Mapping