SPATIAL DIFFERENCES IN FRESH VEGETABLE SPENDING: A CASE STUDY IN GUILFORD COUNTY, NORTH CAROLINA

This paper investigates the spatial differences in fresh vegetable spending in Guilford County, North Carolina. We create a geo-coded spatial-temporal database for both human factors and natural factors to understand why food deserts have become a serious issue in a county with many farming activities. We find that residents living in food deserts do not buy enough fresh vegetables compared with their counterparts, even when they are shopping at full-service grocery stores. Social-economic factors are most sensitive and are important determinants of fresh food demand. Using an agent-based toy model, we find that fresh vegetable demand in each census tract in Guilford County varies to a large extent. The results suggest that the formation of food deserts may root from the demand side.


INTRODUCTION
Vegetables are important in a healthy diet. However, only ten percent adult Americans meet the vegetable consumption standard recommended by Dietary Guidelines for Americans and vegetable is the most under-consumed nutritional food in the States (2018 State Indicator Report on Fruits and Vegetables). There is growing literature in enhancing food security, especially providing sufficient fresh vegetables to communities through local food systems (e.g., Torjusen, Lieblein &Vittersø 2008;Wilkins, Farrell & Rangarajan 2015). Local food systems can improve food access (Kantor 2001), decrease carbon footprint (Kaplin 2011), reduce consumers' energy intake (Rose et al. 2008), ease the food desert problem (McKinney and Kato, 2017), one type of geographic disparities in the food supply. We propose to mitigate the food desert problem by using local food systems from a coupled human and natural system perspective. We are particularly focusing on improving the availability and accessibility of fresh vegetables in the food desert area by making good use of local food systems. This research takes the first step to investigate spatial differences in fresh vegetable demand. We use Guilford County, North Carolina (N.C.) as a case study.
A food desert is a metaphor for neighborhood health food deprivation. It has many versions of definition. In general, it refers to areas with low access to affordable fresh vegetables and fruits, usually in the units of census tracts. The United States Department of Agriculture (USDA) defines one census tract as a food desert if it meets two thresholds：1) "a poverty rate of 20 percent or greater, or a median family income at or below 80 percent of the statewide or metropolitan area median family income"; 2) "at least 500 persons and/or at least 33 percent of the population lives more than 1 mile from a supermarket or large grocery store (10 miles, in the case of rural census tracts)". In other words, one measurement of food deserts by USDA is whether most residents in one census tract have nearby access to a full-service grocery store, which serves a variety of fresh vegetables and fruits, such as a supermarket or a wholesale * Corresponding author market. Based on recent data (USDA ERS 2010, 2015; U.S. Census Bureau 2017), we find the food deserts in N.C. are expanding, and the number of full-service grocery stores is declining. Some solutions were taken by policymakers to ease the food desert problem. But very few cases had meaningful effects. For example, In Greensboro, N.C., Renaissance Co-op opened as a food desert rescuer, but they were out of business in two years due to a lack of enough demand.
Guilford County, N.C., has a population of more than half a million. It is divided into about 120 census tracts, and 21 of them are defined as food deserts by USDA. In Figure 1a, we find that those food deserts are surrounded by the undeveloped area where exists about 854 farms (USDA 2017, Census of Agriculture), and those farms are capable of producing a large variety of fresh vegetables. If we further overlay individual household addresses (RTI international's U.S. synthetic household population) on the map (see Figure 1b), we may find that a significant portion of households living in food desert areas. USDA estimates that about one-fifth of the county's population is underserved by full-service grocery stores in Guilford County. One important difference between one full-service grocery store and one corner store is the different supply of fresh vegetables and fruits. As we know, supply and demand always come side by side, and most literature argues that the food desert area (e.g., Allcott et al. 2019) is underserved mostly because of the low demand for healthy food. However, we find that the current literature and data mostly focuses on finding the association between food environment/food deserts and dietary behavior/healthy food at the individual level. For example, the U.S. Bureau of Labor Statistics (BLS) conducts consumer expenditure surveys every year, which gives average spending on fresh vegetables pre household. From the dataset, we can draw a map that displays the correlations between food deserts and the low consumption level of fresh vegetables. Figure 2 demonstrates that households living in most food desert areas consume fresh vegetables less than $210 per year on average. However, there is a lack of literature that appropriately calculate the spatial differences in vegetable spending between food-desert and non-food desert area at the census tract level. The gap in the literature leads to our first step research question: what the spatial differences in fresh vegetable spending between food-desert and non-food-desert census tracts in Guilford County are. We propose to use a combination of a private dataset and several public datasets to solve this problem. The technique to calculate the aggregated fresh vegetable consumption is agent-based modeling, which is a from-bottom-to-up approach. In general, we start from household level and use agents in the software NetLogo to represent about 220k households of Guilford County. Their fresh vegetable purchasing behavior is affected by the food environment and their household features, which are programmed as one set of fresh vegetable purchasing rules in the NetLogo. We use the software to simulate each household behavior for one year, and then we aggregate fresh vegetable demand for each census tract.

DATA
The dataset that we first look at in our study area is the Cropland Data Layer (CDL) from the USDA National Agricultural Statistics Service (NASS). Using images remotely sensed by satellites and national agricultural statistics, USDA NASS creates the CDL dataset annually to illustrate how the U.S. continents are covered by specific crops, includes different types of vegetables. CDL data contain one layer of raster data that show how specific crops are distributed over space and over time. To fulfill the purpose of the paper, we mainly utilize the CDL dataset in the year 2019 to find out how the current spatial relationships among vegetable growing parcels, household addresses, and food desert areas. For the next steps, we would apply CDL data in about ten years to estimate how land use, especially vegetable growing area, changes over space and time, and whether they are correlated with farm-level characteristics and uncertain scenarios, such as a flood.
The second dataset that we use for our study area is the RTI international's U.S. synthetic household population dataset (Wheaton et al. 2009). Because we use a bottom-up strategy to estimate how fresh vegetable demand is spatially different using an agent-based model and socioeconomic differences matter in the correlation between food deserts and dietary behavior (Mackenbach et al. 2019), we need data containing the spatial distribution of households and the characteristics of each household in our study area. Different from aggregated data at census tracts or zip code levels, The RTI international's U.S. synthetic household population dataset represents an accurate and complete set of household addresses and household (member) characteristics, such as household income, member ages, race, household size, etc. Therefore, in the agent-based model, one household, represented by one agent, can follow specific rules and make fresh vegetable purchasing decisions based on their household-specific data (e.g., whether one household locates in food desert area). Also, we can intuitively understand the spatial relationships between the food desert area/fresh vegetable production area and household locations.
The third dataset, based on which we extract behavior patterns, is the Nielsen Homescan dataset. Nilsen Homescan data are one type of national-level dataset provided by Nielsen company. The company has a balanced sample all over the States, and each panelist in the sample reports all purchased items, including all kinds of fresh vegetables. We can observe the types of vegetables, unit price, and total spending of one vegetable item from the dataset. Also, we can observe the types of stores where those vegetables are bought from, such as a wholesale club, a supermarket, or a convenience store.
Furthermore, each panelist reports the location of his/her household, and household characteristics, such as household The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIV-M-2-2020, 2020 ASPRS 2020 Annual Conference Virtual Technical Program, 22-26 June 2020 income, household member ages, employment status, race, etc. We use the subset of N.C. so that we can better mimic the fresh vegetable purchasing behavior in our study area. The dataset records panelists' each grocery shopping trip, and each item they bought during one trip. We also combine the Nielsen Homescan data and USDA food desert data to estimate the food desert status.
One important feature of Nielsen Homescan data is that the dataset only discloses panelists' geographic location at the zip code level to protect the privacy of the panelists. However, the USDA census tract definition is based on the census tract level. We use the fourth dataset to estimate the food desert status of one zip code: the HUD-USPS ZIP Crosswalk. This dataset is organized by the U.S. Department of Housing and Urban Development's (HUD's) Office of Policy Development and Research (PD&R). It estimates how many residents living in a census tract live in a zip code. We define one zip code overlapping with any food-desert census tracts as a food-desert zip code.

METHOD
We aim to build an integrated agent-based model to achieve our goal for our whole project. The overarching goal of the agentbased model is to build a food system that includes vegetable production and consumption, and related environmental impact. Thus, the agent-based model includes three types of agents, households, farms, and environmental agents. We identify their rules of the agents from our datasets and previous literature.
Following Abel and Faust (2020), we assume households make one to four decisions each tick or day based on their food environment and household characteristics: whether they are going to do grocery shopping, which store they are going to do grocery shopping, whether they are going to buy fresh vegetables, and how much they are going to spend on fresh vegetables. By analyzing Nielsen Homesman data, we can find the parameters for fresh vegetable purchasing behavior.
As farm agents, they decide whether they change the use of land at the beginning of one season, either turn vegetable planting into other uses, or vice versa. Based on the previous years' CDL data, we model the change process as functions of changes in previous years' consumption and changes in vegetable prices, and other variables that have significant effects on land-use change. The environmental agents would take variables from farm agents' actions, such as land-use change, to simulate environment changes, such as water quality, using Agricultural Policy/Environmental eXtender (APEX model). The three parts, production, consumption, and environment, would be linked and integrated by variables such as vegetable prices, vegetable yield, and weather conditions.
However, to answer the research question of this paper, we only initialize and parametrize the production model part in the agentbased model to calculate the spatial differences of fresh vegetable consumption in Guilford county. We will initialize and parametrize other parts of the agent-based model in our future work. After analyzing the data systematically, we can conclude a few consumption patterns. Firstly, the numbers of total trips in a year and the numbers of trips to a full-service grocery store are not different from each other between food desert households and their counterpart. For example, during a year, both food-desert and non-food-desert households visit full-service grocery stores 92 times on average, and the differences are not statistically significant (T-Test, p > 0.1.).

RESULTS
If we take other confounders together and try to understand how they affect the probabilities of households going to a full-service grocery store, we can see the ownership of a vehicle is the most powerful indicator (Table 2). It decreases the probability of choosing a full-service grocery store by about 15%. Intuitively, one household will have many choices if they own vehicles. In other words, although families without cars may decrease their grocery shopping trips, they would go for shopping at full-service grocery stores at greater probabilities than those having cars. The The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIV-M-2-2020, 2020 ASPRS 2020 Annual Conference Virtual Technical Program, 22-26 June 2020 food environment (food desert status) matters, but the magnitude seems small. Households with children would have a 4% higher probability to go for shopping at full-service grocery stores, ceteris paribus. The ethnic groups also play an important role in choosing grocery stores. Secondly, we find that most vegetables are bought in full-service grocery stores. Households spend about 98% of their fresh vegetable budget at full-service grocery stores, no matter where they live. After arriving at a full-service grocery store, patterns to buy fresh vegetables are different across different household characteristics. The logistic regression results ( After we implement those above patterns and the trigger of going for grocery shopping (a probability function similar in Abel and Faust, 2020) in the agent-based model and simulate agents' behavior with other environmental data, such as household food environment, and household characteristics, we can find the spatial differences in yearly fresh vegetable consumption in Guilford County (  Table 5. Spatial differences in yearly fresh vegetable consumption in Guilford County

DISCUSSION AND FUTURE STEPS
In this paper, we propose a way to build an integrated model of a local food system using the agent-based model technique. We take the first step to estimate the spatial differences in fresh vegetable spending in Guilford County, NC, using a part of the model. Combing a few private and public datasets, we can estimate aggregated fresh vegetable spending at the census tract level. We find that census tracts defined as food deserts by USDA have a much lower fresh vegetable demand, which indicates Food deserts may be equilibrium responses to consumers' demand.
With the mentioned datasets and proposed method, we built up a platform to estimate the spatial fresh vegetable demand. We can easily extend the method to other food, such as fresh fruits, canned vegetables, or prepared food. We believe the proposed method would help future research about food security. One limitation of the current platform is that we do not validate our results with data from other resources. We will plan to do that for our next step.
In the future, we will develop one more integrated model, which includes consumers, farmers, and environmental agents in the model. More data, such as remote sensing data, GIS info, land use data, will also be included in the model as environment data. Ultimately, we want to find out to what extent how the local food system can help alleviate the food desert problem, and any environmental consequences would happen.