This paper explores the relationship between increasing housing prices and some of the neighborhood factors that influence them. Using multi-regression analysis and R software with data from the US Census, Yelp, Zillow and RMLS – this article examines the strong correlation between trendy establishments and rising real estate values.
This paper was submitted to Hafencity University on March 31st, 2019.
1. Introduction: A gentrifying city
Portland, Oregon is often considered one of the greenest cities in the United States. Surrounded by lush forests, bike friendly bridges, and a liberal, environmentally conscious populace, Portland has been booming in popularity. Having lived in Portland for years, many of the claims ring true, yet with an influx of increasingly wealthy residents and rising home prices, social diversity is diminishing as minorities are being pushed to the fringes of the city.
Portland is one of the nation’s “whitest” major cities with 71.6 percent of the population, and while Portland as a whole is increasing in diversity, some of the city’s most livable neighborhoods are only getting whiter (US Census, 2010). Key gathering places that once served minority communities, such as churches, barber shops, beauty salons, restaurants and nonprofits are closing, and in their place are hip new brunch spots, artisan coffee shops and vegan friendly restaurants.
This analysis will explore the relationship between increasing housing prices and some of the neighborhood factors that influence them. While there are many variables that affect home prices, I am interested in observing the relationship between perceived signs of increasing gentrification (e.g. the introduction of places like coffee shops and vegetarian restaurants) and the increase in mean home value in the neighborhood. It is noted that the introduction of new coffee shops and vegetarian restaurants may not be contributive, but rather be contributed to by increasing gentrification. However, taking a closer look at the relationship between trendy establishments and home price could give greater insight into the future of the city’s neighborhoods.
1.1 Research Question
Does the increase in number of brunch restaurants, cafes and vegetarian friendly restaurants have a positive correlation with home value?
Null Hypothesis: The number of brunch restaurants, cafes and vegetarian friendly restaurants in a neighborhood do NOT affect the mean home value.
1.2 Variable Selection
1.2.1 Dependent Variable:
As a dependent ‘y’ variable, I chose the 2017 mean home value (MHV) in dollars per square foot ($/sqft) of 30 of Portland’s neighborhoods chosen at random after removing purely residentially zoned ‘gated community’ type neighborhoods from the sample that lie on the outskirts of Portland. While higher mean housing prices are not a direct indicator of gentrification, according to census data 36.77% of Blacks, 32.23% of Native Americans, and 26.27% Hispanics are below the poverty line, with only 12.5% of Whites below the poverty line (US Census, 2010). This, combined with higher education levels, likely indicates that Whites in Portland have more buying power, and thus could afford more expensive homes contributing to gentrification.
1.2.2 Independent Variables
As independent variables I chose the number of brunch restaurants ‘x1’, the number of cafes ‘x2’ and the number of vegetarian friendly restaurants ‘x3’ per 1000 people in each neighborhood to account for differences in neighborhood size. Vegetarian “friendly” restaurants were chosen because there were not enough purely vegetarian restaurants in each neighborhood to be statistically relevant.
1.3. Expectations
I expect the coefficients of x1 (β1), x2 (β2), and x3 (β3) to be greater than 0, meaning that the more brunch restaurants, coffee shops and vegetarian restaurants in each neighborhood, the higher the mean home value.
2. Description of data
I gathered the data for this research from three main sources. The population data came from the United States Census Bureau’s 2013-2017 American Community Survey 5-Year Estimates. The 2017 real estate values, in dollars ($) per square foot (sqft), from the Regional Multiple Listing Service, and cross-checked with Zillow Research data to determine inconsistencies. And the number of establishments in each neighborhood came from Yelp data.
In regards to the Yelp data, in order to account for the overlap of restaurants that could be both brunch and vegetarian, or cafes and brunch etc., I refined the search using their filter tool. While their might remain some overlap, it can be noted that the number of times certain keywords like “vegetarian friendly” or “brunch” appear in searches can also be a factor in neighborhood value. Where the actual number of establishments might be less important than the number of reviews mentioning these characteristics, however, due to the limitations of this study, the focus will remain on actual number of establishments that appeared in searches for the terms “brunch”, “vegetarian”, and “cafes”, then dividing that number by the population (divided by 1000) to account for differences in neighborhood size.
Table 1: Table of Neighborhoods, Mean Home Values ($/sqft) and establishments per 1000 people arranged by MHV
2.1 Regression Equation
y = β0 + β1 x1 + β2 x2 + β3 x3 + ε
y = mean value in home price ($/sqft)
x1 = number of brunch restaurants per 1000 people
x2 = number of cafes per 1000 people
x3 = number of vegetarian friendly restaurants per 1000 people
β0 = y intercept
β1 = brunch coefficient
β2 = cafe coefficient
β3 = vegetarian friendly restaurant coefficient
3. Statistical Analysis
To determine whether the three chosen regressors have an impact on mean home value (MHV) in a given neighborhood, I began by exploring the relationships between the ‘x’ and ‘y’ variables to get an idea of their linear relationship. As can be seen in the figure below, there is a positive relationship between the number of brunch, cafe and vegetarian establishments and the MHV of real estate in a neighborhood. The more of these establishments, in general, indicates a higher MHV, with cafes having the greatest variance.
3.1 Single Regression Analysis
After exploring the data, I performed three simple linear regressions to find out how much the independent (regressor) ‘x’ variables affect the dependent ‘y’ variable. The null hypotheses for all three variables is: There is NO correlation between ‘y’ and ‘x1’, ‘x2’, or ‘x3’. However, as will be shown in the analysis below, all three variables play a significant role in predicting the dependent variable. The Pearson Correlation Coefficients for x1: 0.92, x2: 0.79 and x3: 0.93 show that all three dependent variables show a strong linear correlation, with cafes showing the weakest of the three, though still quite strong. When checking the correlation between ‘x’ variables, strong correlation was also found, meaning there could be a risk for multicollinearity of independent variables.
After performing a linear regression to determine the relationship between MHV and BR, it was found that the “R Squared”, or the coefficient of determination that shows the proportion of the variance in the dependent variable predictable by the independent variable, was 0.8509, meaning that the value of ‘x1’ explains the value of the ‘y’ variable to 85.09 percent.
The linear regression of Cafes ‘x2’ with MHV ‘y’, found that the “R Squared” was 0.6256, meaning that the value of ‘x2’ explains the value of the ‘y’ variable to 62.56 percent.
The linear regression of vegetarian friendly restaurants ‘x3’ with MHV ‘y’, found that the “R Squared” was 0.8644, meaning that the value of ‘x3’ explains the value of the ‘y’ variable to 86.44 percent.
For all three variables, a change in ‘x’ creates a significant change in ‘y’ with P-values well below the threshold of <.05, therefore we reject all three null-hypotheses. There seems to be a very strong correlation between all variables, with ‘x2’ being the weakest, though still significant. Therefore, the number of brunch restaurants, cafes and vegetarian friendly restaurants in a neighborhood DOES seem to correlate with higher home value. Now, we will take a look at the multiple regression analysis.
3.2 Multiple Regression Analysis
To determine whether these factors combine to have more of an affect on MHV than they would alone, I performed a series of multivariate regression analyses to assess the null hypothesis: The number of brunch restaurants, cafes and vegetarian friendly restaurants in a neighborhood do NOT affect the mean home values. I created several models to check the variables against each other: (x1 + x2), (x1 + x3), (x2 +x3) and (x1 + x2 + x3).
From the results of the multivariate model (x1+x2+x3) found in the figure below, the adjusted R squared is 0.8506, similar to the results for the linear regression of brunch restaurants, higher than the results for cafes, and lower than the result for vegetarian friendly restaurants. The p-value of 1.75e-11 is much below the threshold of <0.05, meaning that there is very strong evidence against the null hypothesis, therefore we can reject the null hypothesis. The number of brunch restaurants, cafes and vegetarian friendly restaurants in a neighborhood DO seem to affect mean home values.
Using the multivariate regression model (Ŷ = β0 + β1 x1 + β2 x2 + β3 x3 + ε) it can be determined that with the introduction of each brunch, cafe or vegetarian friendly restaurant, the MHV of the neighborhood increases by $1.36, $1.54 and $9.20 per square foot, respectively.
Ŷ = 204.633 + 1.356 x1 + 1.535 x2 + 9.196 x3 + ε
Where Ŷ = the predicted values of the dependent variable
Figure 5: R Studio Calculation with relevant P-values and R squared
The multivariate model (x2 + x3), or cafes and vegetarian restaurants, seemed to be the most significant with a p-value of 1.668e-12 and an R squared of 0.8559. The coefficients were also high, with ‘x3’ being 10.397 and ‘x2’ being 2.233. This means that with the introduction of every new vegetarian friendly restaurant and cafe into a neighborhood, the MHV will increase by $10.40 and $2.23 per square foot, respectively.
It must be noted that the results show strong evidence of data-based multicollinearity between the variables, with the ‘y’ intercept model having a lower P-value in almost every model I ran. This could be due to reliance on purely observation data, i.e. the number of search results per restaurant, or on the increased likelihood of neighborhoods with more restaurants simply having more of every kind of restaurant.
4. Conclusions
The expectations I had for this analysis were that the coefficients of x1 (β1), x2 (β2), and x3 (β3) would be greater than 0, meaning that the more brunch restaurants, coffee shops and vegetarian restaurants in each neighborhood, the higher the mean home value. After the analysis, I can conclude that there is a positive correlation between MHV and the number of establishments within a neighborhood.
However, due to the evidence of a strong likelihood of multicollinearity between my independent variables after finding high Pearson Correlation Coefficients and extremely low ‘y’ intercept P-values, the results of the regression analyses cannot be fully trusted, therefore I cannot confidently reject my null-hypothesis that there is NO influence on MHV.
To address this in further analyses, I would choose independent variables that are less related to each other, such as restaurants and crime, walkability or amount of green space etc. I would also examine other aspects of Yelp data, such as number of reviews per neighborhood or how many restaurants have entered a neighborhood within the last five years as possible indications of gentrification. With the near immediacy of available data from many different sources, the possibility of gauging neighborhood change in almost real time is too attractive an opportunity to ignore.