Abstract

Chicago has a long history of redlining, a discriminatory housing process that has led to segregation in Chicago up to the modern day. This practice marginalized people of color and created a significant and still prevalent wealth gap between redlined and non-redlined communities(Chicago, n.d.a). It is well known that poorer communities in Chicago are victims of negative environmental factors, such as industrial corridor proximity, and higher air pollution levels. There have been studies done on these things, but there are many overlapping factors that have led to Chicago’s prevalent inequality, creating a complex problem(Geertsma 2018).

Our study aimed to tackle some of this complexity and to analyze how various environmental indicators impacted the health outcomes of people based on the HOLC (Home Owners’ Loan Corporation) grade they lived in. We conducted a literature review to see which health impacts were tied to which significant environmental indicators and designed our technical portion to account for the complexity and inter-relation of the data, but we wanted to do more with the communities as well, rather than just analyzing data. A major part of our project was inspired by the work of Phillip Boda of UIC in regards to designing research centered around communities rather than around data.

We came to the conclusion that the scope of our project was larger and more complex than any two people sufficiently cover and decided to design a coding tool alongside our analysis that will allow for communities to both interpret our findings and run their own analyses in the ways they deem most useful. We hope that this can both provide communities with a more thorough understanding of the complex relations between the environment and health. We also hope to provide a tool that can be more useful than existing resources we’ve used and found to have issues, such as the EPA’s Environmental Justice Screening tool (EJScreen).

We found that higher percentages of greenlined and bluelined areas had little significance to negative health outcomes, or lowered risk, while higher percentages of red and yellowlined areas had greater significance to the model, increasing risk of negative health outcomes with increasing area. The impact of environmental indicators on our model varied depending on the health impact being analyzed, but in the case of every health impact, there was a semi-linear positive trend when a model was created taking into account all of the independent variables and their coefficients.

Background

Chicago has a long history of segregation, caused in large part by housing restrictions and zoning laws. In 1934, the Federal Housing Administration’s Chief Economist released a ranking of races in order of their perceived desirability, and this ranking went on to be used by the Home Owners’ Loan Corporation (HOLC) to make determinations about mortgage lending. The HOLC graded neighborhoods in Chicago, with A graded places being the most desirable for living and D being the least desirable.On segregation maps, the four grades correlated to Greenlined, Bluelined, Yellowlined, and Redlined areas, from most to least desirable. Based on the FHA’s ranking, predominantly European American neighborhoods were treated as most desirable for lending, while African and Mexican American neighborhoods were marked as Red, and the highest risk. This practice, even after its end, has perpetuated segregation in the city of Chicago to the present day(Rossi 2020).

Zoning laws have also contributed not only to perpetuating segregation, but also to ensuring that places with higher densities of colored populations are less economically privileged. While explicit racial zoning was forbidden, more subtle methods have been used to prevent integration, such as the prevention of affordable community housing being built in predominantly white neighborhoods and urban renewal projects that destroyed the homes of many of Chicago’s poor, pushing them westward. Unequal zoning laws have led to the northeastern parts of the city being predominantly white with access to parks and beaches, in contrast to the predominantly Latino and African American south and west parts of the city, which were zoned for manufacturing and higher density buildings(Chicago, n.d.b).

This has led to unequal distribution of environmental inequalities based on race, something this project addresses, in part by utilizing an EPA based tool called the Environmental Justice Mapping and Screening Tool (EJScreen) for one facet of its analysis. EJScreen has compiled environmental and demographic data for each of Chicago’s Census Block Groups. It also includes methods to combine the environmental and demographic indices. The environmental indicators included in the tool are PM 2.5, Ozone, Lead Paint Indicator, Air Toxics Cancer Risk, Respiratory Hazard Index, Diesel Particulate Matter, Proximity to RMP Sites, Traffic Proximity and Volume, Proximity to NPL Sites, Wastewater Discharge, and Proximity to Hazardous Waste Treatment, Storage, and Disposal Facilities. The data are used to calculate percentiles for the block groups, which can be compared to either a singular state or the entirety of the United States, with a higher percentile indicating a higher environmental impact(Agency, n.d.b).

Tied to these environmental indicators are health impacts that the study was also interested in investigating, and the health impacts of each of the EJScreen Environmental Indicators were investigated. In Chicago, where lead in water is an issue, poor water quality has led to kidney problems and increased blood pressure, especially in neighborhoods with higher density of people of color(McCormick, Uteuova, and Moore 2022). PM (Particulate Matter) 2.5 is linked to nonfatal heart attacks, irregular heartbeat, aggravated asthma, decreased lung function, and increased respiratory symptoms. It also poses a higher risk to those already in vulnerable populations(Agency, n.d.c). Research has shown green space to have a positive impact on mental health and to lower all cause mortality(Berg et al. 2015).

Noise exposure due to traffic proximity may be related to high blood pressure and cardiovascular disease(Davies and Kamp 2012), and traffic proximity in general has shown correlation to low birth weight and preterm birth(Brender, Maantay, and Chakraborty 2011). The Risk Management Plan facility and Hazardous Waste proximity indicators correlate to a wide variety of health issues. Proximity to polluted sites can cause low birth weights, as well as birth defects, though those were only found at waste sites emitting substances with specific biological effects, chromium waste, and waste sites classified as ‘high priority’ and there was uncertainty in study methodology(Kihal-Talantikite et al. 2017). Research regarding multiple communities exposed to air pollution and hazardous waste facilities showed a higher risk of cancer and potential for respiratory illness from air pollution(White 2018). Another report on residential proximity to environmental hazards found correlation between childhood cancer and proximity to hazards(Brender, Maantay, and Chakraborty 2011).

Methodology

For the technical portion of the project, we began by analyzing which environmental indicators varied significantly based on HOLC grade. We did this by utilizing the EJScreen tool to log percentiles of environmental indicators for each census tract. We also overlaid an HOLC grade map in the EJScreen map in order to record the HOLC grades for each census tract, so that we could decide whether or not the percentiles significantly varied based on HOLC grade. After this was done, QGis software was used to divide the census tracts up along HOLC graded areas. This information was saved in a dataframe, which is available in our repository, and Python was used to loop through the data and find the redlined, yellowlined, blueline, and greenlined areas of each census tract. These values were then put into their own dataframe to complete the analysis.

Following these preliminary steps, we began the coding portion that would make up the bulk of our work. This included Variance Inflation Factor and Ridge Regression analysis. Variance Inflation Factors (VIF) are a tool used to explore multicollinearity, which occurs when independent variables are correlated with each other. VIF measures the correlation between multiple independent variables in a least squared linear regression model. Each independent variable will be given a value by VIF to indicate how much multicollinearity exists between it and the other independent variables. While there is a wide range of values that can be considered indicators of significant multicollinearity, a good general rule is to consider VIF values of 5 and above as indicators of significant multicollinearity, as was done in this analysis(Frost 2020).

For this analysis, the independent variables were selected to be percentages of redlining, yellow lining, bluelining, and greenlining for each census tract, as well as the EJScreen Census tract percentiles for Particulate Matter 2.5, Traffic Proximity, Diesel Particulate Matter, RMP Facility Proximity, Hazardous Waste Proximity, Underground Storage Tanks, and Wastewater Discharge. Wastewater Discharge was dropped eventually due to insufficient data.

A VIF analysis was run using code available in our repository, and after the analysis was run to determine the degree of multicollinearity, an analysis of the independent variables’ effects on the health impacts was necessary. In this analysis, the health impacts studied were high blood pressure, medium blood pressure, cancer, asthma, coronary heart disease, cholesterol screening, chronic obstructive pulmonary disease, chronic kidney disease, poor mental health, strokes, sleep quality (less than seven hours of sleep), and poor physical health. These health indicators were drawn from our literature review of health outcomes tied to environmental indicators that we found to be significant.

In order to effectively analyze the impact of all of the independent variables on each health outcome, a regression method was needed that accounted for the multicollinearity present in the data. The method we employed to achieve this was Ridge Regression, a specialized type of ridge regression designed to account for multicollinearity, which can be modeled by:
\(y = w[0]x[0] + w[1]x[1] + … + w[n]x[n].\)
This model can include as many features as are needed, and w is the coefficient, or weight, which is assigned to each feature. This coefficient describes how heavily each feature will affect the model, either positively or negatively. Linear models use a cost function to calculate errors between predictions and actual values of data. The cost function is a single real valued number representing the average error across the dataset, and is represented by: \(Loss = (y - y_{prediction})^2\)

Due to this, coefficients can become very large and cause overfitting of a model, which is when the line of best fit adheres too closely to too many of the points, causing inaccurate predictions. When there are multiple predictors, some may have much larger coefficients, thus influencing the model more heavily(DataCamp 2022).

Ridge regression provides a method of tackling this problem, as it introduces a penalty to the loss function that will include an additional cost for functions with larger coefficients. The specific penalty used by Ridge Regression is called the L2 Penalty, which penalizes a model based on the sum of the squared coefficient values. The size of all of the coefficients is minimized, but none of them are allowed to reach zero. The weighting of the penalty is controlled by a hyperparameter \(\lambda\) (sometimes defined as \(alpha\) in code) and is defined by: \(Ridge Loss\)= \(loss\) + (\(\lambda \cdot L2_{penalty}\))(Brownlee 2020b).

The final step of this analysis was running this ridge regression, going through each of the health impacts and treating them as the dependent variables in relation to all of the independent variables.The data was split and scaled for testing and training, and the data was used to produce standard linear regression and ridge regression models, displaying the models’ coefficients and test scores for comparison. After the linear regression scores were calculated, ridge regression was run, which required calculation of an optimal lambda value, denoted as alpha in the code. A repeating K-Fold Cross Validation method was used for this, using functions RepeatedKFold() and RidgeCV()(Brownlee 2020a; scikit, n.d.).

Once the ideal alpha value was calculated, the ridge model was fitted using alpha and the data for the independent and dependent variables, and its training and testing scores, as well as its calculated coefficients for each independent variable, were compared to those from the linear regression model. The ridge regression coefficients were plugged into the model provided above and each of the health indicators were independently plotted against all of the independent variables with their appropriate calculated coefficients.

Finally, graphs were created plotting each individual health indicator against redlined, yellowlined, bluelined, and greenlined percentages of census tract area. In these graphs, the HOLC percentages were multiplied by their ridge regression coefficients in order to indicate how heavily weighted each HOLC grade was for each health indicator. This analysis was run for all of the health indicators, with each one being treated as a singular dependent variable to be plotted against the same independent variables each time.

Findings and Conclusions

For the purpose of discussing our findings and conclusions, the analysis examples shown are for the health impacts of high blood pressure and chronic obstructive pulmonary disease. However, these same analyses were done for each health impact listed in methodology, and the full code with every health impact can be found at the source below, in our repository(Brown and Exline 2023). In the code, it can be seen that there are a few exceptions to all of our findings, and that the conclusions we have drawn are based on overall trends in the data.

Our first finding was that there was significant multicollinearity, greater than five in this analysis, for all of the environmental indicators. This indicates that all of the environmental indicators had significant effects on each other, while green and bluelined areas had minimal multicollinearity and redlined and yellowlined areas were approaching significance. This can be seen in Figure 1.

On a technical level, this indicates that environmental indicators have significantly more inter-relation than the HOLC grades do, and that green and blue lining percentages are almost entirely insignificant when inter-relation is being taken into account. While this doesn’t tell us anything concrete about these factors’ impacts on communities, it does demonstrate the complexity of their situations. Green and bluelining won’t change based on the presence of environmental indicators, nor are environmental indicators likely to change based on green and bluelining. However, environmental indicators depend heavily on each other, meaning change in one environmental indicator affects the significance of other indicators in each census tract, making it more likely for communities to suffer from cumulative environmental impacts.

From the Ridge Regression Analysis, it was found that for each of the health impacts graphed against the environmental indicators and HOLC percentages, there was an overall, semi-linear positive trend to the graphs. Examples of this can be seen in Figures 2 and 3.

The analysis of coefficients showed similar coefficients for ridge regression and standard linear regression, a comparison for which can be seen in Figures 6 and 7. In cases where the health impacts were negative, greenlined and bluelined percentages were likely to have smaller or negative coefficients in the model, indicating they had little impact or decreased the overall model. Similarly, redlined and yellowlined percentages were likely to have larger coefficients, indicating more significance to the model. The environmental indicator coefficients had less consistency that the HOLC percentage coefficients did, tending to vary depending on the health impact being analyzed. Examples of the coefficients can be seen in Figures 4 and 5.

Based on the overall ridge regression, it is reasonable to conclude that a rise in percentile of environmental indicators will lead to greater health impacts, regardless of the health impact. Negative or small coefficients do not have enough significance to combat the overall positive trend, meaning that it is reasonable to predict that proximity to a significant amount of health indicators will lead to larger negative health outcomes.

It can also be concluded that a census tract having a high percentage of greenlining or, to a lesser extent, bluelining, makes it less likely for that census tract to have health risks for the population. Based on the coefficients, bluelining and greenlining will, at worst, have no significant impact on negative health outcomes, and at best, decrease likelihood of negative health outcomes. The inverse can be found to be true for redlined, and to a lesser extent, yellowlined tracts. There was no such consistent pattern for the coefficients of environmental indicators, which is in line with the research we did. It was found that, while some health impacts were tied to multiple environmental indicators, each indicator had unique health impacts. This is supported by our findings, as each health impact was affected differently by different environmental indicators.

On a community level, we can infer three major things from this. The first is that higher percentiles of environmental indicators for a community, and in general, higher percentages of red and yellow lining, will predictably lead to higher rates of negative health outcomes, regardless of that health outcome.

Second, it can be determined that redlining and yellow lining, while not the cause of negative health outcomes, are significantly tied to them in ways that bluelining and greenlining aren’t, a finding that can be used as evidence in the proposal of more equitable environmental policy.

Third, the variation of the environmental indicator coefficients can aid in the development of more clear and effective environmental policies in areas that are at risk from multiple environmental indicators. Knowing more specifically which indicators have significant impacts on which health outcomes can help with future community planning and better development of social justice resources, especially when communities have the power to conduct their own more focused analyses.

The final part of our findings, with examples shown in 8 and 9, involved graphing just the HOLC grade percentages against the health impacts and modifying the x axes, which were all initially zero to 100, with their calculated regression coefficients. This showed that the x axis value ranges were generally shrunk, or even inversed, for the blue and greenlined percentages. Conversely, the yellow and redlined percentages generally had their value ranges expanded, which visually indicated their larger significance to the model.

It could also be seen from the plots that there was no clear linear relationship between just the health impacts and the HOLC grade percentages, unlike in the complete regression. However, even if no good predictions can be made using a linear model, probabilistic predictions can be made. Each dot represents a census tract and its value of the selected health impact, as well as its percentage of the relevant HOLC grade. On the greenlined percentage plots generally, but referencing Figure 8 as an example, there are many census tracts with zero percent greenlining, and these tracts have a wide range of possible values for high blood pressure. However, as greenlined percentages rise, there are fewer census tracts that have high values of high blood pressure. Conversely, on the redlining graph, there is a wide distribution of high blood pressure values for all census tracts with redlining percentage.

Linear regression methods can’t be applied to this portion of the analysis to make predictions, due to the scattered nature, but statistical methods can be used to make determinations that support the conclusions made from the regression model. While living in a redlined area doesn’t guarantee someone will have more negative health outcomes, their chances of having a negative health outcome increase as percentage of redlining increases, while likelihood of negative health outcome is mostly unchanged by percentage of greenlining increasing. The likelihood of negative health impacts being tied to ‘less desirable’ HOLC grades supports the earlier conclusions, but these nonlinear graphs add a layer of sociological complexity.

It can’t be proven just from looking at the graphs in Figures 8 and 9 that HOLC grades are explicitly tied to negative health impacts, and that’s because the grades themselves aren’t the cause. The cause of the negative health impacts are the environmental indicators tied to those grades. This means that an increased percentage of a less desirable HOLC grade in a census tract increases the likelihood of being near one of the environmental indicators linked to negative health outcomes. Thus, statistical likelihood of experiencing that negative impact increases with desirability decrease.

The issue with investigating this is the fact that, as shown by the coefficients, not each environmental indicator affects each health impact the same, and not every redlined or yellowlined area has each environmental indicator present within it. For example, it is more likely for an RMP (Risk Management Plan) Facility to be located in redlined areas, but this doesn’t guarantee that a redlined tract will have an RMP Facility(“EPA Emergency Response (ER) Risk Management Plan (RMP) Facilities,” n.d.).

This complexity can make it difficult to prove that redlining has any real correlation with health, because it is not the redlining itself that is the root cause of the health issues, but the compounding injustices that have been placed upon those historically redlined areas. It is easy to view the past as simply something of the past and not something that affects people every day as environmental and economic inequalities pile up in their lives. It is important that communities are empowered to bring these injustices to light for themselves, because without extensive research, inequality is allowed to become invisible, passed off as a relic of a bygone era rather than an ongoing and increasingly complex system of oppression that will require equally complex research and problem solving.

Future Work

The ultimate goal of this project is to make future work as easy as possible.

While this analysis provides information on health impacts of environmental inequalities in redlined areas, the way this study was conducted may not provide the most useful information for each individual Chicago community. Thus, our methodology and datasets have been compiled into a tool made available on GitHub so that any community can recreate the analysis to their specifications. Different data could be explored using the same methodology, which could include adding additional health impacts, or something else entirely. Suggestions for this included data on community proximity to environmental indicators or environmental education data.

To make these analyses possible, the tool contains in depth explanations of how to use and interpret the code, the code itself, and the datasets we used or modified. The datasets in the Github include environmental indicators data from EJScreen(Agency, n.d.a) and the Chicago portion of the health data from the 2022 PLACES: Census Tract Data(“Centers for Disease Control and Prevention,” n.d.). Further health data sources are referenced in the tool, but not used in our analysis.

In this vein, there is also potential for future work due to lack of data on the Census Tract level. The Chicago Health Atlas, which we had intended to use as a principal data resource, has data at Community Area levels, but not much regarding Census Tracts. Thus, even though health impacts such as preterm birth, all cause mortality, and childhood cancer were tied to environmental factors with significant disparities, they were not included in the analysis due to lack of available data. However, if any community wanted to include them, the same methodology could be applied, and the data could be downloaded from the Chicago Health Atlas, assuming it wasn’t a census tract level analysis.

There is also potential for work to be done, either by us or by others, that is outside the purview of our toolkit. It was suggested that we could do analyses based on factors other than HOLC grade to attempt to determine more of the root causes of the inequalities, rather than just correlations.These possible analyses included using environmental education or community proximity to environmental indicators in place of HOLC grade percentage. It was also recommended that future analyses could use Marginal Mean Weighting Through Stratification (MMW-S). This is a method that adjusts for selection bias when categorizing nonexperimental data that has pre-existing covariates, or an independent variable that will affect the outcome of a trial(Hong 2012).

Acknowledgements

We would like to thank the whole SoReMo board for their input and help with technical aspects of the project. In particular, we would like to thank Professor Petrovic for running SoReMo. We are grateful to our other fellows for their feedback and inspiration, and to Amirreza Eshraghi for his considerable help with our most tedious piece of code.

Appendix

Figure 1: VIF Values

Figure 1 shows the Variance Inflation Factor values for each of the independent variables, with values greater than five being significant in terms of multicollinearity.

Figure 2: High Blood Pressure Regression

Figure 2 shows the regression model for the health impact of high blood pressure that was plotted using the model \(y = w[0]x[0] + w[1]x[1] + … + w[n]x[n]\). This model was determined by the coefficients calculated using ridge regression for high blood pressure.

Figure 3: Chronic Obstructive Pulmonary Disease Regression

Figure 3 shows the regression model for the health impact of chronic obstructive pulmonary disease that was plotted using the model \(y = w[0] x[0] + w[1]x[1] + … + w[n] x[n]\). This model was determined by the coefficients calculated using ridge regression for chronic obstructive pulmonary disease.

Figure 4: High Blood Pressure Coefficients

These are the coefficients for each of the independent variables calculated for the high blood pressure ridge regression model.

Figure 5: Chronic Obstructive Pulmonary Disease Coefficients

These are the coefficients for each of the independent variables calculated for the chronic obstructive pulmonary disease ridge regression model.

Figure 6: High Blood Pressure Coefficient Comparison

Figure 6 shows the high blood pressure coefficient comparison for the coefficients calculated from linear regression vs the coefficients calculated by ridge regression. Ridge coefficients are in red, while linear coefficients are in blue.

Figure 7: Chronic Obstructive Pulmonary Disease Coefficient Comparison

Figure 7 shows the chronic obstructive pulmonary disease coefficient comparison for the coefficients calculated from linear regression vs the coefficients calculated by ridge regression. Ridge coefficients are in red, while linear coefficients are in green.

Figure 8: High Blood Pressure HOLC Grade Graphs

These graphs for high blood pressure versus each of the HOLC grade percentages, with the x axes modified by each of the grades’ respective ridge coefficients in order to demonstrate their significance. Each dot represents a census tract and its HOLC percentage and high blood pressure value, and the graphs show the distribution of high blood pressure values varying with HOLC percentage.

Figure 9: Chronic Obstructive Pulmonary Disease HOLC Grade Graphs

These graphs for chronic obstructive pulmonary disease versus each of the HOLC grade percentages, with the x axes modified by each of the grades’ respective ridge coefficients in order to demonstrate their significance. Each dot represents a census tract and its HOLC percentage and chronic obstructive pulmonary disease value, and the graphs show the distribution of chronic obstructive pulmonary disease values varying with HOLC percentage.

References

Agency, Environmental Protection. n.d.a. “Download EJScreen Data.” EPA. https://www.epa.gov/ejscreen/download-ejscreen-data.
———. n.d.b. “EJSCREEN Technical Documentation 2014 - US EPA.” EPA. https://www.epa.gov/sites/default/files/2017-09/documents/2017_ejscreen_technical_document.pdf.
———. n.d.c. “Particulate Matter (PM) Pollution.” EPA. https://www.epa.gov/pm-pollution/health-and-environmental-effects-particulate-matter-pm.
Berg, M., W. Wendel-Vos, M. Poppel, H. Kemper, W. Mechelen, and J. Maas. 2015. “Health Benefits of Green Spaces in the Living Environment: A Systematic Review of Epidemiological Studies.” Urban Forestry & Urban Greening 14 (4): 806–16. https://doi.org/10.1016/j.ufug.2015.07.008.
Brender, J. D., J. A. Maantay, and J. Chakraborty. 2011. “Residential Proximity to Environmental Hazards and Adverse Health Outcomes.” American Journal of Public Health 101 (S1). https://doi.org/10.2105/ajph.2011.300183.
Brown, N., and D. Exline. 2023. “Environmental Inequality and Redlining.” Github. https://github.com/sarrypotter237/Environmental-Inequality-and-Redlining.
Brownlee, J. 2020a. “Repeated k-Fold Cross-Validation for Model Evaluation in Python.” MachineLearningMastery.com, August. https://machinelearningmastery.com/repeated-k-fold-cross-validation-with-python/.
———. 2020b. “How to Develop Ridge Regression Models in Python.” In Machine Learning Mastery. https://machinelearningmastery.com/ridge-regression-with-python/.
“Centers for Disease Control and Prevention.” n.d. https://chronicdata.cdc.gov/500-Cities-Places/500-Cities-Census-Tract-level-Data-GIS-Friendly-Fo/k86t-wghb.
Chicago, Digital. n.d.a. “Racial Restriction and Housing Discrimination in the Chicagoland Area. Digital Chicago Lake Forest College.” https://digitalchicagohistory.org/exhibits/show/restricted-chicago/other/redlining.
———. n.d.b. “Racial Restriction and Housing Discrimination in the Chicagoland Area. Digital Chicago Lake Forest College.” https://digitalchicagohistory.org/exhibits/show/restricted-chicago/other/redlining.
DataCamp. 2022. “Lasso and Ridge Regression in Python Tutorial.” DataCamp, March. https://www.datacamp.com/tutorial/tutorial-lasso-ridge-regression.
Davies, H., and I. Kamp. 2012. “Noise and Cardiovascular Disease: A Review of the Literature 2008-2011.” Noise and Health 14 (61): 287. https://doi.org/10.4103/1463-1741.104895.
“EPA Emergency Response (ER) Risk Management Plan (RMP) Facilities.” n.d. HIFLD Open Data. https://hifld-geoplatform.opendata.arcgis.com/datasets/geoplatform::epa-emergency-response-er-risk-management-plan-rmp-facilities/explore?location=41.947847%2C-87.746910%2C9.73.
Frost, J. 2020. “Variance Inflation Factors (Vifs.” Statistics By Jim, December. https://statisticsbyjim.com/regression/variance-inflation-factors/.
Geertsma, M. 2018. “New Map Shows Chicago Needs Environmental Justice Reforms.” Natural Resources Defense Council, October. https://www.nrdc.org/bio/meleah-geertsma/new-map-shows-chicago-needs-environmental-justice-reforms.
Hong, G. 2012. “Marginal Mean Weighting Through Stratification: A Generalized Method for Evaluating Multivalued and Multiple Treatments with Nonexperimental Data.” Psychological Methods 17 (1): 44–60. https://doi.org/10.1037/a0024918.
Kihal-Talantikite, W., D. Zmirou-Navier, C. Padilla, and S. Deguen. 2017. “Systematic Literature Review of Reproductive Outcome Associated with Residential Proximity to Polluted Sites.” International Journal of Health Geographics 16 (1). https://doi.org/10.1186/s12942-017-0091-y.
McCormick, E., A. Uteuova, and T. Moore. 2022. “Revealed: The ’Shocking’ Levels of Toxic Lead in Chicago Tap Water.” The Guardian, September. https://www.theguardian.com/us-news/2022/sep/21/lead-contamination-chicago-tap-water-revealed.
Rossi, M. R. 2020. “Chicago’s History of Zoning Against Affordable Housing.” Progressive City, July. https://www.progressivecity.net/single-post/2020/07/07/CHICAGOS-HISTORY-OF-ZONING-AGAINST-AFFORDABLE-HOUSING.
scikit, Sklearnlinear_modelRIDGECV. n.d. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeCV.html.
White, R. 2018. “Life at the Fenceline: Understanding Cumulative Health Hazards in Environmental Justice Communities.” https://ej4all.org/life-at-the-fenceline.