Air Quality in the Pittsburgh Metropolitan Area:
An Analysis pm25 particles in the Pittsburgh area from 1999 to 2022
Executive Summary:
This analysis of Pittsburgh's air quality from 1999 to 2022 reveals:
Significant Improvement: Overall air quality has improved markedly since 1999.
Recent Data (2012-2022): 75% of days had "good" air quality; 98.6% were acceptable.
Seasonal Pattern: Air quality consistently worsens in summer (June-August).
Geographical Consistency: PM2.5 levels vary little across the metropolitan area.
Key Stakeholders: Residents with respiratory conditions, public health officials, policymakers, environmental agencies.
Action Items:
Investigate causes of summer air quality deterioration.
Implement targeted strategies to improve summer air quality.
Enhance public awareness and health advisories for high-risk periods.
Continue efforts to increase "good" air quality days year-round.
These actions aim to further improve public health outcomes and maintain Pittsburgh's environmental progress, particularly addressing the summer air quality issues.
Background
In mid-June 2024, while procrastinating at one of Pittsburgh's libraries, I stumbled upon an advertisement for "Solar Punk Pittsburgh," an eco-tech event. Attending a week later, I was surprised to learn that Pittsburgh has some of the worst air quality in the U.S., a fact confirmed by the American Lung Association.
This revelation is both shocking and not. Pittsburgh was once known as the home of American steel manufacturing, famous for its bad air and polluted water. However, the city has transformed into a medical and tech hub, industries not typically associated with air pollution. This contrast sparked my curiosity about Pittsburgh's current air quality situation.
Objectives:
Discover patterns in Pittsburgh’s Air Quality by analyzing trends in pm25 particles.
Pm25 particles:
According to the EPA website, Pm25 particles are “fine inhalable particles, with diameters that are generally 2.5 micrometers and smaller. How small is 2.5 micrometers? Think about a single hair from your head. The average human hair is about 70 micrometers in diameter – making it 30 times larger than the largest fine particle.”
They further state “Particulate matter contains microscopic solids or liquid droplets that are so small that they can be inhaled and cause serious health problems. Some particles less than 10 micrometers in diameter can get deep into your lungs and some may even get into your bloodstream. Of these, particles less than 2.5 micrometers in diameter, also known as fine particles or PM2.5, pose the greatest risk to health.”
https://www.epa.gov/pm-pollution/particulate-matter-pm-basics
Dataset and Preparation
The data for this analysis was taken from the “Historic Air Quality” data set provided by the the United States Environmental Protection Agency (EPA), which is available in Google Big Query. The EPA plays a crucial role in safeguarding public health and the environment by setting national air quality standards. The EPA's data includes annual summaries and detailed hourly and daily measurements for various categories, such as criteria gases, particulates, meteorological factors, and toxic substances. The EPA's datasets, starting from 1990, are regularly updated to reflect the most recent data. The updates occur twice a year: in June, the complete data for the previous year is updated.
Tools and Technologies:
For this analysis, I used Google BigQuery, SQL, and Google Sheets. BigQuery is great for handling large datasets and running complex queries quickly. SQL was the go-to for manipulating and querying the data, as well as for the creation of tables. Sheets was used to perform final analysis and to create visualizations.
Data Extraction and table creation
To facilitate the analysis, I created separate tables for each of the following years: 2002, 2007, 2012, 2017, and 2021. 5-year increments were chosen to present regularly intervaled windows during the period of analysis. Data from 2021 was substituted for 2022 data, as available data from the later year was incomplete. Each table contains daily AQI measurements for every municipality within the Pittsburgh metropolitan area. The structure of these tables ensures that data is segmented by year, allowing for year-over-year comparisons and trend analysis.
The SQL queries grouped data by city and date, and calculated averages. I also created a table containing all the averaged daily AQI data from 1999 to 2022, broken down by municipality or township. All daily AQIs were averaged per day, per municipality or township, as some days contained multiple measurements for the same day.
Further, I created a table containing the average daily AQI per year, per municipality or township. This allowed for analysis of year-on-year trends, broken down by local administrative unit. Additionally, a table was created with the average AQI for the entire metropolitan area, as aggregated from cities, per year for the entire period
So, in a nutshell, I pulled data, combined it, and calculated averages to understand how air quality has changed over time. This helped in visualizing and analyzing the trends in the air quality of the Pittsburgh area.
Data Aggregation:
The daily AQI data was aggregated to calculate yearly averages and other summary statistics. This aggregation helps understand long-term trends and seasonal variations in air quality across different municipalities.
Data Validation:
I performed data validation checks to ensure the accuracy and completeness of the extracted data. This included verifying the consistency of AQI values and cross-referencing with known benchmarks to confirm data integrity.
Analysis and Findings
The headline is this: Pittsburgh’s air quality in terms of PM2.5 concentrations in the atmosphere has improved since 1999.
Another key takeaway is that PM2.5 air quality varied relatively little across the metropolitan area, based on data average across the period of analysis. On a scale of 0 to 500, the 20-year average of all for every city was between 40 and 54. This indicates that localized factors like median income levels, municipal policy, and the like have a negligible impact of air quality across localities within the Pittsburgh metro area.
Scale
51 - 100 Moderate
101 - 150 Unhealthy for Sensitive Groups
151 - 200 Unhealthy
By comparing the entire data set from 1999 - 2022 for the Pittsburgh metro region against the EPA air quality rating scale above, we can see that the AQI was Moderate just over one third of all days. Only 1.4% of all days were considered unhealthy for sensitive groups. Pittsburgh’s air quality was considered acceptable 98.6% of the time, with a solid majority of those days being ranked as good air quality days.
Number of Days per AQI ranking, 1999 - 2022
Good 42,912
Moderate 23,256
Moderately Unhealthy 965
Unhealthy 160
When we zoom in on the last ten years of the period of analysis, 2012 - 2022, we can see that the number of good air quality days has risen to 75%. Only 8 days in the entire 10 year period were considered unhealthy for the general population
Number of Days per AQI ranking, 2012- 2022
Good 28,465
Moderate 9,073
Moderately Unhealthy 80
Unhealthy 8
A more puzzling, but intriguing pattern also emerges from the data. AQI averages spike between approximately June and August of each year, with the strongest patterns being visible in the data from 2002 and 2021 over the interval years analyzed. Conversely, PM2.5 concentrations drop around April, and again around September and October.
Due to my own limitations, not being a climate scientist, I was not able to determine the cause of these seasonal fluctuations. What I can recommend, however, is that persons with lung conditions or similar respiratory concerns may need to exercise additional caution during the summer months. According to airnow.gov, a US government website of air quality, moderate AQI conditions (between 51-100) mean “Air quality is acceptable. However, there may be a risk for some people, particularly those who are unusually sensitive to air pollution.” In Pittsburgh, AQI is most often above 51 between June and August.
Potential Weaknesses
For this project, for the purposes of my own learning and practice, I have only utilized SQL and Google Sheets to perform basic analysis of a very large data set. More in-depth analysis could be performed using other tools, such as Python, R, Tableau, and the like. However, this is not a scientific report, and should not be taken as such.
Conclusion and Recommendations
In summary, Pittsburgh is actually doing quite well in terms of it’s air quality. While one would hope that all days, or nearly all days, would be considered “good” air quality days, the metro area has made huge strides in the past few decades. In a city once synonymous with soot-covered buildings, dirty air, and polluted rivers, Pittsburgh has come a long way. There is still room for improvement, as ideally, we would have an even greater number of days with good air quality.
By looking at the seasonal trends in AQI, it seems that the greatest number of moderate to moderately unhealthy days are in the summer months. This leaves us with a key question: Why are the summer months the worst for air quality? These are the rainiest months in Pittsburgh, which in principle should reduce the concentration of pm25 particles in the atmosphere. Yet, the opposite is taking place.