Trace metals (e.g., Cu, Mn, and Fe) in fine particulate matter (PM
2.5) are notorious for their adverse
health effects on human beings. Studies imply that they play significant roles in the production of
reactive oxygen species when inhaled into lungs and that the consequential oxidative stress is one
probable cause for PM
2.5-induced organ injuries. As a result, it is of great scientific interest to investigate
the spatial distributions as well as major sources of trace metals in PM
2.5. This thesis aims at presenting
an observation-constrained hybrid model for regional source apportionment of trace metals in PM
2.5,
where observation data are infused with source contributions to primary PM
2.5 (PPM
2.5) from the
Community Multiscale Air Quality (CMAQ) Model. As parts of the hybrid model, tw...[
Read more ]
Trace metals (e.g., Cu, Mn, and Fe) in fine particulate matter (PM
2.5) are notorious for their adverse
health effects on human beings. Studies imply that they play significant roles in the production of
reactive oxygen species when inhaled into lungs and that the consequential oxidative stress is one
probable cause for PM
2.5-induced organ injuries. As a result, it is of great scientific interest to investigate
the spatial distributions as well as major sources of trace metals in PM
2.5. This thesis aims at presenting
an observation-constrained hybrid model for regional source apportionment of trace metals in PM
2.5,
where observation data are infused with source contributions to primary PM
2.5 (PPM
2.5) from the
Community Multiscale Air Quality (CMAQ) Model. As parts of the hybrid model, two novel statistical
methods are proposed separately after seeing the inadequacy of traditional techniques. Their
applicability is tested with different real-world cases.
First, we describe a Bayesian inference (BI) approach to determine the primary organic carbon
(POC) and secondary organic carbon (SOC) levels using only major species measurement data, i.e.,
organic carbon (OC), elemental carbon (EC), and secondary inorganic aerosol (SIA) species. Traditional
methods to quantify POC and SOC in observation data such as Positive Matrix Factorization (PMF)
largely rely on the availability of source-specific organic tracer measurements. Other techniques that do
not require such extensive measurements would first find a proper POC/EC ratio, using either minimum
OC/EC ratio (MIN) principle or minimum R squared (MRS) method. Here, our BI model determines
proper POC/EC and SOC/SIA values using concentration and uncertainty data of major species. Markov
Chain Monte Carlo technique is employed to numerically estimate the optimal ratio values. Two case
studies are conducted to test its applicability. One composes of filter-based daily observation data made
in the Pearl River Delta (PRD) region, China during 2012, while the other uses online measurement data
recorded at Dianshan Lake (DSL) monitoring site in Shanghai in wintertime 2019. In both cases, source-specific
organic tracer measurements are present so that PMF analysis is performed with results reported
in previous publications, which allows us to regard PMF-resolved POC and SOC as references for model evaluation. Meanwhile, traditional techniques such as elemental carbon tracer methods (both MIN and
MRS) and multiple linear regression (MLR) are also employed. MLR and BI methods are further
specified into 4 scenarios depending on tracer(s) used for SOC, hence 10 models in total. In both case
studies, BI models have shown significant advantages in estimating POC and SOC levels in terms of its
comparability to PMF results. For PRD 2012 data, BI model with sulfate as SOC tracer, denoted as BISO4,
gives the best correlation R for POC (0.852) and SOC (0.926) among all models to evaluate, while
BI-NH4 model has the smallest mean fractional errors (MFEs) for POC (0.244) and SOC (0.385). For
DSL 2019 dataset, BI-SO4 model returns the lowest errors for both POC (0.275) and SOC (0.260), while
the best correlations for POC (0.880) and SOC (0.907) are seen in BI-NH4 model outcomes. We
conclude that BI-SO4 is the most reliable one given the higher measurement quality of sulfate data. In
the hybrid model, the deduction of primary PM
2.5 (PPM
2.5) concentration in ambient measurement data
is predicated on the information of POC and SOC. It is then used to improve CMAQ model performance.
Second, we demonstrate a novel regression model using a log-transformed objective function to
better cope with the ubiquitous lognormality in atmospheric concentration data. First, we prove that
using a log-transformed objective function is equivalent to solving a regression model with
multiplicative log-normal error terms under a paradigm of maximum likelihood estimation (MLE). MLE
also allows us to estimate the estimation errors based on asymptotic normality. Second, we apply our
new model to a case study where one-year observation data of black carbon are regressed by PPM
2.5
source contributions from CMAQ at 12 background sites in 2017 across China. Results from our new
approach have better compatibility, while residual analysis of results from commonly used ordinary least
squares (OLS) method show clear heteroscedasticity, i.e., violation of its model assumption. Finally, the
new method is further demonstrated to have clear advantages in numerical simulation experiments of a
5-variable multiple linear regression model using synthesized data with prescribed coefficients and
lognormally distributed multiplicative errors. Under all 9 simulation scenarios, the new method yields
the most accurate estimations of the regression coefficients and has significantly higher coverage
probability (on average, 95% for all five coefficients) than OLS (79%) and weighted least square (WLS,
72%) methods. This new technique provides a powerful statistical tool to solve the regression equations
in the hybrid model.
Finally, we discuss the model details of the observation-constrained hybrid model. In this method,
source contributions to PPM
2.5 from the CMAQ Model at each monitoring location are first improved to
align better with the PPM
2.5 levels from observation data by applying source-specific scaling factors.
These factors are estimated from a regularized regression method, which is an improved version of the
regression model introduced before. The adjusted PPM
2.5 predictions and speciation measurements are
then used to generate region-specific observation-based source profiles of primary species (i.e., trace
metals, EC, and POC) using regression. Lastly, spatial distributions of the source contributions are
produced by multiplying the improved CMAQ PPM
2.5 contributions with the deduced source profiles.
The model is applied to the PRD Region, China using data collected at multiple stations in 2015 to resolve source contributions to 18 elements, EC, and POC. Including the penalty term in the multilinear
regression leads to more scaling factors for the modeled PPM
2.5. The source profiles determined in this
study compare well with those collected from the literature. In terms of the source apportionment results
in the PRD region, Cu is mainly from the area sources (37.9%), power generation (31.3%), and industry
sector (14.0%), with annual average concentration as high as 50 ng m
-3 in some districts. Meanwhile,
major contributors to Mn are sources outside PRD (25.7%), power plant (25.4%), and marine vessel
(14.9%) emissions, leading to a mean level of around 10 ng m
-3.
Post a Comment