THESIS
2023
1 online resource (viii, 49 pages) : color illustrations
Abstract
Automated algorithmic decisions are widely used by organizations on a daily basis and
thus developing efficient methods is of top priority. Organizational efficiency is a sophisticated
target since there are multiple aspects that need to be optimized simultaneously,
which include but are not limited to algorithmic fairness, i.e. minimize systemic bias
against certain disadvantageous social groups, and economic efficiency, minimize the cost
induced by wrong decisions.
With these targets in mind, we utilize a dual-focused Neyman-Pearson (NP) classification
paradigm, which seeks minimal type II error under simultaneous control over both
of the type I error and bias against fairness. By leveraging a LDA model, we develop for
the first time an oracle framework for dual-focused NP classificat...[
Read more ]
Automated algorithmic decisions are widely used by organizations on a daily basis and
thus developing efficient methods is of top priority. Organizational efficiency is a sophisticated
target since there are multiple aspects that need to be optimized simultaneously,
which include but are not limited to algorithmic fairness, i.e. minimize systemic bias
against certain disadvantageous social groups, and economic efficiency, minimize the cost
induced by wrong decisions.
With these targets in mind, we utilize a dual-focused Neyman-Pearson (NP) classification
paradigm, which seeks minimal type II error under simultaneous control over both
of the type I error and bias against fairness. By leveraging a LDA model, we develop for
the first time an oracle framework for dual-focused NP classification. In particular, we
propose finite-sample based classifiers that satisfy at population-level both the fairness
constraint and type I error constraint with high probability, and derive oracle bounds on
the excess type II error. It is worth noting that the new classifier does not require sample
splitting, which was necessary for most of the existing NP methods, and thus further
increase data efficiency. Numerical and real data analyses demonstrate its superiority.
Post a Comment