THESIS
2015
ix, 44 pages : illustrations ; 30 cm
Abstract
Class membership probability estimates are important for many applications, especially Click-Through Rate (CTR) prediction in online advertising, in which classification outputs are combined
with other sources, such as bid price, for decision-making. Existing calibration models
can well learn a mapping function from predicted probabilities to empirical CTRs and thus reduce
the systematic bias (the differences between the average predicted and observed CTRs
on some slices of data). Yet, current methods have some theoretical issues and the classifier
used in display advertising has some special properties. In this thesis, in order to address those
limitations, we propose a model, called Calibration Trees (CT) as a post-processing to calibrate
the bias of predictions. CT is scalable...[
Read more ]
Class membership probability estimates are important for many applications, especially Click-Through Rate (CTR) prediction in online advertising, in which classification outputs are combined
with other sources, such as bid price, for decision-making. Existing calibration models
can well learn a mapping function from predicted probabilities to empirical CTRs and thus reduce
the systematic bias (the differences between the average predicted and observed CTRs
on some slices of data). Yet, current methods have some theoretical issues and the classifier
used in display advertising has some special properties. In this thesis, in order to address those
limitations, we propose a model, called Calibration Trees (CT) as a post-processing to calibrate
the bias of predictions. CT is scalable to large-scale data and robust for extremely imbalanced
data. The experimental results on two data sets of display advertising systems show that our
model significantly outperforms the state-of-the-art calibration models in terms of accuracy and
well-calibrated properties. An advanced version of CT, called Calibration Forest, also allows
implementation in a distributed system and further improves the performance of predictions.
Post a Comment