THESIS
2020
1 online resource (viii, 33 pages) : illustrations (some color)
Abstract
K-nearest neighbors (KNN) has been successfully used for recommendation, but querying neighbors of high quality is nearly impossible when the feature space is small and has
limited training data. However, due to privacy requirements and government policies,
directly transferring data from one data owner to another is not workable. Therefore, we
propose a novel KNN approach, secured federated KNN (SF-KNN), that takes privacy
requirements into consideration and builds a federated model to gain global neighbors
with joint parties, in order to improve the model performance. Specifically, it empowers
the parties to train high-quality models with little data. More importantly, it makes
cross-domain training possible. We implement SF-KNN on Euclidean and cosine metrics
using user-based and ite...[
Read more ]
K-nearest neighbors (KNN) has been successfully used for recommendation, but querying neighbors of high quality is nearly impossible when the feature space is small and has
limited training data. However, due to privacy requirements and government policies,
directly transferring data from one data owner to another is not workable. Therefore, we
propose a novel KNN approach, secured federated KNN (SF-KNN), that takes privacy
requirements into consideration and builds a federated model to gain global neighbors
with joint parties, in order to improve the model performance. Specifically, it empowers
the parties to train high-quality models with little data. More importantly, it makes
cross-domain training possible. We implement SF-KNN on Euclidean and cosine metrics
using user-based and item-based methods. In our experiment, we evaluate the proposed
SF-KNN on three data sources, MovieLens, Netflix, and Amazon, and several diverse
domains, movies, books, clothes, jewellery and food, by comparing it against various
baselines. The experiment results indicate that SF-KNN is able to learn more precise
neighbors than a local KNN trained by parties individually. In general, it outperforms
the local KNN on all of the datasets, reaching 8.24% average accuracy gain on the Euclidean metric and 4.40% on the cosine metric when simulating 10 parties across all data
sources.
Post a Comment