Minimum exposure approach for trustworthy vertical federated learning

HKUST Electronic Theses

Minimum exposure approach for trustworthy vertical federated learning

by Dashan Gao

THESIS 2025

Ph.D. Computer Science and Engineering

1 online resource (xviii, 197 pages) : illustrations (chiefly color)

Abstract

As artificial intelligence continues to permeate various domains, the challenges of data scarcity and privacy preservation become increasingly significant. Federated Learning (FL) provides a collaborative model training framework across organizations while protecting data privacy. Specifically, Vertical Federated Learning (VFL) tackles unique challenges arising from vertically partitioned data among participants. However, VFL encounters heightened privacy risks and inefficiencies stemming from prevalent data and model information exposures. This thesis proposes a minimum-exposure approach for trustworthy VFL, strategically identifying and exposing only the minimum-necessary information, thereby optimizing the trade-offs among multiple objectives of trustworthiness, including privacy, utility, robustness, and efficiency. This thesis categorizes information exposure into data exposure and model parameter exposure. First, we address intra-sample label exposure in VFL with a two-phase framework: offline-phase cleansing and training-phase perturbation. Our proposed Label Privacy Source Coding (LPSC) encodes the minimum-necessary label information in the offline phase. Then, we employ adversarial training to enhance privacy during training. Second, we further explore a more challenging VFL scenario with arbitrarily-aligned samples across parties. To tackle this challenge, we introduce the Complementary Knowledge Distillation (CKD) framework, which facilitates privacy-preserving knowledge transfer among passive parties by minimizing intra-sample information exposure. Third, we address inter-sample information exposure by proposing a secure vertical federated dataset condensation (VFDC) framework. This framework efficiently condenses the entire real dataset in VFL to a small synthetic dataset, reducing inter-sample information exposure that could compromise privacy while maintaining model utility. Finally, we tackle model parameter exposure in heterogeneous federated transfer learning with a privacy-preserving framework, PP-HFTL, to securely transfer knowledge using cryptographic methods. PP-HFTL proposes a model integration method to reduce model parameter exposure and allow local model inference, thereby eliminating the need for secure cross-party inference. Extensive experiments on real-world datasets demonstrate the effectiveness and efficiency of our approaches, outperforming existing baselines in various objectives.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Chen, Kai Yang, Qiang Authors Gao, Dashan Language English Call number Thesis CSE 2025 Gao DOI 10.14711/thesis-991013426071603412

Full record

Minimum exposure approach for trustworthy vertical federated learning

by Dashan Gao

Post a Comment Cancel reply