Network data analysis and its applications

HKUST Electronic Theses

Network data analysis and its applications

by Xianshi Yu

THESIS 2018

Ph.D. Mathematics

xvi, 99 pages : illustrations ; 30 cm

Abstract

In this thesis, I study two problems about network data. The first problem is about network community detection and the second problem is about the application of network data in recommender systems.

In the first part of this thesis, we propose a new method for network community detection, with an emphasis on sparse networks. Previous methods focused mainly on dense scenarios, where the average degree of nodes E(degree) = Ω(log n). Recently, sparse networks have received increasing attention amongst researchers. The new proposed method uses the symmetrized Laplacian inverse matrix (SLIM) to measure the closeness between nodes. The idea comes from the first hitting time in random walk, and it also has a nice interpretation in diffusion maps. Community membership is acquired by applying the spectral method to the SLIM. We show that the SLIM ensures the misclassification rate of order O(log(max nP_i,j)/max nP_i,j), for the stochastic block model (SBM) as sparse as E(degree) → ∞, with edge probability matrix P. The SLIM outperforms other methods in many simulations and real datasets. It's robust with respect to the choice of tuning parameter, in contrast to normalized spectral clustering with regularization.

In the second part of this thesis, we present the NetRec method to leverage the network data in collaborative filtering (CF). The recent prevalence of online social networks provides valuable side information to the technique of CF. Previous efforts to exploit network data lack theoretical justifications that measure the improvement achieved by adding the relational data. Moreover, all of them find a local optimum of the proposed objective function. In this thesis, we formulate a convex optimization problem by including a new network related term, which is particularly designed to confine the introduced bias. Our main results demonstrate that the inclusion of the designed term leads to a sharper error bound than before, as long as the observed users social network is consistent with their preferences and is well structured. We point out that the larger the noise in the observed user preferences, the larger the reduction in the level of error bound. Moreover, our theory shows that the combination of the network related term and the nuclear norm term obtains estimates better than those achieved by any of them alone. We provide an algorithm to solve our new objective function and prove that it is guaranteed to find the global optimum. Both simulations and real data experiments are carried out to validate our theoretical findings. Data on Yelp.com is used to test the new method and substantial improvements achieved by using the social network data are observed.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Mathematics Supervisors Jing, Bing-Yi Authors Yu, Xianshi Subjects Social networks Mathematical models Recommender systems (Information filtering) Data processing Communication Network analysis Signal processing Information society Language English Call number Thesis MATH 2018 Yu DOI 10.14711/thesis-991012655568503412

Full record

Network data analysis and its applications

by Xianshi Yu

Post a Comment Cancel reply