THESIS
2012
xii, 55 p. : ill. ; 30 cm
Abstract
Detecting clusters or communities in complex networks is a hot topic in machine learning and data mining. Many community detection algorithms that discover disjoint or non-overlapping communities have been proposed. However, the community structure of many networks consists of overlapping communities, making many community detection methods based on the overly strong assumption of having non-overlapping communities inadequate for practical use. Along the recent research direction of extending community detection methods for overlapping communities, we propose a probabilistic model which infers the latent membership vector for each entity in the network for its robustness especially when dealing with sparse networks. Computational advantages come from the use of conjugate priors for the...[
Read more ]
Detecting clusters or communities in complex networks is a hot topic in machine learning and data mining. Many community detection algorithms that discover disjoint or non-overlapping communities have been proposed. However, the community structure of many networks consists of overlapping communities, making many community detection methods based on the overly strong assumption of having non-overlapping communities inadequate for practical use. Along the recent research direction of extending community detection methods for overlapping communities, we propose a probabilistic model which infers the latent membership vector for each entity in the network for its robustness especially when dealing with sparse networks. Computational advantages come from the use of conjugate priors for the latent variables and parameters in the model. We seek to find the maximum a posteriori (MAP) estimate of the latent membership matrix for simplicity and efficiency. Moreover, compared with a related recent model, the likelihood function in our model is more suitable for many network datasets and applications.
In addition to detecting overlapping communities in networks with only non-negative links, we also consider an extension of our model that takes into account the existence of negative links. Networks that contain both positive and negative links are called signed networks. While positive links may represent the similarity relationship among network entities, negative links display the dissimilarity relationship. This extension amounts to generalizing the expression of the likelihood term to adapt to the problem of overlapping community detection in signed networks.
For experimental validation, we report extensive experiments on small toy datasets as well as larger synthetic datasets to compare our model against a closely related probabilistic method for overlapping community detection in unsigned networks. We compared the signed version of our method against a closely related baseline proposed by ourselves. Moreover, we also perform experiments on a few real-world datasets in the comparative study of overlapping community detection in unsigned networks.
Post a Comment