Community Detection Based on Structure and Content: A Content Propagation Perspective

Abstract

With the recent advances in information networks, the problem of identifying group structure or communities has received a significant amount of attention. Most of the existing principles of community detection or clustering mainly focus on either the topological structure of a network or the node attributes separately, while both of the two aspects provide valuable information to characterize the nature of communities. In this paper we combine the topological structure of a network as well as the content information of nodes in the task of detecting communities in information networks. Specifically, we treat a network as a dynamic system and consider its community structure as a consequence of interactions among nodes. To model the interactions we introduce the principle of content propagation and integrate the aspects of structure and content in a network naturally. We further describe the interactions among nodes in two different ways, including a linear model to approximate influence propagation, and modeling the interactions directly with random walk. Based on interaction modeling, the nature of communities is described by analyzing the stable status of the dynamic system. Extensive experimental results on benchmark datasets demonstrate the superiority of the proposed framework over the state of the art.

Publication
the 2015 IEEE International Conference on Data Mining (ICDM 2015)
Avatar
Liyuan Liu
Senior Researcher @ MSR

Understand the underlying mechanism of pretraining heuristics.