THESIS
2014
ix, 51 pages : illustrations ; 30 cm
Abstract
When our life is becoming more and more inseparable with the social network,
our judgment also becomes easier to be affected by the online community. In the
virtual Internet, when people see the opinion they are agree with is widely welcomed,
they will actively participate in. On the other hand, if they find that there is no one or
only few people care about a certain opinion, even if they are agree with this opinion,
they will keep silent. Nowadays, a large amount of paid posters exist in social network
are hired to affect the others opinions. These accounts, so-called water army, are
often hired by PR companies to post specific content in social network and get
reward.
As the former work mainly detect water army from individual account perspective,
this work investigates int...[
Read more ]
When our life is becoming more and more inseparable with the social network,
our judgment also becomes easier to be affected by the online community. In the
virtual Internet, when people see the opinion they are agree with is widely welcomed,
they will actively participate in. On the other hand, if they find that there is no one or
only few people care about a certain opinion, even if they are agree with this opinion,
they will keep silent. Nowadays, a large amount of paid posters exist in social network
are hired to affect the others opinions. These accounts, so-called water army, are
often hired by PR companies to post specific content in social network and get
reward.
As the former work mainly detect water army from individual account perspective,
this work investigates into collective detection approach. We clustered the posts
collected from a water army activity into groups and label them based on their nature:
normal user groups, water army groups and traditional advertise groups. Five group
features, such as tweet interval of time (TIT), group member diversity (GMD) and
three group member features, such as group reputation value (GRV) are used as
attributes of machine learning process for classifying. Our strategy succeeds at
detecting much of the water army groups while only a small percentage of normal
user groups are misclassified. Approximately 70% of water army groups and 98% of
normal user groups were correctly classified. Our results also highlight the most
important attributes for water army detection on Sina micro-blog. In the end, we also
use the same approach to detect traditional advertise groups from normal user groups
and the result clearly indicates the difference between water army groups and
traditional advertise group.
Post a Comment