یک راه‌حل افزایشی جهت خوشه‌بندی محتوایی- ساختاری یک گراف

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشگاه جامع امام حسین (ع)

2 استادیار دانشگاه جامع امام حسین (ع)

3 دانشگاه علم و صنعت ایران

چکیده

خوشه‌بندی گره‌های گراف از جنبه ساختاری یا محتوایی، همواره موردتوجه پژوهشگران حوزه داده‌کاوی بوده است؛ اما به خوشه‌بندی گراف بر مبنای ساختار و محتوا به‌طور ترکیبی کمتر توجه شده است. با توجه به نیاز خوشه‌بندی ساختاری-محتوایی در شبکه‌های اطلاعاتی که شبکه‌های اجتماعی نمونه‌ای از آن‌هاست، در این مقاله الگوریتم خوشه‌بندی ICS-Cluster ارائه‌شده که هر دو جنبه ساختار و محتوا را به‌صورت هم‌زمان در نظر می‌گیرد. هدف این روش، رسیدن به خوشه‌هایی با ساختار درونی منسجم (ساختاری) و مقادیر ویژگی (محتوایی) همگن در گراف است. در این روش ابتدا گراف اولیه به یک گراف ساختاری-محتوایی تبدیل می‌شود که در آن وزن هر یال (ارتباط) بیانگر شباهت ساختاری-محتوایی دو گره (موجودیت) است. خوشه‌بندی با توجه به وزن یال‌ها به‌صورت افزایشی انجام می‌شود بدین معنا که گره‌های یالِ با وزن بالا به‌عنوان خوشه در نظر گرفته می‌شوند و وزن یال‌های متصل به خوشه با یکدیگر ادغام‌شده و به‌صورت یک یال متصل به خوشه در نظر گرفته می‌شوند، این مراحل تا زمانی که الگوریتم به تعداد خوشه موردنظر کاربر برسد، ادامه خواهد یافت. الگوریتم       ICS-Cluster به هر تعداد خوشه که مدنظر کاربر است، گراف را خوشه‌بندی می‌کند. مقایسه الگوریتم مطرح‌شده با سه الگوریتم خوشه‌بندی ساختاری- محتوایی ارائه‌شده، بر اساس معیارهای شش‌گانه سنجش کیفیت خوشه، بیانگر عملکرد مناسب روش ICS-Cluster است. این معیارها معیارهای ساختاریِ تراکم خوشه، خطای یال و پیمانگی، معیار محتواییِ میانگین شباهت، معیار ساختاری-محتوایی CS-Measure و زمان اجرای روش‌ها است.

کلیدواژه‌ها


عنوان مقاله [English]

An Incremental Solution for Content-Structural Graph Clustering

نویسندگان [English]

  • S. Keshvari 1
  • M. Javadzadeh 2
  • H. Naderi 3
1 Department of Information and Communication Technologyو Imam Hossein University.
2 Assistant Professor, Imam Hossein University
3 Assistant Professor, Iran University of Science & Technology
چکیده [English]

Researchers have always been interested in graph nodes clustering based on content or structure. But less attention has been paid to clustering based on both structure and content. But a content-structural clustering is needed in information networks like social networks. In this paper, the ICS-Cluster algorithm is proposed which takes into consideration both the structure and content aspects of the nodes. The purpose of this approach is to gain a coherent internal structure (structural aspect) and homogeneous attribute values (content aspect) in the graph. In this approach firstly the graph is converted into a content-structural graph which edges’ weight show similarity between the connected nodes. Incremental clustering is done based on edges’ weight in this process the edges with the most weight is considered as clusters then the weight of connected edge to the cluster is aggregated and they’ll be one edge, the process is repeated until the algorithm reaches the number of clusters that indicated by the user. ICS-Cluster algorithm number of cluster is indicated by the user. Comparing ICS-Cluster with other content structural algorithm based on six criteria for measuring cluster quality shows that ICS-Cluster has good performance. These criteria contain structural criteria (Modularity, Error Link, and Density), content criterion (Average Similarity), content-structural criterion (CS-Measure) and the run time.

کلیدواژه‌ها [English]

  • Graph Clustering
  • Content-Structural Clustering
  • Attributed Graph
  • Cluster Extraction
[1]     Sambhoos, K.; Nagi, R.; Sudit, M.; Stotz, A. “Enhancements to High Level Data Fusion Using Graph Matching and State Space Search”; Inform. Fusion 2010, 11, 4, 351-364.##
[2]     Zhou, L.; Wang, Q.; Fujita, H. “One Versus One Multi-Class Classification Fusion Using Optimizing Decision Directed Acyclic Graph for Predicting Listing Status of Companies”; Inform. Fusion 2017, 36, 80-89.##
[3]     Gross, G. A.; Nagi, R. “Precedence Tree Guided Search for the Efficient Identification of Multiple Situations of Interest – AND/OR Graph Matching”; Inform. Fusion 2016, 27, 240-254.##
[4]     Guedes, G. P.; Ogasawara, E.; Bezerra, E.; Xexeo, G. “Discovering Top-Non-Redundant Clustering in Attributed Graphs”; Neurocomputing 2016, 210, 45-54.##
[5]     Schoch, D.; Valente, T. W.; Brandes, U. “Correlations Among Centrality Indices and a Class of Uniquely Ranked Graphs”; Soc. Networks 50, 46-54.##
[6]     Johnson, M.; Paulusma, D.; Leeuwen, E. J. V. “Algorithms for Diversity and Clustering in Social Networks Through Dot Product Graphs”; Soc. Networks 2015, 41, 48-55.##
[7]     Ertem, Z.; Veremyev, A.; Butenko, S. “Detecting Large Cohesive Subgroups with High Clustering Coefficients in Social Networks”; Soc. Networks 2016, 46, 1-10.##
[8]     Vörös, A.; Snijders, T. A. B. “Cluster Analysis of Multiplex Networks: Defining Composite Network Measures”; Soc. Networks 2017, 49, 93-112.##
[9]     Opsahl, T.; Panzarasa, P. “Clustering in Weighted Networks”; Soc. Networks 2009, 31, 155-163.##
[10]  Aggarwal, C.; Wang, H. “Managing and Mining Graph Data”; Springer US, 2010.##
[11]  Patkar, S. B.; Narayanan, H. “An Efficient Practical Heuristic for Good Ratio-Cut Partitioning”; 16th Int. Conf. VLSI Design 2003, 1-6.##
[12]  Feldmann, A. E.; Foschini, L. “Balanced Partitions of Trees and Applications”; Algorithmica 2015, 71, 354-376.##
[13]  Newman, M. “Community Detection in Networks: Modularity Optimization and Maximum Likelihood Are Equivalent”; Phys. Rev. 2016, 94, 1-8.##
[14]  Bhatia, V.; Rani, R. “A Parallel Fuzzy Clustering Algorithm for Large Graphs Using PREGEL”; Expert Syst. Appl. 2017, 78, 135-144.##
[15]  Fortunatoa, S.; Hricb, D. “Community Detection in Networks: A User Quide”; Phys. Rep. 2016, 659, 1-44.##
[16]  Khatoon, M.; Banu, A. “A Survey on Community Detection Methods In Social Networks”; IJEME. 2015, 1, 8-18.##
[17]  Elhadi, H.; Agam, G. “Structure and Attributes Community Detection: Comparative Analysis of Composite Ensemble and Selection Methods”; Proc. 7th Workshop on Social Network Mining and Analysis 2013, 1-7.##
[18]  Harenberg, S.; Bello, G.; Gjeltema, L.; Ranshous, S.; Harlalka, J.; Seay, R.; Padmanabhan, K.; Samatova, N. “Community Detection in Large-Scale Networks: A Survey and Empirical Evaluation”; Computation Stat. 2014, 6, 426-439.##
[19]  Bu, Z.; Gao, G.; Li, H. J.; Cao, J. “CAMAS: A Cluster-Aware Multiagent System for Attributed Graph Clustering”; Inform. Fusion 2017, 37, 10-21.##
[20]  Weber L. M.; Robinson M. D. “Comparison of Clustering Methods for High-Dimensional Single-Cell Flow and Mass Cytometry Data”; Cold Spring Harbor Labs Journals 2016, 047613.##
[21]  Shchukin, V.; Khristich, D.; Galinskaya, I. “Word Clustering Approach to Bilingual Document Alignment”; First Conf. Machine Translation 2016, 2, 953-994.##
[22]  Skabar, A. “Clustering Mixed-Attribute Data Using Random Walk”; Procedia Comput. Sci. 2017, 108, 988-997.##
[23]  Xu, Z.; Cheng, J.; Xiao, X.; Fujimaki, R.; Muraoka, Y. “Efficient Nonparametric and Asymptotic Bayesian Model Selection Methods for Attributed Graph Clustering”; Knowl. Inform. Syst. 2017, 53, 239-268.##
[24]  Gross, G.; Nagi, R.; Sambhoos, K. “A Fuzzy Graph Matching Approach in Intelligence Analysis and Maintenance of Continuous Situational Awareness”; Inform. Fusion 2014, 18, 43-61.##
[25]  Boobalan, M. P.; Lopez, D.; Gao, X. Z. “Graph Clustering Using K-Neighbourhood Attribute Structural Similarity”; Appl. Soft Comput. 2016, 47, 216-223.##
[26]  Zhou, H.; Li, J.; Li, J.; Zhang, F.; Cui, Y. “A Graph Clustering Method for Community Detection in Complex Networks”; Physica A 2017, 469, 551-562.##
[27]  Bai, L.; Cheng, X.; Liang, J.; Guo, Y. “Fast Graph Clustering with a New Description Model for Community Detection”; Inform. Sci. 2017, 388, 37-47.##
[28]  Ruan, Y.; Fuhry, D.; Parthasarathy, S. “Efficient Community Detection in Large Networks Using Content and Links”; Proc. 22nd Int. Conf. World Wide Web WWW '13).  ACM, New York, NY, USA, 2013, 1089-1098.##
[29]  Zhou, Y.; Cheng, H.; Xu, Y. J. “Graph Clustering Based on Structural/Attribute Similarities”; Proc. VLDB Endowment 2009, 2, 718-729.##
[30]  Parimala, M.; Daphne, L. “Graph Clustering Based on Structural Attribute Neighborhood Similarity (SANS)”; IEEE Int. Conf. Elec. Comput. Commun. Technol. 2015, 1-5.##
[31]  Pool, S.; Bonchi, F.; Leeuwen, M. “Description-Driven Community Detection”; ACM Trans. Intel. Syst. Technol. 2014, 03, 201-210.##
[32]  Rahmati, K.; Naderi, H.; Keshvari, S. “Content-Structural Graph Clustering and a New Measure For Its Evaluation”; Adv. Defence Sci. Technol. 2018, 03, 201-210. (In Persian)##
[33]  Halloush, R. “Overhearing-Aware Modified Dijkstra's Algorithm for Multicasting over Multi-Hop Wireless Networks”; Int. J. Commun. Netw. Distrib. Syst. 2016, 16, 240-260.##
[34]  Pool, S.; Bonchi, F.; Leeuwen, M. “Description-Driven Community Detection”; ACM Trans. Intel. Syst. Technol. 2014, 5, 28.##
[35]  Wang, M.; Wang, Ch.; Xu, Y. J.; Zhang, J. “Community Detection in Social Networks: An In-Depth Benchmarking Study with a Procedure-Oriented Framework”; Proceedings of the VLDB Endowment 2015, 8, 998-1009.##
[36]  Yang, J.; Leskovec, J. “Defining and Evaluating Network Communities Based on Ground-Truth”; Knowl. Inf. Syst. 2015, 42, 181-213.##
[37]  Biswas, A.; Biswas, B. “Defining Quality Metrics for Graph Clustering Evaluation”; Expert Syst. Appl. 2017, 71, 1-17.##