ISBN-13: 9783659333224 / Angielski / Miękka / 2015 / 96 str.
In this work, a new similarity measure called SSM (Sequence similarity measure) is developed that shows the impact of clustering when both sequence and content information is incorporated while computing similarity . SSM-DBSCAN and SSM-OPTICS, which are the extended versions of DBSCAN and OPTICS clustering techniques. SSM (Sequence similarity measure ) is used to capture both the order of occurrence of page visits and page information itself. The results in turn are compared with Euclidean, Jaccard, Cosine, and Fuzzy similarity measures.A variety of experiments are performed on MSNBC.COM website which is a free news data website with multiple categories of news in the context of density based clustering, using existing methods( DBSCAN and OPTICS) and newly developed SSM-DBSCAN and SSM-OPTICS. This SSM new similarity measure has significant results when compared to other existing similarity/distance measures. Good time requirements of the newly developed algorithms are as well represented.
In this work, a new similarity measure called SSM (Sequence similarity measure) is developed that shows the impact of clustering when both sequence and content information is incorporated while computing similarity . SSM-DBSCAN and SSM-OPTICS, which are the extended versions of DBSCAN and OPTICS clustering techniques. SSM (Sequence similarity measure ) is used to capture both the order of occurrence of page visits and page information itself. The results in turn are compared with Euclidean, Jaccard, Cosine, and Fuzzy similarity measures.A variety of experiments are performed on MSNBC.COM website which is a free news data website with multiple categories of news in the context of density based clustering , using existing methods( DBSCAN and OPTICS) and newly developed SSM-DBSCAN and SSM-OPTICS. This SSM new similarity measure has significant results when compared to other existing similarity/distance measures. Good time requirements of the newly developed algorithms are as well represented.