没有太多光谱聚类的经验,只是按照文档进行(结果请跳到最后!)以下内容:

代码:import numpy as np

import networkx as nx

from sklearn.cluster import SpectralClustering

from sklearn import metrics

np.random.seed(1)

# Get your mentioned graph

G = nx.karate_club_graph()

# Get ground-truth: club-labels -> transform to 0/1 np-array

# (possible overcomplicated networkx usage here)

gt_dict = nx.get_node_attributes(G, 'club')

gt = [gt_dict[i] for i in G.nodes()]

gt = np.array([0 if i == 'Mr. Hi' else 1 for i in gt])

# Get adjacency-matrix as numpy-array

adj_mat = nx.to_numpy_matrix(G)

print('ground truth')

print(gt)

# Cluster

sc = SpectralClustering(2, affinity='precomputed', n_init=100)

sc.fit(adj_mat)

# Compare ground-truth and clustering-results

print('spectral clustering')

print(sc.labels_)

print('just for better-visualization: invert clusters (permutation)')

print(np.abs(sc.labels_ - 1))

# Calculate some clustering metrics

print(metrics.adjusted_rand_score(gt, sc.labels_))

print(metrics.adjusted_mutual_info_score(gt, sc.labels_))

输出:ground truth

[0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1]

spectral clustering

[1 1 0 1 1 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

just for better-visualization: invert clusters (permutation)

[0 0 1 0 0 0 0 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]

0.204094758281

0.271689477828

总体思路:

介绍here中的数据和任务:The nodes in the graph represent the 34 members in a college Karate club. (Zachary is a sociologist, and he was one of the members.) An edge between two nodes indicates that the two members spent significant time together outside normal club meetings. The dataset is interesting because while Zachary was collecting his data, there was a dispute in the Karate club, and it split into two factions: one led by “Mr. Hi”, and one led by “John A”. It turns out that using only the connectivity information (the edges), it is possible to recover the two factions.

使用sklearn&spectral集群解决此问题:If affinity is the adjacency matrix of a graph, this method can be used to find normalized graph cuts.

This将规范化图切割描述为:Find two disjoint partitions A and B of the vertices V of a graph, so

that A ∪ B = V and A ∩ B = ∅

Given a similarity measure w(i,j) between two vertices (e.g. identity

when they are connected) a cut value (and its normalized version) is defined as:

cut(A, B) = SUM u in A, v in B: w(u, v)

...

we seek the minimization of disassociation

between the groups A and B and the maximization of the association

within each group

听起来不错。因此,我们创建邻接矩阵(nx.to_numpy_matrix(G)),并将参数affinity设置为预计算的(因为邻接矩阵是我们预计算的相似性度量)。Alternatively, using precomputed, a user-provided affinity matrix can be used.

编辑:虽然对此不熟悉,但我查找了要调整的The strategy to use to assign labels in the embedding space. There are two ways to assign labels after the laplacian embedding. k-means can be applied and is a popular choice. But it can also be sensitive to initialization. Discretization is another approach which is less sensitive to random initialization.

所以尝试不那么敏感的方法:sc = SpectralClustering(2, affinity='precomputed', n_init=100, assign_labels='discretize')

输出:ground truth

[0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1]

spectral clustering

[0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1]

just for better-visualization: invert clusters (permutation)

[1 1 0 1 1 1 1 1 0 0 1 1 1 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0]

0.771725032425

0.722546051351

这是一个非常符合实际的事实!

谱聚类python代码_python中的谱聚类图相关推荐

  1. 层次聚类python实现_Python机器学习——Agglomerative层次聚类

    层次聚类(hierarchical clustering)可在不同层次上对数据集进行划分,形成树状的聚类结构.AggregativeClustering是一种常用的层次聚类算法. 其原理是:最初将每个 ...

  2. 谱聚类python代码_Python 谱聚类算法从零开始

    谱聚类算法是一种常用的无监督机器学习算法,其性能优于其他聚类方法. 此外,谱聚类实现起来非常简单,并且可以通过标准线性代数方法有效地求解. 在谱聚类算法中,根据数据点之间的相似性而不是k-均值中的绝对 ...

  3. js如何运行python代码_python中执行javascript代码

    python中执行javascript代码: 1.安装相应的库,我使用的是PyV8 2.import PyV8 ctxt = PyV8.JSContext() ctxt.enter() func = ...

  4. 层次聚类python代码_python实现层次聚类

    BAFIMINARMTO BA0662877255412996 FI6620295468268400 MI8772950754564138 NA2554687540219869 RM412268564 ...

  5. 支持向量机python代码_Python中的支持向量机SVM的使用(有实例)

    除了在Matlab中使用PRTools工具箱中的svm算法,Python中一样可以使用支持向量机做分类.因为Python中的sklearn库也集成了SVM算法,本文的运行环境是Pycharm. 一.导 ...

  6. 谱聚类Python代码详解

    谱聚类算法步骤 整体来说,谱聚类算法要做的就是先求出相似性矩阵,然后对该矩阵归一化运算,之后求前个特征向量,最后运用K-means算法分类. 实际上,谱聚类要做的事情其实就是将高维度的数据,以特征向量 ...

  7. python字符集_PYTHON 中的字符集

    Python中的字符编码是个老生常谈的话题,今天来梳理一下相关知识,希望给其他人些许帮助. Python2的 默认编码 是ASCII,不能识别中文字符,需要显式指定字符编码:Python3的 默认编码 ...

  8. 用Python代码实现视频转gif动图

    下面是一个使用 Python 代码实现视频转 gif 动图的简单示例: import imageio# 读取视频文件 video = imageio.get_reader('input.mp4')# ...

  9. python层次聚类_python中做层次聚类,使用scipy.cluster.hierarchy.fclusterdata方法 | 学步园...

    python机器学习包里面的cluster提供了很多聚类 但是没有看明白ward_tree的返回值代表了什么含义,遂决定寻找别的实现方式. 经过查找,发现scipy.cluster.hierarchy ...

最新文章

  1. Centos7多内核情况下修改默认启动内核方法
  2. loadrunner关联点总结
  3. Kubernetes Service 对象的使用
  4. 关于产品推荐的10个问题
  5. 【渝粤教育】广东开放大学 网络市场与预测 形成性考核 (23)
  6. 如何在Mac的内置词典中添加和删除单词
  7. 领域驱动设计系列 (六):CQRS
  8. 8.2捷联惯导算法仿真 代码整理分析(一)
  9. Project 3 :Python爬虫源码实现抓取1000条西刺免费代理IP-HTTPS并保存读取
  10. 一款在Linux下运行Android应用的软件——xDroid
  11. 前端缓存方法实现—cookie/sessionStorage/localStorage
  12. C语言 空气质量优良率
  13. 多台计算机使用一个硬盘,怎么实现多台电脑共用一块硬盘
  14. 【解决方案】谈公众号红包的正确打开方式--传奇创世
  15. Lamda C++11
  16. JS逆向之人口流动态势
  17. 书包网小说多线程爬虫
  18. Raspbian命令行安装desktop界面
  19. Vue全家桶之webpack详解(四)
  20. Configuration Manager 2012 R2基础知识

热门文章

  1. YOLOX训练C盘爆满解决方案
  2. SQL窗口函数OVER详细用法,一学就会
  3. 计算机内存加速,最简单有效的提速——增加内存
  4. 【《OpenCV3编程入门》内容简介勘误配套源代码下载
  5. 脉脉疯传!2023年程序员生存指南;多款prompt效率加倍工具;提示工程师最全秘籍;AI裁员正在发生 | ShowMeAI日报
  6. 计算机网络第4章 网络层(自整理万字图文笔记)
  7. VMware“花屏”的解决方法
  8. 数据库课程设计心得【1】
  9. Spring配置数据源没有maxActive和maxWait参数解决方法
  10. 分布java开发_java分布服务:我打赌,没人可以这么精短的讲出分布服务架构吧...