【论文阅读】Learning with Hypergraphs: Clustering, Classification, and Embedding

人工智能89

目录

Abstract

我们通常赋予被调查对象成对的关系,这可以用图来说明。但是在许多现实问题中,我们感兴趣的对象之间的关系比成对的更复杂。如果简单地将复杂的关系压缩成两两关系,将不可避免地导致对我们的学习任务有价值的信息的丢失。
因此,我们考虑使用超图来完全表示我们感兴趣的对象之间的复杂关系,因此出现了用超图学习的问题。
我们在本文中的主要贡献是将最初在无向图上运行的强大的谱聚类方法推广到超图,并在 谱聚类方法的基础上进一步开发 超图嵌入和直推分类算法

Introduction

举个栗子:考虑把一组文章分成不同主题,而仅仅知道文章的作者。顶点表示文章,边表示作者。可以构造下面一个无向图:
【论文阅读】Learning with Hypergraphs: Clustering, Classification, and Embedding

例如v 3 v_3 v 3 ​表示文章3由三个作者共同撰写。可以看出,当作者写了三篇及更多的文章时,上图便无法表示了。而同一个作者写的文章很可能是同一个领域的,对于分类有一定的意义。

超图(Hypergraph) 相对于普通图而言,可以更加准确的描述存在多元关联的对象之间的关系。超图与普通图的主要不同在于图中边上顶点的个数的不同,在普通图中,一条边包含两个顶点,在超图中,边被称为 超边(hyperedge),一条超边包含多个顶点。如果一个超图中,所有的超边最多只包含两个顶点,那么超图就会退化为普通图。
使用超图就不会产生这种信息丢失:
【论文阅读】Learning with Hypergraphs: Clustering, Classification, and Embedding
e e e为超边,图中的e 2 e_2 e 2 ​表示作者写了v 5 、 v 6 、 v 7 v_5、v_6、v_7 v 5 ​、v 6 ​、v 7 ​四篇文章,v 6 v_6 v 6 ​由e 2 e_2 e 2 ​和e 3 e_3 e 3 ​两位作者共同撰写。

本文主要解决的问题为超图的分割(partition)。普通图中常常采用谱聚类实现,因此可以将谱聚类技术推过到超图中。

; Preliminaries

𝑉:节点集合,𝑒:超边𝑒上的节点集合,𝐸:超边集合,其关系可以表示为𝑈 ( 𝑒 ∈ 𝐸 ) = 𝑉 𝑈_{(𝑒∈𝐸)}=𝑉U (e ∈E )​=V
给超边赋予权重𝑤(𝑒)
可以将超图表示为𝐺=(𝑉,𝐸,𝑤),可以用一个|𝑉|∗|𝐸|的矩阵来表示

每个元素的值可以表示为h ( n , θ ) = { 1 , if v n ∈ e n , j 0 , otherwise h(n,\theta) = \begin{cases} 1, & \text{if }v_n\in e_{n,j} \ 0, & \text{otherwise} \end{cases}h (n ,θ)={1 ,0 ,​if v n ​∈e n ,j ​otherwise ​

节点的度d ( v ) = ∑ e ∈ E w ( e ) h ( v , e ) d(v)=\sum_{e\in E}w(e)h(v,e)d (v )=∑e ∈E ​w (e )h (v ,e )

超边的度δ ( e ) = ∑ v ∈ V h ( v , e ) \delta(e)=\sum_{v\in V}h(v,e)δ(e )=∑v ∈V ​h (v ,e )

采用对角矩阵D v D_v D v ​ 和𝐷 e 𝐷_e D e ​表示节点和超边的度矩阵,𝑊为权重对角矩阵
因此邻接矩阵A = H W H T − D v A=HWH^T-D_v A =H W H T −D v ​

Normalized hypergraph cut

将超图切割后,将𝑉分为两个互补子集𝑆 𝑆S和𝑆 𝐶 𝑆^𝐶S C ,𝑆 ⊂ 𝑉 𝑆⊂𝑉S ⊂V,其补集为𝑆 𝐶 𝑆^𝐶S C
假如超边e e e中同时包含有𝑆 𝑆S和𝑆 𝐶 𝑆^𝐶S C中的节点,表明超边e e e被切割。
定义∂ 𝑆 \partial𝑆∂S为一个由被切割的超边组成的超边集,即∂ 𝑆 = { 𝑒 ∈ 𝐸 ∣ 𝑒 ∩ 𝑆 ≠ ∅ , 𝑒 ∩ 𝑆 𝐶 ≠ ∅ } \partial𝑆=\left{𝑒∈𝐸|𝑒∩𝑆≠∅,𝑒∩𝑆^𝐶≠∅\right}∂S ={e ∈E ∣e ∩S ​=∅,e ∩S C ​=∅}
定义v o l S = ∑ v ∈ S d ( v ) vol S=\sum_{v\in S}d(v)v o l S =∑v ∈S ​d (v ),可以将∂ S \partial S ∂S的体积定义为:
v o l ∂ S = ∑ e ∈ ∂ S w ( e ) ∣ e ∩ S ∣ ∣ e ∩ S C ∣ δ ( e ) vol \partial S=\sum_{e\in \partial S}w(e)\frac{|e\cap S||e\cap S^C|}{\delta(e)}v o l ∂S =e ∈∂S ∑​w (e )δ(e )∣e ∩S ∣∣e ∩S C ∣​
将一条超边看作一个全连通图(任意两个顶点间都存在一条虚拟的边),并且每条边赋予相同的权重w ( e ) / δ ( e ) w(e)/\delta(e)w (e )/δ(e ),因此当一条超边被切割时,其顶点被分为了两个部分,因此一共有∣ e ∩ S ∣ ∣ e ∩ S C ∣ |e\cap S||e\cap S^C|∣e ∩S ∣∣e ∩S C ∣条虚拟的边被切割,因此上式可以理解为被切割的边的权重之和。
【论文阅读】Learning with Hypergraphs: Clustering, Classification, and Embedding
上图中,顶点1、2、3为同一超边内的顶点,为它们构建虚拟的边,使其成为全连通图。如在1、2与3之间切割,则一共有2*1条边被切割。
在进行分割时,我们希望在同一个分区中的顶点关系是较为紧密的,被分割开的两个区域的联系是稀疏的。得到下面的目标函数:
a r g m i n ∅ ≠ S ⊂ V ( S ) = v o l ∂ S ( 1 / v o l S + 1 / v o l S c ) argmin_{\varnothing \ne S\subset V}(S)=vol \partial S(1/vol S+1/volS^c)a r g m i n ∅​=S ⊂V ​(S )=v o l ∂S (1 /v o l S +1 /v o l S c )
采用随机游走来解释上式:

; Random walk explanation

给定一个起始顶点u ∈ V u\in V u ∈V,选择一个包含u u u的超边𝑒 𝑒e ,选到该超边的概率与w ( e ) w(e)w (e )成正比,再在这条超边内随机选取一个点𝑣 𝑣v,选取该点的概率为1 / δ ( 𝑒 ) 1/\delta(𝑒)1 /δ(e ) 。
在一个∣ 𝑉 ∣ × ∣ 𝑉 ∣ |𝑉|×|𝑉|∣V ∣×∣V ∣的矩阵中,定义𝑃 𝑃P为超图随机游走的概率矩阵:
p ( u , v ) = ∑ e ∈ E w ( e ) h ( u , e ) d ( u ) h ( v , e ) δ ( e ) p(u,v)=\sum_{e\in E}w(e)\frac{h(u,e)}{d(u)}\frac{h(v,e)}{\delta(e)}p (u ,v )=e ∈E ∑​w (e )d (u )h (u ,e )​δ(e )h (v ,e )​
定义平稳分布π \pi π为:π ( v ) = d ( v ) v o l V \pi (v)=\frac{d(v)}{vol V}π(v )=v o l V d (v )​
证明π P = π \pi P=\pi πP =π:
∑ u ∈ V π ( u ) p ( u , v ) = d ( v ) v o l V = π ( v ) \sum_{u\in V}\pi(u)p(u,v)=\frac{d(v)}{vol V}=\pi(v)u ∈V ∑​π(u )p (u ,v )=v o l V d (v )​=π(v )
因此其满足马尔科夫平稳分布。

将目标函数改写为c ( S ) = v o l ∂ S v o l V ( 1 v o l S / v o l V + 1 v o l S c / v o l V ) c(S)=\frac{vol\partial S}{vol V}(\frac{1}{vol S/volV}+\frac{1}{vol S^c/vol V})c (S )=v o l V v o l ∂S ​(v o l S /v o l V 1 ​+v o l S c /v o l V 1 ​),由于v o l S / v o l V = ∑ v ∈ S d ( v ) v o l V = ∑ v ∈ V π ( v ) volS/volV=\sum_{v\in S}\frac{d(v)}{volV}=\sum_{v\in V}\pi(v)v o l S /v o l V =∑v ∈S ​v o l V d (v )​=∑v ∈V ​π(v ),因此v o l S / v o l V volS/volV v o l S /v o l V表示随机游走占用S S S中某个顶点的概率。
而:
v o l ∂ S v o l V = ∑ u ∈ S ∑ v ∈ S c π ( u ) p ( u , v ) \frac{vol\partial S}{vol V}=\sum_{u\in S}\sum_{v\in S^c}\pi(u)p(u,v)v o l V v o l ∂S ​=u ∈S ∑​v ∈S c ∑​π(u )p (u ,v )
因此v o l ∂ S / v o l V vol\partial S/vol V v o l ∂S /v o l V可以理解为随机游走从S S S走到S c S^c S c的概率,因此目标函数:
c ( S ) = v o l ∂ S v o l V ( 1 v o l S / v o l V + 1 v o l S c / v o l V ) c(S)=\frac{vol\partial S}{vol V}(\frac{1}{vol S/volV}+\frac{1}{vol S^c/vol V})c (S )=v o l V v o l ∂S ​(v o l S /v o l V 1 ​+v o l S c /v o l V 1 ​)
可以将超图分割理解为:寻找一个切割点,使随机游走穿过不同簇的概率尽可能小(即v o l ∂ S v o l V \frac{vol\partial S}{vol V}v o l V v o l ∂S ​部分尽可能小),在同一簇内的概率尽可能大(1 v o l S / v o l V + 1 v o l S c / v o l V \frac{1}{vol S/volV}+\frac{1}{vol S^c/vol V}v o l S /v o l V 1 ​+v o l S c /v o l V 1 ​尽可能小)。

Spectarl hypergraph partitioning

将上述目标函数松弛为实值化问题(这里没讲具体是怎么松弛的),总之得到了如下的目标函数:
a r g m i n f ∈ R ∣ V ∣ 1 2 ∑ e ∈ E ∑ { u , v } ⊆ e w ( e ) δ ( e ) ( f ( u ) d ( u ) − f ( v ) d ( v ) ) 2 argmin_{f\in R^{|V|}}\frac{1}{2}\sum_{e\in E}\sum_{\left{u,v\right} \subseteq e}\frac{w(e)}{\delta (e)}(\frac{f(u)}{ \sqrt{d(u)}}-\frac{f(v)}{ \sqrt{d(v)}})^2 a r g m i n f ∈R ∣V ∣​2 1 ​e ∈E ∑​{u ,v }⊆e ∑​δ(e )w (e )​(d (u )​f (u )​−d (v )​f (v )​)2
上面的公式经过代入和推导可以简化为:
a r g m i n f ∈ R ∣ V ∣ 1 2 ∑ e ∈ E ∑ { u , v } ⊆ e w ( e ) δ ( e ) ( f ( u ) d ( u ) − f ( v ) d ( v ) ) 2 = 2 f T Δ f argmin_{f\in R^{|V|}}\frac{1}{2}\sum_{e\in E}\sum_{\left{u,v\right} \subseteq e}\frac{w(e)}{\delta (e)}(\frac{f(u)}{ \sqrt{d(u)}}-\frac{f(v)}{ \sqrt{d(v)}})^2=2f^T\Delta f a r g m i n f ∈R ∣V ∣​2 1 ​e ∈E ∑​{u ,v }⊆e ∑​δ(e )w (e )​(d (u )​f (u )​−d (v )​f (v )​)2 =2 f T Δf
这里的Δ = I − D V − 1 / 2 H W D e − 1 H T D v − 1 / 2 \Delta=I-D_V^{-1/2}HWD_e^{-1}H^TD_v^{-1/2}Δ=I −D V −1 /2 ​H W D e −1 ​H T D v −1 /2 ​,定义为超图的拉普拉斯矩阵(为标准拉普拉斯矩阵的归一化形式)。

对于特征值𝜆和特征向量v ⃗ \vec{v}v有:
A v ⃗ = λ v ⃗ A\vec{v}=\lambda \vec{v}A v =λv
在上式中f f f为列向量,因此f T Δ f = λ f T f = λ n f^T\Delta f=\lambda f^Tf=\lambda n f T Δf =λf T f =λn ,因此要最小化f T Δ f f^T\Delta f f T Δf相当于最小化λ \lambda λ,并找到其对应的特征向量即可。

因此目标函数即转化为寻找拉普拉斯矩阵的最小非零特征值对应的特征向量

拉普拉斯矩阵参考文章:
拉普拉斯矩阵与拉普拉斯算子的关系
超图的拉普拉斯矩阵

Spectral hypergraph embedding

将超图顶点划分为k个不相交的子集,𝑐 ( 𝑉 1 , ... , 𝑉 𝑘 ) = ∑ 𝑖 = 1 𝑘 𝑣 𝑜 𝑙 ∂ 𝑉 𝑖 𝑣 𝑜 𝑙 𝑉 𝑖 𝑐(𝑉_1,...,𝑉_𝑘 )=\sum_{𝑖=1}^𝑘\frac{𝑣𝑜𝑙\partial 𝑉_𝑖}{𝑣𝑜𝑙𝑉_𝑖 }c (V 1 ​,...,V k ​)=∑i =1 k ​v o l V i ​v o l ∂V i ​​
优化该组合问题也可以被松弛为实值优化问题,化简为𝑐 𝑘 ( G ) = m i n ⁡ 𝑐 ( 𝑉 1 , ... , 𝑉 𝑘 ) = 𝑡 𝑟 𝐹 𝑇 Δ 𝐹 𝑐_𝑘 (G)=min ⁡𝑐(𝑉_1,...,𝑉_𝑘)=𝑡𝑟 𝐹^𝑇 Δ𝐹c k ​(G )=m i n ⁡c (V 1 ​,...,V k ​)=t r F T ΔF

其解是∆ ∆∆的第k个最小特征值相关联的k个特征向量

Experiment

同时利用多个特征向量来进行k k k路划分:

  1. 构造超图,计算关联矩阵(遍历每个顶点,找到离点距离前m m m小的点,将其权重设置为1,其余为0)得到一个N ∗ N N*N N ∗N的矩阵,N N N为顶点个数;
  2. 根据公式计算拉普拉斯矩阵;
  3. 计算l l l小特征值对应的特征向量;
  4. 根据得到的特征向量进行k-means聚类。
from numpy import *
from sklearn.cluster import KMeans
import numpy as np
import  matplotlib.pyplot as plt
plt.rcParams["font.sans-serif"]=["SimHei"]
plt.rcParams['axes.unicode_minus']=False
from adjustText import adjust_text

def Eu_dis(x):
    x = np.array(x)
    aa = np.sum(x**2, 1)
    ab = np.dot(x,x.T)
    dist_mat = aa + aa.T - 2 * ab
    dist_mat[dist_mat < 0] = 0
    dist_mat = np.sqrt(dist_mat)
    dist_mat = np.maximum(dist_mat, dist_mat.T)
    return dist_mat

def topvec(matrix,k):
    e_vals, e_vecs = np.linalg.eig(matrix)
    sorted_indices=np.argsort(e_vals)
    print("sorted:",sorted_indices)
    print("sorted:",sorted_indices[:k])
    return e_vals[sorted_indices[:k]],e_vecs[:,sorted_indices[1:k]]

def calH(new_array,k):
    distance = Eu_dis(new_array)
    distance = np.array(distance)
    print(distance)

    a, b = distance.shape
    print(a, b)
    H = np.zeros((a, b))
    print(H)

    for i in range(a):

        sorted_distance = np.argsort(distance[i])
        print(sorted_distance)
        for j in range(k):
            H[sorted_distance[j]][i] = 1
    print(H)
    return H

group = np.array([[1, 101], [5, 89], [108, 5], [115, 8]])

x = group.shape[0]

k=2

print(x)
new_array = group
print(new_array)

H=calH(new_array,k)

Dv=np.diag(np.power(np.sum(H,axis=0),-0.5))
De=np.diag(np.sum(H,axis=1))
print(Dv)
print(De)

H=np.mat(H)
Dv=np.mat(Dv)
W=np.mat(De)
I=np.eye(H.shape[0])
L=I-1/2*(Dv*H* W * H.T * Dv)
print(L)

val,vec=topvec(L,2)
vec_real=vec.real

y_pread=KMeans(n_clusters=2).fit_predict(vec_real)

print(y_pread)

Original: https://blog.csdn.net/qq_43955154/article/details/121389757
Author: 六九八
Title: 【论文阅读】Learning with Hypergraphs: Clustering, Classification, and Embedding



相关阅读

Title: ROS移植机器人小车:问题集(2)

小问题比较多,根据自己编译的情况罗列,不一定是最佳解决办法。

ubuntu@ubuntu:/opt$ whereis qmake
qmake: /usr/bin/qmake
ubuntu@ubuntu:/opt$ whereis qt5
qt5: /usr/lib/aarch64-linux-gnu/qt5 /usr/lib/qt5 /usr/share/qt5

CMake Error at lidar/hector_slam/hector_geotiff/CMakeLists.txt:28 (include):
include called with wrong number of arguments. include() only takes one
file.

CMake Warning at /opt/ros/noetic/share/catkin/cmake/catkin_package.cmake:166 (message):
catkin_package() DEPENDS on 'QT' but neither 'QT_INCLUDE_DIRS' nor
'QT_LIBRARIES' is defined.

Call Stack (most recent call first):
/opt/ros/noetic/share/catkin/cmake/catkin_package.cmake:102 (_catkin_package)
lidar/hector_slam/hector_geotiff/CMakeLists.txt:69 (catkin_package)

CMake Error at /usr/lib/aarch64-linux-gnu/cmake/Boost-1.71.0/BoostConfig.cmake:117 (find_package):
Could not find a package configuration file provided by "boost_signals"
(requested version 1.71.0) with any of the following names:

boost_signalsConfig.cmake
boost_signals-config.cmake

Add the installation prefix of "boost_signals" to CMAKE_PREFIX_PATH or set
"boost_signals_DIR" to a directory containing one of the above files. If
"boost_signals" provides a separate development package or SDK, be sure it
has been installed.

Call Stack (most recent call first):
/usr/lib/aarch64-linux-gnu/cmake/Boost-1.71.0/BoostConfig.cmake:182 (boost_find_component)
/usr/share/cmake-3.16/Modules/FindBoost.cmake:443 (find_package)
imu_filter_madgwick/CMakeLists.txt:6 (find_package)

-- Configuring incomplete, errors occurred!

See also "/home/ubuntu/riki/catkin_ws/build/CMakeFiles/CMakeOutput.log".

See also "/home/ubuntu/riki/catkin_ws/build/CMakeFiles/CMakeError.log".

这个问题十分奇怪,我是尝试着解决的,办法如下,

sudo apt autoremove

sudo apt update

sudo apt upgrade

后来还重新安装了一次ros-noetic才解决。具体原因可能是文件包冲突。

In file included from /home/ubuntu/riki/catkin_ws/src/clbrobot_project/clbrobot/src/riki_base.cpp:4:
/home/ubuntu/riki/catkin_ws/src/clbrobot_project/clbrobot/include/riki_base.h:5:10: fatal error: riki_msgs/Velocities.h: No such file or directory
5 | #include

解决办法,

catkin_make -DCATKIN_WHITELIST_PACKAGES="riki_msgs"
catkin_make -DCATKIN_WHITELIST_PACKAGES=""

[ 84%] Generating EusLisp code from frontier_exploration/ExploreTaskFeedback.msg
/bin/sh: 1: python2.7: not found
/bin/sh: 1: python2.7: not found
/bin/sh: 1: python2.7: not found
/bin/sh: 1: python2.7: not found
make[3]: Packaging/Harvest.py: Command not found
make[3]: [Makefile:205: release] Error 127
make[2]:
[depth_camera/ros_astra_camera/CMakeFiles/astra_openni2.dir/build.make:112: depth_camera/ros_astra_camera/astra_openni2/src/astra_openni2-stamp/astra_openni2-build] Error 2
make[1]: [CMakeFiles/Makefile2:6461: depth_camera/ros_astra_camera/CMakeFiles/astra_openni2.dir/all] Error 2
make[1]:
Waiting for unfinished jobs....

[ 84%] Generating EusLisp manifest code for frontier_exploration
[ 84%] Built target frontier_exploration_generate_messages_eus
[ 84%] Linking CXX executable /home/ubuntu/riki/catkin_ws/devel/lib/exploration_server/plugin_client
[ 84%] Built target plugin_client
[ 84%] Linking CXX executable /home/ubuntu/riki/catkin_ws/devel/lib/exploration_server/exploration_server_node
[ 84%] Built target exploration_server_node
make: *** [Makefile:141: all] Error 2
Invoking "make -j4 -l4" failed

解决办法,

Step 1: Update system:

sudo apt-get update

Step 2: Install: python2.7-minimal

Ater updaing the OS run following command to install the packae:

sudo apt-get install python2.7-minimal

参考:

/home/ubuntu/riki/catkin_ws/src/opencv_apps/src/nodelet/discrete_fourier_transform_nodelet.cpp:61:24: error: 'DiscreteFourierTransformConfig' in namespace 'opencv_apps' does not name a type; did you mean 'DiscreteFourierTransformNodelet'?

61 | typedef opencv_apps::DiscreteFourierTransformConfig Config;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| DiscreteFourierTransformNodelet
/home/ubuntu/riki/catkin_ws/src/opencv_apps/src/nodelet/discrete_fourier_transform_nodelet.cpp:62:39: error: 'Config' was not declared in this scope; did you mean 'dynamic_reconfigure::Config'?

62 | typedef dynamic_reconfigure::Server

解决办法,

到下面的地址,

https:// github.com/ros-perception/ opencv_apps

根据你自己的系统,选择合适版本的opencv_apps,比如我在树莓派上装的ubuntu20.04.4 focal系统,就选这个
opencv_apps-release-debian-ros-noetic-opencv-apps_2.0.2-1_focal

Original: https://blog.csdn.net/tanmx219/article/details/123365215
Author: 高精度计算机视觉
Title: ROS移植机器人小车:问题集(2)