egc.model.node_embedding package
Subpackages
Submodules
egc.model.node_embedding.SENet module
SENet Kmeans
- egc.model.node_embedding.SENet.get_improved_graph(adj: ndarray, lam: float) ndarray[source]
Get adjacency matrix of the improved graph.
- Parameters:
adj (np.ndarray) – the adjacency matrix of graph.
lam (float) – hyper-parameters.
- Returns:
improved graph.
- Return type:
np.ndarray
\(S=|N(v_i)∩N(v_j)|min{N(vi),N(v_j)}\)
\(S'_{ij} = S_{ij} >= min{S_{iq} | V q ∈ N(v_i)} ? S_{ij} : 0\)
\(A'=A+lamda*S'\)
- class egc.model.node_embedding.SENet.SENetEmbed(feature: FloatTensor, labels: IntTensor, adj: array, n_clusters: int, hidden0: int = 16, hidden1: int = 16, lr: float = 0.03, epochs: int = 50, weight_decay: float = 0.0, lam: float = 1.0, n_iter: int = 3)[source]
Bases:
ModuleSENet Embedding
- Parameters:
feature (FloatTensor) – node’s feature.
labels (IntTensor) – node’s label.
adj (ndarray) – graph’s adjacency matrix
n_clusters (int) – clusters
hidden0 (int,optional) – hidden units size of gnn layer1. Defaults to 16,
hidden1 (int,optional) – hidden units size of gnn layer2. Defaults to 16,,
lr (float,optional) – learning rate. Defaults to 3e-2,
epochs (int,optional) – number of embedding training epochs.Defaults to 50,
weight_decay (float,optional) – weight decay.Defaults to 0.0,
lam (float,optional) – Used for construct improved graph . Defaults to 1.0,
n_iter (int,optional) – the times of convoluting feature . Defaults to 3,
seed (int,optional) – random seed. Defaults to 20.
- forward()[source]
Get embedding by three networks
- Returns:
(torch.floatTensor, torch.floatTensor, torch.floatTensor) Z1 = tanh(D’^-1 * A’ * X * W1) Z2 = tanh(D’^-1 * A’ * Z1 * W2) F = Z2 * W3 F^T * F = Q * Q^T Z3 = F * (Q^-1)^t
- get_imporved_feature(n_iter, features)[source]
Get the improved feature after three convolutions
- Parameters:
n_iter (int) – the times of convolution
features (tensor) – origin graph feature
- Returns:
(tensor) X’ = (D’^-1 * A’)^3 * X
- get_normalized_kernel_martix(feature)[source]
Get kernel martix
- Parameters:
features (tensor) – improved graph feature
Returns: (tensor) K = Relu(X’ * X’^T) K = (K + K^T)/2
- training: bool
egc.model.node_embedding.ae module
AE Embedding
- class egc.model.node_embedding.ae.AE(n_input: int, n_clusters: int, hidden1: int = 500, hidden2: int = 500, hidden3: int = 2000, hidden4: int = 2000, hidden5: int = 500, hidden6: int = 500, lr: float = 0.0005, epochs: int = 100, n_z: int = 10, activation: str = 'relu', early_stop: int = 20, if_eva: bool = False, if_early_stop: bool = False)[source]
Bases:
ModuleAutoEncoder Model
- Parameters:
n_input (int) – dim of features
n_clusters (int) – cluster num.
hidden1 (int) – hidden units size of encode1.
hidden2 (int) – hidden units size of encode2.
hidden3 (int) – hidden units size of encode3.
hidden4 (int) – hidden units size of decode1.
hidden5 (int) – hidden units size of decode2.
hidden6 (int) – hidden units size of decode3.
lr (float, optional) – learning rate.. Defaults to 0.001.
epochs (int, optional) – number of embedding training epochs. Defaults to 200.
n_z (int) – Number of Z’s dimensions. Default is 20.
activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.
early_stop (bool) – steps’ numbers of early stop.
if_eva (bool) – if use kmean to judge the embedding quality.
if_early_stop (bool) – if use early stop.
- forward(x)[source]
Forward Propagation
- Parameters:
x (torch.Tensor) – node’s features
- Returns:
Reconstructed attribute matrix generated by AE decoder z_ae (torch.Tensor):Latent embedding of AE
- Return type:
x_hat (torch.Tensor)
- fit(data, train_loader, label) None[source]
Fitting a AE clustering model.
- Parameters:
data (torch.Tensor) – node’s features
train_loader (DataLoader) – DataLoader of AE train
label (torch.Tensor) – node’s label
- training: bool
- class egc.model.node_embedding.ae.AE_encoder(n_input: int, hidden1: int, hidden2: int, hidden3: int, n_z: int, activation: object)[source]
Bases:
ModuleEncoder for AE
- Parameters:
args (argparse.Namespace) – all parameters
- forward(x)[source]
Forward Propagation
- Parameters:
x (torch.Tensor) – node’s features
- Returns:
Latent embedding of AE
- Return type:
z_ae (torch.Tensor)
- training: bool
egc.model.node_embedding.agc module
AGC Embedding
- class egc.model.node_embedding.agc.AGCEmbed(adj: Tensor, feature: Tensor, labels: Tensor, epochs: int = 60, n_clusters: int = 7, rep: int = 10)[source]
Bases:
ModuleAGC Embedding
- forward()[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
egc.model.node_embedding.age module
AGE Model
- class egc.model.node_embedding.age.AGE(dims: list | None = None, feat_dim: int | None = None, gnnlayers_num: int = 3, linlayers_num: int = 1, lr: float = 0.001, upth_st: float = 0.0015, upth_ed: float = 0.001, lowth_st: float = 0.1, lowth_ed: float = 0.5, upd: float = 10, bs: int = 10000, epochs: int = 400, norm: str = 'sym', renorm: bool = True, estop_steps: int = 5)[source]
Bases:
ModuleAGE paper:Adaptive Graph Encoder for Attributed Graph Embedding
- Parameters:
dims (list,optional) – Number of units in hidden layer 1.
feat_dim (int,optional) – input feature dimension.
gnnlayers_num (int) – Number of gnn layers
linlayers_num (int, optional) – Number of hidden layers
lr (float, optional) – learning rate.. Defaults to 0.001.
upth_st (float, optional) – Upper Threshold start.
upth_ed (float, optional) – Upper Threshold end.
lowth_st (float, optional) – Lower Threshold start.
lowth_ed (float, optional) – Lower Threshold end.
upd (float, optional) – Update epoch.
bs (int,optional) – Batchsize
epochs (int,optional) – Number of epochs to train.
norm (str,optional) – normalize mode of Laplacian matrix
renorm (bool,optional) – If with the renormalization trick
estop_steps (int,optional) – Number of early_stop steps.
- forward(x, y)[source]
Forward Propagation
- Parameters:
x (torch.Tensor) – Sample node embedding for x-axis
y (torch.Tensor) – Sample node embedding for y-axis
- Returns:
prediction of adj
- Return type:
batch_pred (torch.Tensor)
- fit(adj: csr_matrix, features: Tensor) None[source]
Fitting a AGE model
- Parameters:
adj (sp.csr_matrix) – 2D sparse adj.
features (torch.Tensor) – features.
- training: bool
- class egc.model.node_embedding.age.LinTrans(layers, dims)[source]
Bases:
ModuleLinear Transform Model
- Parameters:
layers (int) – number of linear layers.
dims (list) – Number of units in hidden layers.
- forward(x)[source]
Forward Propagation
- Parameters:
x (torch.Tensor) – feature embedding
- Returns:
hiddin embedding
- Return type:
out (torch.Tensor)
- training: bool
- class egc.model.node_embedding.age.SampleDecoder(act=<built-in method sigmoid of type object>)[source]
Bases:
ModuleDecoder Model , inner dot
- Parameters:
activation (object, optional) – activation of Decoder.
- forward(zx, zy)[source]
Forward Propagation
- Parameters:
zx (torch.Tensor) – Sample node embedding for x-axis
zy (torch.Tensor) – Sample node embedding for y-axis
- Returns:
prediction of adj
- Return type:
sim (torch.Tensor)
- training: bool
- egc.model.node_embedding.age.loss_function(adj_preds, adj_labels)[source]
compute loss
- Parameters:
adj_preds (torch.Tensor) – reconstructed adj
- Returns:
loss
- Return type:
torch.Tensor
- egc.model.node_embedding.age.update_similarity(z, upper_threshold, lower_treshold, pos_num, neg_num)[source]
update similarity
- Parameters:
z (numpy.ndarray) – hidden embedding
upper_threshold (float) – upper threshold
lower_treshold (float) – lower treshold
pos_num (int) – number of positive samples
neg_num (int) – number of negative samples
- Returns:
list of positive indexs numpy.ndarray: list of negative indexs
- Return type:
numpy.ndarray
- egc.model.node_embedding.age.update_threshold(upper_threshold, lower_treshold, up_eta, low_eta)[source]
update threshold
- Parameters:
upper_threshold (float) – upper threshold
lower_treshold (float) – lower treshold
up_eta (float) – update step size of upper threshold
low_eta (float) – update step size of lower threshold
- Returns:
updated upth lowth (float): updated lowth
- Return type:
upth (float)
- egc.model.node_embedding.age.preprocess_graph(adj: csr_matrix, layer: int, norm: str = 'sym', renorm: bool = True) Tensor[source]
Generalized Laplacian Smoothing Filter
- Parameters:
adj (sp.csr_matrix) – 2D sparse adj.
layer (int) – numbers of linear layers
norm (str) – normalize mode of Laplacian matrix
renorm (bool) – If with the renormalization trick
- Returns:
Laplacian Smoothing Filter
- Return type:
adjs (sp.csr_matrix)
egc.model.node_embedding.dgi module
Embedding By DGI
Adapted from: https://github.com/PetarV-/DGI
- egc.model.node_embedding.dgi.avg_readout(h: Tensor, mask: Tensor | None = None)[source]
Average readout of whole graph
- Parameters:
h (torch.Tensor) – embeddings of all nodes in graph.
mask (torch.Tensor, optional) – node mask. Defaults to None.
- Returns:
Average readout of whole graph.
- Return type:
(torch.Tensor)
- class egc.model.node_embedding.dgi.DGIEmbed(in_feats: int, out_feats_list: List[int], n_epochs: int = 10000, early_stopping_epoch: int = 20, batch_size: int = 1024, neighbor_sampler_fanouts: List[int] = -1, lr: float = 0.001, l2_coef: float = 0.0, activation: str = 'prelu', model_filename: str = 'dgi')[source]
Bases:
ModuleDGI Embedding
- Parameters:
in_feats (int) – input feature dimension.
out_feats_list (List[int]) – List of hidden units dimensions.
n_epochs (int, optional) – number of embedding training epochs. Defaults to 10000.
early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.
batch_size (int, optional) – batch size. Defaults to 1024.
neighbor_sampler_fanouts (List[int] or int, optional) –
List of neighbors to sample for each GNN layer, with the i-th element being the fanout for the i-th GNN layer. Defaults to -1.
If only a single integer is provided, DGL assumes that every layer will have the same fanout.
If -1 is provided on one layer, then all inbound edges will be included.
lr (float, optional) – learning rate. Defaults to 0.001.
l2_coef (float, optional) – weight decay. Defaults to 0.0.
activation (str, optional) – activation of gcn layer. Defaults to prelu.
model_filename (str, optional) – path to save best model parameters. Defaults to dgi.
- forward(block, input_feats) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- fit(graph: DGLGraph, device: device = device(type='cpu')) None[source]
Fit for Specific Graph
- Parameters:
graph (dgl.DGLGraph) – dgl graph.
device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).
- get_embedding(graph: DGLGraph, device: device = device(type='cpu'), model_filename: str | None = None) Tensor[source]
Get the embeddings.
- Parameters:
graph (dgl.DGLGraph) – dgl graph.
device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).
model_filename (str, optional) – Model file to load. Defaults to None.
- Returns:
Embeddings.
- Return type:
torch.Tensor
- training: bool
egc.model.node_embedding.gae module
GAE embedding
- class egc.model.node_embedding.gae.DGL_GAE(epochs: int, n_clusters: int, fead_dim: int, n_nodes: int, hidden_dim1: int = 32, dropout: float = 0.0, lr: float = 0.01, early_stop: int = 10, activation: str = 'relu')[source]
Bases:
ModuleAn implementation of “GAE”
- Parameters:
epochs (int, optional) – number of embedding training epochs. Defaults to 200.
n_clusters (int) – cluster num.
fead_dim (int) – dim of features
n_nodes (int) – number of nodes
hidden_dim1 (int) – hidden units size of gcn_1. Defaults to 32.
dropout (int, optional) – Dropout rate (1 - keep probability).
lr (float, optional) – learning rate.. Defaults to 0.001.
early_stop (int, optional) – early stopping threshold. Defaults to 10.
activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.
- Encode(graph, features)[source]
Encoder for GAE
- Parameters:
graph (dgl.DGLGraph) – Graph data in dgl
features (torch.Tensor) – node’s features
- Returns:
Latent embedding of GAE
- Return type:
h1 (torch.Tensor)
- Decode(z)[source]
Decoder for GAE
- Parameters:
features (torch.Tensor) – node’s features
- Returns:
Latent embedding of GAE
- Return type:
h1 (torch.Tensor)
- forward()[source]
Forward Propagation
- Returns:
Reconstructed adj matrix Latent_Representation (torch.Tensor):Latent embedding of GAE
- Return type:
Graph_Reconstruction (torch.Tensor)
- fit(adj_csr: csr_matrix, features: Tensor, device: device = device(type='cpu')) None[source]
Fitting a GAE model
- Parameters:
adj_csr (sp.lil_matrix) – 2D sparse features.
features (torch.Tensor) – node’s features.
device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).
- training: bool
egc.model.node_embedding.gmi module
Embedding By GMI
Adapted From: https://github.com/zpeng27/GMI
- egc.model.node_embedding.gmi.mi_loss_jsd(pos: Tensor, neg: Tensor) Tensor[source]
Jensen-Shannon MI Estimator
- Parameters:
pos (torch.Tensor) – \(D_w(h_i, x_i) or D_w(h_i, x_j)\).
neg (torch.Tensor) – \(D_w(h_i, x'_i) or D_w(h_i, x'_j)\).
- Returns:
JSD loss.
\[\begin{split}& sp(-D_w(h_i,x_i))+E(sp(D_w(h_i,x'_i)))\\ & \textbf{or} \\ & sp(-D_w(h_i,x_j))+E(sp(D_w(h_i,x'_j))). \\\end{split}\]- Return type:
(torch.Tensor)
- egc.model.node_embedding.gmi.reconstruct_loss(pred: Tensor, gnd: Tensor) Tensor[source]
Loss of Rebuilt Adj
- Parameters:
pred (torch.Tensor) – \(w_{ij}\).
gnd (torch.Tensor) – \(a_{ij}\).
- Returns:
reconstruction loss.
\[\begin{split}\text{reconstruct}_{loss} = & \frac{n^2}{n^2 - |E|} * AVG(\frac{-(n^2-|E|)}{|E|} * a_{ij} * \log(w_{ij} + e^{-10}) \\ & - (1 - a_{ij}) * \log(1 - w_{ij} + e^{-10})).\end{split}\]- Return type:
(torch.Tensor)
- egc.model.node_embedding.gmi.preprocess_adj(adj_orig: csr_matrix) Tuple[Tensor, Tensor][source]
Preprocess of Adjacency Matrix for Row Avarage and Self Loop
- Parameters:
adj_orig (<class 'scipy.sparse.csr.csr_matrix'>) – input origin adjacency matrix.
- Returns:
row avarage and self loop adj
- Return type:
adj_orig, adj_target (<class ‘scipy.sparse.csr.csr_matrix’>, <class ‘numpy.matrix’>)
- class egc.model.node_embedding.gmi.GMIEmbed(in_features: int, hidden_units: int = 512, n_epochs: int = 550, early_stopping_epoch: int = 20, lr: float = 0.001, l2_coef: float = 0.0, alpha: float = 0.8, beta: float = 1.0, gamma: float = 1.0, activation: str = 'prelu', gcn_depth: int = 2)[source]
Bases:
ModuleGMI Embedding
- Parameters:
in_features (int) – input feature dimension.
hidden_units (int, optional) – hidden units size of gcn. Defaults to 512.
n_epochs (int, optional) – number of embedding training epochs. Defaults to 550.
early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.
lr (float, optional) – learning rate. Defaults to 0.001.
l2_coef (float, optional) – weight decay. Defaults to 0.0.
alpha (float, optional) – parameter for \(I(h_i; x_i)\). Defaults to 0.8.
beta (float, optional) – parameter for \(I(h_i; x_j)\). Defaults to 1.0.
gamma (float, optional) – parameter for \(I(w_ij; a_ij)\). Defaults to 1.0.
activation (str, optional) – activation of gcn layer. Defaults to “prelu”.
- calc_loss(mi_pos: Tensor, mi_neg: Tensor, local_mi_pos: Tensor, local_mi_neg: Tensor, adj_rebuilt: Tensor) Tensor[source]
Calculate Loss
- Parameters:
mi_pos (torch.Tensor) – \(D_w(h_i, x_i)\).
mi_neg (torch.Tensor) – \(D_w(h_i, x'_i)\).
local_mi_pos (torch.Tensor) – \(D_w(h_i, x_j)\).
local_mi_neg (torch.Tensor) – \(D_w(h_i, x'_j)\).
adj_rebuilt (torch.Tensor) – \(w_{ij}\)
- Returns:
loss.
\[\begin{split}loss = & \alpha * sp(-D_w(h_i,x_i))+E(sp(D_w(h_i,x'_i))) \\ & + \beta * sp(-D_w(h_i,x_j))+E(sp(D_w(h_i,x'_j))) \\ & + \gamma * \text{reconstruct}_{loss} \\\end{split}\]- Return type:
(torch.Tensor)
- forward(neg_sample_list: Tensor) Tensor[source]
Forward Propagation
- Parameters:
neg_sample_list (torch.Tensor) – negative sample list.
- Returns:
loss.
- Return type:
torch.Tensor
- fit(features: lil_matrix, adj_orig: csr_matrix, neg_list_num: int = 5) None[source]
Fit for Specific Graph
- Parameters:
features (sp.lil_matrix) – 2D sparse features.
adj_orig (sp.csr_matrix) – 2D sparse adj.
neg_list_num (int, optional) – negative sample times. Defaults to 5.
- set_features_norm(features_norm) None[source]
Set the features row normalized
- Parameters:
features_norm (torch.Tensor) – normalized 3D features tensor in shape of [1, xx, xx]
- set_adj_norm(adj_norm) None[source]
Set the adjacency symmetrically normalized
- Parameters:
adj_norm (torch.Tensor) – symmetrically normalized 2D adjacency tensor
- get_features_norm() Tensor[source]
Get the features row normalized
- Returns:
normalized 3D features tensor in shape of [1, xx, xx]
- Return type:
features_norm (torch.Tensor)
- get_adj_norm() Tensor[source]
Get the adjacency symmetrically normalized
- Returns:
symmetrically normalized 2D adjacency tensor
- Return type:
adj_norm (torch.Tensor)
- get_embedding() Tensor[source]
Get the embeddings (graph or node level).
- Returns:
embedding.
- Return type:
(torch.Tensor)
- training: bool
egc.model.node_embedding.igae module
IGAE Embedding
- class egc.model.node_embedding.igae.IGAE(args: Namespace, device)[source]
Bases:
ModuleThis is a symmetric improved graph autoencoder (IGAE). This network requires to reconstruct both the weighted attribute matrix and the adjacency matrix simultaneously
- Parameters:
args (argparse.Namespace) – all parameters
- forward(g, feat)[source]
Forward Propagation
- Parameters:
g (dgl.DGLGraph) – Graph data in dgl
feat (torch.Tensor) – node’s features
- Returns:
Latent embedding of IGAE z_hat (torch.Tensor):Reconstructed weighted attribute matrix generated by IGAE decoder adj_hat (torch.Tensor):Reconstructed adjacency matrix generated by IGAE decoder
- Return type:
z_igae (torch.Tensor)
- fit(g, data, adj)[source]
Fitting a IGAE clustering model.
- Parameters:
g (dgl.DGLGraph) – Graph data in dgl
data (torch.Tensor) – node’s features
adj (sp.csr.csr_matrix) – adjacency matrix
- training: bool
- class egc.model.node_embedding.igae.IGAE_encoder(args: Namespace)[source]
Bases:
ModuleEncoder for IGAE
- Parameters:
args (argparse.Namespace) – all parameters
- forward(g, feat)[source]
Forward Propagation
- Parameters:
g (dgl.DGLGraph) – Graph data in dgl
feat (torch.Tensor) – node’s features
- Returns:
Latent embedding of IGAE z_igae_adj (torch.Tensor):Reconstructed adjacency matrix generated by IGAE encoder
- Return type:
z_igae (torch.Tensor)
- training: bool
- class egc.model.node_embedding.igae.IGAE_decoder(args: Namespace)[source]
Bases:
ModuleDecoder for IGAE
- Parameters:
args (argparse.Namespace) – all parameters
- forward(g, z_igae)[source]
Forward Propagation
- Parameters:
g (dgl.DGLGraph) – Graph data in dgl
z_igae (torch.Tensor) – Latent embedding of IGAE
- Returns:
Reconstructed weighted attribute matrix generated by IGAE decoder z_hat_adj (torch.Tensor):Reconstructed adjacency matrix generated by IGAE decoder
- Return type:
z_hat (torch.Tensor)
- training: bool
egc.model.node_embedding.mvgrl module
Contrastive Multi-View Representation Learning on Graphs https://arxiv.org/abs/2006.05582
- class egc.model.node_embedding.mvgrl.Readout[source]
Bases:
Moduleread out
- static forward(seq, msk)[source]
Forward Propagation
- Parameters:
seq (torch.Tensor) – features tensor.
msk (torch.Tensor) – node mask.
- Returns:
graph-level representation
- Return type:
(torch.Tensor)
- training: bool
- class egc.model.node_embedding.mvgrl.MVGRL(in_feats: int, n_clusters: int, n_h: int = 512, model_filename: str = 'mvgrl', sparse: bool = False, nb_epochs: int = 3000, patience: int = 20, lr: float = 0.001, weight_decay: float = 0.0, sample_size: int = 2000, batch_size: int = 4, dataset: str = 'Citeseer')[source]
Bases:
ModuleMVGRL:Contrastive Multi-View Representation Learning on Graphs
- Parameters:
in_feats (int) – Input feature size.
n_clusters (int) – Num of clusters.
n_h (int,optional) – hidden units dimension. Defaults to 256.
model_filename (str,optional) – Path to store model parameters. Defaults to ‘mvgrl’.
sparse (bool,optional) – Use sparse tensor. Defaults to False.
nb_epochs (int,optional) – Maximum training epochs. Defaults to 3000.
patience (int,optional) – Early stopping patience. Defaults to 20.
lr (float,optional) – Learning rate. Defaults to 0.001.
weight_decay (float,optional) – Weight decay. Defaults to 0.0.
sample_size (int,optional) – Sample size. Defaults to 2000.
batch_size (int,optional) – Batch size. Defaults to 4.
dataset (str,optional) – Dataset. Defaults to ‘Citeseer’.
- forward(seq1, seq2, adj, diff, sparse, msk)[source]
Forward Propagation
- Parameters:
seq1 (torch.Tensor) – features of raw graph
seq2 (torch.Tensor) – shuffle features of diffuse graph
adj (torch.Tensor) – adj matrix of raw graph
diff (torch.Tensor) – ppr matrix of diffuse graph
sparse (bool) – if sparse
msk (torch.Tensor) – mask node
- Returns:
probability of positive or negtive node h_1 (torch.Tensor): node embedding of raw graph by one gcn layer h_2 (torch.Tensor): node embedding of diffuse graph by one gcn layer
- Return type:
ret (torch.Tensor)
- fit(adj_csr, features)[source]
Fitting
- Parameters:
adj_csr (sp.lil_matrix) – adj sparse matrix.
features (torch.Tensor) – features.
- get_embedding()[source]
Get the embeddings (graph or node level).
- Returns:
embedding of each node. (torch.Tensor): embedding of graph representations
- Return type:
(torch.Tensor)
- training: bool
egc.model.node_embedding.saif module
A structure and attribute information fusion (SAIF) module
- class egc.model.node_embedding.saif.SAIF(adj_orig_graph: DGLGraph, data: Tensor, train_loader: DataLoader, label: Tensor, adj: csr_matrix, n_clusters: int, n_node: int, device: device, args: Namespace)[source]
Bases:
ModuleA structure and attribute information fusion (SAIF) module
- Parameters:
adj_orig_graph (dgl.DGLGraph) – Graph data in dgl
data (torch.Tensor) – node’s features
train_loader (DataLoader) – DataLoader of AE train
label (torch.Tensor) – node’s label
adj (sp.csr.csr_matrix) – adjacency matrix
n_clusters (int) – numbers of clusters
n_node (int, optional) – number of nodes. Defaults to None.
device (torch.device, optional) – device. Defaults to None.
args (argparse.Namespace) – all parameters
- forward()[source]
Forward Propagation
- Returns:
Reconstructed attribute matrix generated by AE decoder z_hat (torch.Tensor):Reconstructed weighted attribute matrix generated by IGAE decoder z_tilde (torch.Tensor):Clustering embedding adj_hat (torch.Tensor):Reconstructed adjacency matrix generated by IGAE decoder
- Return type:
x_hat (torch.Tensor)
- fit(epochs)[source]
Fitting a SAIF clustering model.
- Parameters:
epochs (int) – number of train epoch
- training: bool
egc.model.node_embedding.sgc module
SGC
- egc.model.node_embedding.sgc.eliminate_zeros(adj: spmatrix) spmatrix[source]
Remove self-loops and edges with value of zero.
- Parameters:
adj (sp.spmatrix) – adjacent matrix.
- Returns:
adjacent matrix.
- Return type:
sp.spmatrix
- egc.model.node_embedding.sgc.scale(z)[source]
Feature Scale :param z: hidden embedding :type z: torch.Tensor
- Returns:
scaled embedding
- Return type:
z_scaled (torch.Tensor)
- class egc.model.node_embedding.sgc.LinTrans(layers, dims)[source]
Bases:
ModuleLinear Transform Model
- Parameters:
layers (int) – number of linear layers.
dims (list) – Number of units in hidden layers.
- forward(x)[source]
Forward Propagation
- Parameters:
x (torch.Tensor) – feature embedding
- Returns:
hiddin embedding
- Return type:
out (torch.Tensor)
- training: bool
- class egc.model.node_embedding.sgc.SGC(in_feats: int, hidden_units: ~typing.List, n_lin_layers: int = 1, n_gnn_layers: int = 10, lr: float = 0.001, n_epochs: int = 400, inner_act: ~typing.Callable = <function SGC.<lambda>>, early_stop: int = 10)[source]
Bases:
Module- preprocess_graph(adj: csr_matrix, layer: int, norm: str = 'sym', renorm: bool = True, lbd: float = 0.6666666666666666) Tensor[source]
Generalized Laplacian Smoothing Filter
- Parameters:
adj (sp.csr_matrix) – 2D sparse adj without self-loops
layer (int) – numbers of linear layers
norm (str) – normalize mode of Laplacian matrix
renorm (bool) – If with the renormalization trick
- Returns:
Laplacian Smoothing Filter
- Return type:
adjs (sp.csr_matrix)
- forward()[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- fit(graph: DGLGraph, device: device) Tuple[Tensor, Tensor][source]
Fitting
- Parameters:
adj (sp.csr_matrix) – 2D sparse adj.
features (torch.Tensor) – features.
- training: bool
egc.model.node_embedding.vgae module
GAE & VGEA
- class egc.model.node_embedding.vgae.Encoder(in_features: int, hidden_units_1: int = 32, hidden_units_2: int = 16, activation: str = 'relu')[source]
Bases:
ModuleEncoder for VGAE
- Parameters:
in_features (int) – input feature dimension.
hidden_units_1 (int) – hidden units size of gcn_1. Defaults to 32.
hidden_units_2 (int) – hidden units size of gcn_2. Defaults to 16.
activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.
- forward(features_norm: Tensor, adj_norm: Tensor) Tuple[Tensor][source]
- Parameters:
features_norm (torch.Tensor) – features_norm
adj_norm (torch.Tensor) – adj_norm
- Returns:
(mu, log_sigma, feat_hidden)
- Return type:
Tuple[torch.Tensor]
- training: bool
- class egc.model.node_embedding.vgae.Decoder[source]
Bases:
ModuleDecoder for VGAE
- forward(mu: Tensor, log_sigma: Tensor, training: bool = True) Tensor[source]
Decoder
- Parameters:
mu (torch.Tensor) – mu
log_sigma (torch.Tensor) – log_sigma
training (bool) – isTraining
- Returns:
A_hat
- Return type:
(torch.Tensor)
- training: bool
- class egc.model.node_embedding.vgae.VGAE(in_features: int, hidden_units_1: int = 32, hidden_units_2: int = 16, n_epochs: int = 200, early_stopping_epoch: int = 20, lr: float = 0.01, l2_coef: float = 0.0, activation: str = 'relu', model_filename: str = 'vgae')[source]
Bases:
Module- Parameters:
in_features (int) – input feature dimension.
hidden_units_1 (int) – hidden units size of gcn_1. Defaults to 32.
hidden_units_2 (int) – hidden units size of gcn_2. Defaults to 16.
n_epochs (int, optional) – number of embedding training epochs. Defaults to 200.
early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.
lr (float, optional) – learning rate.. Defaults to 0.001.
l2_coef (float, optional) – weight decay. Defaults to 0.0.
activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.
model_filename (str, optional) – path to save best model parameters. Defaults to vgae.
- fit(features: lil_matrix, adj_orig: csr_matrix) None[source]
- Parameters:
features (sp.lil_matrix) – 2D sparse features.
adj_orig (sp.csr_matrix) – 2D sparse adj.
- get_embedding(model_filename: str | None = None) Tensor[source]
Get the embeddings (graph or node level).
- Parameters:
model_filename (str, optional) – Model file to load. Defaults to None.
- Returns:
embedding.
- Return type:
(torch.Tensor)
- training: bool
- egc.model.node_embedding.vgae.loss_function(preds, labels, mu, logvar, n_nodes, norm, pos_weight)[source]
- class egc.model.node_embedding.vgae.DGL_VGAE(epochs: int, n_clusters: int, fead_dim: int, n_nodes: int, hidden_dim1: int = 32, hidden_dim2: int = 16, dropout: float = 0.0, lr: float = 0.01, early_stop: int = 10, activation: str = 'relu')[source]
Bases:
Module- Parameters:
epochs (int, optional) – number of embedding training epochs. Defaults to 200.
n_clusters (int) – cluster num.
fead_dim (int) – dim of features
n_nodes (int) – number of nodes
hidden_dim1 (int) – hidden units size of gcn_1. Defaults to 32.
hidden_dim2 (int) – hidden units size of gcn_2. Defaults to 16.
dropout (int, optional) – Dropout rate (1 - keep probability).
lr (float, optional) – learning rate.. Defaults to 0.001.
early_stop (int, optional) – early stopping threshold. Defaults to 10.
activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.
- encode(g, feat)[source]
Encoder for VGAE
- Parameters:
g (dgl.DGLGraph) – Graph data in dgl
feat (torch.Tensor) – node’s features
- Returns:
latent mean self.gc3(g, hidden1) (torch.Tensor):latent log variance
- Return type:
self.gc2(g, hidden1) (torch.Tensor)
- reparameterize(mu, logvar)[source]
reparameterization trick
- Parameters:
mu – (torch.Tensor):latent mean
logvar – (torch.Tensor):latent log variance
- Returns:
(torch.Tensor):latent mean after reparameterization trick
- Return type:
mu
- forward()[source]
Forward Propagation
- Returns:
Reconstructed adj matrix mu: (torch.Tensor):latent mean logvar: (torch.Tensor):latent log variance
- Return type:
self.dc(z)
- fit(adj_csr, features)[source]
Fitting a VGAE model
- Parameters:
adj_csr (sp.lil_matrix) – 2D sparse features.
features (torch.Tensor) – node’s features
- training: bool
Module contents
Node Embedding Methods