AE

AE Embedding

class egc.model.node_embedding.ae.AE(n_input: int, n_clusters: int, hidden1: int = 500, hidden2: int = 500, hidden3: int = 2000, hidden4: int = 2000, hidden5: int = 500, hidden6: int = 500, lr: float = 0.0005, epochs: int = 100, n_z: int = 10, activation: str = 'relu', early_stop: int = 20, if_eva: bool = False, if_early_stop: bool = False)[source]

Bases: Module

AutoEncoder Model

Parameters:
  • n_input (int) – dim of features

  • n_clusters (int) – cluster num.

  • hidden1 (int) – hidden units size of encode1.

  • hidden2 (int) – hidden units size of encode2.

  • hidden3 (int) – hidden units size of encode3.

  • hidden4 (int) – hidden units size of decode1.

  • hidden5 (int) – hidden units size of decode2.

  • hidden6 (int) – hidden units size of decode3.

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • epochs (int, optional) – number of embedding training epochs. Defaults to 200.

  • n_z (int) – Number of Z’s dimensions. Default is 20.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

  • early_stop (bool) – steps’ numbers of early stop.

  • if_eva (bool) – if use kmean to judge the embedding quality.

  • if_early_stop (bool) – if use early stop.

forward(x)[source]

Forward Propagation

Parameters:

x (torch.Tensor) – node’s features

Returns:

Reconstructed attribute matrix generated by AE decoder z_ae (torch.Tensor):Latent embedding of AE

Return type:

x_hat (torch.Tensor)

fit(data, train_loader, label) None[source]

Fitting a AE clustering model.

Parameters:
  • data (torch.Tensor) – node’s features

  • train_loader (DataLoader) – DataLoader of AE train

  • label (torch.Tensor) – node’s label

training: bool
class egc.model.node_embedding.ae.AE_encoder(n_input: int, hidden1: int, hidden2: int, hidden3: int, n_z: int, activation: object)[source]

Bases: Module

Encoder for AE

Parameters:

args (argparse.Namespace) – all parameters

forward(x)[source]

Forward Propagation

Parameters:

x (torch.Tensor) – node’s features

Returns:

Latent embedding of AE

Return type:

z_ae (torch.Tensor)

training: bool
class egc.model.node_embedding.ae.AE_decoder(n_input: int, hidden1: int, hidden2: int, hidden3: int, n_z: int, activation: object)[source]

Bases: Module

Decoder for AE

Parameters:

args (argparse.Namespace) – all parameters

training: bool
forward(z_ae)[source]

Forward Propagation

Parameters:

z_ae (torch.Tensor) – Latent embedding of AE

Returns:

Reconstructed attribute matrix generated by AE decoder

Return type:

x_hat (torch.Tensor)

GAE

GAE embedding

class egc.model.node_embedding.gae.DGL_GAE(epochs: int, n_clusters: int, fead_dim: int, n_nodes: int, hidden_dim1: int = 32, dropout: float = 0.0, lr: float = 0.01, early_stop: int = 10, activation: str = 'relu')[source]

Bases: Module

An implementation of “GAE”

Parameters:
  • epochs (int, optional) – number of embedding training epochs. Defaults to 200.

  • n_clusters (int) – cluster num.

  • fead_dim (int) – dim of features

  • n_nodes (int) – number of nodes

  • hidden_dim1 (int) – hidden units size of gcn_1. Defaults to 32.

  • dropout (int, optional) – Dropout rate (1 - keep probability).

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • early_stop (int, optional) – early stopping threshold. Defaults to 10.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

Encode(graph, features)[source]

Encoder for GAE

Parameters:
  • graph (dgl.DGLGraph) – Graph data in dgl

  • features (torch.Tensor) – node’s features

Returns:

Latent embedding of GAE

Return type:

h1 (torch.Tensor)

Decode(z)[source]

Decoder for GAE

Parameters:

features (torch.Tensor) – node’s features

Returns:

Latent embedding of GAE

Return type:

h1 (torch.Tensor)

forward()[source]

Forward Propagation

Returns:

Reconstructed adj matrix Latent_Representation (torch.Tensor):Latent embedding of GAE

Return type:

Graph_Reconstruction (torch.Tensor)

fit(adj_csr: csr_matrix, features: Tensor, device: device = device(type='cpu')) None[source]

Fitting a GAE model

Parameters:
  • adj_csr (sp.lil_matrix) – 2D sparse features.

  • features (torch.Tensor) – node’s features.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

training: bool
get_embedding() ndarray[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(numpy.ndarray)

IGAE

IGAE Embedding

class egc.model.node_embedding.igae.IGAE(args: Namespace, device)[source]

Bases: Module

This is a symmetric improved graph autoencoder (IGAE). This network requires to reconstruct both the weighted attribute matrix and the adjacency matrix simultaneously

Parameters:

args (argparse.Namespace) – all parameters

forward(g, feat)[source]

Forward Propagation

Parameters:
  • g (dgl.DGLGraph) – Graph data in dgl

  • feat (torch.Tensor) – node’s features

Returns:

Latent embedding of IGAE z_hat (torch.Tensor):Reconstructed weighted attribute matrix generated by IGAE decoder adj_hat (torch.Tensor):Reconstructed adjacency matrix generated by IGAE decoder

Return type:

z_igae (torch.Tensor)

fit(g, data, adj)[source]

Fitting a IGAE clustering model.

Parameters:
  • g (dgl.DGLGraph) – Graph data in dgl

  • data (torch.Tensor) – node’s features

  • adj (sp.csr.csr_matrix) – adjacency matrix

training: bool
class egc.model.node_embedding.igae.IGAE_encoder(args: Namespace)[source]

Bases: Module

Encoder for IGAE

Parameters:

args (argparse.Namespace) – all parameters

forward(g, feat)[source]

Forward Propagation

Parameters:
  • g (dgl.DGLGraph) – Graph data in dgl

  • feat (torch.Tensor) – node’s features

Returns:

Latent embedding of IGAE z_igae_adj (torch.Tensor):Reconstructed adjacency matrix generated by IGAE encoder

Return type:

z_igae (torch.Tensor)

training: bool
class egc.model.node_embedding.igae.IGAE_decoder(args: Namespace)[source]

Bases: Module

Decoder for IGAE

Parameters:

args (argparse.Namespace) – all parameters

forward(g, z_igae)[source]

Forward Propagation

Parameters:
  • g (dgl.DGLGraph) – Graph data in dgl

  • z_igae (torch.Tensor) – Latent embedding of IGAE

Returns:

Reconstructed weighted attribute matrix generated by IGAE decoder z_hat_adj (torch.Tensor):Reconstructed adjacency matrix generated by IGAE decoder

Return type:

z_hat (torch.Tensor)

training: bool

VGAE

GAE & VGEA

class egc.model.node_embedding.vgae.Encoder(in_features: int, hidden_units_1: int = 32, hidden_units_2: int = 16, activation: str = 'relu')[source]

Bases: Module

Encoder for VGAE

Parameters:
  • in_features (int) – input feature dimension.

  • hidden_units_1 (int) – hidden units size of gcn_1. Defaults to 32.

  • hidden_units_2 (int) – hidden units size of gcn_2. Defaults to 16.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

forward(features_norm: Tensor, adj_norm: Tensor) Tuple[Tensor][source]
Parameters:
  • features_norm (torch.Tensor) – features_norm

  • adj_norm (torch.Tensor) – adj_norm

Returns:

(mu, log_sigma, feat_hidden)

Return type:

Tuple[torch.Tensor]

training: bool
class egc.model.node_embedding.vgae.Decoder[source]

Bases: Module

Decoder for VGAE

forward(mu: Tensor, log_sigma: Tensor, training: bool = True) Tensor[source]

Decoder

Parameters:
  • mu (torch.Tensor) – mu

  • log_sigma (torch.Tensor) – log_sigma

  • training (bool) – isTraining

Returns:

A_hat

Return type:

(torch.Tensor)

training: bool
class egc.model.node_embedding.vgae.VGAE(in_features: int, hidden_units_1: int = 32, hidden_units_2: int = 16, n_epochs: int = 200, early_stopping_epoch: int = 20, lr: float = 0.01, l2_coef: float = 0.0, activation: str = 'relu', model_filename: str = 'vgae')[source]

Bases: Module

Parameters:
  • in_features (int) – input feature dimension.

  • hidden_units_1 (int) – hidden units size of gcn_1. Defaults to 32.

  • hidden_units_2 (int) – hidden units size of gcn_2. Defaults to 16.

  • n_epochs (int, optional) – number of embedding training epochs. Defaults to 200.

  • early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • l2_coef (float, optional) – weight decay. Defaults to 0.0.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

  • model_filename (str, optional) – path to save best model parameters. Defaults to vgae.

forward()[source]
Returns:

loss

Return type:

loss (torch.Tensor)

fit(features: lil_matrix, adj_orig: csr_matrix) None[source]
Parameters:
  • features (sp.lil_matrix) – 2D sparse features.

  • adj_orig (sp.csr_matrix) – 2D sparse adj.

get_embedding(model_filename: str | None = None) Tensor[source]

Get the embeddings (graph or node level).

Parameters:

model_filename (str, optional) – Model file to load. Defaults to None.

Returns:

embedding.

Return type:

(torch.Tensor)

training: bool
egc.model.node_embedding.vgae.loss_function(preds, labels, mu, logvar, n_nodes, norm, pos_weight)[source]
class egc.model.node_embedding.vgae.DGL_VGAE(epochs: int, n_clusters: int, fead_dim: int, n_nodes: int, hidden_dim1: int = 32, hidden_dim2: int = 16, dropout: float = 0.0, lr: float = 0.01, early_stop: int = 10, activation: str = 'relu')[source]

Bases: Module

Parameters:
  • epochs (int, optional) – number of embedding training epochs. Defaults to 200.

  • n_clusters (int) – cluster num.

  • fead_dim (int) – dim of features

  • n_nodes (int) – number of nodes

  • hidden_dim1 (int) – hidden units size of gcn_1. Defaults to 32.

  • hidden_dim2 (int) – hidden units size of gcn_2. Defaults to 16.

  • dropout (int, optional) – Dropout rate (1 - keep probability).

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • early_stop (int, optional) – early stopping threshold. Defaults to 10.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

encode(g, feat)[source]

Encoder for VGAE

Parameters:
  • g (dgl.DGLGraph) – Graph data in dgl

  • feat (torch.Tensor) – node’s features

Returns:

latent mean self.gc3(g, hidden1) (torch.Tensor):latent log variance

Return type:

self.gc2(g, hidden1) (torch.Tensor)

reparameterize(mu, logvar)[source]

reparameterization trick

Parameters:
  • mu – (torch.Tensor):latent mean

  • logvar – (torch.Tensor):latent log variance

Returns:

(torch.Tensor):latent mean after reparameterization trick

Return type:

mu

forward()[source]

Forward Propagation

Returns:

Reconstructed adj matrix mu: (torch.Tensor):latent mean logvar: (torch.Tensor):latent log variance

Return type:

self.dc(z)

fit(adj_csr, features)[source]

Fitting a VGAE model

Parameters:
  • adj_csr (sp.lil_matrix) – 2D sparse features.

  • features (torch.Tensor) – node’s features

get_embedding()[source]

Get cluster embedding.

Returns:numpy.ndarray

training: bool
get_memberships()[source]

Get cluster membership.

Returns:numpy.ndarray

SGC

SGC

class egc.model.node_embedding.sgc.SGC(in_feats: int, hidden_units: ~typing.List, n_lin_layers: int = 1, n_gnn_layers: int = 10, lr: float = 0.001, n_epochs: int = 400, inner_act: ~typing.Callable = <function SGC.<lambda>>, early_stop: int = 10)[source]

Bases: Module

static bce_loss(preds, labels, norm=1.0, pos_weight=None)[source]
preprocess_graph(adj: csr_matrix, layer: int, norm: str = 'sym', renorm: bool = True, lbd: float = 0.6666666666666666) Tensor[source]

Generalized Laplacian Smoothing Filter

Parameters:
  • adj (sp.csr_matrix) – 2D sparse adj without self-loops

  • layer (int) – numbers of linear layers

  • norm (str) – normalize mode of Laplacian matrix

  • renorm (bool) – If with the renormalization trick

Returns:

Laplacian Smoothing Filter

Return type:

adjs (sp.csr_matrix)

update_features(adj)[source]

Check whether adj matrix needs to remove self-loops

forward()[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fit(graph: DGLGraph, device: device) Tuple[Tensor, Tensor][source]

Fitting

Parameters:
  • adj (sp.csr_matrix) – 2D sparse adj.

  • features (torch.Tensor) – features.

get_embedding()[source]
training: bool

AGE

AGE Model

class egc.model.node_embedding.age.AGE(dims: list | None = None, feat_dim: int | None = None, gnnlayers_num: int = 3, linlayers_num: int = 1, lr: float = 0.001, upth_st: float = 0.0015, upth_ed: float = 0.001, lowth_st: float = 0.1, lowth_ed: float = 0.5, upd: float = 10, bs: int = 10000, epochs: int = 400, norm: str = 'sym', renorm: bool = True, estop_steps: int = 5)[source]

Bases: Module

AGE paper:Adaptive Graph Encoder for Attributed Graph Embedding

Parameters:
  • dims (list,optional) – Number of units in hidden layer 1.

  • feat_dim (int,optional) – input feature dimension.

  • gnnlayers_num (int) – Number of gnn layers

  • linlayers_num (int, optional) – Number of hidden layers

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • upth_st (float, optional) – Upper Threshold start.

  • upth_ed (float, optional) – Upper Threshold end.

  • lowth_st (float, optional) – Lower Threshold start.

  • lowth_ed (float, optional) – Lower Threshold end.

  • upd (float, optional) – Update epoch.

  • bs (int,optional) – Batchsize

  • epochs (int,optional) – Number of epochs to train.

  • norm (str,optional) – normalize mode of Laplacian matrix

  • renorm (bool,optional) – If with the renormalization trick

  • estop_steps (int,optional) – Number of early_stop steps.

forward(x, y)[source]

Forward Propagation

Parameters:
  • x (torch.Tensor) – Sample node embedding for x-axis

  • y (torch.Tensor) – Sample node embedding for y-axis

Returns:

prediction of adj

Return type:

batch_pred (torch.Tensor)

fit(adj: csr_matrix, features: Tensor) None[source]

Fitting a AGE model

Parameters:
  • adj (sp.csr_matrix) – 2D sparse adj.

  • features (torch.Tensor) – features.

get_embedding()[source]

Get cluster embedding.

Returns:tensor.Tensor

training: bool
class egc.model.node_embedding.age.LinTrans(layers, dims)[source]

Bases: Module

Linear Transform Model

Parameters:
  • layers (int) – number of linear layers.

  • dims (list) – Number of units in hidden layers.

forward(x)[source]

Forward Propagation

Parameters:

x (torch.Tensor) – feature embedding

Returns:

hiddin embedding

Return type:

out (torch.Tensor)

training: bool
class egc.model.node_embedding.age.SampleDecoder(act=<built-in method sigmoid of type object>)[source]

Bases: Module

Decoder Model , inner dot

Parameters:

activation (object, optional) – activation of Decoder.

forward(zx, zy)[source]

Forward Propagation

Parameters:
  • zx (torch.Tensor) – Sample node embedding for x-axis

  • zy (torch.Tensor) – Sample node embedding for y-axis

Returns:

prediction of adj

Return type:

sim (torch.Tensor)

training: bool
egc.model.node_embedding.age.loss_function(adj_preds, adj_labels)[source]

compute loss

Parameters:

adj_preds (torch.Tensor) – reconstructed adj

Returns:

loss

Return type:

torch.Tensor

egc.model.node_embedding.age.update_similarity(z, upper_threshold, lower_treshold, pos_num, neg_num)[source]

update similarity

Parameters:
  • z (numpy.ndarray) – hidden embedding

  • upper_threshold (float) – upper threshold

  • lower_treshold (float) – lower treshold

  • pos_num (int) – number of positive samples

  • neg_num (int) – number of negative samples

Returns:

list of positive indexs numpy.ndarray: list of negative indexs

Return type:

numpy.ndarray

egc.model.node_embedding.age.update_threshold(upper_threshold, lower_treshold, up_eta, low_eta)[source]

update threshold

Parameters:
  • upper_threshold (float) – upper threshold

  • lower_treshold (float) – lower treshold

  • up_eta (float) – update step size of upper threshold

  • low_eta (float) – update step size of lower threshold

Returns:

updated upth lowth (float): updated lowth

Return type:

upth (float)

egc.model.node_embedding.age.preprocess_graph(adj: csr_matrix, layer: int, norm: str = 'sym', renorm: bool = True) Tensor[source]

Generalized Laplacian Smoothing Filter

Parameters:
  • adj (sp.csr_matrix) – 2D sparse adj.

  • layer (int) – numbers of linear layers

  • norm (str) – normalize mode of Laplacian matrix

  • renorm (bool) – If with the renormalization trick

Returns:

Laplacian Smoothing Filter

Return type:

adjs (sp.csr_matrix)

egc.model.node_embedding.age.scale(z)[source]

Feature Scale :param z: hidden embedding :type z: torch.Tensor

Returns:

scaled embedding

Return type:

z_scaled (torch.Tensor)

MVGRL

Contrastive Multi-View Representation Learning on Graphs https://arxiv.org/abs/2006.05582

class egc.model.node_embedding.mvgrl.MVGRL(in_feats: int, n_clusters: int, n_h: int = 512, model_filename: str = 'mvgrl', sparse: bool = False, nb_epochs: int = 3000, patience: int = 20, lr: float = 0.001, weight_decay: float = 0.0, sample_size: int = 2000, batch_size: int = 4, dataset: str = 'Citeseer')[source]

Bases: Module

MVGRL:Contrastive Multi-View Representation Learning on Graphs

Parameters:
  • in_feats (int) – Input feature size.

  • n_clusters (int) – Num of clusters.

  • n_h (int,optional) – hidden units dimension. Defaults to 256.

  • model_filename (str,optional) – Path to store model parameters. Defaults to ‘mvgrl’.

  • sparse (bool,optional) – Use sparse tensor. Defaults to False.

  • nb_epochs (int,optional) – Maximum training epochs. Defaults to 3000.

  • patience (int,optional) – Early stopping patience. Defaults to 20.

  • lr (float,optional) – Learning rate. Defaults to 0.001.

  • weight_decay (float,optional) – Weight decay. Defaults to 0.0.

  • sample_size (int,optional) – Sample size. Defaults to 2000.

  • batch_size (int,optional) – Batch size. Defaults to 4.

  • dataset (str,optional) – Dataset. Defaults to ‘Citeseer’.

forward(seq1, seq2, adj, diff, sparse, msk)[source]

Forward Propagation

Parameters:
  • seq1 (torch.Tensor) – features of raw graph

  • seq2 (torch.Tensor) – shuffle features of diffuse graph

  • adj (torch.Tensor) – adj matrix of raw graph

  • diff (torch.Tensor) – ppr matrix of diffuse graph

  • sparse (bool) – if sparse

  • msk (torch.Tensor) – mask node

Returns:

probability of positive or negtive node h_1 (torch.Tensor): node embedding of raw graph by one gcn layer h_2 (torch.Tensor): node embedding of diffuse graph by one gcn layer

Return type:

ret (torch.Tensor)

fit(adj_csr, features)[source]

Fitting

Parameters:
  • adj_csr (sp.lil_matrix) – adj sparse matrix.

  • features (torch.Tensor) – features.

get_embedding()[source]

Get the embeddings (graph or node level).

Returns:

embedding of each node. (torch.Tensor): embedding of graph representations

Return type:

(torch.Tensor)

get_memberships()[source]

Get memberships

Returns:

memberships

Return type:

np.ndarray

training: bool

SENet

SENet Kmeans

class egc.model.node_embedding.SENet.SENetEmbed(feature: FloatTensor, labels: IntTensor, adj: array, n_clusters: int, hidden0: int = 16, hidden1: int = 16, lr: float = 0.03, epochs: int = 50, weight_decay: float = 0.0, lam: float = 1.0, n_iter: int = 3)[source]

Bases: Module

SENet Embedding

Parameters:
  • feature (FloatTensor) – node’s feature.

  • labels (IntTensor) – node’s label.

  • adj (ndarray) – graph’s adjacency matrix

  • n_clusters (int) – clusters

  • hidden0 (int,optional) – hidden units size of gnn layer1. Defaults to 16,

  • hidden1 (int,optional) – hidden units size of gnn layer2. Defaults to 16,,

  • lr (float,optional) – learning rate. Defaults to 3e-2,

  • epochs (int,optional) – number of embedding training epochs.Defaults to 50,

  • weight_decay (float,optional) – weight decay.Defaults to 0.0,

  • lam (float,optional) – Used for construct improved graph . Defaults to 1.0,

  • n_iter (int,optional) – the times of convoluting feature . Defaults to 3,

  • seed (int,optional) – random seed. Defaults to 20.

forward()[source]

Get embedding by three networks

Returns:

(torch.floatTensor, torch.floatTensor, torch.floatTensor) Z1 = tanh(D’^-1 * A’ * X * W1) Z2 = tanh(D’^-1 * A’ * Z1 * W2) F = Z2 * W3 F^T * F = Q * Q^T Z3 = F * (Q^-1)^t

get_imporved_feature(n_iter, features)[source]

Get the improved feature after three convolutions

Parameters:
  • n_iter (int) – the times of convolution

  • features (tensor) – origin graph feature

Returns:

(tensor) X’ = (D’^-1 * A’)^3 * X

get_normalized_kernel_martix(feature)[source]

Get kernel martix

Parameters:

features (tensor) – improved graph feature

Returns: (tensor) K = Relu(X’ * X’^T) K = (K + K^T)/2

init_weights()[source]

initial the parameter of networks

get_embedding()[source]

Get kernel martix

Returns: (tensor) Z = [Z1,Z2,Z3]

fit()[source]

train model

training: bool

SAIF

A structure and attribute information fusion (SAIF) module

class egc.model.node_embedding.saif.SAIF(adj_orig_graph: DGLGraph, data: Tensor, train_loader: DataLoader, label: Tensor, adj: csr_matrix, n_clusters: int, n_node: int, device: device, args: Namespace)[source]

Bases: Module

A structure and attribute information fusion (SAIF) module

Parameters:
  • adj_orig_graph (dgl.DGLGraph) – Graph data in dgl

  • data (torch.Tensor) – node’s features

  • train_loader (DataLoader) – DataLoader of AE train

  • label (torch.Tensor) – node’s label

  • adj (sp.csr.csr_matrix) – adjacency matrix

  • n_clusters (int) – numbers of clusters

  • n_node (int, optional) – number of nodes. Defaults to None.

  • device (torch.device, optional) – device. Defaults to None.

  • args (argparse.Namespace) – all parameters

forward()[source]

Forward Propagation

Returns:

Reconstructed attribute matrix generated by AE decoder z_hat (torch.Tensor):Reconstructed weighted attribute matrix generated by IGAE decoder z_tilde (torch.Tensor):Clustering embedding adj_hat (torch.Tensor):Reconstructed adjacency matrix generated by IGAE decoder

Return type:

x_hat (torch.Tensor)

get_embedding()[source]

Get cluster embedding.

Returns:numpy.ndarray

fit(epochs)[source]

Fitting a SAIF clustering model.

Parameters:

epochs (int) – number of train epoch

training: bool

DGI

Embedding By DGI

Adapted from: https://github.com/PetarV-/DGI

class egc.model.node_embedding.dgi.DGIEmbed(in_feats: int, out_feats_list: List[int], n_epochs: int = 10000, early_stopping_epoch: int = 20, batch_size: int = 1024, neighbor_sampler_fanouts: List[int] = -1, lr: float = 0.001, l2_coef: float = 0.0, activation: str = 'prelu', model_filename: str = 'dgi')[source]

Bases: Module

DGI Embedding

Parameters:
  • in_feats (int) – input feature dimension.

  • out_feats_list (List[int]) – List of hidden units dimensions.

  • n_epochs (int, optional) – number of embedding training epochs. Defaults to 10000.

  • early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.

  • batch_size (int, optional) – batch size. Defaults to 1024.

  • neighbor_sampler_fanouts (List[int] or int, optional) –

    List of neighbors to sample for each GNN layer, with the i-th element being the fanout for the i-th GNN layer. Defaults to -1.

    • If only a single integer is provided, DGL assumes that every layer will have the same fanout.

    • If -1 is provided on one layer, then all inbound edges will be included.

  • lr (float, optional) – learning rate. Defaults to 0.001.

  • l2_coef (float, optional) – weight decay. Defaults to 0.0.

  • activation (str, optional) – activation of gcn layer. Defaults to prelu.

  • model_filename (str, optional) – path to save best model parameters. Defaults to dgi.

forward(block, input_feats) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fit(graph: DGLGraph, device: device = device(type='cpu')) None[source]

Fit for Specific Graph

Parameters:
  • graph (dgl.DGLGraph) – dgl graph.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

get_embedding(graph: DGLGraph, device: device = device(type='cpu'), model_filename: str | None = None) Tensor[source]

Get the embeddings.

Parameters:
  • graph (dgl.DGLGraph) – dgl graph.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

  • model_filename (str, optional) – Model file to load. Defaults to None.

Returns:

Embeddings.

Return type:

torch.Tensor

training: bool

GMI

Embedding By GMI

Adapted From: https://github.com/zpeng27/GMI

class egc.model.node_embedding.gmi.GMIEmbed(in_features: int, hidden_units: int = 512, n_epochs: int = 550, early_stopping_epoch: int = 20, lr: float = 0.001, l2_coef: float = 0.0, alpha: float = 0.8, beta: float = 1.0, gamma: float = 1.0, activation: str = 'prelu', gcn_depth: int = 2)[source]

Bases: Module

GMI Embedding

Parameters:
  • in_features (int) – input feature dimension.

  • hidden_units (int, optional) – hidden units size of gcn. Defaults to 512.

  • n_epochs (int, optional) – number of embedding training epochs. Defaults to 550.

  • early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.

  • lr (float, optional) – learning rate. Defaults to 0.001.

  • l2_coef (float, optional) – weight decay. Defaults to 0.0.

  • alpha (float, optional) – parameter for \(I(h_i; x_i)\). Defaults to 0.8.

  • beta (float, optional) – parameter for \(I(h_i; x_j)\). Defaults to 1.0.

  • gamma (float, optional) – parameter for \(I(w_ij; a_ij)\). Defaults to 1.0.

  • activation (str, optional) – activation of gcn layer. Defaults to “prelu”.

calc_loss(mi_pos: Tensor, mi_neg: Tensor, local_mi_pos: Tensor, local_mi_neg: Tensor, adj_rebuilt: Tensor) Tensor[source]

Calculate Loss

Parameters:
  • mi_pos (torch.Tensor) – \(D_w(h_i, x_i)\).

  • mi_neg (torch.Tensor) – \(D_w(h_i, x'_i)\).

  • local_mi_pos (torch.Tensor) – \(D_w(h_i, x_j)\).

  • local_mi_neg (torch.Tensor) – \(D_w(h_i, x'_j)\).

  • adj_rebuilt (torch.Tensor) – \(w_{ij}\)

Returns:

loss.

\[\begin{split}loss = & \alpha * sp(-D_w(h_i,x_i))+E(sp(D_w(h_i,x'_i))) \\ & + \beta * sp(-D_w(h_i,x_j))+E(sp(D_w(h_i,x'_j))) \\ & + \gamma * \text{reconstruct}_{loss} \\\end{split}\]

Return type:

(torch.Tensor)

forward(neg_sample_list: Tensor) Tensor[source]

Forward Propagation

Parameters:

neg_sample_list (torch.Tensor) – negative sample list.

Returns:

loss.

Return type:

torch.Tensor

fit(features: lil_matrix, adj_orig: csr_matrix, neg_list_num: int = 5) None[source]

Fit for Specific Graph

Parameters:
  • features (sp.lil_matrix) – 2D sparse features.

  • adj_orig (sp.csr_matrix) – 2D sparse adj.

  • neg_list_num (int, optional) – negative sample times. Defaults to 5.

set_features_norm(features_norm) None[source]

Set the features row normalized

Parameters:

features_norm (torch.Tensor) – normalized 3D features tensor in shape of [1, xx, xx]

set_adj_norm(adj_norm) None[source]

Set the adjacency symmetrically normalized

Parameters:

adj_norm (torch.Tensor) – symmetrically normalized 2D adjacency tensor

get_features_norm() Tensor[source]

Get the features row normalized

Returns:

normalized 3D features tensor in shape of [1, xx, xx]

Return type:

features_norm (torch.Tensor)

get_adj_norm() Tensor[source]

Get the adjacency symmetrically normalized

Returns:

symmetrically normalized 2D adjacency tensor

Return type:

adj_norm (torch.Tensor)

get_embedding() Tensor[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

training: bool

AGC

AGC Embedding

class egc.model.node_embedding.agc.AGCEmbed(adj: Tensor, feature: Tensor, labels: Tensor, epochs: int = 60, n_clusters: int = 7, rep: int = 10)[source]

Bases: Module

AGC Embedding

forward()[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fit()[source]
normalize_adj()[source]
to_onehot(prelabel)[source]
square_dist(prelabel, feature)[source]
get_embedding()[source]
training: bool