egc.model.graph_clustering.disjoint package

Submodules

egc.model.graph_clustering.disjoint.ComE module

Model of ComE

class egc.model.graph_clustering.disjoint.ComE.ComE(graph, n_clusters=7, size=2, down_sampling=0, table_size=100000000, labels=None, batch_size=50, num_workers=10, negative=5, lr=0.025, window_size=10, num_walks=10, walk_length=80, num_iter=1, output_file='Cora', alpha=0.1, beta=0.1, reg_covar=1e-05)[source]

Bases: object

class that keep track of all the parameters used during the learning of the embedding.

Parameters:
  • nodes_degree – Dict with node_id: degree of node

  • size – projection space

  • down_sampling – perform down_sampling of common node

  • table_size – size of the negative table to generate

  • path_labels – location of the file containing the ground true (label for each node)

  • input_file – name of the file containing the ground true (label for each node)

Returns:

build_vocab_(nodes_degree)[source]

Build vocabulary from a sequence of paths (can be a once-only generator stream). Sorted by node id

precalc_sampling()[source]

Peach vocabulary item’s threshold for sampling

reset_weights()[source]

Reset all projection weights to an initial (untrained) state, but keep the existing vocabulary.

reset_communities_weights()[source]

Reset all projection weights to an initial (untrained) state, but keep the existing vocabulary.

make_table(power=0.75)[source]

Create a table using stored vocabulary word counts for drawing random words in the negative sampling training routines.

Called internally from build_vocab().

fit()[source]
get_memberships()[source]

egc.model.graph_clustering.disjoint.SEComm module

SEComm implement

class egc.model.graph_clustering.disjoint.SEComm.SEComm(n_clusters: int, n_nodes: int, num_features: int, activation: str, base_model: str, batch_size: int, num_hidden: int, num_layers: int, num_proj_hidden: int, tau: float, num_cl_hidden: int, dropout: float, pretrain_epochs: int, learning_rate: float, weight_decay: float, drop_edge_rate_1: float, drop_edge_rate_2: float, drop_feature_rate_1: float, drop_feature_rate_2: float, x_norm: bool, iterations: int, threshold: float, se_epochs: int, se_alpha: float, se_patience: int, se_lr: float, cluster_epochs: int, cluster_alpha: float, final_beta: float, cluster_patience: int)[source]

Bases: Base, Module

SEComm model

Parameters:

function (see utils/argparser.py _SEComm_subparser) –

forward()[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fit(graph, features, label)[source]

Fitting a SEComm model

Parameters:
  • graph (dgl.DGLGraph) – data graph.

  • features (torch.Tensor) – features.

  • label (torch.Tensor) – label of node’s cluster

get_embedding()[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships()[source]

Get the memberships (graph or node level).

Returns:

memberships.

Return type:

(numpy.ndarray)

egc.model.graph_clustering.disjoint.SENet_kmeans module

SENet Kmeans

class egc.model.graph_clustering.disjoint.SENet_kmeans.SENetKmeans(feature: FloatTensor, labels: IntTensor, adj: FloatTensor, n_clusters: int, hidden0: int = 16, hidden1: int = 16, lr: float = 0.03, epochs: int = 50, weight_decay: float = 0.0, lam: float = 1.0, n_iter: int = 3)[source]

Bases: Base

SENet Kmeans

Parameters:
  • feature (FloatTensor) – node’s feature.

  • labels (IntTensor) – node’s label.

  • adj (FloatTensor) – graph’s adjacency matrix

  • n_clusters (int) – clusters

  • hidden0 (int,optional) – hidden units size of gnn layer1. Defaults to 16,

  • hidden1 (int,optional) – hidden units size of gnn layer2. Defaults to 16,,

  • lr (float,optional) – learning rate. Defaults to 3e-2,

  • epochs (int,optional) – number of embedding training epochs.Defaults to 50,

  • weight_decay (float,optional) – weight decay.Defaults to 0.0,

  • lam (float,optional) – Used for construct improved graph . Defaults to 1.0,

  • n_iter (int,optional) – the times of convoluting feature . Defaults to 3,

fit()[source]

Fit for Specific Graph

get_embedding()[source]

Get embedding from trained model

Returns:

(torch.floatTensor) node embedding

get_memberships()[source]

Get predict label by kmeans

Returns:

(torch.intTensor) predict label

egc.model.graph_clustering.disjoint.agc_kmeans module

AGC Kmeans

class egc.model.graph_clustering.disjoint.agc_kmeans.AGC(adj: Tensor, feature: Tensor, labels: Tensor, epochs: int = 60, n_clusters: int = 7, rep: int = 10)[source]

Bases: Base

SENet Kmeans

Parameters:
  • feature (FloatTensor) – node’s feature.

  • labels (IntTensor) – node’s label.

  • adj (FloatTensor) – graph’s adjacency matrix

  • n_clusters (int) – clusters

  • epochs (int,optional) – number of embedding training epochs.Defaults to 60,

  • rep (int,optional) – times of calculate intra(c)

fit()[source]

Fit for Specific Graph

get_embedding()[source]

Get embedding from trained model

Returns:

(torch.floatTensor) node embedding

get_memberships()[source]

Get predict label by kmeans

Returns:

(torch.intTensor) predict label

egc.model.graph_clustering.disjoint.agcn module

AGCN implement

class egc.model.graph_clustering.disjoint.agcn.MLP_L(n_mlp)[source]

Bases: Module

Used to reduce the dimension of features

forward(mlp_in)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class egc.model.graph_clustering.disjoint.agcn.MLP_1(n_mlp)[source]

Bases: Module

Used to reduce the dimension of features

forward(mlp_in)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class egc.model.graph_clustering.disjoint.agcn.MLP_2(n_mlp)[source]

Bases: Module

Used to reduce the dimension of features

forward(mlp_in)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class egc.model.graph_clustering.disjoint.agcn.MLP_3(n_mlp)[source]

Bases: Module

Used to reduce the dimension of features

forward(mlp_in)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class egc.model.graph_clustering.disjoint.agcn.AGCN(graph: DGLGraph, X: FloatTensor, labels: IntTensor, n_input, n_clusters, hidden1: int = 500, hidden2: int = 500, hidden3: int = 2000, lr: float = 0.0001, epochs: int = 200, pretrain_lr: float = 0.001, pretrain_epochs: int = 100, n_z: int = 10, v: int = 1, gpu: int = 0)[source]

Bases: Module

forward(graph, x)[source]

Calculate the distribution of p,q and z

Parameters:
  • graph (dgl.DGLgraph) – graph

  • x (torch.FloatTensor) – node features

Returns:

node features after AE reconstruction q (torch.FloatTensor): q-distribution predict (torch.FloatTensor): z-distribution, label predict p (torch.FloatTensor): p-distribution

Return type:

x_bar (torch.FloatTensor)

init_cluster_layer_parameter(features, n_init)[source]

Initialize the cluster center

Parameters:
  • features (torch.FloatTensor) – node feature

  • n_init (int) – Number of kmeans iterations

fit()[source]

Train model

Returns:

the result of model predict

Return type:

label_predict (ndarray)

get_memberships()[source]

Get predicted label

Parameters:
  • graph (dgl.DGLGraph) – graph

  • features (torch.FloatTensor) – node features

Returns:

get_embedding()[source]

Get Embedding :returns: torch.Tensor

training: bool

egc.model.graph_clustering.disjoint.age_cluster module

Used for age model

class egc.model.graph_clustering.disjoint.age_cluster.age_cluster(dims: list | None = None, feat_dim: int | None = None, gnnlayers_num: int = 3, linlayers_num: int = 1, lr: float = 0.001, upth_st: float = 0.0015, upth_ed: float = 0.001, lowth_st: float = 0.1, lowth_ed: float = 0.5, upd: float = 10, bs: int = 10000, epochs: int = 400, norm: str = 'sym', renorm: bool = True, estop_steps: int = 5, n_clusters: int | None = None)[source]

Bases: Base

AGE Cluster Implement

Parameters:
  • dims (list,optional) – Number of units in hidden layer 1.

  • feat_dim (int,optional) – input feature dimension.

  • gnnlayers_num (int) – Number of gnn layers

  • linlayers_num (int, optional) – Number of hidden layers

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • upth_st (float, optional) – Upper Threshold start.

  • upth_ed (float, optional) – Upper Threshold end.

  • lowth_st (float, optional) – Lower Threshold start.

  • lowth_ed (float, optional) – Lower Threshold end.

  • upd (float, optional) – Update epoch.

  • bs (int,optional) – Batchsize

  • epochs (int,optional) – Number of epochs to train.

  • norm (str,optional) – normalize mode of Laplacian matrix

  • renorm (bool,optional) – If with the renormalization trick

  • estop_steps (int,optional) – Number of early_stop steps.

  • n_cluster (int,optinal) – number of clusters

fit(adj_csr, features)[source]

Fit for Specific Graph

Parameters:
  • adj_csr (sp.lil_matrix) – 2D sparse features.

  • features (torch.Tensor) – node’s features

get_embedding()[source]

Get cluster embedding

Returns:numpy.ndarray

get_memberships()[source]

Get spectral cluster membership.

Returns:numpy.ndarray

egc.model.graph_clustering.disjoint.cc module

Contrastive Clustering

class egc.model.graph_clustering.disjoint.cc.ContrastiveClustering(in_feats: int, out_feats_list: List[int], n_clusters: int, aggregator_type: str = 'gcn', bias: bool = True, batch_size: int = 1024, instance_temperature: float = 0.5, cluster_temperature: float = 1.0, aug_types: List | None = None, n_epochs: int = 1000, lr: float = 0.001, l2_coef: float = 0.0, early_stopping_epoch: int = 20, model_filename: str = 'cc')[source]

Bases: Base, Module

Parameters:
  • in_feats (int) – Input feature size.

  • out_feats_list (List[int]) – List of hidden units dimensions.

  • n_clusters (int) – Num of clusters.

  • aggregator_type (str, optional) – Aggregator type to use (mean, gcn, pool, lstm). Defaults to ‘gcn’.

  • bias (bool, optional) – If True, adds a learnable bias to the output. Defaults to True.

  • batch_size (int, optional) – Batch size. Defaults to 1024.

  • instance_temperature (float, optional) – Instance Contrastive Head temperature. Defaults to 0.5.

  • cluster_temperature (float, optional) – Cluster Contrastive Head temperature. Defaults to 1.0.

  • aug_types (List, optional) – Augmentation types list. Defaults to [‘edge’, ‘edge’].

  • n_epochs (int, optional) – Maximum training epochs. Defaults to 1000.

  • lr (float, optional) – Learning Rate. Defaults to 0.001.

  • l2_coef (float, optional) – Weight decay. Defaults to 0.0.

  • early_stopping_epoch (int, optional) – Early stopping threshold. Defaults to 20.

  • model_filename (str, optional) – Path to store model parameters. Defaults to ‘cc’.

forward(blocks_i, blocks_j)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_cluster(blocks)[source]
forward_instance(blocks)[source]
fit(graph: DGLGraph, device: device = device(type='cpu')) None[source]
Parameters:
  • graph (dgl.DGLGraph) – graph.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

get_embedding(graph: DGLGraph, device: device = device(type='cpu')) Tensor[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships(graph: DGLGraph, device: device = device(type='cpu')) ndarray[source]

Get memberships

Returns:

memberships

Return type:

np.ndarray

egc.model.graph_clustering.disjoint.clusternet module

ClusterNet Paper: https://proceedings.neurips.cc/paper/2019/file/8bd39eae38511daad6152e84545e504d-Paper.pdf Source Code: https://github.com/bwilder0/clusternet

class egc.model.graph_clustering.disjoint.clusternet.ClusterNet(in_feats: int, out_feats_list: List[int], n_clusters: int, cluster_temp: float = 30, aggregator_type: str = 'gcn', bias: bool = True, dropout: float = 0.5, n_epochs: int = 1000, lr: float = 0.01, l2_coef: float = 1e-05, early_stopping_epoch: int = 20, model_filename: str = 'clusternet')[source]

Bases: Module

GCN ClusterNet. The ClusterNet architecture. The first step is a 2-layer GCN to generate embeddings. The output is the cluster means mu and soft assignments r, along with the embeddings and the the node similarities (just output for debugging purposes).

The forward pass inputs are x, a feature matrix for the nodes, and adj, a sparse adjacency matrix. The optional parameter num_iter determines how many steps to run the k-means updates for.

Parameters:
  • in_feats (int) – Input feature size.

  • out_feats_list (List[int]) – List of hidden units dimensions.

  • n_clusters (int) – Num of clusters.

  • cluster_temp (float, optional) – softmax temperature. Defaults to 30.

  • aggregator_type (str, optional) – Aggregator type to use (mean, gcn, pool, lstm). Defaults to ‘gcn’.

  • bias (bool, optional) – If True, adds a learnable bias to the output. Defaults to True.

  • dropout (float, optional) – Percentage for dropping in GCN. Defaults to 0.5.

  • n_epochs (int, optional) – Maximum training epochs. Defaults to 1000.

  • lr (float, optional) – Learning Rate. Defaults to 0.01.

  • l2_coef (float, optional) – Weight decay. Defaults to 0.5.

  • early_stopping_epoch (int, optional) – Early stopping threshold. Defaults to 20.

  • model_filename (str, optional) – Path to store model parameters. Defaults to ‘clusternet’.

forward(blocks) Tuple[Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fit(graph: DGLGraph, num_iter: int = 10, device: device = device(type='cpu')) None[source]
Parameters:
  • graph (dgl.DGLGraph) – graph.

  • num_iter (int, optional) – clustering iteration. Defaults to 10.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

get_embedding(graph: DGLGraph, device: device = device(type='cpu')) Tensor[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships(graph: DGLGraph, device: device = device(type='cpu')) Tensor[source]

Get the memberships.

Parameters:
  • graph (dgl.DGLGraph) – dgl graph.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

  • model_filename (str, optional) – Model file to load. Defaults to None.

Returns:

Embeddings.

Return type:

torch.Tensor

training: bool

egc.model.graph_clustering.disjoint.daegc module

DAEGC implement ref:https://github.com/kouyongqi/DAEGC

class egc.model.graph_clustering.disjoint.daegc.DAEGC(num_features: int, hidden_size: int, embedding_size: int, alpha: float, num_clusters: int, pretrain_lr: float, lr: float, weight_decay: float, pre_epochs: int, epochs: int, update_interval: int, estop_steps: int, t: int, v: int = 1)[source]

Bases: Base, Module

Parameters:
  • num_features (int) – input feature dimension.

  • hidden_size (int) – number of units in hiddin layer.

  • embedding_size (int) – number of output emb dim.

  • alpha (float) – Alpha for the leaky_relu.

  • num_clusters (int) – cluster num.

  • pretrain_lr (float) – learning rate of pretrain model.

  • lr (float) – learning rate of final model.

  • weight_decay (float) – weight decay.

  • pre_epochs (int) – number of epochs to pretrain model.

  • epochs (int) – number of epochs to final model.

  • update_interval (int) – update interval of DAEGC.

  • estop_steps (int) – Number of early_stop steps.

  • v (int,optional) – Degrees of freedom of the student t-distribution.Defaults to 1.

forward(x, adj, M)[source]

Forward Propagation

Parameters:
  • x (torch.Tensor) – features of nodes

  • adj (torch.Tensor) – adj matrix

  • M (torch.Tensor) – the topological relevance of node j to node i up to t orders.

Returns:

Reconstructed adj matrix z (torch.Tensor): latent representation q (torch.Tensor): Soft assignments

Return type:

A_pred (torch.Tensor)

fit(adj, feats, label)[source]

Fitting a DAEGC model

Parameters:
  • adj (sp.lil_matrix) – adj sparse matrix.

  • feats (torch.Tensor) – features.

  • label (torch.Tensor) – label of node’s cluster

get_Q(z)[source]

get soft clustering assignment distribution

Parameters:

z (torch.Tensor) – node embedding

Returns:

Soft assignments

Return type:

torch.Tensor

get_embedding()[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships()[source]

Get the memberships (graph or node level).

Returns:

memberships.

Return type:

(numpy.ndarray)

egc.model.graph_clustering.disjoint.daegc.target_distribution(q)[source]

get target distribution P

Parameters:

q (torch.Tensor) – Soft assignments

Returns:

target distribution P

Return type:

torch.Tensor

egc.model.graph_clustering.disjoint.daegc.get_M(adj, t=2)[source]

get the topological relevance of node j to node i up to t orders.

Parameters:
  • adj (torch.Tensor) – adj matrix

  • t (int,optional) – t order

Returns:

M

Return type:

torch.Tensor

egc.model.graph_clustering.disjoint.danmf module

DANMF

Adapted from: https://github.com/benedekrozemberczki/DANMF

class egc.model.graph_clustering.disjoint.danmf.DANMF(graph, args)[source]

Bases: object

Deep autoencoder-like non-negative matrix factorization class.

Parameters:
  • graph – Networkx graph.

  • args – Arguments object.

setup_z(i)[source]

Setup target matrix for pre-training process.

sklearn_pretrain(i)[source]

Pretraining a single layer of the model with sklearn. :param i: Layer index.

pre_training()[source]

Pre-training each NMF layer.

setup_Q()[source]

Setting up Q matrices.

update_U(i)[source]

Updating left hand factors. :param i: Layer index.

update_P(i)[source]

Setting up P matrices. :param i: Layer index.

update_V(i)[source]

Updating right hand factors. :param i: Layer index.

calculate_cost(i)[source]

Calculate loss. :param i: Global iteration.

save_embedding()[source]

Save embedding matrix.

gen_membership()[source]

Generate cluster membership.

training()[source]

Training process after pre-training.

egc.model.graph_clustering.disjoint.dfcn module

An implementation of “DFCN” from the AAAI’21 paper “Deep Fusion Clustering Network”.

An interdependency learning-based Structure and Attribute Information Fusion (SAIF) module is proposed to explicitly merge the representations learned by an autoencoder and a graph autoencoder for consensus representation learning.

Also, a reliable target distribution generation measure and a triplet self-supervision strategy, which facilitate cross-modality information exploitation, are designed for network training.

class egc.model.graph_clustering.disjoint.dfcn.DFCN(graph: DGLGraph, data: Tensor, label: Tensor, n_clusters: int, n_node: int, device: device, args: Namespace)[source]

Bases: Module, Base

DFCN.

Parameters:
  • graph (dgl.DGLGraph) – Graph data in dgl

  • data (torch.Tensor) – node’s features

  • label (torch.Tensor) – node’s label

  • n_clusters (int) – numbers of clusters

  • n_node (int, optional) – number of nodes. Defaults to None.

  • device (torch.device, optional) – device. Defaults to None.

  • args (argparse.Namespace) – all parameters

forward()[source]

Forward Propagation

Returns:

Reconstructed attribute matrix generated by AE decoder z_hat (torch.Tensor):Reconstructed weighted attribute matrix generated by IGAE decoder adj_hat (torch.Tensor):Reconstructed adjacency matrix generated by IGAE decoder q (torch.Tensor):Soft assignment distribution of the fused representations q1 (torch.Tensor):Soft assignment distribution of IGAE q2 (torch.Tensor):Soft assignment distribution of AE z_tilde (torch.Tensor):Clustering embedding

Return type:

x_hat (torch.Tensor)

get_embedding()[source]

Get cluster embedding.

Returns:torch.Tensor

get_memberships()[source]

Get cluster membership.

Returns:numpy.ndarray

fit(epochs)[source]

Fitting a DFCN clustering model.

Parameters:

epochs (int) – number of train epoch

training: bool

egc.model.graph_clustering.disjoint.dgc module

Deep Graph Clustering

class egc.model.graph_clustering.disjoint.dgc.DGC(in_feats: int, out_feats_list: List[int], n_clusters: int, aggregator_type: str = 'gcn', bias: bool = True, encoder_act: List[str] | None = None, dropout: float = 0.0, batch_size: int = 1024, n_epochs: int = 1000, lr: float = 0.01, l2_coef: float = 0.0, early_stopping_epoch: int = 20, model_filename: str = 'dgc')[source]

Bases: Base, Module

Deep Graph Clustering

forward(blocks)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss(adj_hat: Tensor, adj: Tensor) Tensor[source]
fit(graph: DGLGraph, device: device = device(type='cpu')) None[source]
get_embedding(graph: DGLGraph, device: device = device(type='cpu')) Tensor[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships(graph: DGLGraph, device: device = device(type='cpu')) ndarray[source]

Get memberships

Returns:

memberships

Return type:

np.ndarray

egc.model.graph_clustering.disjoint.dgc_mlp module

Deep Graph Clustering

class egc.model.graph_clustering.disjoint.dgc_mlp.DGC(in_feats: int, out_feats_list: List[int], n_clusters: int, classifier_hidden_list: List[int] | None = None, aggregator_type: str = 'gcn', bias: bool = True, encoder_act: List[str] | None = None, classifier_act: List[str] | None = None, dropout: float = 0.0, n_epochs: int = 1000, lr: float = 0.01, l2_coef: float = 0.0, early_stopping_epoch: int = 20, model_filename: str = 'dgc_mlp')[source]

Bases: Base, Module

Deep Graph Clustering

forward(blocks)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss(adj_hat: Tensor, adj: Tensor) Tensor[source]
fit(graph: DGLGraph, device: device = device(type='cpu')) None[source]
get_embedding(graph: DGLGraph, device: device = device(type='cpu')) Tensor[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships(graph: DGLGraph, device: device = device(type='cpu')) ndarray[source]

Get memberships

Returns:

memberships

Return type:

np.ndarray

egc.model.graph_clustering.disjoint.dgc_mlp_gsl module

Deep Graph Clustering

class egc.model.graph_clustering.disjoint.dgc_mlp_gsl.DGC(in_feats: int, out_feats_list: List[int], n_clusters: int, classifier_hidden_list: List[int] | None = None, aggregator_type: str = 'gcn', bias: bool = True, k: int = 20, tau: float = 0.9999, encoder_act: List[str] | None = None, classifier_act: List[str] | None = None, dropout: float = 0.0, n_epochs: int = 1000, n_pretrain_epochs: int = 800, lr: float = 0.01, l2_coef: float = 0.0, early_stopping_epoch: int = 20, model_filename: str = 'dgc_mlp_gsl')[source]

Bases: Base, Module

Deep Graph Clustering

load_best_model(device: device) None[source]
forward(blocks)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss(adj_hat: Tensor, adj: Tensor) Tensor[source]
pretrain(data_loader: NaiveDataLoader, adj_label: Tensor) None[source]
learn_structure(preds: Tensor) Tensor[source]
fit(graph: DGLGraph, device: device = device(type='cpu')) None[source]
get_embedding(graph: DGLGraph, device: device = device(type='cpu')) Tensor[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships(graph: DGLGraph, device: device = device(type='cpu')) ndarray[source]

Get memberships

Returns:

memberships

Return type:

np.ndarray

egc.model.graph_clustering.disjoint.dgi_kmeans module

DGI + Kmeans Graph Clustering

class egc.model.graph_clustering.disjoint.dgi_kmeans.DGIKmeans(in_feats: int, out_feats_list: List[int], n_epochs: int = 10000, early_stopping_epoch: int = 20, batch_size: int = 1024, neighbor_sampler_fanouts: List[int] = -1, lr: float = 0.001, l2_coef: float = 0.0, activation: str = 'prelu', model_filename: str = 'dgi')[source]

Bases: Base

DGI + Kmeans

Parameters:
  • in_feats (int) – input feature dimension.

  • out_feats_list (List[int]) – List of hidden units dimensions.

  • n_epochs (int, optional) – number of embedding training epochs. Defaults to 10000.

  • early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.

  • batch_size (int, optional) – batch size. Defaults to 1024.

  • neighbor_sampler_fanouts (List[int] or int, optional) –

    List of neighbors to sample for each GNN layer, with the i-th element being the fanout for the i-th GNN layer. Defaults to -1.

    • If only a single integer is provided, DGL assumes that every layer will have the same fanout.

    • If -1 is provided on one layer, then all inbound edges will be included.

  • lr (float, optional) – learning rate. Defaults to 0.001.

  • l2_coef (float, optional) – weight decay. Defaults to 0.0.

  • activation (str) – activation of gcn layer. Defaults to prelu.

  • model_filename (str, optional) – path to save best model parameters. Defaults to dgi.

fit(graph: DGLGraph, n_clusters: int, device: device = device(type='cpu'))[source]

Fit for Specific Graph

Parameters:
  • graph (dgl.DGLGraph) – dgl graph.

  • n_clusters (int) – cluster num.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

get_embedding(graph: DGLGraph, device: device = device(type='cpu'), model_filename: str | None = None) Tensor[source]

Get the embeddings.

Parameters:
  • graph (dgl.DGLGraph) – dgl graph.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

  • model_filename (str, optional) – Model file to load. Defaults to None.

Returns:

Embeddings.

Return type:

torch.Tensor

get_memberships(graph: DGLGraph, device: device = device(type='cpu'), model_filename: str | None = None) Tensor[source]

Get the memberships.

Parameters:
  • graph (dgl.DGLGraph) – dgl graph.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

  • model_filename (str, optional) – Model file to load. Defaults to None.

Returns:

Embeddings.

Return type:

torch.Tensor

egc.model.graph_clustering.disjoint.gae_kmeans module

gae_kmeans

class egc.model.graph_clustering.disjoint.gae_kmeans.DGL_GAEKmeans(epochs: int, n_clusters: int, fead_dim: int, n_nodes: int, hidden_dim1: int = 32, dropout: float = 0.0, lr: float = 0.01, early_stop: int = 10, activation: str = 'relu')[source]

Bases: Base

GAE Kmeans implement using dgl

Parameters:
  • epochs (int, optional) – number of embedding training epochs. Defaults to 200.

  • n_clusters (int) – cluster num.

  • fead_dim (int) – dim of features

  • n_nodes (int) – number of nodes

  • hidden_dim1 (int) – hidden units size of gcn_1. Defaults to 32.

  • dropout (int, optional) – Dropout rate (1 - keep probability).

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • early_stop (int, optional) – early stopping threshold. Defaults to 10.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

fit(adj_csr: csr_matrix, features: Tensor, device: device = device(type='cpu'))[source]

Fit for Specific Graph

Parameters:
  • adj_csr (sp.lil_matrix) – 2D sparse features.

  • features (torch.Tensor) – node’s features

get_embedding()[source]
get_memberships()[source]

egc.model.graph_clustering.disjoint.gala module

GALA

class egc.model.graph_clustering.disjoint.gala.GALA(adj: Tensor, X: Tensor, lr: float = 0.0001, epochs: int = 1000, hidden1: int = 800, hidden2: int = 700, n_clusters: int = 7)[source]

Bases: Module

forward()[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights()[source]

initial the parameter of networks

fit()[source]
get_encoder_operator()[source]
get_decoder_operator()[source]
get_memberships()[source]
training: bool

egc.model.graph_clustering.disjoint.gdcl module

Graph Debiased Contrastive Learning with Joint Representation Clustering https://www.ijcai.org/proceedings/2021/0473.pdf

class egc.model.graph_clustering.disjoint.gdcl.Readout[source]

Bases: Module

read out

static forward(seq, msk)[source]

Forward Propagation

Parameters:
  • seq (torch.Tensor) – features tensor.

  • msk (torch.Tensor) – node mask.

Returns:

graph-level representation

Return type:

(torch.Tensor)

training: bool
class egc.model.graph_clustering.disjoint.gdcl.GDCL(in_feats, n_clusters, n_h: int = 512, nb_epochs: int = 1500, lr: float = 5e-05, alpha=0.0001, mask_num: int = 100, batch_size: int = 4, update_interval: int = 10, model_filename: str = 'gdcl', beta: float = 0.001, weight_decay: float = 0.0, pt_n_h: int = 512, pt_model_filename: str = 'mvgrl', pt_nb_epochs: int = 3000, pt_patience: int = 20, pt_lr: float = 0.001, pt_weight_decay: float = 0.0, pt_sample_size: int = 2000, pt_batch_size: int = 4, sparse: bool = False, dataset: str = 'Citeseer', device: device = device(type='cpu'))[source]

Bases: Module

GDCL: Graph Debiased Contrastive Learning with Joint Representation Clustering

Parameters:
  • in_feats (int) – Input feature size.

  • n_clusters (int) – Num of clusters.

  • n_h (int) – hidden units dimension. Defaults to 512.

  • nb_epochs – epoch number of GDCL . Defaults to 1500.

  • lr – learning rate of GDCL. Defaults to 0.00005.

  • alpha – alpha parameter of distribution. Defaults to 0.0001.

  • mask_num – mask number. Defaults to 100.

  • batch_size – batch size of GDCL. Defaults to 4.

  • update_interval – update interval of GDCL. Defaults to 10.

  • model_filename – model filename of GDCL. Defaults to ‘gdcl’.

  • beta – balance factor. Defaults to 10e-4.

  • weight_decay – weight decay of GDCL. Defaults to 0.0.

  • pt_n_h – hidden units dimension of pretrained MVGRL. Defaults to 512.

  • pt_model_filename – model filename of pretrained MVGRL. Defaults to ‘mvgrl’.

  • pt_nb_epochs – epoch number of pretrained MVGRL. Defaults to 3000.

  • pt_patience – patience of pretrained MVGRL. Defaults to 20.

  • pt_lr – learning rate of pretrained MVGRL. Defaults to 0.001.

  • pt_weight_decay – weight decay of pretrained MVGRL. Defaults to 0.0.

  • pt_sample_size – sample size of pretrained MVGRL. Defaults to 2000.

  • pt_batch_size – batch size of pretrained MVGRL. Defaults to 4.

  • sparse – if sparse. Defaults to False.

  • dataset – dataset name. Defaults to ‘Citeseer’.

  • device – device. Defaults to torch.device(‘cpu’).

pretrain(graph)[source]

Fitting

Parameters:

graph (dgl.DGLGraph) – graph.

embed(seq, adj, diff, sparse)[source]

Embed.

Parameters:
  • seq (tensor.Tensor) – features of raw graph

  • adj (tensor.Tensor) – adj matrix of raw graph

  • diff (tensor.Tensor) – ppr matrix of diffuse graph

  • sparse (bool) – if sparse

Returns:

node embedding

Return type:

(tensor.Tensor)

forward(bf, mask_fts, bd, sparse)[source]

Forward Propagation

Parameters:
  • bf (tensor.Tensor) – features of raw graph

  • mask_fts (tensor.Tensor) – mask features

  • bd (tensor.Tensor) – ppr matrix of diffuse graph

  • sparse (bool) – if sparse

Returns:

node embedding of mask features graph h (tensor.Tensor): node embedding of raw graph q (tensor.Tensor): soft assignment

Return type:

h_mask (tensor.Tensor)

fit(graph, labels)[source]

Fitting

Parameters:
  • graph (dgl.DGLGraph) – graph.

  • labels (tensor.Tensor) – labels of each node

get_embedding()[source]

Get the embeddings (graph or node level).

Returns:

embedding of each node. (torch.Tensor): embedding of graph representations

Return type:

(torch.Tensor)

get_memberships()[source]

Get memberships

Returns:

memberships

Return type:

np.ndarray

training: bool
egc.model.graph_clustering.disjoint.gdcl.target_distribution(q)[source]

get target distribution P

Parameters:

q (torch.Tensor) – Soft assignments

Returns:

target distribution P

Return type:

torch.Tensor

egc.model.graph_clustering.disjoint.gmi_kmeans module

GMI Kmeans Graph Clustering

class egc.model.graph_clustering.disjoint.gmi_kmeans.GMIKmeans(in_features: int, hidden_units: int = 512, n_epochs: int = 550, early_stopping_epoch: int = 20, lr: float = 0.001, l2_coef: float = 0.0, alpha: float = 0.8, beta: float = 1.0, gamma: float = 1.0, activation: str = 'prelu', gcn_depth: int = 2)[source]

Bases: Base

GMI Kmeans

Parameters:
  • in_features (int) – input feature dimension.

  • hidden_units (int, optional) – hidden units size of gcn. Defaults to 512.

  • n_epochs (int, optional) – number of embedding training epochs. Defaults to 550.

  • early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.

  • lr (float, optional) – learning rate. Defaults to 0.001.

  • l2_coef (float, optional) – weight decay. Defaults to 0.0.

  • alpha (float, optional) – parameter for I(h_i; x_i). Defaults to 0.8.

  • beta (float, optional) – parameter for I(h_i; x_j). Defaults to 1.0.

  • gamma (float, optional) – parameter for I(w_ij; a_ij). Defaults to 1.0.

  • activation (str, optional) – activation of gcn layer. Defaults to “prelu”.

fit(features_lil: lil_matrix, adj_csr: csr_matrix, n_clusters: int, neg_list_num: int = 5)[source]

Fit for Specific Graph

Parameters:
  • features (sp.lil_matrix) – 2D sparse features.

  • adj_orig (sp.csr_matrix) – 2D sparse adj.

  • n_clusters (int) – cluster num.

  • neg_list_num (int, optional) – negative sample times. Defaults to 5.

get_embedding()[source]
get_memberships()[source]

egc.model.graph_clustering.disjoint.idec module

DEC / IDEC

class egc.model.graph_clustering.disjoint.idec.IDEC(in_feats: int, out_feats_list: List[int], n_clusters: int, aggregator_type: str = 'gcn', bias: bool = True, batch_size: int = 1024, alpha: float = 1.0, beta: float = 10.0, n_epochs: int = 1000, n_pretrain_epochs: int = 400, lr: float = 0.001, l2_coef: float = 0.0, early_stopping_epoch: int = 20, model_filename: str = 'dec')[source]

Bases: Base, Module

DEC / IDEC. Set beta to 0.0 for DEC or to nonzero for IDEC.

Parameters:
  • in_feats (int) – Input feature size.

  • out_feats_list (List[int]) – List of hidden units dimensions.

  • n_clusters (int) – Num of clusters.

  • aggregator_type (str, optional) – Aggregator type to use (mean, gcn, pool, lstm). Defaults to ‘gcn’.

  • bias (bool, optional) – If True, adds a learnable bias to the output. Defaults to True.

  • batch_size (int, optional) – Batch size. Defaults to 1024.

  • alpha (float, optional) – Alpha of student-T distribution. Defaults to 1.0.

  • beta (float, optional) – Coeffecient of reconstruction loss. 0.0 for DEC while nonzero for IDEC. Defaults to 10.0.

  • n_epochs (int, optional) – Maximum training epochs. Defaults to 1000.

  • n_pretrain_epochs (int, optional) – Maximum pretraining epochs. Defaults to 400.

  • lr (float, optional) – Learning Rate. Defaults to 0.001.

  • l2_coef (float, optional) – Weight decay. Defaults to 0.0.

  • early_stopping_epoch (int, optional) – Early stopping threshold. Defaults to 20.

  • model_filename (str, optional) – Path to store model parameters. Defaults to ‘dec’.

clustering(h: Tensor, device: device = device(type='cpu')) None[source]

Clustering by miniBatchKmeans.

Parameters:
  • h (torch.Tensor) – features.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

get_distance(h: Tensor) Tensor[source]

Get the distance sum of all the point to each center.

Parameters:

h (torch.Tensor) – features.

Returns:

distance sum of all the point to each center.

Return type:

distance (torch.Tensor)

get_t_distribution(h: Tensor) Tuple[Tensor, Tensor][source]

Student t-distribution, as same as used in t-SNE algorithm. q_ij = 1/(1+dist(x_i, u_j)^2), then normalize it.

Parameters:

h (torch.Tensor) – features.

Returns:

(distance, q)

Return type:

Tuple[torch.Tensor, torch.Tensor]

static target_distribution(q: Tensor) Tensor[source]
pretrain(train_loader: NaiveDataLoader, features: Tensor) None[source]
forward(blocks) Tuple[Tensor, Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fit(graph: DGLGraph, device: device = device(type='cpu')) None[source]
Parameters:
  • graph (dgl.DGLGraph) – graph.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

get_embedding(graph: DGLGraph, device: device = device(type='cpu')) Tensor[source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships(graph: DGLGraph, device: device = device(type='cpu')) ndarray[source]

Get memberships

Returns:

memberships

Return type:

np.ndarray

egc.model.graph_clustering.disjoint.mnmf module

MNMF implement

class egc.model.graph_clustering.disjoint.mnmf.MNMF(dimensions=128, clusters=10, lambd=0.2, alpha=0.05, beta=0.05, iterations=200, lower_control=1e-15, eta=5.0)[source]

Bases: Base

An implementation of “M-NMF” from the AAAI ‘17 paper “Community Preserving Network Embedding”. The procedure uses joint non-negative matrix factorization with modularity based regularization in order to learn a cluster membership distribution over nodes. The method can be used in an overlapping and non-overlapping way.

Parameters:
  • dimensions (int) – Number of dimensions. Default is 128.

  • clusters (int) – Number of clusters. Default is 10.

  • lambd (float) – KKT penalty. Default is 0.2

  • alpha (float) – Clustering penalty. Default is 0.05.

  • beta (float) – Modularity regularization penalty. Default is 0.05.

  • iterations (int) – Number of power iterations. Default is 200.

  • lower_control (float) – Floating point overflow control. Default is 10**-15.

  • eta (float) – Similarity mixing parameter. Default is 5.0.

get_memberships()[source]

Getting the cluster membership of nodes.

Return types:
  • memberships (dict) - Node cluster memberships.

get_embedding()[source]

Getting the node embedding.

Return types:
  • embedding (Numpy array) - The embedding of nodes.

get_cluster_centers()[source]

Getting the node embedding.

Return types:
  • centers (Numpy array) - The cluster centers.

fit(graph)[source]

Fitting an M-NMF clustering model.

Arg types:
  • graph (NetworkX graph) - The graph to be clustered.

egc.model.graph_clustering.disjoint.pca_kmeans module

pca_kmeans

egc.model.graph_clustering.disjoint.pca_kmeans.pca_kmeans(X: ndarray, n_clusters: int, n_components: int | None = None) ndarray[source]

Principal component analysis (PCA).

Parameters:
  • X (np.ndarray) – array-like of shape (n_samples, n_features) Training data, where n_samples is the number of samples and n_features is the number of features.

  • n_clusters (int) – num of clusters.

  • n_components (int or float or str) – Number of components to keep. Defaults to None.

Returns:

Community memberships.

Return type:

np.ndarray

egc.model.graph_clustering.disjoint.sdcn module

SDCN implement

class egc.model.graph_clustering.disjoint.sdcn.SDCN(graph: DGLGraph, X: FloatTensor, labels: IntTensor, n_input, n_clusters, hidden1: int = 500, hidden2: int = 500, hidden3: int = 200, lr: float = 0.0001, epochs: int = 200, pretrain_lr: float = 0.0005, pretrain_epochs: int = 100, n_z: int = 10, v: int = 1, gpu: int = 0)[source]

Bases: Module

forward(graph, x)[source]

Calculate the distribution of p,q and z

Parameters:
  • graph (dgl.DGLgraph) – graph

  • x (torch.FloatTensor) – node features

Returns:

node features after AE reconstruction q (torch.FloatTensor): q-distribution predict (torch.FloatTensor): z-distribution, label predict p (torch.FloatTensor): p-distribution

Return type:

x_bar (torch.FloatTensor)

init_cluster_layer_parameter(features, n_init)[source]

Initialize the cluster center

Parameters:
  • features (torch.FloatTensor) – node feature

  • n_init (int) – Number of kmeans iterations

fit()[source]

Train model

Returns:

the result of model predict

Return type:

label_predict (ndarray)

get_memberships()[source]

Get predicted label

Parameters:
  • graph (dgl.DGLGraph) – graph

  • features (torch.FloatTensor) – node features

Returns:

get_embedding()[source]

Get Embedding :returns: torch.Tensor

training: bool

egc.model.graph_clustering.disjoint.sgc_kmeans module

sgc_kmeans

class egc.model.graph_clustering.disjoint.sgc_kmeans.SGCKmeans(in_feats: int, n_epochs: int = 400, hidden_units: ~typing.List = [500], lr: float = 0.01, early_stop: int = 10, inner_act: ~typing.Callable = <function SGCKmeans.<lambda>>, n_lin_layers: int = 1, n_gnn_layers: int = 10)[source]

Bases: Base

GAE Kmeans implement using dgl

Parameters:
  • epochs (int, optional) – number of embedding training epochs. Defaults to 200.

  • n_clusters (int) – cluster num.

  • fead_dim (int) – dim of features

  • n_nodes (int) – number of nodes

  • hidden_dim1 (int) – hidden units size of gcn_1. Defaults to 32.

  • dropout (int, optional) – Dropout rate (1 - keep probability).

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • early_stop (int, optional) – early stopping threshold. Defaults to 10.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

fit(graph: DGLGraph, n_clusters: int, device: device = device(type='cpu'))[source]

Fit for Specific Graph

Parameters:
  • graph (dgl.DGLGraph) – dgl graph.

  • n_clusters (int) – cluster num.

  • device (torch.device, optional) – torch device. Defaults to torch.device(‘cpu’).

get_embedding()[source]
get_memberships()[source]

egc.model.graph_clustering.disjoint.vgae_kmeans module

vgae_kmeans

class egc.model.graph_clustering.disjoint.vgae_kmeans.DGL_VGAEKmeans(epochs: int, n_clusters: int, fead_dim: int, n_nodes: int, hidden_dim1: int = 32, hidden_dim2: int = 16, dropout: float = 0.0, lr: float = 0.01, early_stop: int = 10, activation: str = 'relu')[source]

Bases: Base

VGAE Kmeans implement using dgl

Parameters:
  • epochs (int, optional) – number of embedding training epochs. Defaults to 200.

  • n_clusters (int) – cluster num.

  • fead_dim (int) – dim of features

  • n_nodes (int) – number of nodes

  • hidden_dim1 (int) – hidden units size of gcn_1. Defaults to 32.

  • hidden_dim2 (int) – hidden units size of gcn_2. Defaults to 16.

  • dropout (int, optional) – Dropout rate (1 - keep probability).

  • lr (float, optional) – learning rate.. Defaults to 0.001.

  • early_stop (int, optional) – early stopping threshold. Defaults to 10.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

fit(adj_csr, features)[source]

Fit for Specific Graph

Parameters:
  • adj_csr (sp.lil_matrix) – 2D sparse features.

  • features (torch.Tensor) – node’s features

get_embedding()[source]
get_memberships()[source]
class egc.model.graph_clustering.disjoint.vgae_kmeans.VGAEKmeans(in_features: int, hidden_units_1: int = 32, hidden_units_2: int = 16, n_epochs: int = 400, early_stopping_epoch: int = 20, lr: float = 0.001, l2_coef: float = 0.0, activation: str = 'relu', model_filename: str = 'vgae_kmeans')[source]

Bases: Base

VGAE Kmeans

Parameters:
  • in_features (int) – input feature dimension.

  • hidden_units_1 (int, optional) – gcn_1 hidden units. Defaults to 32.

  • hidden_units_2 (int, optional) – gcn_2 hidden units. Defaults to 16.

  • n_epochs (int, optional) – node embedding epochs. Defaults to 400.

  • early_stopping_epoch (int, optional) – early stopping epoch number. Defaults to 20.

  • lr (float, optional) – learning rate. Defaults to 0.001.

  • l2_coef (float, optional) – l2 weight decay. Defaults to 0.0.

  • activation (str, optional) – activation of gcn layer. Defaults to ‘relu’.

  • model_filename – str = ‘vgae_kmeans’,

fit(features_lil: lil_matrix, adj_csr: csr_matrix, n_clusters: int)[source]

Fit for Specific Graph

Parameters:
  • features (sp.lil_matrix) – 2D sparse features.

  • adj_orig (sp.csr_matrix) – 2D sparse adj.

  • n_clusters (int) – cluster num.

  • neg_list_num (int, optional) – negative sample times. Defaults to 5.

get_embedding()[source]
get_memberships()[source]

egc.model.graph_clustering.disjoint.vgaecd module

VGAECD

class egc.model.graph_clustering.disjoint.vgaecd.VGAECD(in_features: int, n_clusters: int, alpha: float = 25.0, beta: float = 1.0, hidden_units_1: int = 32, hidden_units_2: int = 16, n_epochs: int = 800, early_stopping_epoch: int = 20, n_epochs_pretrain: int = 200, lr: float = 0.01, l2_coef: float = 0.0, activation: str = 'relu')[source]

Bases: Base, Module

Parameters:
  • in_features (int) – input feature dimension.

  • n_clusters (int) – cluster num.

  • alpha (float) – coefficient of reconstruction loss. Defaults to 25.0.

  • beta (float) – coefficient of the loss except reconstruction loss. Defaults to 1.0.

  • hidden_units_1 (int) – hidden units size of gcn_1. Defaults to 32.

  • hidden_units_2 (int) – hidden units size of gcn_2. Defaults to 16.

  • n_epochs (int, optional) – number of embedding training epochs. Defaults to 200.

  • early_stopping_epoch (int, optional) – early stopping threshold. Defaults to 20.

  • lr (float, optional) – learning rate. Defaults to 0.01.

  • l2_coef (float, optional) – weight decay. Defaults to 0.0.

  • activation (str, optional) – activation of gcn layer_1. Defaults to ‘relu’.

recon_loss(adj_hat: Tensor) Tensor[source]
forward() Tuple[Tensor, Tensor, Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fit(features: lil_matrix, adj_orig: csr_matrix) None[source]
Parameters:
  • features (sp.lil_matrix) – 2D sparse features.

  • adj_orig (sp.csr_matrix) – 2D sparse adj.

get_embedding() Tuple[Tensor, Tensor][source]

Get the embeddings (graph or node level).

Returns:

embedding.

Return type:

(torch.Tensor)

get_memberships() ndarray[source]

Module contents

Graph Clustering Models