egc.utils package
Subpackages
Submodules
egc.utils.ComE_utils module
Utils for ComE model
- egc.utils.ComE_utils.chunkize_serial(iterable, chunksize, as_numpy=False)[source]
Return elements from the iterable in chunksize-ed lists. The last returned element may be smaller (if length of collection is not divisible by chunksize).
>>> print(list(grouper(range(10), 3))) [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
- egc.utils.ComE_utils.prepare_sentences(model, paths)[source]
- Parameters:
model – current model containing the vocabulary and the index
paths – list of the random walks. we have to translate the node to the appropriate index and apply the dropout
- Returns:
generator of the paths according to the dropout probability and the correct index
- egc.utils.ComE_utils.batch_generator(iterable, batch_size=1)[source]
same as chunkize_serial, but without the usage of an infinite while
- Parameters:
iterable – list that we want to convert in batches
batch_size – batch size
- class egc.utils.ComE_utils.RepeatCorpusNTimes(corpus, n)[source]
Bases:
objectClass used to repeat n-times the same corpus of paths
- Parameters:
corpus – list of paths that we want to repeat
n – number of times we want to repeat our corpus
- class egc.utils.ComE_utils.Vocab(**kwargs)[source]
Bases:
objectA single vocabulary item, used internally for constructing binary trees (incl. both word leaves and inner nodes).
- class egc.utils.ComE_utils.WriteWalksToDisk[source]
Bases:
objectUsed for writing rand walks to disk
- write_walks_to_disk(G, filebase, num_paths, path_length, alpha=0, rand=<random.Random object>, num_workers=56)[source]
save the random walks on files so is not needed to perform the walks at each execution
- Parameters:
G – graph to walks on
filebase – location where to save the final walks
num_paths – number of walks to do for each node
path_length – lenght of each walks
alpha – restart probability for the random walks
rand – generator of random numbers
num_workers – number of thread used to execute the job
- Returns:
- egc.utils.ComE_utils.build_deepwalk_corpus_iter(G, num_paths, path_length, alpha=0, rand=<random.Random object>)[source]
- egc.utils.ComE_utils.count_words(file)[source]
Counts the word frequences in a list of sentences.
Note
This is a helper function for parallel execution of Vocabulary.from_text method.
egc.utils.SEComm_utils module
SEComm utils
- egc.utils.SEComm_utils.enhance_sim_matrix(C: ndarray, K: int, d: int, alpha: float) ndarray[source]
Enhance similarity matrix.
- Parameters:
C (np.ndarray) – coefficient matrix.
K (int) – number of clusters.
d (int) – dimension of each subspace.
alpha (float) – coefficient.
- Returns:
enhanced similarity matrix
- Return type:
np.ndarray
egc.utils.argparser module
Parse All Model Args
- egc.utils.argparser.models: Dict = {'AGC': {'description': 'AGC', 'name': 'AGC', 'paper url': 'https://dl.acm.org/doi/abs/10.1145/3474085.3475276', 'source code': 'https://github.com/karenlatong/AGC-master'}, 'AGCN': {'description': 'AGCN', 'name': 'AGCN', 'paper url': '', 'source code': 'https://github.com/ZhihaoPENG-CityU/MM21---AGCN'}, 'AGE': {'description': 'AGE', 'name': 'AGE', 'paper url': 'https://dl.acm.org/doi/pdf/10.1145/3394486.3403140', 'source code': 'https://github.com/thunlp/AGE'}, 'ComE': {'description': 'ComE', 'name': 'ComE', 'paper url': 'https://dl.acm.org/doi/pdf/10.1145/3132847.3132925', 'source code': 'https://github.com/andompesta/ComE'}, 'CommunityGAN': {'description': 'CommunityGAN', 'name': 'CommunityGAN', 'paper url': 'https://dl.acm.org/doi/pdf/10.1145/3308558.3313564', 'source code': 'https://github.com/SamJia/CommunityGAN'}, 'DAEGC': {'description': 'DAEGC', 'name': 'DAEGC', 'paper url': 'https://www.ijcai.org/Proceedings/2019/0509.pdf', 'source code': 'https://github.com/Tiger101010/DAEGC'}, 'DANMF': {'description': 'DANMF', 'name': 'DANMF', 'paper url': 'https://dl.acm.org/doi/pdf/10.1145/3269206.3271697', 'source code': 'https://github.com/benedekrozemberczki/DANMF'}, 'DFCN': {'description': 'DFCN', 'name': 'DFCN', 'paper url': 'https://arxiv.org/pdf/2012.09600.pdf', 'source code': 'https://github.com/WxTu/DFCN'}, 'GALA': {'description': 'GALA', 'name': 'GALA', 'paper url': 'https://arxiv.org/pdf/1908.02441v1.pdf', 'source code': 'https://github.com/sseung0703/GALA_TF2.0'}, 'GDCL': {'description': 'GDCL', 'name': 'GDCL', 'paper url': 'https://www.ijcai.org/proceedings/2021/0473.pdf', 'source code': 'https://github.com/hzhao98/GDCL'}, 'MNMF': {'description': 'MNMF', 'name': 'MNMF', 'paper url': 'https://ojs.aaai.org/index.php/AAAI/article/view/10488', 'source code': 'https://github.com/AnryYang/M-NMF'}, 'MVGRL': {'description': 'MVGRL', 'name': 'MVGRL', 'paper url': 'https://arxiv.org/abs/2006.05582', 'source code': 'https://github.com/kavehhassani/mvgrl'}, 'SDCN': {'description': 'SDCN', 'name': 'SDCN', 'paper url': 'https://arxiv.org/pdf/2002.01633.pdf', 'source code': 'https://github.com/bdy9527/SDCN'}, 'SEComm': {'description': 'SEComm', 'name': 'SEComm', 'paper url': 'https://proceedings.mlr.press/v161/bandyopadhyay21a/bandyopadhyay21a.pdf', 'source code': 'https://github.com/viz27/SEComm'}, 'SENet_kmeans': {'description': 'SENEet with kmeans', 'name': 'SENet', 'paper url': 'https://www.sciencedirect.com/science/article/pii/S0893608021002227?via%3Dihub', 'source code': ''}, 'SUBLIME': {'description': 'SUBLIME', 'name': 'SUBLIME', 'paper url': 'https://arxiv.org/pdf/2201.06367.pdf', 'source code': 'https://github.com/GRAND-Lab/SUBLIME'}, 'VGAECD': {'description': 'VGAECD', 'name': 'VGAECD', 'paper url': 'https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8594831', 'source code': ''}, 'cc': {'description': 'Contrastive Clustering', 'name': 'CC', 'paper url': 'https://arxiv.org/pdf/2009.09687.pdf', 'source code': 'https://github.com/Yunfan-Li/Contrastive-Clustering'}, 'clusternet': {'description': 'ClusterNet', 'name': 'clusternet', 'paper url': 'https://arxiv.org/abs/1905.13732', 'source code': 'https://github.com/bwilder0/clusternet'}, 'dgi_kmeans': {'description': 'DGI with Kmeans', 'name': 'DGI', 'paper url': 'https://arxiv.org/abs/1809.10341', 'source code': 'https://github.com/PetarV-/DGI'}, 'gae_kmeans': {'description': 'GAE with Kmeans', 'name': 'GAE', 'paper url': 'https://arxiv.org/pdf/1611.07308.pdf', 'source code': 'https://github.com/tkipf/gae'}, 'gmi_kmeans': {'description': 'GMI with Kmeans', 'name': 'GMI', 'paper url': 'https://arxiv.org/pdf/1809.10341.pdf', 'source code': 'https://github.com/zpeng27/GMI'}, 'idec': {'description': 'IDEC', 'name': 'idec', 'paper url': 'https://dl.acm.org/doi/10.5555/3045390.3045442', 'source code': 'https://github.com/piiswrong/dec'}, 'pca_kmeans': {'description': 'PCA with Kmeans.', 'name': 'PCA', 'paper url': '', 'source code': ''}, 'sgc_kmeans': {'description': 'SGC with Kmeans.', 'name': 'SGC', 'paper url': 'https://arxiv.org/pdf/1902.07153.pdf', 'source code': 'https://github.com/Tiiiger/SGC'}, 'vgae_kmeans': {'description': 'VGAE with Kmeans', 'name': 'VGAE', 'paper url': 'https://arxiv.org/pdf/1611.07308.pdf', 'source code': 'https://github.com/tkipf/gae'}}
Info of the models supported.
egc.utils.clustering module
Clustering Methods.
- egc.utils.clustering.sk_clustering(X: Tensor, n_clusters: int, name: str = 'kmeans') ndarray[source]
sklearn clustering.
- Parameters:
X (torch.Tensor) – data embeddings.
n_clusters (int) – num of clusters.
name (str, optional) – type name. Defaults to ‘kmeans’.
- Raises:
NotImplementedError – clustering method not implemented.
- Returns:
cluster assignments.
- Return type:
np.ndarray
- egc.utils.clustering.soft_kmeans_clustering(data: Tensor, miu: Tensor, num_iter: int = 1, cluster_temp: float = 5, dist_type: str = 'cosine_similarity') Tuple[Tensor, Tensor, Tensor][source]
pytorch (differentiable) implementation of soft k-means clustering.
- Parameters:
data (torch.Tensor) – data embeddings.
miu (torch.Tensor, optional) – cluster centers.
num_iter (int, optional) – num of iterations. Defaults to 1.
cluster_temp (float, optional) – softmax temperature. Defaults to 5.
dist_type (str, optional) – distance type. Defaults to ‘cosine_similarity’.
- Returns:
[cluster_centers, soft_assignment_matrix, distance]
- Return type:
Tuple[torch.Tensor, torch.Tensor, torch.Tensor]
egc.utils.common module
common utils
- egc.utils.common.sparse_mx_to_torch_sparse_tensor(sparse_mx: spmatrix) Tensor[source]
Convert a scipy sparse matrix to a torch sparse tensor
- Parameters:
sparse_mx (<class 'scipy.sparse'>) – sparse matrix
- Returns:
torch sparse tensor
- Return type:
(torch.Tensor)
- egc.utils.common.tab_printer(args: Dict, thead: List[str] | None = None) None[source]
Function to print the logs in a nice tabular format.
- Parameters:
args (Dict) – Parameters used for the model.
- egc.utils.common.make_parent_dirs(target_path: PurePath) None[source]
make all the parent dirs of the target path.
- Parameters:
target_path (PurePath) – target path.
- egc.utils.common.refresh_file(target_path: str | None = None) None[source]
clear target path
- Parameters:
target_path (str) – file path
- egc.utils.common.csv2file(target_path: str, thead: Tuple[str] | None = None, tbody: Tuple | None = None, refresh: bool = False, is_dict: bool = False) None[source]
save csv to target_path
- Parameters:
target_path (str) – target path
thead (Tuple[str], optional) – csv table header, only written into the file when it is not None and file is empty. Defaults to None.
tbody (Tuple, optional) – csv table content. Defaults to None.
refresh (bool, optional) – whether to clean the file first. Defaults to False.
- egc.utils.common.set_seed(seed: int = 4096) None[source]
Set random seed.
NOTE:!!! conv and neighborSampler of dgl is somehow nondeterministic !!!
Set according to the pytorch doc: https://pytorch.org/docs/1.9.0/notes/randomness.html cudatoolkit doc: https://docs.nvidia.com/cuda/cublas/index.html#cublasApi_reproducibility dgl issue: https://github.com/dmlc/dgl/issues/3302
- Parameters:
seed (int, optional) – random seed. Defaults to 4096.
- egc.utils.common.set_device(gpu: str = '0') device[source]
Set torch device.
- Parameters:
gpu (str) – args.gpu. Defaults to ‘0’.
- Returns:
torch device. device(type=’cuda: x’) or device(type=’cpu’).
- Return type:
torch.device
- egc.utils.common.print_model_parameters(model: Module) None[source]
print model parameters.
- Parameters:
model (torch.nn.Module) – Torch module.
- egc.utils.common.run_subprocess_command(cmd: str, cwd_path: <module 'posixpath' from '/data3/guming/.pyenv/versions/3.8.16/lib/python3.8/posixpath.py'> = None) None[source]
run shell command in subprocess.
- Parameters:
cmd (str) – command string.
cwd_path (os.path, optional) – cwd path to run the cmd. Defaults to None.
- egc.utils.common.dump_var(filename: str, variable: Any, relative_path: str = 'tmp') None[source]
dump var using pickle.
- Parameters:
filename (str) – varname.
variable (Any) – variable to dump.
relative_path (str, optional) – relative path of the dir to save the var. Defaults to ‘tmp’.
- egc.utils.common.load_var(filename: str, relative_path: str = 'tmp') Any[source]
load var using pickle.
- Parameters:
filename (str) – varname.
relative_path (str, optional) – relative path of the dir to save the var. Defaults to ‘tmp’.
- Returns:
variable.
- Return type:
Any
- egc.utils.common.load_or_dump(filename: str, func: Callable, args: Dict, relative_path: str = 'tmp') Any[source]
load and return the variable if dumped. Otherwise calculate and dump before return.
- Parameters:
filename (str) – varname.
func (Callable) – func to calculate the variable.
args (Dict) – parameter dict for the func.
relative_path (str, optional) – relative path of the dir to save the var. Defaults to ‘tmp’.
- Returns:
variable.
- Return type:
Any
egc.utils.construct_DGLgraph module
construct_DGLgraph
egc.utils.danmf_utils module
DANMF implement Repository: https://github.com/benedekrozemberczki/DANMF Author: benedekrozemberczki
egc.utils.evaluation module
Author: Zhou Sheng Evaluation Metric for Graph Clustering ACC, NMI, ARI, F1 Score
- egc.utils.evaluation.best_mapping(labels_true: list, labels_pred: list) Tuple[array, array][source]
Get best mapping between labels_true and labels_pred.
- Parameters:
labels_true (list or np.array) – gnd labels.
labels_pred (list or np.array) – pred labels.
- Raises:
ValueError – Labels must be in numpy format!
- Returns:
best mapping.
- Return type:
Tuple[np.array,np.array]
- egc.utils.evaluation.evaluation(labels_true: Tensor, labels_pred: Tensor) Tuple[float][source]
Clustering evaluation.
- Parameters:
labels_true (torch.Tensor or np.ndarray) – Ground Truth Community.
labels_pred (torch.Tensor or np.ndarray) – Predicted Community.
- Returns:
(ARI, NMI, AMI, ACC, Micro-F1, Macro-F1, purity)
- Return type:
Tuple[float]
egc.utils.graph_diffusion module
utils of MVGRL
- egc.utils.graph_diffusion.compute_ppr(adj: ndarray, alpha: float = 0.2, self_loop: bool = True)[source]
Compute Personalized PageRank (PPR) matrix
- Parameters:
adj (np.ndarray) – adjacency matrix
alpha (float) – Restart probability,. Defaults to 0.2.
self_loop (bool) – add self loop. Defaults to True.
- Returns:
diffusion graph adjacency matrix
- Return type:
(np.ndarray)
egc.utils.graph_statistics module
Graph Statistics
- egc.utils.graph_statistics.count_label(label: Tensor) Dict[source]
count label
- Parameters:
label (torch.Tensor) – label list Tensor
- Returns:
label cnt dict
- Return type:
Dict
- egc.utils.graph_statistics.get_intra_class_edges(edges: Tuple[ndarray, ndarray], label: List) Dict[source]
Get the Dict of intra-class edges index list
- Parameters:
edges (Tuple[np.ndarray, np.ndarray]) – edges in the format of [(v1,v2,…,vn), (u1,u2,…un))]
label (Listornp.ndarray) – label list
- Returns:
edges index list indexed by label
- Return type:
Dict
- egc.utils.graph_statistics.get_intra_class_mean_distance(embedding: Tensor, label: List) Dict[source]
Get intra-class Mean distance between node embeddings and community embeddings
- Parameters:
embedding (torch.Tensor) – node embedding matrix
label (Listornp.ndarray) – label
- Returns:
mean distance matrix
- Return type:
torch.Tensor
- egc.utils.graph_statistics.get_neighbor_set(edges: Tuple[Tensor, Tensor]) Dict[source]
get neighbor set from edges tuple
- Parameters:
edges (Tuple[torch.Tensor, torch.Tensor]) – edges list
- Returns:
neighbor set indexed by node id
- Return type:
Dict
- egc.utils.graph_statistics.get_motifs_with_one_more_node(motifs: Set[Tuple], neighbor_set: Dict) Set[Tuple][source]
get motifs recursively
- Parameters:
motifs (Set[Tuple]) – motifs set
neighbor_set (Dict) – neighbor set indexed by node id
- Returns:
motifs set enlarged with one more node for each motif
- Return type:
Set[Tuple]
- egc.utils.graph_statistics.get_undireced_motifs(n_nodes: int, motif_size: int, edges: Tuple[Tensor, Tensor]) Tuple[List[List[Tuple]], Dict, Set[Tuple]][source]
get motifs(n-clique) of undirected graph
- Parameters:
n_nodes (int) – node num
motif_size (int) – motif size
edges (Tuple[torch.Tensor, torch.Tensor]) – edges tunple
- Returns:
(motif list indexed by node id, neighbor set indexed by node id, set of notifs)
- Return type:
Tuple[List[List[Tuple]], Dict, Set[Tuple]]
egc.utils.initialization module
Initialization
egc.utils.load_data module
Load dataset with DGL for Graph Clustering Author: Sheng Zhou
- egc.utils.load_data.load_data(dataset_name: str, directory='./data') Tuple[DGLGraph, Tensor, int][source]
Load datasets.
- Parameters:
dataset_name (str) – Name of the dataset. Check README.md for supported datasets.
directory (str, optional) – path for the dataset to save. Defaults to ‘./data’.
- Raises:
NotImplementedError – dataset not supported
- Returns:
graph, label, n_clusters
- Return type:
Tuple[dgl.DGLGraph, torch.Tensor, int]
- egc.utils.load_data.load_ogb_data(dataset_name, directory='./data')[source]
graph:DGL graph ob+ject label: torch tensor of shape (num_nodes,num_tasks)
- egc.utils.load_data.load_dgl_data(dataset_name, directory='./data')[source]
graph:DGL graph object label: form graph.ndata[‘label’]
- egc.utils.load_data.allclose(a: Tensor, b: Tensor, rtol: float = 0.0001, atol: float = 0.0001) bool[source]
This function checks if a and b satisfy the condition: |a - b| <= atol + rtol * |b|
- Parameters:
a (torch.Tensor) – first tensor to compare
b (torch.Tensor) – second tensor to compare
rtol (float, optional) – relative tolerance. Defaults to 1e-4.
atol (float, optional) – absolute tolerance. Defaults to 1e-4.
- Returns:
True for close, False for not
- Return type:
bool
- egc.utils.load_data.is_bidirected(g: DGLGraph) bool[source]
Return whether the graph is a bidirected graph. A graph is bidirected if for any edge \((u, v)\) in \(G\) with weight \(w\), there exists an edge \((v, u)\) in \(G\) with the same weight.
- Parameters:
g (dgl.DGLGraph) – dgl.DGLGraph
- Returns:
True for bidirected, False for not
- Return type:
bool
- egc.utils.load_data.load_mat_data2dgl(data_path, verbose=True)[source]
load data from .mat file
- Parameters:
data_path (str) – the file to read in
verbose (bool, optional) – print info, by default True
- Returns:
the graph read from data_path (torch.Tensor): label of node classes num_classes (int): number of node classes
- Return type:
graph (DGL.graph)
- egc.utils.load_data.bar_progress(current, total, _)[source]
create this bar_progress method which is invoked automatically from wget
- egc.utils.load_data.load_BlogCatalog(raw_dir='./data')[source]
load BlogCatalog dgl graph
- Parameters:
raw_dir (str) – Data path. Supports user customization.
- Returns:
the graph read from data_path (torch.Tensor): label of node classes num_classes (int): number of node classes
- Return type:
graph (DGL.graph)
Examples
>>> graph, label, n_clusters = load_BlogCatalog()
- egc.utils.load_data.load_Flickr(raw_dir='./data')[source]
load Flickr dgl graph
- Parameters:
raw_dir (str) – Data path. Supports user customization.
- Returns:
the graph read from data_path (torch.Tensor): label of node classes num_classes (int): number of node classes
- Return type:
graph (DGL.graph)
Examples
>>> graph, label, n_clusters = load_Flickr()
- egc.utils.load_data.load_ACM(raw_dir='./data', verbose=True)[source]
load ACM dgl graph
- Parameters:
raw_dir (str) – Data path. Supports user customization.
verbose (bool, optional) – print info, by default True
- Returns:
the graph read from data_path (torch.Tensor): label of node classes num_classes (int): number of node classes
- Return type:
graph (DGL.graph)
Examples
>>> graph, label, n_clusters = load_ACM()
- egc.utils.load_data.load_DBLP(raw_dir='./data', verbose=True)[source]
load DBLP dgl graph
- Parameters:
raw_dir (str) – Data path. Supports user customization.
verbose (bool, optional) – print info, by default True
- Returns:
the graph read from data_path (torch.Tensor): label of node classes num_classes (int): number of node classes
- Return type:
graph (DGL.graph)
Examples
>>> graph, label, n_clusters = load_DBLP()
egc.utils.metrics module
Metrics
- egc.utils.metrics.get_soft_assignment_matrix(data: Tensor, miu: Tensor, cluster_temp: float = 30, dist_type: str = 'cosine_similarity') Tensor[source]
Get soft assignment matrix from data points and cluster centers.
- Parameters:
data (torch.Tensor) – data embeddings.
miu (torch.Tensor) – cluster center embeddings.
cluster_temp (float, optional) – softmax temperature. Defaults to 30.
dist_type (str, optional) – distance type. Defaults to ‘cosine_similarity’.
- Returns:
soft assignment matrix.
- Return type:
torch.Tensor
- egc.utils.metrics.get_modularity_matrix(adj_nodia: Tensor) Tensor[source]
Get Modularity Matrix.
\[A_{vw} - \frac{K_vk_w}{2m}\]- Parameters:
adj (torch.Tensor) – adjacency matrix without diag.
- Returns:
modularity matrix.
- Return type:
torch.Tensor
- egc.utils.metrics.get_modularity_value(bin_adj_nodiag: Tensor, r: Tensor, mod: Tensor) Tensor[source]
Get Modularity.
\[Q(r)=\frac{1}{2m}\sum_{u,v\in V}\sum_{k=1}^K[A_{uv}-\frac{d_ud_v}{2m}]r_{uk}r_{vk}\]- Parameters:
bin_adj_nodiag (torch.Tensor) – n x n. Boolean adj matrix without diag.
r (torch.Tensor) – n x k. Soft assignment probability matrix.
mod (torch.Tensor) – n x n. Modularity matrix.
- Returns:
Modularity value.
- Return type:
torch.Tensor
egc.utils.model_management module
Model Management
- egc.utils.model_management.save_model(model_filename: str, model: Module, optimizer: Optimizer, current_epoch: int, loss: float) None[source]
Save model, optimizer, current_epoch, loss to
checkpoints/${model_filename}.pt.- Parameters:
model_filename (str) – filename to save model.
model (torch.nn.Module) – model.
optimizer (torch.optim.Optimizer) – optimizer.
current_epoch (int) – current epoch.
loss (float) – loss.
- egc.utils.model_management.load_model(model_filename: str, model: Module, optimizer: Optimizer) Tuple[Module, Optimizer, int, float][source]
Load model from
checkpoints/${model_filename}.pt.- Parameters:
model_filename (str) – filename to load model.
model (torch.nn.Module) – model.
optimizer (torch.optim.Optimizer) – optimizer.
- Returns:
[model, optimizer, epoch, loss]
- Return type:
Tuple[torch.nn.Module, torch.optim.Optimizer, int, float]
egc.utils.normalization module
Normalization Utils
- egc.utils.normalization.normalize_feature(features: lil_matrix) array[source]
Row-normalize feature matrix.
- Parameters:
features (scipy.sparse.lil.lil_matrix) – 2D sparse features
- Returns:
2D row-normalized features
- Return type:
features_norm (numpy.matrix)
- egc.utils.normalization.symmetrically_normalize_adj(adj: csr_matrix) coo_matrix[source]
Symmetrically normalize adjacency matrix.
- Parameters:
adj (scipy.sparse.csr.csr_matrix) – 2D sparse adjacency matrix
- Returns:
2D Symmetrically normalized sparse adjacency matrix
- Return type:
daj_norm (scipy.sparse.coo.coo_matrix)
egc.utils.sampling module
Sample Method
- egc.utils.sampling.get_repeat_shuffle_nodes_list(n_nodes, sample_times)[source]
Get Negative Sample Nodes List By Repeatable Shuffle
- Parameters:
n_nodes (int) – node number in all.
sample_times (int) – sample times.
- Returns:
list of multiple repeatable nodes index shuffle lists.
- Return type:
(List)
- egc.utils.sampling.normal_reparameterize(mu: Tensor, logvar: Tensor, training: bool = True) Tensor[source]
Reparameterization trick for normal distribution
- Parameters:
mu (torch.Tensor) – mu
logvar (torch.Tensor) – logsigma
training (bool) – isTraining
- Returns:
(torch.Tensor)
- egc.utils.sampling.agm(x: ndarray) ndarray[source]
AGM probability
- Parameters:
x (np.ndarray) – 1-d array
- Returns:
AGM probability
- Return type:
np.ndarray
- egc.utils.sampling.choice(samples: List[int], weight: ndarray) int[source]
choose next node
- Parameters:
samples (List[int]) – neighbors
weight (np.ndarray) – wights
- Returns:
node chosen
- Return type:
int
- class egc.utils.sampling.CommunityGANSampling(n_threads: int, args: Tuple[int, int, bool], motif_size: int, total_motifs: List[List[Tuple]], theta_g: ndarray, neighbor_set: Dict)[source]
Bases:
objectCommunityGAN Sampling
- Parameters:
n_threads (int) – cores of multiprocessing.
args (Tuple[int, int, bool]) – root, n_sample, only_neg. root (int): root node id n_sample (int): num of motif sampled only_neg (bool): only return negative samples
motif_size (int) – motif size.
total_motifs (List[List[Tuple]]) – list of all motifs indexed by node id.
theta_g (np.ndarray) – node embedding of generator.
neighbor_set (Dict) – neighbor set Dict indexed by node id.
- g_v(roots: List[int]) Tuple[int, List[int]][source]
get next node
- Parameters:
roots (List[int]) – list of node sampled before
- Returns:
current_node, path walked
- Return type:
Tuple[int, List[int]]
- g_s(args: Tuple[int, int, bool]) Tuple[List[Tuple], List[List[int]]][source]
sampling for community gan generator
- Parameters:
args (Tuple[int, int, bool]) – root, n_sample, only_neg root (int): root node id n_sample (int): num of motif sampled only_neg (bool): only return negative samples
- Returns:
motifs, paths
- Return type:
Tuple[List[Tuple], List[List[int]]]
egc.utils.sublime_utils module
Utils for SUBLIME model
Module contents
Utils