-
Notifications
You must be signed in to change notification settings - Fork 217
Description
Describe the bug
If I plotted a graph with 95,157 nodes and 12,000,000 edges, graphistry will attempt to render, but towards the completion of the render, it will "refresh" the process and start again. This cycle repeats itself until it enters "herding stray GPUs" and then it goes to timeout.
I would like to add that I tried this for 1,000 nodes and 2,000 edges and does behave the same way. If I tried this at a different time it might work.
To Reproduce
Code, including data, than can be run without editing:
import matplotlib.cm as cm
import matplotlib.colors as mcolors
import pandas as pd
import graphistry
#graphistry.register(api=3, username='...', password='...')
num_nodes = 95157
num_edges = 12000000
num_clusters = 744
print(f"Generating a sparse graph with {num_nodes} nodes and {num_edges} edges...")
rows = np.random.randint(0, num_nodes, num_edges)
cols = np.random.randint(0, num_nodes, num_edges)
data = np.ones(num_edges, dtype=int)
adj_matrix_csr = sp.csr_matrix((data, (rows, cols)), shape=(num_nodes, num_nodes))
adj_matrix_csr.setdiag(0)
adj_matrix_csr.eliminate_zeros()
source_nodes, target_nodes = adj_matrix_csr.nonzero()
edges_df = pd.DataFrame({
'source': source_nodes,
'destination': target_nodes,
'edge_weight': 1
})
edges_df.drop_duplicates(subset=['source', 'destination'], inplace=True)
print(f"Generated {len(edges_df)} unique edges.")
cluster_labels = np.random.randint(0, num_clusters, num_nodes)
nodes_df = pd.DataFrame({
'node': np.arange(num_nodes),
'type': cluster_labels
})
print(f"Generated {len(nodes_df)} nodes with {744} cluster labels.")
print("Binding data to PyGraphistry and plotting...")
g = graphistry.edges(edges_df, 'source', 'destination').nodes(nodes_df, 'node')
cmap = cm.get_cmap('viridis', 744) # 'hsv' or 'rainbow' are good for max distinctness, but not perceptually uniform
colors_list = [mcolors.rgb2hex(cmap(i)) for i in range(744)]
custom_cluster_colors_all = {i: colors_list[i] for i in range(744)}
g = g.encode_point_color(
'type',
categorical_mapping=custom_cluster_colors_all
).plot()
print("Plotting command issued. Check your browser or Jupyter output for the visualization.")
g
Expected behavior
The graph should render with the colored nodes.
Actual behavior
The rendering process keeps restarting when it's almost completed.
Screenshots
graphistry-timeout.mov
Browser environment (please complete the following information):
- OS: Mac OS
- Browser chrome, firefox
- Version 138 chrome
Graphistry GPU server environment
- Where run, Hub
PyGraphistry API client environment
- Where run Graphistry 2.43.4 Jupyter Notebook 4.3.6
- Version 0.41.0
- Python Version Python 3.13.2