Releases: neo4j/graph-data-science
GDS 1.1.7
GDS 1.1.7 is compatible with Neo4j Neo4j 3.5.x. For a 4.x compatible release, please see GDS 1.7.2.
Bug fixes
- Fixed a bug in Louvain where changes to
maxIterationswere ignored. - Fixed a bug which caused
gds.graph.listandgds.graph.dropto throw an error when specifying a graph with duplicate property keys by failing early - Fixed a bug where
gds.alpha.sccwould sometimes fail with anArrayIndexOutOfBoundsException. - Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.
GDS 1.7.1
Release Date October 12, 2021
GDS 1.7.1 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.6.5
- Fixed a bug where Cypher graph loading and subgraph creation which could lead to
ArrayIndexOutOfBoundserrors. - Fixed an
ArrayIndexOutOfBoundscaused by running triangleCount on graphs with multiple relationship types.
Graph Data Science 1.7.0
GDS 1.7.0 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.6.5
Breaking changes
- This release does not support Neo4j 4.0.x
- Align returned
modelInfoentry names ofgds.alpha.ml.linkPrediction.trainandgds.alpha.ml.nodeClassification.trainwith the model catalog. Now containingmodelNameandmodelInfoinstead ofnameandinfo. - Remove the
sharedUpdaterparameter fromgds.alpha.ml.linkPredictionandgds.alpha.ml.nodeClassification. gds.beta.graph.export.csvnow exports into a subdirectory calledexport. Previously, the exported graphs were written directly into the configured directory.- Renamed all
graphalgopackages togds
New features
- New Algorithm: Approximate Maximum K-Cut
- Includes procedures:
gds.alpha.maxkcut.[mutate|mutate.estimate|stream|stream.estimate].
- Includes procedures:
- Introduced Link Prediction Pipelines to make it easier to define and calculate features, split your graph, and make predictions.
- Includes procedures:
gds.alpha.ml.pipeline.linkPrediction.create|addNodeProperty|addFeature|configureSplit|configureParams|train|predict.mutate.
- Includes procedures:
- Introduced support for exporting additional node properties, including strings, from the underlying database.
- Added
additionalNodePropertiesparameter togds.graph.export - Added
additionalNodePropertiesparameter togds.graph.export.csv
- Added
- Introduced experimental support for querying the in-memory graph with Cypher
- Added
gds.alpha.create.cypherdbto allow neo4j to recognize the in-memory graph as a database for Cypher queries
- Added
- To allow users better ability to handle multiple concurrent users, we’ve added a system monitoring procedure,
gds.alpha.systemMonitor,to provide an overview of the system's workload and available resources. - Progress logging is now turned on by default, and no longer requires changing your configuration settings. Progress can be accessed with
gds.beta.listProgress - GraphSAGE now supports deterministic results with the
randomSeedconfiguration parameter togds.beta.graphSage.train. - Improve performance (up to 20x speedup) of weakly connected components,
gds.wcc,for undirected graphs by applying a subgraph sampling optimization.
Bug fixes
- Fixed a bug regarding weighted graphs with multiple relationship types, which affected
gds.beta.graphSageandgds.alpha.spanningTree. - Supervised Machine Learning (Node Classification & Link Prediction):
- Fixed a
NaNissue in NodeClassification where computations with very small probability values can cause the result to flip to infinity. - Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
- Corrected the training size used in
gds.alpha.ml.linkPrediction.train. This affects thepenalityparameter used in logistic regression.
- Fixed a
- Progress Logging:
- Fixed a bug in beta progress event tracking where progress events would not be released if computation was abandoned before completion.
- Fixed a bug in beta progress event tracking for Pregel algorithms where progress events would not be released on algorithm completion.
- Node Similarity & KNN:
- Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
- Fixed a bug which affected
gds.nodeSimilarity.writeandgds.alpha.knn.writewhen being executed in combination with anodeLabelsfilter. The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids. - Fixed a bug where
gds.nodeSimilarity.[write|mutate]andgds.beta.knn.[write|mutate]wrote duplicate relationships if the input graph is undirected.
- KNN:
- Fixed a bug in
gds.beta.knnwhere negative values in node properties of type float arrays failed when returning thesimilarityDistribution.
- Fixed a bug in
- Fast RP:
- FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
- GraphSAGE:
- Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
- Fixed a bug in
gds.beta.graphSage, where the concurrency parameter was not considered.
- Graph Operations:
- Fixed a bug in
gds.graph.removeNodePropertieswhereremovedPropertiesWrittenwas too large for properties shared across multiple labels. - Fixed a bug in
gds.beta.graph.generate, where random graphs with relationship properties could not be generated. - Fixed a bug in
gds.create.subgraphwhich could lead to undefined behaviour or an AIOOB exception when executed on GDS Enterprise Edition. - Fixed a bug in
gds.graph.create, where default values for array properties would throw for convertable types.
Improvements
- Pathfinding: Added existence checks for
sourceNodeandtargetNodeto all shortest path procedures in the product tier. - Improved runtime of
gds.fastRPvia better workload balancing between threads. - Lower memory footprint for LinkPrediction and NodeClassification.
- Improved the procedure output of
gds.beta.listProgress. - Scale down scores computed by
gds.articleRank.
- Fixed a bug in
Graph Data Science 1.6.5
GDS 1.6.5 is compatible with Neo4j 4.0, 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug in
gds.beta.graph.generate, where random graphs with relationship properties could not be generated. - Fixed a bug in
gds.graph.create, where default values for array properties would throw for convertable types. - Fixed a bug in
gds.beta.graphSage, where the concurrency parameter was not considered. - Fixed a bug where the BitIdMap node mapping builder (on by default in GDS Enterprise Edition) would not correctly count all nodes in certain situations.
- Corrected the training size used in
gds.alpha.ml.linkPrediction.train. This affects thepenalityparameter used in logistic regression.
GDS 1.7.0-Preview
GDS 1.7.0-preview is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.1.6
Breaking changes
- This release does not support Neo4j 4.0.x
- Align returned
modelInfoentry names ofgds.alpha.ml.linkPrediction.trainandgds.alpha.ml.nodeClassification.trainwith the model catalog. Now containingmodelNameandmodelInfoinstead ofnameandinfo. - Remove the
sharedUpdaterparameter fromgds.alpha.ml.linkPredictionandgds.alpha.ml.nodeClassification. gds.beta.graph.export.csvnow exports into a subdirectory calledexport. Previously, the exported graphs were written directly into the configured directory.- Renamed all
graphalgopackages togds
New features
- New Algorithm: Approximate Maximum K-Cut
- Includes procedures:
gds.alpha.maxkcut.[mutate|mutate.estimate|stream|stream.estimate].
- Includes procedures:
- Introduced Link Prediction Pipelines to make it easier to define and calculate features, split your graph, and make predictions.
- Includes procedures:
gds.alpha.ml.pipeline.linkPrediction.create|addNodeProperty|addFeature|configureSplit|configureParams|train|predict.mutate.
- Includes procedures:
- Introduced support for exporting additional node properties, including strings, from the underlying database.
- Added
additionalNodePropertiesparameter togds.graph.export - Added
additionalNodePropertiesparameter togds.graph.export.csv
- Added
- Introduced experimental support for querying the in-memory graph with Cypher
- Added
gds.alpha.create.cypherdbto allow neo4j to recognize the in-memory graph as a database for Cypher queries
- Added
- To allow users better ability to handle multiple concurrent users, we’ve added a system monitoring procedure,
gds.alpha.systemMonitor,to provide an overview of the system's workload and available resources. - Progress logging is now turned on by default, and no longer requires changing your configuration settings. Progress can be accessed with
gds.beta.listProgress - GraphSAGE now supports deterministic results with the
randomSeedconfiguration parameter togds.beta.graphSage.train. - Improve performance (up to 20x speedup) of weakly connected components,
gds.wcc,for undirected graphs by applying a subgraph sampling optimization.
Bug fixes
- Fixed a bug regarding weighted graphs with multiple relationship types, which affected
gds.beta.graphSageandgds.alpha.spanningTree. - Supervised Machine Learning (Node Classification & Link Prediction):
- Fixed a
NaNissue in NodeClassification where computations with very small probability values can cause the result to flip to infinity. - Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
- Corrected the training size used in
gds.alpha.ml.linkPrediction.train. This affects thepenalityparameter used in logistic regression.
- Fixed a
- Progress Logging:
- Fixed a bug in beta progress event tracking where progress events would not be released if computation was abandoned before completion.
- Fixed a bug in beta progress event tracking for Pregel algorithms where progress events would not be released on algorithm completion.
- Node Similarity & KNN:
- Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
- Fixed a bug which affected
gds.nodeSimilarity.writeandgds.alpha.knn.writewhen being executed in combination with anodeLabelsfilter. The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids. - Fixed a bug where
gds.nodeSimilarity.[write|mutate]andgds.beta.knn.[write|mutate]wrote duplicate relationships if the input graph is undirected.
- KNN:
- Fixed a bug in
gds.beta.knnwhere negative values in node properties of type float arrays failed when returning thesimilarityDistribution.
- Fixed a bug in
- Fast RP:
- FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
- GraphSAGE:
- Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
- Fixed a bug in
gds.beta.graphSage, where the concurrency parameter was not considered.
- Graph Operations:
- Fixed a bug in
gds.graph.removeNodePropertieswhereremovedPropertiesWrittenwas too large for properties shared across multiple labels. - Fixed a bug in
gds.beta.graph.generate, where random graphs with relationship properties could not be generated. - Fixed a bug in
gds.create.subgraphwhich could lead to undefined behaviour or an AIOOB exception when executed on GDS Enterprise Edition. - Fixed a bug in
gds.graph.create, where default values for array properties would throw for convertable types.
Improvements
- Pathfinding: Added existence checks for
sourceNodeandtargetNodeto all shortest path procedures in the product tier. - Improved runtime of
gds.fastRPvia better workload balancing between threads. - Lower memory footprint for LinkPrediction and NodeClassification.
- Improved the procedure output of
gds.beta.listProgress. - Scale down scores computed by
gds.articleRank.
- Fixed a bug in
GDS 1.6.4
GDS 1.6.3
Release Date: July 22, 2021
Breaking changes
- Remove the
sharedUpdaterparameter fromgds.alpha.ml.linkPredictionandgds.alpha.ml.nodeClassification.
New features
Bug fixes
- Fixed a bug which affected
gds.nodeSimilarity.writeandgds.alpha.knn.writewhen being executed in combination with anodeLabelsfilter.
The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids. - Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
- Fixed a bug where
gds.nodeSimilarity.[write|mutate]andgds.beta.knn.[write|mutate]wrote duplicate relationships if the input graph is undirected. - Fixed a bug in
gds.beta.knnwhere negative values in node properties of type float arrays failed when returning thesimilarityDistribution.
Improvements
- Lower memory footprint for LinkPrediction and NodeClassification.
GDS 1.6.2
Release Date: July 8, 2021
GDS 1.6.2 is compatible with Neo4j 4.3, 4.2, 4.1, and 4.0. It is not compatible with Neo4j 3.5.x - for a compatible release, please see GDS 1.1.6
Bug Fixes
- Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
- Fixed a bug in
gds.beta.node2vecwhere using relationship weights would not work when running concurrently. - Fixed a bug in
gds.graph.createwhere the default value could not be changed for array properties.
1.6.1
Release Date: 17 June 2021
GDS 1.6.1 is compatible with Neo4j 4.0, 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug regarding weighted graphs with multiple relationship types, which affected
gds.beta.graphSageandgds.alpha.spanningTree. - Fix
NaNissue in NodeClassification where computations with very small probability values can cause the result to flip to infinity. - Progress logging (gds.beta.listProgress)
- Fixed a bug where progress events would not be released if computation was abandoned before completion.
- Fixed a bug with Pregel algorithms logging where progress events would not be released on algorithm completion.
- Fixed a bug regarding mutated node properties that could cause an AIOOB exception.
- Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
- FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
GDS 1.6.0
Release Date: 27 May 2021
GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- Degree centrality has been promoted to the product tier
- Added procedures:
gds.degree.stream.estimategds.degree.write.estimategds.degree.mutategds.degree.mutate.estimategds.degree.statsgds.degree.stats.estimate
- Removed alpha procedures:
gds.alpha.degree.streamGds.alpha.degree.write
- Added procedures:
- Article Rank has been promoted to the product tier
- Added procedures:
gds.articleRank.streamgds.articleRank.stream.estimategds.articleRank.writegds.articleRank.write.estimategds.articleRank.mutategds.articleRank.mutate.estimategds.articleRank.statsgds.articleRank.stats.estimate
- Removed alpha procedures:
gds.alpha.articleRank.streamgds.alpha.articleRank.write
- Added procedures:
- Eigenvector Centrality has been promoted to the product tier
- Added procedures:
gds.eigenvector.streamgds.eigenvector.stream.estimategds.eigenvector.writegds.eigenvector.write.estimategds.eigenvector.mutategds.eigenvector.mutate.estimategds.eigenvector.statsgds.eigenvector.stats.estimate
- Removed alpha procedures:
gds.alpha.eigenvector.streamGds.alpha.eigenvector.write
- Added procedures:
- AStar has been promoted to the product tier
- Added procedures:
gds.astar.streamgds.astar.stream.estimategds.astar.writegds.astar.write.estimategds.astar.mutategds.astar.mutate.estimate
- Removed alpha procedures:
gds.beta.astar.streamgds.beta.astar.stream.estimategds.beta.astar.writegds.beta.astar.write.estimategds.beta.astar.mutategds.beta.astar.mutate.estimate
- The parameter
pathwas removed. The path computation is controlled by the YIELD.
- Added procedures:
- Yens K Shortest Paths has been promoted to the product tier:
- Added procedures:
gds.yens.streamgds.yens.stream.estimategds.yens.writegds.yens.write.estimategds.yens.mutategds.yens.mutate.estimate
- Removed alpha procedures:
gds.beta.yens.streamgds.beta.yens.stream.estimategds.beta.yens.writegds.beta.yens.write.estimategds.beta.yens.mutategds.beta.yens.mutate.estimate
- The parameter
pathwas removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Source-Target has been promoted to the product tier:
- Added procedures:
gds.shortestPath.dijkstra.streamgds.shortestPath.dijkstra.stream.estimategds.shortestPath.dijkstra.writegds.shortestPath.dijkstra.write.estimategds.shortestPath.dijkstra.mutategds.shortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.shortestPath.dijkstra.streamgds.beta.shortestPath.dijkstra.stream.estimategds.beta.shortestPath.dijkstra.writegds.beta.shortestPath.dijkstra.write.estimategds.beta.shortestPath.dijkstra.mutategds.beta.shortestPath.dijkstra.mutate.estimate
- The parameter
pathwas removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Single-Source has been promoted to the product tier:
- Added procedures:
gds.allShortestPath.dijkstra.streamgds.allShortestPath.dijkstra.stream.estimategds.allShortestPath.dijkstra.writegds.allShortestPath.dijkstra.write.estimategds.allShortestPath.dijkstra.mutategds.allShortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.allShortestPath.dijkstra.streamgds.beta.allShortestPath.dijkstra.stream.estimategds.beta.allShortestPath.dijkstra.writegds.beta.allShortestPath.dijkstra.write.estimategds.beta.allShortestPath.dijkstra.mutategds.beta.allShortestPath.dijkstra.mutate.estimate
- The parameter
pathwas removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Node2Vec has been promoted to the beta tier
- Added procedures:
gds.beta.node2vec.streamgds.beta.node2vec.stream.estimategds.beta.node2vec.writegds.beta.node2vec.write.estimategds.beta.node2vec.mutategds.beta.node2vec.mutate.estimate
- Removed alpha procedures:
gds.alpha.node2vec.streamgds.alpha.node2vec.write
- Added procedures:
- The parameter
centerSamplingFactoris renamed topositiveSamplingFactor - The parameter
contextSamplingExponentis renamed tonegativeSamplingExponent
maxStreakCountconfiguration parameter is renamed topatience. It is used in the train modes of Node Classification and Link Prediction.maxIterationsandminIterationsconfiguration parameters are renamed tomaxEpochsandminEpochs. It is used in the train modes of Node Classification and Link Prediction.windowSizeconfiguration parameters is removed from the train modes of Node Classification and Link Prediction.
gds.alpha.ml.linkPrediction.train configuration parameter classRatio is renamed to negativeClassWeight. It is also mandatory now.
degreeAsProperty configuration parameter from GraphSAGE
- The same effect can be achieved by using
gds.degree.mutateand use the mutated property as feature for GraphSAGE training. - Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.
New features
- New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
gds.alpha.scaleProperties.streamgds.alpha.scaleProperties.mutate
- Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure
gds.beta.graph.create.subgraph - Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
gds.alpha.influenceMaximization.celf.streamgds.alpha.influenceMaximization.greedy.stream
- Link Prediction:
- Added support for storing, loading and publishing Link Prediction models.
- Added progress logging for
gds.alpha.ml.linkPrediction.trainandgds.alpha.ml.linkPrediction.predict. - Added write and stream modes to
gds.alpha.ml.linkPrediction.predictgds.alpha.ml.linkPrediction.streamgds.alpha.ml.linkPrediction.write
- Added estimate mode for Link Prediction:
gds.alpha.ml.linkPrediction.train.estimategds.alpha.ml.lin...