Skip to content

Commit c99d4f1

Browse files
GagaLPPeterTh
authored andcommitted
Prepare 0.7.0 Release
1 parent ac6024f commit c99d4f1

File tree

11 files changed

+67
-11
lines changed

11 files changed

+67
-11
lines changed

.hdoc.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[project]
44
name = "Celerity"
5-
version = "0.6.0"
5+
version = "0.7.0"
66

77
# Optional, adding this will enable direct links from the documentation
88
# to your source code.

CHANGELOG.md

Lines changed: 57 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,24 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
66
and this project adheres to [Semantic
77
Versioning](http://semver.org/spec/v2.0.0.html).
88

9-
## [Unreleased]
9+
## [0.7.0] - 2025-08-18
10+
11+
This release includes changes that may require adjustments when upgrading:
12+
- Celerity now requires C++20
13+
- `celerity::distr_queue` has been replaced by `celerity::queue`.
14+
Multiple instances of `celerity::queue` are now supported, with behavior more closely aligned with SYCL.
15+
- Buffer access handling has been refactored: celerity::access_mode is now a dedicated enum.
16+
Using `sycl::access_mode` on Celerity buffers is no longer supported.
17+
- Coordinate-list constructors of `access::neighborhood` have been deprecated in favor of the `range` overload.
18+
- We recommend performing a clean build when updating Celerity to ensure all updated submodule dependencies are properly propagated.
19+
20+
We recommend using the following SYCL versions with this release:
21+
22+
- DPC++: ad494e9d or newer
23+
- AdaptiveCpp (formerly hipSYCL): v24.06
24+
- SimSYCL: master
25+
26+
See our [platform support guide](docs/platform-support.md) for a complete list of all officially supported configurations.
1027

1128
### Added
1229

@@ -24,6 +41,10 @@ Versioning](http://semver.org/spec/v2.0.0.html).
2441

2542
### Changed
2643

44+
- Update Tracy dependency to v0.11.1 (#281)
45+
- Update libenvpp dependency to 1.5 (#312)
46+
- Update fmt dependency to 11.1.2 (#328)
47+
- Update spdlog dependency to HEAD > 1.15.0 (#328)
2748
- Celerity now requires C++20 (#291)
2849
- Automatic runtime shutdown, which was previously triggered by the last queue / buffer / host object going out of scope,
2950
is now postponed until process termination (`atexit()`). This allows multiple non-overlapping sections of Celerity code
@@ -36,17 +57,52 @@ Versioning](http://semver.org/spec/v2.0.0.html).
3657
- Overhauled the [installation](docs/installation.md) and [configuration](docs/configuration.md) documentation (#309)
3758
- Celerity will now queue up several command groups in order to combine allocations and elide resize operations.
3859
This behavior can be influenced using the new `experimental::set_lookahead` and `experimental::flush` APIs (#298)
60+
- Reduced small host-buffer allocations in MPI transfers by accumulating touched boxes during `anticipate()` (#313)
61+
- Celerity internals are no longer exposed to users through installed headers (#308)
62+
- Buffer `access_mode` is now a dedicated `celerity::access_mode` enum instead of an alias of `sycl::access_mode`, simplifying
63+
the include tree and removing namespace ambiguity. `sycl::access_mode` can no longer be used with Celerity buffers. (#315)
64+
- Uninitialized read warnings now provide more helpful information (#321)
65+
- Improved Tracy integration for executor starvation. Celerity now also prints a warning when execution time exceeds a
66+
given percentage threshold, indicating that the application might be scheduler-bound (#322)
3967

4068
### Fixed
4169

4270
- Host-initialized buffers will not read from user-provided memory after the last reference to the buffer has been dropped (#283)
71+
- Fix a build issue on macOS where moving a std::function did not clear the source, causing failing test cases (#285)
72+
- Fix a path hint for finding AdaptiveCpp when using an installed Celerity (#286)
73+
- Fix a race condition in unit tests by updating last_epoch_reached before signalling the epoch promise, ensuring proper synchronization (#307)
4374
- Fix a build issue with (rare) configurations which enable both Tracy and OOB-checks (#331)
4475

4576
### Deprecated
4677

4778
- `celerity::distr_queue` is deprecated in favor of `celerity::queue` (#283)
4879
- The coordinate-list constructors of `access::neighborhood` are deprecated in favor of the `range` overload (#292)
4980

81+
### Internal
82+
83+
- Command graphs generate a single "fat" push command instead of a septate push for each write and target node. (#290)
84+
- Event polling now only happens for instructions that are actively executing (#293)
85+
- Task management now uses epoch-based structures, removes the ring buffer size limit, and handles tasks via
86+
stable pointers, simplifying scheduler and application thread interactions (#295)
87+
- Command graph now uses `command` instead of `abstract_command`, moves CDAG-related pruning to the scheduler,
88+
and maintains command pointers in the CDAG generator (#297)
89+
- `buffer_access_map` now works in terms of consumed and produced regions instead of access modes.
90+
This includes various related improvements to task requirements, execution ranges, and graph printing (#300)
91+
- Use `region_map::update_box` instead of `update_region` where applicable (#302)
92+
- Improved "system" benchmarks to better capture effects that are highly significant in real-world workloads (#304)
93+
- Unified thread code, with a single source of truth for thread names and Tracy thread ordering (#310)
94+
- Optimize `perform_task_buffer_accesses` to skip redundant last-writers updates and transpose loops,
95+
yielding minor performance improvements in scheduler-bound workloads (#317)
96+
- The SimSYCL workaround for thread safety has been removed (#318)
97+
- Prevent unbounded growth in `receive_arbiter` by caching active transfers (#319)
98+
- Centralize definition of Tracy colors (#320)
99+
- Change split functions to work on box instead of chunk (#323)
100+
- Align await-pushes with pushes by computing the union of regions for remote chunks executed on the same node (#324)
101+
- Celerity now uses `SYCL_IS_*` macros instead of `defined(__SYCL_COMPILER_VERSION)` for checking the SYCL version (#329)
102+
- Removed internal branches on `CELERITY_FEATURE_UNNAMED_KERNELS`, which now only exists for backwards compatibility in
103+
applications (#329)
104+
105+
50106
## [0.6.0] - 2024-08-12
51107

52108
This release includes changes that may require adjustments when upgrading:

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
The MIT License (MIT)
22

3-
Copyright (c) 2018-2024 DPS Group, University of Innsbruck, Austria.
3+
Copyright (c) 2018-2025 DPS Group, University of Innsbruck, Austria.
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.6.0
1+
0.7.0
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
cmake_minimum_required(VERSION 3.13)
22
project(convolution LANGUAGES CXX)
33

4-
find_package(Celerity 0.6.0 REQUIRED)
4+
find_package(Celerity 0.7.0 REQUIRED)
55

66
add_executable(convolution convolution.cc)
77
add_celerity_to_target(TARGET convolution SOURCES convolution.cc)

examples/distr_io/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
cmake_minimum_required(VERSION 3.13)
22
project(distr_io LANGUAGES CXX)
33

4-
find_package(Celerity 0.6.0 REQUIRED)
4+
find_package(Celerity 0.7.0 REQUIRED)
55
if(NOT CELERITY_ENABLE_MPI)
66
message(SEND_ERROR "Your Celerity installation is built without MPI support.\nSkip this example.")
77
endif()
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
cmake_minimum_required(VERSION 3.13)
22
project(hello_world LANGUAGES CXX)
33

4-
find_package(Celerity 0.6.0 REQUIRED)
4+
find_package(Celerity 0.7.0 REQUIRED)
55

66
add_executable(hello_world hello_world.cc)
77
add_celerity_to_target(TARGET hello_world SOURCES hello_world.cc)

examples/matmul/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
cmake_minimum_required(VERSION 3.13)
22
project(matmul LANGUAGES CXX)
33

4-
find_package(Celerity 0.6.0 REQUIRED)
4+
find_package(Celerity 0.7.0 REQUIRED)
55

66
add_executable(matmul matmul.cc)
77
add_celerity_to_target(TARGET matmul SOURCES matmul.cc)

examples/reduction/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
cmake_minimum_required(VERSION 3.13)
22
project(syncing LANGUAGES CXX)
33

4-
find_package(Celerity 0.6.0 REQUIRED)
4+
find_package(Celerity 0.7.0 REQUIRED)
55

66
add_executable(reduction reduction.cc)
77
add_celerity_to_target(TARGET reduction SOURCES reduction.cc)

examples/syncing/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
cmake_minimum_required(VERSION 3.13)
22
project(syncing LANGUAGES CXX)
33

4-
find_package(Celerity 0.6.0 REQUIRED)
4+
find_package(Celerity 0.7.0 REQUIRED)
55

66
add_executable(syncing syncing.cc)
77
add_celerity_to_target(TARGET syncing SOURCES syncing.cc)

0 commit comments

Comments
 (0)