-
Notifications
You must be signed in to change notification settings - Fork 142
deserialize/value: Accept shorter encoded tuples #1453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deserialize/value: Accept shorter encoded tuples #1453
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances tuple deserialization to handle tuples with fewer elements than expected in the database representation, treating missing fields as nulls. This addresses a scenario where tuple columns are altered to add new fields, causing existing rows to have shorter encoded tuples than the metadata indicates.
Key changes:
- Modified tuple deserialization logic to interpret
UnexpectedEoferrors as missing (null) tuple elements - Added regression test verifying behavior when a tuple column is altered to add fields
- Enabled vector type integration test for ScyllaDB
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| scylla/tests/integration/types/cql_collections.rs | Added test case test_cql_tuple_db_repr_shorter_than_metadata to verify tuple deserialization with missing fields |
| scylla-cql/src/deserialize/value.rs | Enhanced tuple deserialization to handle UnexpectedEof by treating missing elements as null values |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
2376552 to
5d63ad7
Compare
|
Did you ensure that nested tuples with missing values also work with those changes: ex. While I do not know if any sensible driver would create such value under normal circumstances, it is still possible to receive such value from the database. I assume this should be an easy question to answer, knowing how the deserialization behaves, and it shouldn't be nessesery to test this. |
Writing a test should also not be very problematic, and its better than guessing. |
|
A wrapper like this might be useful for testing shorter tuples, as it allows to easily insert them without ALTER TABLE: struct SerializeValueWithFakeType<T> {
fake_type: ColumnType<'static>,
value: T,
}
impl<T: SerializeValue> SerializeValue for SerializeValueWithFakeType<T> {
fn serialize<'b>(
&self,
_typ: &ColumnType,
writer: scylla_cql::serialize::CellWriter<'b>,
) -> Result<scylla::serialize::writers::WrittenCellProof<'b>, scylla::errors::SerializationError>
{
<T as SerializeValue>::serialize(&self.value, &self.fake_type, writer)
}
}
impl<T> SerializeValueWithFakeType<T> {
fn new(value: T, fake_type: ColumnType<'static>) -> Self {
Self { fake_type, value }
}
}Example test (helper functions defined in cql_collections.rs): #[tokio::test]
async fn test_shorter_tuples() {
use ColumnType::*;
use NativeType::*;
setup_tracing();
let session: Session = connect().await;
let table_name: &str = "test_cql_tuple_tab";
create_table(&session, table_name, "tuple<int, int, text>").await;
// Rust tuple ()
let tuple1: (i32, i32) = (1, 2);
let tuple1_ser = SerializeValueWithFakeType::new(tuple1, Tuple(vec![Native(Int), Native(Int)]));
let tuple1_deser = (1, 2, Option::<String>::None);
insert_and_select(&session, table_name, &tuple1_ser, &tuple1_deser).await;
session
.ddl(format!("DROP KEYSPACE {}", session.get_keyspace().unwrap()))
.await
.unwrap();
} |
When deserializing tuples, if the encoded data is shorter than expected (i.e., it contains fewer elements than the tuple type specifies), we should treat the missing elements as nulls instead of returning an error. This is what other drivers (Python, Java, Node.js) do, and it improves compatibility with ScyllaDB/Cassandra, which allow inserting tuples with fewer elements than the defined type. Fixes: scylladb#1452.
5d63ad7 to
2579e40
Compare
|
Rebased on main. |
After scylladb#1452 is fixed in a previous commit, now we add a regression test to ensure that tuples encoded with fewer elements than in the metadata are accepted and decoded correctly (missing elements are returned as null). The test would fail before the fix. Co-authored-by: Karol Baryła <karol.baryla@scylladb.com>
This helper struct allows serializing a value with a different CQL type than its actual Rust type. This is useful for testing scenarios where we want to force serializing a value to a CQL column of a non-matching type (or, more specifically, a type that is not accepted by the type-checking logic in the driver). In the next commit, this will be used to insert tuples with fewer elements than the CQL type expects, to later verify that reading such tuples does not cause deserialization errors. Co-authored-by: Karol Baryła <karol.baryla@scylladb.com>
This is similar to the existing test for shorter tuples, but: 1) it tests nested tuples to ensure that the logic works recursively, 2) it uses SerializeValueWithFakeType to create the shorter tuple representation, instead of using the `ALTER TABLE` trick. Co-authored-by: Karol Baryła <karol.baryla@scylladb.com>
2579e40 to
79ac7b1
Compare
|
Added a new test for nested tuples. |
|
|
||
| pub(crate) struct SerializeValueWithFakeType<'typ, T> { | ||
| fake_type: ColumnType<'typ>, | ||
| value: T, | ||
| } | ||
|
|
||
| impl<T: SerializeValue> SerializeValue for SerializeValueWithFakeType<'_, T> { | ||
| fn serialize<'b>( | ||
| &self, | ||
| _typ: &ColumnType, | ||
| writer: scylla_cql::serialize::CellWriter<'b>, | ||
| ) -> Result<scylla::serialize::writers::WrittenCellProof<'b>, scylla::errors::SerializationError> | ||
| { | ||
| <T as SerializeValue>::serialize(&self.value, &self.fake_type, writer) | ||
| } | ||
| } | ||
|
|
||
| impl<'typ, T> SerializeValueWithFakeType<'typ, T> { | ||
| pub(crate) fn new(value: T, fake_type: ColumnType<'typ>) -> Self { | ||
| Self { fake_type, value } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I posted this util before I knew about the serialization mechanism that you found (driver allowing tuples shorter than in db).
Do we need this util if the driver allows shorter serialization itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The util is useful in wider set of cases than just shorter tuples, so I'd keep it anyway.
- I don't want to depend on the lenient behaviour of
CqlValue::Tuple, as I don't know if this behaviour is desirable or not. If not, we'll eventually alter it, and the tests would then break.
adespawn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a bonus, I noticed that the integration test for vector type is not run for ScyllaDB even though ScyllaDB already support vector type, so I enabled it.
While I do not have any comments on the PR, I couldn't find any changes that enable those tests.
I think those were extracted to separate PR. @wprzytula please update PR description. |
As decided in #1452, this PR makes deserialization framework interpret tuple fields that are missing in the encoded data from DB as nulls. It also includes a regression test written by @Lorak-mmk.
Partly fixes: #1452
Pre-review checklist
[ ] I have provided docstrings for the public items that I want to introduce.[ ] I have adjusted the documentation in./docs/source/.Fixes:annotations to PR description.~~