Skip to content

Commit cb8a461

Browse files
authored
[BUG] Default create path with no config or schema does not populate default ef in schema (#5726)
## Description of changes _Summarize the changes made by this PR._ - Improvements & Bug fixes - There was a bug in the create collection path, where if the user provides no configuration and no schema at create time, the default schema (which has no ef) gets written to sysdb. this got hidden in the tests since on deserialization, schema builds with the default ef, so the test thought it correctly was assigned the default ef. the fix for this is to check if both config and schema are default during reconciliation, and in that case use the config to build the schema. - New functionality - ... ## Test plan updated e2e test to catch this bug. validated the test caught the bug, then fixed it _How are these changes tested?_ - [ x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Migration plan _Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?_ ## Observability plan _What is the plan to instrument and monitor this change?_ ## Documentation Changes _Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs section](https://github.com/chroma-core/chroma/tree/main/docs/docs.trychroma.com)?_
1 parent bf02ad7 commit cb8a461

File tree

2 files changed

+18
-3
lines changed

2 files changed

+18
-3
lines changed

chromadb/test/api/test_schema_e2e.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1458,6 +1458,12 @@ def test_default_embedding_function_when_no_schema_provided(
14581458
# Verify it's the DefaultEmbeddingFunction, not legacy
14591459
assert ef.name() == "default"
14601460

1461+
config = collection.configuration
1462+
assert config is not None
1463+
config_ef = config.get("embedding_function")
1464+
assert config_ef is not None
1465+
assert config_ef.name() == "default"
1466+
14611467
# Serialize the schema to JSON and verify the embedding function type
14621468
json_data = schema.serialize_to_json()
14631469
embedding_vector = json_data["keys"]["#embedding"]["float_list"]["vector_index"]

rust/types/src/collection_schema.rs

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1241,8 +1241,17 @@ impl Schema {
12411241
) -> Result<Schema, SchemaError> {
12421242
// 1. Check if collection config is default
12431243
if collection_config.is_default() {
1244-
// Collection config is default → schema is source of truth
1245-
return Ok(schema.clone());
1244+
if schema.is_default() {
1245+
// if both are default, use collection config to create schema
1246+
// this handles the case where user did not provide schema or config.
1247+
// since default schema doesnt have an ef, we need to use the coll config to create
1248+
// a schema with the ef.
1249+
let new_schema = Self::convert_collection_config_to_schema(collection_config)?;
1250+
return Ok(new_schema);
1251+
} else {
1252+
// Collection config is default and schema is non-default → schema is source of truth
1253+
return Ok(schema.clone());
1254+
}
12461255
}
12471256

12481257
// 2. Collection config is non-default, schema must be default (already validated earlier)
@@ -3346,8 +3355,8 @@ mod tests {
33463355
#[test]
33473356
fn test_reconcile_with_collection_config_default_config() {
33483357
// Test that when collection config is default, schema is returned as-is
3349-
let schema = Schema::new_default(KnnIndex::Hnsw);
33503358
let collection_config = InternalCollectionConfiguration::default_hnsw();
3359+
let schema = Schema::convert_collection_config_to_schema(&collection_config).unwrap();
33513360

33523361
let result = Schema::reconcile_with_collection_config(&schema, &collection_config).unwrap();
33533362
assert_eq!(result, schema);

0 commit comments

Comments
 (0)