`retrieval_normalized_dcg` should compute per-query average when given 2D inputs (IR-standard behavior)

## 🐛 Bug
Currently, when passing a 2D tensor (`[num_queries, num_documents]`) to `retrieval_normalized_dcg`, the function flattens both `preds` and `target` and computes DCG/IDCG on the concatenated list.
This treats all queries as a single large ranking problem.

In Information Retrieval (IR) and recommender systems, the standard practice for NDCG is:  
- Compute NDCG per query
- Then take the macro average over queries  


Flattening across queries changes the interpretation of the metric and can lead to inflated or misleading results.


### To Reproduce

```python
from torchmetrics.functional.retrieval import retrieval_normalized_dcg
import torch

# Query 1
p1 = retrieval_normalized_dcg(torch.tensor([0.1, 0.2, 0.3]), torch.tensor([0, 1, 0]))
print(p1)  # tensor(0.6309)

# Query 2
p2 = retrieval_normalized_dcg(torch.tensor([0.8, 0.1, 0.05]), torch.tensor([1, 0, 0]))
print(p2)  # tensor(1.0000)

print("Mean per-query NDCG:", (p1 + p2) / 2)
# tensor(0.8155)

# Batched input (2D)
p_batch = retrieval_normalized_dcg(
    torch.tensor([[0.1, 0.2, 0.3], [0.8, 0.1, 0.05]]),
    torch.tensor([[0, 1, 0], [1, 0, 0]]),
)
print("Batch NDCG:", p_batch)
# tensor(0.9197) <-- Not the mean per-query value
```

Here, the batch value 0.9197 is different from the expected per-query average 0.8155 because the function flattens both queries before computing NDCG.

**Environment**
- macOS 15.4.1 (Sequoia) on Intel MacBook Pro
- Python 3.12.6
- torch==2.2.0
- torchmetrics==1.8.1




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`retrieval_normalized_dcg` should compute per-query average when given 2D inputs (IR-standard behavior) #3216

🐛 Bug

To Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

retrieval_normalized_dcg should compute per-query average when given 2D inputs (IR-standard behavior) #3216

Description

🐛 Bug

To Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`retrieval_normalized_dcg` should compute per-query average when given 2D inputs (IR-standard behavior) #3216