Skip to content

Overly broad robots.txt language disallow rule (/{languageCode}*) #7775

@bambuca

Description

@bambuca

Description

In CommonModelFactory.cs, when generating Disallow rules for disallowed languages, the code currently contains:

sb.AppendLine($"Disallow: /{language.UniqueSeoCode}*");

While this is syntactically valid, it results in an overly broad rule that may unintentionally block unrelated paths.


🔍 Example

Given language.UniqueSeoCode = "en", this line will block:

  • /en/
  • /en/page

(intended)

But it will also block:

  • /enduro
  • /engineer
  • /english-info

(unintended)


✅ Recommendation

Replace this:

sb.AppendLine($"Disallow: /{language.UniqueSeoCode}*");

With a more precise form:

sb.AppendLine($"Disallow: /{language.UniqueSeoCode}");
sb.AppendLine($"Disallow: /{language.UniqueSeoCode}/");

This will block:

  • /en
  • /en/
  • /en/...

But allow unrelated paths like /engine, /enduro, etc.


💡 Benefits

  • Avoids unintended over-blocking of unrelated URLs
  • Aligns with standard path-prefix behavior in robots.txt
  • Keeps output minimal and accurate

I will prepare a PR in a while!

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions