-
-
Notifications
You must be signed in to change notification settings - Fork 232
Description
Expected behavior
We are archiving a website that has had a few incarnations, the older archives have pages with a .html extension and the newer archives have /.
I want any pages with the exact URL paths except for the file extension (like below) to treated as if they are the same URL.
What actually happened
If I navigate to https://mydomain.com/segment-1/segment-2.html I only able to see dates where this exact page URL was archived and not pages with https://mydomain.com/segment-1/segment-2/ that were archived at a later date.
Things I have tried
I have tried many variations of filtering and fuzzy matching in config.yaml, I did add a rules.yaml file but it was ignored.
default_filters:
url_normalize:
- match: '.html$'
replace: '/'
rules:
- url_prefix: 'com,mydomain)/'
rewrite:
fuzzy_lookup:
- match: '.html'
replace: '/'
It would be good to know if I'm in the right ballpark with how to resolve my issue.
Browser
I am running Chrome on Ubuntu, but I have tested on Firefox