-
Notifications
You must be signed in to change notification settings - Fork 484
Description
I'm working on a Flask app which does some markup parsing. One of the things it does is it parses strings like Arriving tomorrow by 9pm or Delivered on Friday. All of the strings are in English and they are short. Today I bumped the version of dateparser from 0.7.6 to 1.0.0 and this is what I saw in the distribution metrics (p50, p95, p99) of the function calling search (function abridged):
STATUS_TEXT_DELIVERED = re.compile(r"delivered", re.IGNORECASE)
settings = {
"PREFER_DATES_FROM": "past"
if bool(STATUS_TEXT_DELIVERED.search(text))
else "future",
}
search_results = search_dates(text, languages=["en"], settings=settings)
One thing which strikes me most is huge latency spikes when the app is rebooted on deploy and how it calms down after some significant amount of time. This function is currently called around 20 times per minute, but we are expecting this number to grow to at least 400 rpm. On the screenshot you can see three deploys (red stripes).
Now, I have a very limited insight into what performance instrumentation you've been using, but what would be the easiest way to pinpoint what's happening with the search right after it starts from scratch? And why does it take so long to figure out the happy state?