Skip to content

BUG FIX: Using Series.str.fullmatch() and Series.str.match() with a compiled regex fails with arrow strings #61964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

khemkaran10
Copy link
Contributor

@khemkaran10 khemkaran10 commented Jul 26, 2025

Fixes: #61952
After Fix:

DATA = ["applep", "bananap", "Cherryp", "DATEp", "eGGpLANTp", "123p", "23.45p"]
s=pd.Series(DATA)
s.str.fullmatch(re.compile(r"applep"))

Output:
0     True
1    False
2    False
3    False
4    False
5    False
6    False
dtype: bool
DATA = ["applep", "bananap", "Cherryp", "DATEp", "eGGpLANTp", "123p", "23.45p"]
sa=pd.Series(DATA, dtype="string[pyarrow]")
sa.str.match(re.compile(r"applep"))

Output:
0     True
1    False
2    False
3    False
4    False
5    False
6    False
dtype: boolean

@jorisvandenbossche jorisvandenbossche added this to the 2.3.2 milestone Jul 26, 2025
@jorisvandenbossche jorisvandenbossche added Strings String extension data type and string data Arrow pyarrow functionality labels Jul 26, 2025
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

It seems we don't actually document that we support a compiled regular expression, although it works in practice because we pass pat to re.compile() in the non-arrow version, and that works.
But so it would be good to update the documentation and typing then to reflect the fact that a compiled pattern is also supported.

@khemkaran10
Copy link
Contributor Author

@jorisvandenbossche Moved tests to pandas/tests/strings/test_find_replace.py and made a minor change to the docstring. I’m not sure what changes need to be made in docs. could you please provide more details?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Using Series.str.fullmatch() and Series.str.match() with a compiled regex fails with arrow strings
2 participants