From 9146d2a7ae9679adb30bdef028a39a79c1294aac Mon Sep 17 00:00:00 2001 From: Joris Van den Bossche Date: Sat, 26 Jul 2025 11:19:21 +0200 Subject: [PATCH] Backport PR #61921: DOC: explicitly mention new str dtype is no longer a numpy dtype in migration guide --- doc/source/user_guide/migration-3-strings.rst | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/doc/source/user_guide/migration-3-strings.rst b/doc/source/user_guide/migration-3-strings.rst index c415f8f43d3c8..c103b88c1db5d 100644 --- a/doc/source/user_guide/migration-3-strings.rst +++ b/doc/source/user_guide/migration-3-strings.rst @@ -118,12 +118,17 @@ through the ``str`` accessor will work the same: Overview of behavior differences and how to address them --------------------------------------------------------- -The dtype is no longer object dtype -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The dtype is no longer a numpy "object" dtype +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When inferring or reading string data, the data type of the resulting DataFrame column or Series will silently start being the new ``"str"`` dtype instead of -``"object"`` dtype, and this can have some impact on your code. +the numpy ``"object"`` dtype, and this can have some impact on your code. + +The new string dtype is a pandas data type ("extension dtype"), and no longer a +numpy ``np.dtype`` instance. Therefore, passing the dtype of a string column to +numpy functions will no longer work (e.g. passing it to a ``dtype=`` argument +of a numpy function, or using ``np.issubdtype`` to check the dtype). Checking the dtype ^^^^^^^^^^^^^^^^^^