numpy strings
In NumPy version 2.0 and later, string operations are primarily handled by the numpy.strings
module. This module provides a comprehensive set of universal functions (ufuncs
) designed to operate efficiently on arrays of type numpy.str_
or numpy.bytes_
. These functions facilitate vectorized string operations, enhancing performance when working with large datasets.
Key Features of numpy.strings
:
- Element-wise Operations: Functions like
add
,multiply
, andmod
allow for element-wise string concatenation, repetition, and formatting. - String Manipulation: Utilities such as
capitalize
,center
,decode
,encode
,expandtabs
,ljust
,lower
,lstrip
,replace
,rjust
,rstrip
,strip
,swapcase
,title
,translate
,upper
, andzfill
provide various string manipulation capabilities. - Comparison Functions: Functions like
equal
,not_equal
,greater_equal
,less_equal
,greater
, andless
enable element-wise string comparisons. - String Information: Functions such as
count
,endswith
,find
,index
,isalnum
,isalpha
,isdecimal
,isdigit
,islower
,isnumeric
,isspace
,istitle
,isupper
,rfind
,rindex
,startswith
, andstr_len
assist in retrieving information about string elements.
String Operations
- numpy.strings.add()
add(x1, x2, /[, out, where, casting, order, ...])
- Performs element-wise string concatenation for two arrays of strings.
- numpy.strings.center()
center(a, width[, fillchar])
- Returns a copy of each string element centered within a string of the specified width, with optional fill characters.
- numpy.strings.capitalize()
capitalize(a)
- Returns a copy of each string element with the first character capitalized and the rest lowercased.
- numpy.strings.decode()
decode(a[, encoding, errors])
- Decodes each byte-string element to a string using the specified encoding.
- numpy.strings.encode()
encode(a[, encoding, errors])
- Encodes each string element to a byte-string using the specified encoding.
- numpy.strings.expandtabs()
expandtabs(a[, tabsize])
- Replaces tab characters in each string element with spaces, using the specified tab size.
- numpy.strings.ljust()
ljust(a, width[, fillchar])
- Returns an array with each string element left-justified in a string of the given width, using optional fill characters.
- numpy.strings.lower()
lower(a)
- Returns a copy of each string element converted to lowercase.
- numpy.strings.lstrip()
lstrip(a[, chars])
- Removes leading characters from each string element, using an optional list of characters to remove.
- numpy.strings.mod()
mod(a, values)
- Performs string formatting (pre-Python 2.6 interpolation) element-wise on an array of strings.
- numpy.strings.multiply()
multiply(a, i)
- Performs element-wise string multiplication, repeating each string element
i
times. - numpy.strings.partition()
partition(a, sep)
- Splits each string element into three parts: the part before the separator, the separator itself, and the part after.
- numpy.strings.replace()
replace(a, old, new[, count])
- Returns a copy of each string element where occurrences of
old
are replaced bynew
, optionally limiting the number of replacements. - numpy.strings.rjust()
rjust(a, width[, fillchar])
- Returns an array where each string element is right-justified in a field of the specified width, using optional fill characters.
- numpy.strings.rpartition()
rpartition(a, sep)
- Splits each string element into three parts: the part before the last occurrence of the separator, the separator itself, and the part after.
- numpy.strings.rstrip()
rstrip(a[, chars])
- Returns a copy of each string element with trailing characters removed, using an optional set of characters to strip.
- numpy.strings.strip()
strip(a[, chars])
- Returns a copy of each string element with leading and trailing characters removed, using an optional set of characters to strip.
- numpy.strings.swapcase()
swapcase(a)
- Returns a copy of each string element with uppercase characters converted to lowercase and vice versa.
- numpy.strings.title()
title(a)
- Returns a copy of each string element converted to title case, where the first letter of each word is capitalized.
- numpy.strings.translate()
translate(a, table[, deletechars])
- Returns a copy of each string element where characters in
deletechars
are removed, and remaining characters are mapped using a translation table. - numpy.strings.upper()
upper(a)
- Returns a copy of each string element converted to uppercase.
- numpy.strings.zfill()
zfill(a, width)
- Returns a copy of each numeric string element left-filled with zeros to match the specified width.
String Comparison Functions
- numpy.equal()
equal(x1, x2, /[, out, where, casting, ...])
- Performs element-wise comparison, returning
True
wherex1 == x2
. - numpy.not_equal()
not_equal(x1, x2, /[, out, where, casting, ...])
- Performs element-wise comparison, returning
True
wherex1 != x2
. - numpy.greater_equal()
greater_equal(x1, x2, /[, out, where, ...])
- Performs element-wise comparison, returning
True
wherex1 >= x2
. - numpy.less_equal()
less_equal(x1, x2, /[, out, where, casting, ...])
- Performs element-wise comparison, returning
True
wherex1 <= x2
. - numpy.greater()
greater(x1, x2, /[, out, where, casting, ...])
- Performs element-wise comparison, returning
True
wherex1 > x2
. - numpy.less()
less(x1, x2, /[, out, where, casting, ...])
- Performs element-wise comparison, returning
True
wherex1 < x2
.
String Information Functions
- numpy.strings.count()
count(a, sub[, start, end])
- Returns an array with the number of non-overlapping occurrences of the substring
sub
within each string element, optionally within the specified range[start, end)
. - numpy.strings.endswith()
endswith(a, suffix[, start, end])
- Returns a boolean array where
True
indicates that the string element ends with the specifiedsuffix
, optionally within the range[start, end)
. - numpy.strings.find()
find(a, sub[, start, end])
- Returns the lowest index in each string element where the substring
sub
is found, or-1
if not found, optionally searching within[start, end)
. - numpy.strings.index()
index(a, sub[, start, end])
- Similar to
find()
but raises aValueError
if the substring is not found. - numpy.strings.isalnum()
isalnum(x, /[, out, where, casting, order, ...])
- Returns
True
for each element where all characters are alphanumeric and there is at least one character; otherwise, returnsFalse
. - numpy.strings.isalpha()
isalpha(x, /[, out, where, casting, order, ...])
- Returns
True
for each element where all characters are alphabetic and there is at least one character; otherwise, returnsFalse
. - numpy.strings.isdecimal()
isdecimal(x, /[, out, where, casting, ...])
- Returns
True
for each element where all characters are decimal digits; otherwise, returnsFalse
. - numpy.strings.isdigit()
isdigit(x, /[, out, where, casting, order, ...])
- Returns
True
for each element where all characters are digits and there is at least one character; otherwise, returnsFalse
. - numpy.strings.islower()
islower(x, /[, out, where, casting, order, ...])
- Returns
True
for each element where all cased characters are lowercase and there is at least one cased character; otherwise, returnsFalse
. - numpy.strings.isnumeric()
isnumeric(x, /[, out, where, casting, ...])
- Returns
True
for each element where all characters are numeric; otherwise, returnsFalse
. - numpy.strings.isspace()
isspace(x, /[, out, where, casting, order, ...])
- Returns
True
for each element where all characters are whitespace and there is at least one character; otherwise, returnsFalse
. - numpy.strings.istitle()
istitle(x, /[, out, where, casting, order, ...])
- Returns
True
for each element where the string is titlecased (i.e., the first letter of each word is uppercase and the rest are lowercase); otherwise, returnsFalse
. - numpy.strings.isupper()
isupper(x, /[, out, where, casting, order, ...])
- Returns
True
for each element where all cased characters are uppercase and there is at least one character; otherwise, returnsFalse
. - numpy.strings.rfind()
rfind(a, sub[, start, end])
- Returns the highest index in each string element where the substring
sub
is found, or-1
if not found, optionally searching within[start, end)
. - numpy.strings.rindex()
rindex(a, sub[, start, end])
- Similar to
rfind()
but raises aValueError
if the substringsub
is not found. - numpy.strings.startswith()
startswith(a, prefix[, start, end])
- Returns a boolean array where
True
indicates that the string element starts with the specifiedprefix
, optionally within the range[start, end)
. - numpy.strings.str_len()
str_len(x, /[, out, where, casting, order, ...])
- Returns the length of each string element in the array.