String Operations

Built-in methods of str, bytes, and bytearray are provided as functions.

Some functions have enhanced parameters.

All functions support function composition modes. See Function Composition for details.

List of APIs

calcpy.str.capitalize(value, /)[source]

Capitalize the first character of each word in the string.

Parameters:

value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

Return type:

str | bytes | bytearray | (list | tuple | pd.Series)[str]

Examples

>>> capitalize("hello world")
'Hello world'
>>> capitalize(["hello", "world"])
['Hello', 'World']
>>> import pandas as pd
>>> capitalize(pd.Series(["hello", "world"]))
0    Hello
1    World
dtype: object
calcpy.str.capwords(value, /)[source]

Capitalize the first character of each word in the string.

Parameters:

value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

Return type:

str | bytes | bytearray | (list | tuple | pd.Series)[str]

Examples

>>> capwords("hello world")
'Hello World'
>>> capwords(["hello", "world"])
['Hello', 'World']
>>> import pandas as pd
>>> capwords(pd.Series(["hello", "world"]))
0    Hello
1    World
dtype: object
calcpy.str.casefold(value, /)[source]
Parameters:

value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

Return type:

str | bytes | bytearray | (list | tuple | pd.Series)[str]

Examples

>>> casefold("Hello World")
'hello world'
>>> casefold(["Hello", "World"])
['hello', 'world']
>>> import pandas as pd
>>> casefold(pd.Series(["Hello", "World"]))
0    hello
1    world
dtype: object
calcpy.str.center(value, /, width, fillchar=' ')[source]
Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • width (int) – width of the string

  • fillchar (str) – fill character

Return type:

str | bytes | bytearray | (list | tuple | pd.Series)[str]

Examples

>>> center("Hello World", 20)
'  Hello World  '
>>> center(["Hello", "World"], 20)
['  Hello  ', '  World  ']
>>> import pandas as pd
>>> center(pd.Series(["Hello", "World"]), 20)
0      Hello
1      World
dtype: object
calcpy.str.contains(value, /, pat, case=True, flags=0, na=None, regex=True)[source]
Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • pat (str) – pattern to match

  • case (bool) – case sensitive

  • flags (int) – flags

  • na (bool) – na value

  • regex (bool) – regex

Return type:

bool | (list | tuple | pd.Series)[str]

Examples

>>> contains("Hello World", "World")
True
>>> contains(["Hello", "World"], "World")
[False, True]
>>> import pandas as pd
>>> contains(pd.Series(["Hello", "World"]), "World")
0    False
1     True
dtype: bool
calcpy.str.count(value, /, sub, start=0, end=None)[source]
Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • sub (str) – substring to count

  • start (int) – start index

  • end (int) – end index

Return type:

int | (list | tuple | pd.Series)

Examples

>>> count('Hello World', 'o')
2
>>> count(['Hello', 'World'], 'o')
[1, 1]
>>> import pandas as pd
>>> count(pd.Series(['Hello', 'World']), 'o')
0    1
1    1
dtype: int64
calcpy.str.decode(value, /, encoding='utf-8', errors='strict')[source]
Parameters:
  • value (bytes | bytearray | (list | tuple | pd.Series)[bytes | bytearray])

  • encoding (str) – encoding

  • errors (str) – errors

Return type:

str | list | tuple | pd.Series

Examples

>>> decode(b'Hello World', 'utf-8', 'strict')
'Hello World'
>>> decode([b'Hello', b'World'], 'utf-8', 'strict')
['Hello', 'World']
>>> import pandas as pd
>>> decode(pd.Series([b'Hello', b'World']), 'utf-8', 'strict')
0    Hello
1    World
dtype: object
calcpy.str.dedent(text)[source]

Remove any common leading whitespace from every line in text.

This can be used to make triple-quoted strings line up with the left edge of the display, while still presenting them in the source code in indented form.

Note that tabs and spaces are both treated as whitespace, but they are not equal: the lines “ hello” and “thello” are considered to have no common leading whitespace.

Entirely blank lines are normalized to a newline character.

calcpy.str.encode(value, /, encoding='utf-8', errors='strict')[source]
Parameters:
  • value (str | (list | tuple | pd.Series)[str])

  • encoding (str) – encoding

  • errors (str) – errors

Return type:

bytes | list | tuple | pd.Series

Examples

>>> encode('Hello World', 'utf-8', 'strict')
b'Hello World'
>>> encode(['Hello', 'World'], 'utf-8', 'strict')
[b'Hello', b'World']
>>> import pandas as pd
>>> encode(pd.Series(['Hello', 'World']), 'utf-8', 'strict')
0    b'Hello'
1    b'World'
dtype: object
calcpy.str.endswith(value, /, suffix, start=0, end=None)[source]
Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • suffix (str) – suffix to endswith

  • start (int) – start index

  • end (int) – end index

Return type:

str | bytes | bytearray | (list | tuple | pd.Series)[str]

Examples

>>> endswith('Hello World', 'World')
True
>>> endswith(['Hello', 'World'], 'World')
[False, True]
>>> import pandas as pd
>>> endswith(pd.Series(['Hello', 'World']), 'World')
0    False
1     True
dtype: bool
calcpy.str.expandtabs(value, /, tabsize=8)[source]
Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • tabsize (int) – tabs

Return type:

str | bytes | bytearray | (list | tuple | pd.Series)[str]

Examples

>>> expandtabs('Hello   World', 4)
'Hello    World'
>>> expandtabs(['Hello  World', 'Hello  World'], 4)
['Hello    World', 'Hello    World']
>>> import pandas as pd
>>> expandtabs(pd.Series(['Hello        World', 'Hello  World']), 4)
0    Hello    World
1    Hello    World
dtype: object
calcpy.str.fill(text, width=70, **kwargs)[source]

Fill a single paragraph of text, returning a new string.

Reformat the single paragraph in ‘text’ to fit in lines of no more than ‘width’ columns, and return a new string containing the entire wrapped paragraph. As with wrap(), tabs are expanded and other whitespace characters converted to space. See TextWrapper class for available keyword args to customize wrapping behaviour.

calcpy.str.find(value, /, sub, start=0, end=None)[source]
Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • sub (str) – substring to find

  • start (int) – start index

  • end (int) – end in

Return type:

int | list | tuple | pd.Series

Examples

>>> find('Hello World', 'World')
6
>>> find(['Hello', 'World'], 'World')
[-1, 0]
>>> import pandas as pd
>>> find(pd.Series(['Hello', 'World']), 'World')
0    -1
1     0
dtype: int64
calcpy.str.format_(value, /, *args, **kwargs)[source]

Perform a string formatting operation.

Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • *args

  • **kwargs

Return type:

str | bytes | bytearray | (list | tuple | pd.Series)[str]

Examples

>>> format_('Hello {0}', 'World')
'Hello World'
>>> format_(['Hello', 'World'], '{0}')
['Hello', 'World']
>>> import pandas as pd
>>> format_(pd.Series(['Hello', 'World']), '{0}')
0    Hello
1    World
dtype: object
calcpy.str.format_map(value, /, mapping)[source]
Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • mapping (dict) – mapping

Return type:

str | bytes | bytearray | (list | tuple | pd.Series)[str]

Examples

>>> format_map('Hello {w}', {'w': 'World'})
'Hello World'
>>> format_map(['Hello', '{w}'], {'w': 'World'})
['Hello', 'World']
>>> import pandas as pd
>>> format_map(pd.Series(['Hello', '{w}']), {'w': 'World'})
0    Hello
1    World
dtype: object
calcpy.str.index(value, /, sub, start=0, end=None)[source]
Parameters:
  • value (str | bytes | bytearray | (list | tuple | pd.Series)[str])

  • sub (str) – substring to index

  • start (int) – start index

  • end (int) – end index

Return type:

int | list | tuple | pd.Series

Raises:

ValueError – if sub is not found

Examples

>>> index('Hello World', 'World')
6
>>> index(['Hello', 'World'], 'World')
Traceback (most recent call last):
ValueError: substring not found
>>> import pandas as pd
>>> index(pd.Series(['Hello', 'World']), 'World')
Traceback (most recent call last):
ValueError: substring not found
calcpy.str.indent(text, prefix, predicate=None)[source]

Adds ‘prefix’ to the beginning of selected lines in ‘text’.

If ‘predicate’ is provided, ‘prefix’ will only be added to the lines where ‘predicate(line)’ is True. If ‘predicate’ is not provided, it will default to adding ‘prefix’ to all non-empty lines that do not consist solely of whitespace characters.

calcpy.str.isalnum(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> isalnum('Hello World')
False
>>> isalnum(['Hello', 'World'])
[True, True]
>>> import pandas as pd
>>> isalnum(pd.Series(['Hello', 'World']))
0    True
1    True
dtype: bool
calcpy.str.isalpha(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> isalpha('Hello World')
False
>>> isalpha(['Hello', 'World'])
[True, True]
>>> import pandas as pd
>>> isalpha(pd.Series(['Hello', 'World']))
0    True
1    True
dtype: bool
calcpy.str.isascii(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> isascii('Hello World')
True
>>> isascii(['Hello', 'World'])
[True, True]
>>> import pandas as pd
>>> isascii(pd.Series(['Hello', 'World']))
0    True
1    True
dtype: bool
calcpy.str.isdecimal(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> isdecimal('Hello World')
False
>>> isdecimal(['Hello', 'World'])
[False, False]
>>> import pandas as pd
>>> isdecimal(pd.Series(['Hello', 'World']))
0    False
1    False
dtype: bool
calcpy.str.isidentifier(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> isidentifier('Hello World')
False
>>> isidentifier(['Hello', 'World'])
[True, True]
>>> import pandas as pd
>>> isidentifier(pd.Series(['Hello', 'World']))
0    True
1    True
dtype: bool
calcpy.str.islower(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> islower('Hello World')
False
>>> islower(['Hello', 'World'])
[False, False]
>>> import pandas as pd
>>> islower(pd.Series(['Hello', 'World']))
0    False
1    False
dtype: bool
calcpy.str.iskeyword(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> iskeyword('Hello World')
False
>>> iskeyword(['Hello', 'World'])
[False, False]
>>> import pandas as pd
>>> iskeyword(pd.Series(['Hello', 'World']))
0    False
1    False
dtype: bool
calcpy.str.isnumeric(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> isnumeric('Hello World')
False
>>> isnumeric(['Hello', 'World'])
[False, False]
>>> import pandas as pd
>>> isnumeric(pd.Series(['Hello', 'World']))
0    False
1    False
dtype: bool
calcpy.str.isprintable(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

Examples

>>> isprintable('Hello World')
True
>>> isprintable(['Hello', 'World'])
[True, True]
>>> import pandas as pd
>>> isprintable(pd.Series(['Hello', 'World']))
0    True
1    True
dtype: bool
calcpy.str.isspace(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

calcpy.str.istitle(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

calcpy.str.isupper(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

bool | pd.Series | pd.DataFrame

calcpy.str.join(value, /, sep='')[source]
Parameters:
  • value (list[str] | tuple[str] | pd.Series[str]) – string to join

  • sep (str) – separator

Returns:

str

Examples

>>> join(['Hello', 'World'], ' ')
'Hello World'
>>> import pandas as pd
>>> join(pd.Series(['Hello', 'World']), ' ')
'Hello World'
calcpy.str.ljust(value, /, width, fillchar=' ')[source]
Parameters:
  • value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

  • width (int)

  • fillchar (str)

Return type:

str | pd.Series | pd.DataFrame

calcpy.str.lower(value, /)[source]
Parameters:

value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

Return type:

str | pd.Series | pd.DataFrame

calcpy.str.lstrip(value, /, chars=None)[source]
Parameters:
  • value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

  • chars (str)

Return type:

str | pd.Series | pd.DataFrame

calcpy.str.partition(value, /, sep, expand=True)[source]
Parameters:
  • value (str | bytes | bytearray | list | pd.Series[str]) – string to partition

  • sep (str) – separator

  • expand (bool) – expand or not. Only used when input is an NDFrame. Should be False when value is a pd.DataFrame.

Return type:

tuple | list | pd.DataFrame

Examples

>>> partition('Hello World', ' ')
('Hello', ' ', 'World')
>>> partition(['Hello', 'World'], ' ')
[('Hello', '', ''), ('World', '', '')]
>>> import pandas as pd
>>> partition(pd.Series(['Hello', 'World']), 'l')
     0  1   2
0   He  l  lo
1  Wor  l   d
calcpy.str.removeprefix(value, /, prefix)[source]
Parameters:
  • value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

  • prefix (str)

Return type:

str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame

calcpy.str.removesuffix(value, /, suffix)[source]
Parameters:
  • value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

  • suffix (str)

Return type:

str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame

calcpy.str.replace(value, pattern, new=None, /, count=inf)[source]
Parameters:
  • value (str | list[str] | tuple[str] | pd.Series[str]) – string to replace

  • pattern (str | dict) – old string, or a dict from old string to new string

  • new (str | None) – new string if pattern is the old string

  • count (int | inf) – maximum number of replacements. Default value (inf) means do not limit the number of replacements. 0 means disabling replacements.

Returns:

string replaced.

Notes

The parameters differ from either the built-in str.replace method or pd.Series.str.replace method.

Examples

>>> replace('Hello World', 'World', 'Earth')
'Hello Earth'
>>> replace(['Hello', 'World'], 'World', 'Earth')
['Hello', 'Earth']
>>> import pandas as pd
>>> replace(pd.Series(['Hello', 'World']), 'World', 'Earth')
0    Hello
1    Earth
dtype: object
>>> replace('aaaa', {'a': 'b'}, count=0)
'aaaa'
>>> replace('aaaa', {'a': 'b'}, count=2)
'bbaa'
calcpy.str.rfind(value, /, sub, start=0, end=None)[source]
Parameters:
  • value (str | list[str] | tuple[str] | pd.Series[str]) – string to find

  • sub (str) – substring to find

  • start (int) – start index

  • end (int) – end in

Returns:

Found indices.

Examples

>>> rfind('Hello World', 'World')
6
>>> rfind(['Hello', 'World'], 'World')
[-1, 0]
>>> import pandas as pd
>>> rfind(pd.Series(['Hello', 'World']), 'World')
0    -1
1     0
dtype: int64
calcpy.str.rindex(value, /, sub, start=0, end=None)[source]
Parameters:
  • value (str | bytes | bytearray | list | tuple | pd.Series | pd.DataFrame)

  • sub (str) – substring to index

  • start (int) – start index

  • end (int) – end index

Returns:

indices

Raises:

ValueError – if sub is not found

Examples

>>> rindex('Hello World', 'World')
6
>>> rindex(['Hello', 'World'], 'World')
Traceback (most recent call last):
ValueError: substring not found
>>> import pandas as pd
>>> rindex(pd.Series(['Hello', 'World']), 'World')
Traceback (most recent call last):
ValueError: substring not found
calcpy.str.rjust(value, /, width, fillchar=' ')[source]
calcpy.str.rpartition(s, /, sep)[source]
calcpy.str.rsplit(value, /, sep=None, maxsplit=-1, minsplit=0, fillvalue=None, expand=False)[source]

Split string by separator.

Parameters:
  • s (str | pd.Series) – string to split

  • sep (str, optional) – separator. By default split on whitespace

  • maxsplit (int) – maximum number of splits

  • minsplit (int) – minimum number of splits

  • fillvalue (Optional) – fill value if not enough splits

  • expand (bool) – expand result pd.Series to pd.DataFrame. Can be True only when input is an pd.Series

Returns:

Splitted strings

Examples

>>> rsplit('abc def ghi')
['abc', 'def', 'ghi']
>>> rsplit('abc def ghi', ' ', maxsplit=1)
['abc def', 'ghi']
>>> rsplit('abc def ghi', ' ', minsplit=2)
['abc', 'def', 'ghi']
>>> rsplit('abc def ghi', ' ', minsplit=4, fillvalue="")
['abc', 'def', 'ghi', '']
>>> rsplit(pd.Series(['abc def', 'ABC']), ' ', minsplit=3, fillvalue="", expand=True)
     0    1 2
0  abc  def
1  ABC
calcpy.str.shorten(text, width, **kwargs)[source]

Collapse and truncate the given text to fit in the given width.

The text first has its whitespace collapsed. If it then fits in the width, it is returned as is. Otherwise, as many words as possible are joined and then the placeholder is appended:

>>> textwrap.shorten("Hello  world!", width=12)
'Hello world!'
>>> textwrap.shorten("Hello  world!", width=11)
'Hello [...]'
calcpy.str.split(value, /, sep=None, maxsplit=-1, minsplit=0, fillvalue=None, expand=False)[source]

Split string by separator.

Parameters:
  • value (str | pd.Series) – string to split

  • sep (str, optional) – separator. By default split on whitespace

  • maxsplit (int) – maximum number of splits

  • minsplit (int) – minimum number of splits

  • fillvalue (Optional) – fill value if not enough splits

  • expand (bool) – expand result pd.Series to pd.DataFrame. Can be True only when input is an pd.Series

Returns:

Splitted strings

Examples

>>> split('abc def ghi')
['abc', 'def', 'ghi']
>>> split('abc def ghi', ' ', maxsplit=1)
['abc', 'def ghi']
>>> split('abc def ghi', ' ', minsplit=2)
['abc', 'def', 'ghi']
>>> split('abc def ghi', ' ', minsplit=4, fillvalue="")
['abc', 'def', 'ghi', '']
>>> split(pd.Series(['abc def', 'ABC']), ' ', minsplit=3, fillvalue="", expand=True)
     0    1 2
0  abc  def
1  ABC
calcpy.str.splitlines(value, /, keepends=False)[source]
calcpy.str.startswith(value, /, prefix, start=0, end=None)[source]
calcpy.str.strip(value, /, chars=None)[source]
calcpy.str.sub(value, pattern, new=None, count=inf, flags=0)[source]

Replace using regex.

Parameters:
  • value (str | list[str] | tuple[str] | pd.Series[str]) – string to replace

  • pattern (str | dict) – old pattern, or a dict from old string to new string

  • new (str | None) – new string if pattern is the old string

  • count (int | inf) – maximum number of replacements. By default, there is no limit on the number of replacements. 0 means disabling replacements.

  • flags (re.RegexFlag)

Returns:

string replaced.

Notes

The parameters differ from the built-in re.sub method.

Examples

>>> sub('Hello World', 'World', 'Earth')
'Hello Earth'
>>> sub(['Hello', 'World'], 'World', 'Earth')
['Hello', 'Earth']
>>> import pandas as pd
>>> sub(pd.Series(['Hello', 'World']), 'World', 'Earth')
0    Hello
1    Earth
dtype: object
>>> sub('aaaa', {'a': 'b'}, count=0)
'aaaa'
>>> sub('aaaa', {'a': 'b'}, count=2)
'bbaa'
calcpy.str.swapcase(value, /)[source]
calcpy.str.title(value, /)[source]
calcpy.str.translate(value, /, table)[source]
calcpy.str.upper(value, /)[source]
calcpy.str.wrap(text, width=70, **kwargs)[source]

Wrap a single paragraph of text, returning a list of wrapped lines.

Reformat the single paragraph in ‘text’ so it fits in lines of no more than ‘width’ columns, and return a list of wrapped lines. By default, tabs in ‘text’ are expanded with string.expandtabs(), and all other whitespace characters (including newline) are converted to space. See TextWrapper class for available keyword args to customize wrapping behaviour.

calcpy.str.zfill(value, /, width)[source]