What is a String?
String in Python is-
Examples of String
Create String
Use “str” function to create a new string. If no value is provided to “str” function then an empty string is created-
empty_str = str()
print("empty_str value: ", empty_str)
print("empty_str type: ", type(empty_str))
first_str = str("big box code")
print("first_str value: ", first_str)
print("first_str type: ", type(first_str))
PythonOutput:
empty_str value:
empty_str type: <class 'str'>
first_str value: big box code
first_str type: <class 'str'>
PlaintextWe can wrap a sequence of characters with single or double quotes to define a string literal.
# Define a string
big_box_str = "Site name BigBoxCode"
# Check type and other properties
print("Type: ", type(big_box_str))
print("Length: ", len(big_box_str))
print(dir(big_box_str))
PythonOutput:
Type: <class 'str'>
Length: 20
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
PlaintextMulti-line String Literal
We can write multi-line string literals in several ways-
Use either the single(‘) or double(“) quote three(3) times. In this case, every newline inside the string will be kept as it is-
multiline_str = """The
quick brown fox
jumps over the
lazy dog"""
print(multiline_str)
print(type(multiline_str))
another_multiline_str = '''Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Morbi gravida sapien at lorem pharetra, ut eleifend nisl semper.
Vestibulum sed elit eu purus lobortis posuere.
Etiam erat lacus, rutrum ut sapien eu, mattis laoreet nulla.
Curabitur aliquam, leo eget ullamcorper mattis, arcu metus condimentum neque, vitae feugiat nisi ligula a nisl.
Nunc consectetur diam sem, ut vehicula elit tempor eu. Fusce in feugiat ipsum, in pellentesque nulla.
Suspendisse nec nulla a elit sodales blandit eget posuere odio.
Phasellus a elementum urna. Nunc suscipit ex fermentum ullamcorper hendrerit.
Mauris mattis ligula diam. Phasellus malesuada facilisis imperdiet.
Curabitur leo odio, dapibus vel gravida vel, accumsan nec lacus. Etiam et malesuada velit, id maximus nibh.'''
print(another_multiline_str)
print(type(another_multiline_str))
PythonOutput:
The
quick brown fox
jumps over the
lazy dog
<class 'str'>
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Morbi gravida sapien at lorem pharetra, ut eleifend nisl semper.
Vestibulum sed elit eu purus lobortis posuere.
Etiam erat lacus, rutrum ut sapien eu, mattis laoreet nulla.
Curabitur aliquam, leo eget ullamcorper mattis, arcu metus condimentum neque, vitae feugiat nisi ligula a nisl.
Nunc consectetur diam sem, ut vehicula elit tempor eu. Fusce in feugiat ipsum, in pellentesque nulla.
Suspendisse nec nulla a elit sodales blandit eget posuere odio.
Phasellus a elementum urna. Nunc suscipit ex fermentum ullamcorper hendrerit.
Mauris mattis ligula diam. Phasellus malesuada facilisis imperdiet.
Curabitur leo odio, dapibus vel gravida vel, accumsan nec lacus. Etiam et malesuada velit, id maximus nibh.
<class 'str'>
PlaintextUse a plus(+) sign to concat multiple string literals and add a backslash at the end of each line-
multi_line_str = "Nunc non interdum augue. Suspendisse sed facilisis sapien." + \
"Maecenas condimentum magna et urna egestas congue. Nam quam enim, commodo dictum rhoncus eget," + \
"ornare eget lorem. Quisque molestie luctus pulvinar. Pellentesque metus neque, elementum a odio eu," + \
"vehicula faucibus odio. In a pulvinar ligula. Maecenas malesuada rutrum eros in hendrerit." + \
"Vivamus ante quam, viverra sit amet semper eget, pulvinar et nisi."
print(multi_line_str)
PythonOutput:
Nunc non interdum augue. Suspendisse sed facilisis sapien.Maecenas condimentum magna et urna egestas congue. Nam quam enim, commodo dictum rhoncus eget,ornare eget lorem. Quisque molestie luctus pulvinar. Pellentesque metus neque, elementum a odio eu,vehicula faucibus odio. In a pulvinar ligula. Maecenas malesuada rutrum eros in hendrerit.Vivamus ante quam, viverra sit amet semper eget, pulvinar et nisi.
PlaintextNOTES
Make sure that, there is no space after the backslash(\) at the end.
Wrap the string with parentheses –
multi_line_str = (
"Nunc non interdum augue. Suspendisse sed facilisis sapien."
"Maecenas condimentum magna et urna egestas congue. Nam quam enim, commodo dictum rhoncus eget,"
"ornare eget lorem. Quisque molestie luctus pulvinar. Pellentesque metus neque, elementum a odio eu,"
"vehicula faucibus odio. In a pulvinar ligula. Maecenas malesuada rutrum eros in hendrerit."
"Vivamus ante quam, viverra sit amet semper eget, pulvinar et nisi."
)
print(multi_line_str)
PythonOutput:
Nunc non interdum augue. Suspendisse sed facilisis sapien.Maecenas condimentum magna et urna egestas congue. Nam quam enim, commodo dictum rhoncus eget,ornare eget lorem. Quisque molestie luctus pulvinar. Pellentesque metus neque, elementum a odio eu,vehicula faucibus odio. In a pulvinar ligula. Maecenas malesuada rutrum eros in hendrerit.Vivamus ante quam, viverra sit amet semper eget, pulvinar et nisi.
PlaintextString Escape
There are certain specific characters we can use inside a string literals, that have special meaning and/or behavior. We add a backslash(\) before those special characters to escape.
Here are the escape characters-
Escape Character | Description |
---|---|
\\ | Backslash(\) |
\’ | Single quote(‘) |
\” | Double quote(“) |
\b | Backspace |
\n | Linefeed |
\r | Carriage return |
\t | Tab |
\o | Octal value |
\u | Unicode character with 16-bit hex value |
\U | Unicode character with 32-bit hex value |
\v | Vertical tab |
Here are some examples of the escape characters-
escape_str_test_1 = 'Single quote inside \'string literal\' with single quote'
escape_str_test_2 = "Double quote inside \"string literal\" with double quote"
escape_str_test_3 = 'String literal with a new line here\nand this part is after new line'
escape_str_test_4 = 'String with tab \t and this part is after tab'
print(escape_str_test_1)
print(escape_str_test_2)
print(escape_str_test_3)
print(escape_str_test_4)
triple_quote_str = """first line \n part after new line
Next 1
Next 2 \n another new line
next 3
"""
print(triple_quote_str)
PythonOutput:
Single quote inside 'string literal' with single quote
Double quote inside "string literal" with double quote
String literal with a new line here
and this part is after new line
String with tab and this part is after tab
first line
part after new line
Next 1
Next 2
another new line
next 3
PlaintextString Reassignment Internals
Let’s check what happens when we reassign some new value to a variable-
import ctypes
big_box_str = "big box code"
first_id = id(big_box_str)
print("Initial id: ", first_id)
big_box_str = "Changed new string"
print("ID after change: ", id(big_box_str))
obj = ctypes.cast(first_id, ctypes.py_object)
print("Old string object: ", obj)
PythonOutput:
Initial id: 140117350899824
ID after change: 140117351620384
Old string object: py_object('big box code')
PlaintextNOTES
String objects are immutable. Once set it will stay in memory.
If a string variable is reassigned, a new string object is created and the new object is assigned to the variable.
Unicode Character Representation
We can represent Unicode characters in several ways.
As Python files support Unicode, so we can write the Unicode character directly in the string literal.
Or we can write the hexadecimal code from Unicode, with an “\u” prefix, to represent the character.
uc_latin = "The Latin character: ɸ, unicode representation: \u0278"
print(uc_latin)
kata_latin = "The katakana character: ガ, unicode representation: \u30AC"
print(kata_latin)
uc_devanagri = "The Devanagari character: ॐ, unicode representation: \u0950"
print(uc_devanagri)
uc_bangla = "The Bangla character: ক, unicode representation: \u0995"
print(uc_bangla)
PythonOutput:
The Latin character: ɸ, unicode representation: ɸ
The katakana character: ガ, unicode representation: ガ
The Devanagari character: ॐ, unicode representation: ॐ
The Bangla character: ক, unicode representation: ক
PlaintextWe can represent the Unicode with 8 character representation, with a 32-bit number. We need to prefix that code with “\U”-
uc_latin = "The Latin character: ɸ, unicode representation: \U00000278"
print(uc_latin)
PythonOutput:
The Latin character: ɸ, unicode representation: ɸ
PlaintextWe can use “chr” function to get the Unicode character from the code. Pass the decimal or hex code to “chr” functional and it will return the character-
uc_example = "Unicode copyright charater: ©"
uc_hex_example = "Representation in hex:" + chr(0x00A9)
uc_dec_example = "Representation in decimal:" + chr(169)
print(uc_example)
print(uc_hex_example)
print(uc_dec_example)
PythonOutput:
Unicode copyright charater: ©
Representation in hex:©
Representation in decimal:©
PlaintextString Concatenation
Use the “+” (Plus) sign to concatenate a string with another.
full_str = "First part" + "second part" + "last part"
print(full_str)
site_name = "BigBoxCode"
site_url = "https://bigboxcode.com"
print("Name of the site is "" + site_name + "", and the URL is " + site_url)
PythonOutput:
First partsecond partlast part
Name of the site is "BigBoxCode", and the URL is https://bigboxcode.com
PlaintextIndexing in String
Characters in a string are indexed. The index can start from the left and there is a negative index that starts from the right and progress as negative numbers.
Positive Indexing
Positive index start from the begin of the string. Indexing starts from Zero(0) and then increases as it goes to the right.
We can directly access the string elements by using the index-
my_str = "Big Box Code"
print(my_str[0])
print(my_str[5])
print(my_str[11])
PythonOutput:
B
o
e
PlaintextNegative Indexing
Negative indexing starts from the right(the last character of the string) and progresses backward. Negative indexing starts from negative one(-1).
We can also access the characters by using the negative index.
my_str = "Big Box Code"
print(my_str[-1])
print(my_str[-7])
print(my_str[-12])
PythonOutput:
e
o
B
PlaintextLoop through String Elements
Case #1: Using for Loop
We can use a for loop and traverse through each element of a string.
my_str = "Big Box Code"
for c in my_str:
print(c)
PythonOutput:
B
i
g
B
o
x
C
o
d
e
PlaintextCase #2: Using while Loop
We can use the while loop, start from the string index 0 to the last index to traverse elements of a string. The index of last element is (length of the string -1)
my_str = "Big Box Code"
# Start from index 0
str_index = 0
while str_index < len(my_str):
char_at_str_index = my_str[str_index]
print(f"Index={str_index} || Item: {char_at_str_index}")
# Increase the index to the next
str_index += 1
PythonOutput:
Index=0 || Item: B
Index=1 || Item: i
Index=2 || Item: g
Index=3 || Item:
Index=4 || Item: B
Index=5 || Item: o
Index=6 || Item: x
Index=7 || Item:
Index=8 || Item: C
Index=9 || Item: o
Index=10 || Item: d
Index=11 || Item: e
PlaintextSlicing String
We can get certain(specific) part of a string by slicing it. We need to provide the starting and ending index of the slice.
The last index is not included in the returned slice. The slice will be up to the previous index of the last provided index.
We can also define steps as the last 3rd part of the slice. That will take every nth character, instead of each character.
my_str = "Big Box Code"
# Get the string elements form index 0 to 4(=5-1)
print(my_str[0:5])
# Get the string elements form index 2 to 4(=5-1)
print(my_str[2:5])
# Stops as the end as the max limit is beyond the string index
print(my_str[4:999])
# Step 2 character at at time
print(my_str[2:9:2])
print(my_str[-7:-2])
# Slice the string in reverse
print(my_str[-2:-9:-1])
PythonOutput:
Big B
g B
Box Code
gBxC
ox Co
doC xoB
PlaintextCheck Sub-String Existence
We can check if a string contains a certain substring by using the keyword “in”. We get the result in boolean-
Case #1: Check if Sub-String Exists
my_str = "Big Box Code"
print("Box" in my_str)
print("Co" in my_str)
print("big" in my_str)
PythonOutput:
True
True
False
PlaintextCase #2: Use Boolean Result from “in”
my_str = "Big Box Code"
# Check if "Box" exist in my_str
if "Box" in my_str:
print("Box found in my_str")
else:
print("Box not found")
# Check if "Co" exist in my_str
if "Co" in my_str:
print("Co found in my_str")
else:
print("Co not found")
# Check if "big" exist in my_str
# it will return false as it does not match the case
if "big" in my_str:
print("big found in my_str")
else:
print("big not found")
PythonOutput:
Box found in my_str
Co found in my_str
big not found
PlaintextTransform String
my_str = "Big Box Code"
# Convert string to lower case
print(my_str.lower())
# Convert string to upper case
print(my_str.upper())
# Capitalize first letter of the string
print("some string for test".capitalize())
PythonOutput:
big box code
BIG BOX CODE
Some string for test
PlaintextSearch and Replace
my_str = "Big Box Code"
print(my_str.replace("B", "*"))
print(my_str.replace("ode", ""))
PythonOutput:
*ig *ox Code
Big Box C
PlaintextStrip Whitespaces
my_str = " Big Box Code "
print(f"my_str: start|{my_str}|end")
print(f"my_str.lstrip(): start|{my_str.lstrip()}|end")
print(f"my_str.rstrip(): start|{my_str.rstrip()}|end")
print(f"my_str.strip(): start|{my_str.strip()}|end")
PythonOutput:
my_str: start| Big Box Code |end
my_str.lstrip(): start|Big Box Code |end
my_str.rstrip(): start| Big Box Code|end
my_str.strip(): start|Big Box Code|end
Plaintext