Python: String [Data Type]

What is a String?

String in Python is-

A sequence of Unicode characters written together.
Wrapped using single(‘) or double(“) quote. We just need to make sure that we use the same quote on both end of the string literal.
Can contain any character in it, there is no limit.

Examples of String

“Hello! World”
‘Name of this site is Big Box Code’
“1234 is a valid positive integer”

Create String

Use “str” function to create a new string. If no value is provided to “str” function then an empty string is created-

empty_str = str()

print("empty_str value: ", empty_str)
print("empty_str type: ", type(empty_str))

first_str = str("big box code")

print("first_str value: ", first_str)
print("first_str type: ", type(first_str))
Python

Output:

empty_str value:  
empty_str type:  <class 'str'>

first_str value:  big box code
first_str type:  <class 'str'>
Plaintext

We can wrap a sequence of characters with single or double quotes to define a string literal.

# Define a string
big_box_str = "Site name BigBoxCode"

# Check type and other properties
print("Type: ", type(big_box_str))
print("Length: ", len(big_box_str))

print(dir(big_box_str))
Python

Output:

Type:  <class 'str'>
Length:  20

['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
Plaintext

Multi-line String Literal

We can write multi-line string literals in several ways-

Use either the single(‘) or double(“) quote three(3) times. In this case, every newline inside the string will be kept as it is-

multiline_str = """The 
quick brown fox 
jumps over the 
lazy dog"""

print(multiline_str)
print(type(multiline_str))

another_multiline_str = '''Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
Morbi gravida sapien at lorem pharetra, ut eleifend nisl semper. 
Vestibulum sed elit eu purus lobortis posuere. 
Etiam erat lacus, rutrum ut sapien eu, mattis laoreet nulla. 
Curabitur aliquam, leo eget ullamcorper mattis, arcu metus condimentum neque, vitae feugiat nisi ligula a nisl. 
Nunc consectetur diam sem, ut vehicula elit tempor eu. Fusce in feugiat ipsum, in pellentesque nulla. 
Suspendisse nec nulla a elit sodales blandit eget posuere odio. 
Phasellus a elementum urna. Nunc suscipit ex fermentum ullamcorper hendrerit. 
Mauris mattis ligula diam. Phasellus malesuada facilisis imperdiet. 
Curabitur leo odio, dapibus vel gravida vel, accumsan nec lacus. Etiam et malesuada velit, id maximus nibh.'''

print(another_multiline_str)
print(type(another_multiline_str))
Python

Output:

The 
quick brown fox
jumps over the
lazy dog


<class 'str'>


Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Morbi gravida sapien at lorem pharetra, ut eleifend nisl semper.
Vestibulum sed elit eu purus lobortis posuere.
Etiam erat lacus, rutrum ut sapien eu, mattis laoreet nulla.
Curabitur aliquam, leo eget ullamcorper mattis, arcu metus condimentum neque, vitae feugiat nisi ligula a nisl.     
Nunc consectetur diam sem, ut vehicula elit tempor eu. Fusce in feugiat ipsum, in pellentesque nulla.
Suspendisse nec nulla a elit sodales blandit eget posuere odio.
Phasellus a elementum urna. Nunc suscipit ex fermentum ullamcorper hendrerit.
Mauris mattis ligula diam. Phasellus malesuada facilisis imperdiet.
Curabitur leo odio, dapibus vel gravida vel, accumsan nec lacus. Etiam et malesuada velit, id maximus nibh.


<class 'str'>
Plaintext

Use a plus(+) sign to concat multiple string literals and add a backslash at the end of each line-

multi_line_str = "Nunc non interdum augue. Suspendisse sed facilisis sapien." + \
    "Maecenas condimentum magna et urna egestas congue. Nam quam enim, commodo dictum rhoncus eget,"  + \
    "ornare eget lorem. Quisque molestie luctus pulvinar. Pellentesque metus neque, elementum a odio eu,"  + \
    "vehicula faucibus odio. In a pulvinar ligula. Maecenas malesuada rutrum eros in hendrerit."  + \
    "Vivamus ante quam, viverra sit amet semper eget, pulvinar et nisi."

print(multi_line_str)
Python

Output:

Nunc non interdum augue. Suspendisse sed facilisis sapien.Maecenas condimentum magna et urna egestas congue. Nam quam enim, commodo dictum rhoncus eget,ornare eget lorem. Quisque molestie luctus pulvinar. Pellentesque metus neque, elementum a odio eu,vehicula faucibus odio. In a pulvinar ligula. Maecenas malesuada rutrum eros in hendrerit.Vivamus ante quam, viverra sit amet semper eget, pulvinar et nisi.
Plaintext

NOTES

Make sure that, there is no space after the backslash(\) at the end.

Wrap the string with parentheses –

multi_line_str = (
    "Nunc non interdum augue. Suspendisse sed facilisis sapien."
    "Maecenas condimentum magna et urna egestas congue. Nam quam enim, commodo dictum rhoncus eget,"
    "ornare eget lorem. Quisque molestie luctus pulvinar. Pellentesque metus neque, elementum a odio eu,"
    "vehicula faucibus odio. In a pulvinar ligula. Maecenas malesuada rutrum eros in hendrerit."
    "Vivamus ante quam, viverra sit amet semper eget, pulvinar et nisi."
)

print(multi_line_str)
Python

Output:

Nunc non interdum augue. Suspendisse sed facilisis sapien.Maecenas condimentum magna et urna egestas congue. Nam quam enim, commodo dictum rhoncus eget,ornare eget lorem. Quisque molestie luctus pulvinar. Pellentesque metus neque, elementum a odio eu,vehicula faucibus odio. In a pulvinar ligula. Maecenas malesuada rutrum eros in hendrerit.Vivamus ante quam, viverra sit amet semper eget, pulvinar et nisi.
Plaintext

String Escape

There are certain specific characters we can use inside a string literals, that have special meaning and/or behavior. We add a backslash(\) before those special characters to escape.

Here are the escape characters-

Escape CharacterDescription
\\Backslash(\)
\’Single quote(‘)
\”Double quote(“)
\bBackspace
\nLinefeed
\rCarriage return
\tTab
\oOctal value
\uUnicode character with 16-bit hex value
\UUnicode character with 32-bit hex value
\vVertical tab

Here are some examples of the escape characters-

escape_str_test_1 = 'Single quote inside \'string literal\' with single quote'
escape_str_test_2 = "Double quote inside \"string literal\" with double quote"
escape_str_test_3 = 'String literal with a new line here\nand this part is after new line'
escape_str_test_4 = 'String with tab \t and this part is after tab'

print(escape_str_test_1)
print(escape_str_test_2)
print(escape_str_test_3)
print(escape_str_test_4)

triple_quote_str = """first line \n part after new line
Next 1
Next 2 \n another new line
next 3
"""

print(triple_quote_str)
Python

Output:

Single quote inside 'string literal' with single quote
Double quote inside "string literal" with double quote
String literal with a new line here
and this part is after new line
String with tab          and this part is after tab

first line 
 part after new line
Next 1
Next 2
 another new line
next 3
Plaintext

String Reassignment Internals

Let’s check what happens when we reassign some new value to a variable-

import ctypes

big_box_str = "big box code"

first_id = id(big_box_str)

print("Initial id: ", first_id)

big_box_str = "Changed new string"

print("ID after change: ", id(big_box_str))

obj = ctypes.cast(first_id, ctypes.py_object)

print("Old string object: ", obj)
Python

Output:

Initial id:  140117350899824
ID after change:  140117351620384

Old string object:  py_object('big box code')
Plaintext

NOTES

String objects are immutable. Once set it will stay in memory.

If a string variable is reassigned, a new string object is created and the new object is assigned to the variable.

Python String Reassignment
Python String Reassignment

Unicode Character Representation

We can represent Unicode characters in several ways.

As Python files support Unicode, so we can write the Unicode character directly in the string literal.

Or we can write the hexadecimal code from Unicode, with an “\u” prefix, to represent the character.

uc_latin = "The Latin character: ɸ, unicode representation: \u0278"

print(uc_latin)

kata_latin = "The katakana character: ガ, unicode representation: \u30AC"

print(kata_latin)

uc_devanagri = "The Devanagari character: ॐ, unicode representation: \u0950"

print(uc_devanagri)

uc_bangla = "The Bangla character: ক, unicode representation: \u0995"

print(uc_bangla)
Python

Output:

The Latin character: ɸ, unicode representation: ɸ

The katakana character: ガ, unicode representation: ガ

The Devanagari character: ॐ, unicode representation: ॐ

The Bangla character: ক, unicode representation: ক
Plaintext

We can represent the Unicode with 8 character representation, with a 32-bit number. We need to prefix that code with “\U”-

uc_latin = "The Latin character: ɸ, unicode representation: \U00000278"

print(uc_latin)
Python

Output:

The Latin character: ɸ, unicode representation: ɸ
Plaintext

We can use “chr” function to get the Unicode character from the code. Pass the decimal or hex code to “chr” functional and it will return the character-

uc_example = "Unicode copyright charater: ©"
uc_hex_example = "Representation in hex:" + chr(0x00A9)
uc_dec_example = "Representation in decimal:" + chr(169)

print(uc_example)
print(uc_hex_example)
print(uc_dec_example)
Python

Output:

Unicode copyright charater: ©
Representation in hex:©
Representation in decimal:©
Plaintext

String Concatenation

Use the “+” (Plus) sign to concatenate a string with another.

full_str = "First part" + "second part" + "last part"

print(full_str)

site_name = "BigBoxCode"
site_url = "https://bigboxcode.com"

print("Name of the site is "" + site_name + "", and the URL is " + site_url)
Python

Output:

First partsecond partlast part
Name of the site is "BigBoxCode", and the URL is https://bigboxcode.com
Plaintext

Indexing in String

Characters in a string are indexed. The index can start from the left and there is a negative index that starts from the right and progress as negative numbers.

String Character Indexing
String Character Indexing

Positive Indexing

Positive index start from the begin of the string. Indexing starts from Zero(0) and then increases as it goes to the right.

We can directly access the string elements by using the index-

my_str = "Big Box Code"

print(my_str[0])
print(my_str[5])
print(my_str[11])
Python

Output:

B
o
e
Plaintext

Negative Indexing

Negative indexing starts from the right(the last character of the string) and progresses backward. Negative indexing starts from negative one(-1).

We can also access the characters by using the negative index.

my_str = "Big Box Code"

print(my_str[-1])
print(my_str[-7])
print(my_str[-12])
Python

Output:

e
o
B
Plaintext

Loop through String Elements

Case #1: Using for Loop

We can use a for loop and traverse through each element of a string.

my_str = "Big Box Code"

for c in my_str:
    print(c)
Python

Output:

B
i
g

B
o
x

C
o
d
e
Plaintext

Case #2: Using while Loop

We can use the while loop, start from the string index 0 to the last index to traverse elements of a string. The index of last element is (length of the string -1)

my_str = "Big Box Code"

# Start from index 0
str_index = 0

while str_index < len(my_str):
    char_at_str_index = my_str[str_index]
    
    print(f"Index={str_index} || Item: {char_at_str_index}")
    
    # Increase the index to the next
    str_index += 1
Python

Output:

Index=0 || Item: B
Index=1 || Item: i
Index=2 || Item: g
Index=3 || Item:
Index=4 || Item: B
Index=5 || Item: o
Index=6 || Item: x
Index=7 || Item:
Index=8 || Item: C
Index=9 || Item: o
Index=10 || Item: d
Index=11 || Item: e
Plaintext

Slicing String

We can get certain(specific) part of a string by slicing it. We need to provide the starting and ending index of the slice.

The last index is not included in the returned slice. The slice will be up to the previous index of the last provided index.

We can also define steps as the last 3rd part of the slice. That will take every nth character, instead of each character.

my_str = "Big Box Code"

# Get the string elements form index 0 to 4(=5-1) 
print(my_str[0:5])

# Get the string elements form index 2 to 4(=5-1) 
print(my_str[2:5])

# Stops as the end as the max limit is beyond the string index
print(my_str[4:999])

# Step 2 character at at time
print(my_str[2:9:2])

print(my_str[-7:-2])

# Slice the string in reverse
print(my_str[-2:-9:-1])
Python

Output:

Big B
g B
Box Code
gBxC
ox Co
doC xoB
Plaintext

Check Sub-String Existence

We can check if a string contains a certain substring by using the keyword “in”. We get the result in boolean-

Case #1: Check if Sub-String Exists

my_str = "Big Box Code"

print("Box" in my_str)

print("Co" in my_str)

print("big" in my_str)
Python

Output:

True
True
False
Plaintext

Case #2: Use Boolean Result from “in”

my_str = "Big Box Code"

# Check if "Box" exist in my_str
if "Box" in my_str:
    print("Box found in my_str")
else:
    print("Box not found")

# Check if "Co" exist in my_str
if "Co" in my_str:
    print("Co found in my_str")
else:
    print("Co not found")

# Check if "big" exist in my_str
# it will return false as it does not match the case
if "big" in my_str:
    print("big found in my_str")
else:
    print("big not found")
Python

Output:

Box found in my_str
Co found in my_str
big not found
Plaintext

Transform String

my_str = "Big Box Code"

# Convert string to lower case
print(my_str.lower())

# Convert string to upper case
print(my_str.upper())

# Capitalize first letter of the string
print("some string for test".capitalize())
Python

Output:

big box code
BIG BOX CODE
Some string for test
Plaintext

Search and Replace

my_str = "Big Box Code"

print(my_str.replace("B", "*"))
print(my_str.replace("ode", ""))
Python

Output:

*ig *ox Code
Big Box C
Plaintext

Strip Whitespaces

my_str = " Big Box Code "

print(f"my_str: start|{my_str}|end")
print(f"my_str.lstrip(): start|{my_str.lstrip()}|end")
print(f"my_str.rstrip(): start|{my_str.rstrip()}|end")
print(f"my_str.strip(): start|{my_str.strip()}|end")
Python

Output:

my_str: start| Big Box Code |end
my_str.lstrip(): start|Big Box Code |end
my_str.rstrip(): start| Big Box Code|end
my_str.strip(): start|Big Box Code|end
Plaintext

Leave a Comment


The reCAPTCHA verification period has expired. Please reload the page.