Advanced Python 6 | String Methods, Unicode, and Formatting

Series: Advanced Python

Advanced Python 6 | String Methods, Unicode, and Formatting

  1. String Methods

(1) String Methods

Note that because all the strings are immutable, so the methods of string can not be an in-place function. Actually, these methods just return the values of the result. The methods are:

['capitalize',
'casefold',
'center',
'count',
'encode',
'endswith',
'expandtabs',
'find',
'format',
'format_map',
'index',
'isalnum',
'isalpha',
'isascii',
'isdecimal',
'isdigit',
'isidentifier',
'islower',
'isnumeric',
'isprintable',
'isspace',
'istitle',
'isupper',
'join',
'ljust',
'lower',
'lstrip',
'maketrans',
'partition',
'replace',
'rfind',
'rindex',
'rjust',
'rpartition',
'rsplit',
'rstrip',
'split',
'splitlines',
'startswith',
'strip',
'swapcase',
'title',
'translate',
'upper',
'zfill']
  • count the number of a char or a string in a string
string.count('a')   # count a char 
string.count('ab') # count a string
  • change all the chars in a string to its lower case
string.lower()
  • change all the chars in a string to its upper case
string.upper()
  • capitalize the first letter in a string
string.capitalize()
  • capitalize every first letter in a string (start after every , and space) (ignore any number)
string.title()      # i.e. hH, ma5, 123 to Hh, Ma5, 123
  • replace a string with another
string.replace("", "?")  # add question marks between/above per char
string.replace("!", "?") # replace ! with ?
string.replace("!", "") # delete all the ! in the string
  • split a string by its space (it will treat several continuous spaces to one space)
string.split()
  • split a string by a given char (and delete it) and return a list (return the string in the list if the given char/string not exists)
string.split('e')

Note that if we split by a continuous char in the string, we will get empty strings in the result.

  • split a string by a given string (and delete it) and return a list (return the string in the list if the given char/string not exists)
string.split(', ')
  • split a string at n-th time it occurs from left by a given string (and delete it) and return a list (return the string in the list if the given char/string not exists)
string.split('l', n)
  • split a string at n-th time it occurs from right by a given string (and delete it) and return a list (return the string in the list if the given char/string not exists)
string.rsplit('l', n)
  • split a string by a given char (and not delete it) and return a tuple of strings (from left)
string.partition('e')
  • split a string by a given char (and not delete it) and return a tuple of strings (from left)
string.partition('el')
  • split a string by a given char (and not delete it) and return a tuple of strings (from right)
string.rpartition('e')
  • split a string by a given string (and not delete it) and return a tuple of strings (from right)
string.rpartition('el')
  • split by lines, that is to say we split a string by \n. If there’s no \n, just put the result into a list.
'hello\nworld'.splitlines()
  • join a list of string to create a single string, connected with nothing
"".join(myList)
  • join a list of string to create a single string, connected with char/string
"?".join(myList)      # join with a char
"hello".join(myList) # join with a string
  • general random string
from random import sample
''.join(sample(string, k=len(string)))
  • return the delete of the last char in a string
string[:-1]
  • reverse a string
string[::-1]
  • get the lower case of a utf-8 string if exists
string.casefold()    # i.e. Λ to λ
  • generate 50 space chars and put our string in the center (reduce the the number of spaces by the length of our string)
string.center(50)
  • generate 50 space chars and put our string in the left (reduce the the number of spaces by the length of our string)
string.ljust(50)
  • generate 50 space chars and put our string in the right (reduce the the number of spaces by the length of our string)
string.rjust(50)
  • delete the spaces before or behind our string
string.strip()
  • delete the spaces before our string
string.lstrip()
  • delete the spaces behind our string
string.rstrip()
  • check if all the chars in a string is a letter (including latin, not allow space)
string.isalpha()
  • check if all the chars in a string is a digit (can be used for bytes and Roman numbers)
string.isdigit()
  • check if all the chars in a string is a digit (can’t be used for bytes or foreign numbers)
string.isdecimal()
  • check if all the chars in a string is a digit (all numeric things except for bytes)
string.isnumeric()
  • check if all the char are alphabet letters or numbers (space not allowed)
string.isalnum()
  • check if there’s only space in a string (\n and \t are also considered as space)
" \n\t ".isspace()      # True
  • check if all the letter chars in a string is lower case
string.islower()    # "hello123".islower() will return True
  • check if all the letter chars in a string is upper case
string.isupper()    # "HELLO123".isupper() will return True
  • check if a string is in the form of title
string.istitle()
  • return the index of a char that first appear in the string, if it doesn’t exist in the stringor if we input a string, return -1 (find from left)
string.find('l')    # 2
string.find('he') # -1
string.find('y') # -1
  • return the index of a char that first appear in the string, if it doesn’t exist in the string or if we input a string, return -1 (find from right)
string.find('l')    # 9
string.find('el') # -1
  • check whether a string starts with a char/string
string.startswith('Hello')
  • check whether a string ends with a char/string
string.endswith('world!')
  • fill (20-len(string)) zeros in the left of a string
string.zfill(20)
  • swap the lower cases to upper cases, and vice versa (only for letters and ignore the other chars)
string.swapcase()

(2) String Formatting

  • 4 digits of precision, including 1 digit to the left of decimal (auto-round)
f"{0.234567:.4}"       # '0.2346'
f"{1.234567:.4}" # '1.235'
f"{12.34567:.4}" # '12.35'
  • 4 digits of precision after the floating point
f"{0.234567:.4f}"       # '0.2346'
f"{1.234567:.4f}" # '1.2346'
f"{12.34567:.4f}" # '12.3457'
  • 2 digits percentage
f"{0.234567:.2%}"      # '23.46%'
f"{0.999999:.2%}" # '100.00%'
  • human readable text
f"{10000000:,}"        # '10,000,000'
  • human readable text and 2 digits of precision after the floating point
f"{10000000.1234:,.2f}"    # '10,000,000.12'
  • currency
f"${10000000.1234:,}"      # '$10,000,000'
  • add zeros before (only for numbers, use zfill for string)
f"{8:02}"                  # '08'
f"{000.8:05}" # '000.8'
  • add spaces before if number, add spaces behind if string
f"{100:8}"                 # '     100'      # or f"{100:>8}"
f"{'yes':10}" # 'yes ' # or f"{'yes':<10}"
  • add spaces behind if number
f"{100:<8}"                 # '100     '
  • add spaces before if string
f"{'yes':>10}"             # '       yes'
  • add spaces to both sides (no matter strings or numbers)
f"{8:^9}"                  # '    8    '
  • print utf-8 format
f'\N{Hatching Chick}'      # '🐣'
  • string format method
"{1} {0} {1}".format("hello", "world")   # 'world hello world'
  • unpack list and format
data = [1, 2, 3, 4]
print("The numbers are {}, {}, {}, and {}".format(*data))
  • switch to binary
f"{878:b}"     # '1101101110'
  • switch to hexadecimal
f"{878:x}"     # '36e'
  • switch to scientific notation
f"{878:e}"     # '8.780000e+02'
  • set aligning by formatting
# 'Options:      A      B      C      D'
"Options: {:>6} {:>6} {:>6} {:>6}".format(*"A B C D".split())

(3) Unicode

The encode process changes the string (human readable) to machine code, when the decode process changes the machine code back to the string.

  • encode a string to utf-8
"Hello World".encode('utf-8')
  • encode a string to utf-16
"Hello World".encode('utf-16')
  • decode bytes to utf-8
b'caf\xc3\xa9'.decode('utf-8')
  • find the best decoder
import ftfy
ftfy.fix_text(mess) # suppose mess is our messed up machine code
  • find the unicode for a char as int
ord('a')
  • find the unicode for a string as list of ints
[ord(i) for i in string]
  • check whether all chars in a string is ASCII code
"ha34@🐶".isascii()     # False
  • Turn an int unicode of a char back to char
chr(128054)             # '🐶'

2. Examples

  • Example #1. What is the output of the following code?

Output:

Error
  • Example #2. What is the output of the following code?

Output:

True
  • Example #3. What is the output of the following code?

Output:

False
  • Example #4. What is the output of the following code?

Output

Error

This can be used for java, but not for python.

  • Example #5. What is the output of the following code?

Output

Error