[Из песочницы] Top 5 Tips Developers Should Know For Python Codes Optimization10.07.2019 19:05

Optimization of Python codes deals with selecting the best option among a number of possible options that are feasible to use for developers. Python is the most popular, dynamic, versatile, and one of the most sought after languages for mobile application development. Right from the programming projects like machine learning and data mining, Python is still the best and most relevant language for application developers.

Code Optimization Tips for Python Developers

In this post, I will discuss some of the optimization procedures and patterns in Python coding in an effort to improve the overall coding knowledge.

Optimizing Slow Code First

When you run your Python program, the source code .py is compiled using CPython into bytecode (.pyc file) saved in a folder (_pycache_) and at the end interpreted by Python virtual M2M code (machine to machine). Since Python uses an interpreter to execute the bytecode at runtime which makes the process a bit slower.

To make your execution faster on CPython, get PyPy v7.1 that include two different interpreters and the feature of its predecessor PyPy3.6-beta and Python 2.7. Compared to CPython it is 7.6X faster which is critical for coding. Programs in CPython takes a lot of space, but PyPy takes up less space in programs.
Spotting Performing Issues in Python

Choosing data structures in Python can affect the performance of codes. Profiling of codes can be of great help in analyzing how codes perform in different situations and enable to identify the time that the program takes in its operations.

Tools such as PyCallGraph, cProfile, and gProf2dot can be useful in faster and more efficient profiling of Python codes:

cProfile is to identify how long the various parts of Python codes are executed.

PyCallGraph creates visual representations of graphs to represent the relationship between Python codes.

This example demonstrates a simple use of pycallgraph.

from pycallgraph import PyCallGraph
from pycallgraph.output import GraphvizOutput


class Juice:

    def drink(self):
        pass


class Person:

    def __init__(self):
        self.no_juices()

    def no_juices(self):
        self.juices = []

    def add_juice(self, juice):
        self.juices.append(juice)

    def drink_juices(self):
        [juice.drink() for juice in self.juices]
        self.no_juices()


def main():
    graphviz = GraphvizOutput()
    graphviz.output_file = 'basic.png'

    with PyCallGraph(output=graphviz):
        person = Person()
        for a in xrange(10):
            person.add_juice(Banana())
        person.drink_juices()


if __name__ == '__main__':
    main()

Concatenation of Strings

Python strings are generally immutable and subsequently perform quite slow. There are several ways to optimize Python strings in different situations. For example, the + (plus) operator is the ideal option for the few Python string objects. And, if you use (+) operator to link multiple strings in Python codes, each time it will create a new object and creation of so many string objects will hamper the utilization of memory.

Iterator such as a List that has multiple Strings, the best way to concatenate them is by utilizing the .join () method.

See an example of how .join () and + (plus) operator works in Python:

import timeit

# create a list of 1000 words
list_of_words = ["foo "] * 1000

def using_join(list_of_words):  
    return "".join(list_of_words)

def using_concat_operator(list_of_words):  
    final_string = ""
    for i in list_of_words:
        final_string += i
    return final_string

print("Using join() takes {} s".format(timeit.timeit('using_join(list_of_words)', 'from __main__ import using_join, list_of_words')))

print("Using += takes {} s".format(timeit.timeit('using_concat_operator(list_of_words)', 'from __main__ import using_concat_operator, list_of_words')))  


Output:
$ python join-vs-concat.py 
Using join() takes 14.0949640274 s  
Using += takes 79.5631570816 s  
$ 
$ python join-vs-concat.py 
Using join() takes 13.3542580605 s  
Using += takes 76.3233859539 s

So, from the above example, it is clear that .join() operator is an ideal way to add more strings to your coding and it is also faster than the concatenation operators.

List Comprehension vs Loops in Python Coding

Loops are common in Python coding that provides a concise way to create new lists that also support various conditions.

For example, if you want to get the list of the cube root of all the odd numbers in a defined range using the for loop;

new_list = []  
for n in range(1, 9):  
    if n % 3 == 0:
        new_list.append(n**3) 

The list comprehension version of the loop would be like:

new_list = [ n**3 for n in range(1,9) if n%3 == 0]

So, it«s clear that list comprehension is more precise, shorter, and notably faster in execution time than for loops.

Optimizing Memory Footprint in Python

Dealing with loops in Python coding, you will need to create a complete list of integers for help in executing the for-loops. Here, in this case, functions such as range and xrange will be useful in your coding. The functionality of both is the same, but they are different in that the range returns a list object but the xrange returns an xrange object.

If you need to generate a large number of integers, then xrange function is an ideal choice for this purpose. Why? It is because xrange works as a generator in that doesn«t provide the final result, but it gives you the ability to generate the values in the final list.

Let«s see the difference between these two functions in terms of memory consumption — xrange vs range:

$ python
Python 2.7.10 (default, Oct 23 2015, 19:19:21)  
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.  
>>> import sys
>>> 
>>> r = range(1000000)
>>> x = xrange(1000000)
>>> 
>>> print(sys.getsizeof(r))
8000072  
>>> 
>>> print(sys.getsizeof(x))
40  
>>> 
>>> print(type(r))
  
>>> print(type(x))

Hence, it«s clear that xrange function is a much better option in saving memory than the range function. The range function consumed 8000072 bytes, on the other hand, xrange function consumed only 40 bytes of memory.

Summary

Hope these optimization tips enables robust and reliable execution of Python codes at multiple levels of granularity from profiling to data structures to string concatenation to memory optimization with xrange function. All of these points aimed at supporting Python programmers in their day to day programming work and to help them write quality code.