Tips and tricks from my Telegram-channel @pythonetc, March 2019
It is a new selection of tips and tricks about Python and programming from my Telegram-channel @pythonetc.
Previous publications.
0_0
is a totally valid Python expression.
Sorting a list with None
values can be challenging:
In [1]: data = [
...: dict(a=1),
...: None,
...: dict(a=-3),
...: dict(a=2),
...: None,
...: ]
In [2]: sorted(data, key=lambda x: x['a'])
...
TypeError: 'NoneType' object is not subscriptable
You may try to remove Nones and put them back after sorting (to the end or the beginning of the list depending on your task):
In [3]: sorted(
...: (d for d in data if d is not None),
...: key=lambda x: x['a']
...: ) + [
...: d for d in data if d is None
...: ]
Out[3]: [{'a': -3}, {'a': 1}, {'a': 2}, None, None]
That’s a mouthful. The better solution is to use more complex key
:
In [4]: sorted(data, key=lambda x: float('inf') if x is None else x['a'])
Out[4]: [{'a': -3}, {'a': 1}, {'a': 2}, None, None]
For types where no infinity is available you can sort tuples instead:
In [5]: sorted(data, key=lambda x: (1, None) if x is None else (0, x['a']))
Out[5]: [{'a': -3}, {'a': 1}, {'a': 2}, None, None]
When you fork your process, the random seed you are using is copying across processes. That may lead to processes producing the same «random» result.
To avoid this, you have to manually call random.seed()
in every process.
However, that is not the case if you are using the multiprocessing
module, it is doing exactly that for you.
Here is an example:
import multiprocessing
import random
import os
import sys
def test(a):
print(random.choice(a), end=' ')
a = [1, 2, 3, 4, 5]
for _ in range(5):
test(a)
print()
for _ in range(5):
p = multiprocessing.Process(
target=test, args=(a,)
)
p.start()
p.join()
print()
for _ in range(5):
pid = os.fork()
if pid == 0:
test(a)
sys.exit()
else:
os.wait()
print()
The result is something like:
4 4 4 5 5
1 4 1 3 3
2 2 2 2 2
Moreover, if you are using Python 3.7 or newer, os.fork
does the same as well, thanks to the new at_fork
hook.
The output of the above code for Python 3.7 is:
1 2 2 1 5
4 4 4 5 5
2 4 1 3 1
It looks like sum([a, b, c])
is equivalent for a + b + c
, while in fact it’s 0 + a + b + c
. That means that it can’t work with types that don’t support adding to 0
:
class MyInt:
def __init__(self, value):
self.value = value
def __add__(self, other):
return type(self)(self.value + other.value)
def __radd__(self, other):
return self + other
def __repr__(self):
class_name = type(self).__name__
return f'{class_name}({self.value})'
In : sum([MyInt(1), MyInt(2)])
...
AttributeError: 'int' object has no attribute 'value'
To fix that you can provide custom start element that is used instead of 0
:
In : sum([MyInt(1), MyInt(2)], MyInt(0))
Out: MyInt(3)
sum
is well-optimized for summation of float
and int
types but can handle any other custom type. However, it refuses to sum bytes
, bytearray
and str
since join
is well-optimized for this operation:
In : sum(['a', 'b'], '')
...
TypeError: sum() can't sum strings [use ''.join(seq) instead]
In : ints = [x for x in range(10_000)]
In : my_ints = [Int(x) for x in ints]
In : %timeit sum(ints)
68.3 µs ± 142 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In : %timeit sum(my_ints, Int(0))
5.81 ms ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
You can customize index completions in Jupyter notebook by providing the _ipython_key_completions_ method
. This way you can control what is displayed when you press tab after something like d["x
:
Note that the method doesn’t get the looked up string as an argument.