Just a few commands without any context:
Profiling with cProfile
This helped me to find slowest functions, because when optimizing, I need to focus on these (best ration of work needed vs. benefits). This helped me to find function which did some unnecessary calsulations over and over again:
$ python -m cProfile -o cProfile-first_try.out ./layout-generate.py ... $ python -m pstats cProfile-first_try.out Welcome to the profile statistics browser. cProfile-first_try.out% sort Valid sort keys (unique prefixes are accepted): cumulative -- cumulative time module -- file name ncalls -- call count pcalls -- primitive call count file -- file name line -- line number name -- function name calls -- call count stdname -- standard name nfl -- name/file/line filename -- file name cumtime -- cumulative time time -- internal time tottime -- internal time cProfile-first_try.out% sort tottime cProfile-first_try.out% stats 10 Sat Aug 12 23:19:40 2017 cProfile-first_try.out 18508294 function calls (18501563 primitive calls) in 8.369 seconds Ordered by: internal time List reduced from 2447 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 27837 4.230 0.000 5.015 0.000 ./utils_matrix2layout.py:14(get_distance_matrix_2d) 10002 1.356 0.000 1.513 0.000 ./utils_matrix2layout.py:244(get_measured_error_2d) 5674796 0.572 0.000 0.572 0.000 /usr/lib64/python2.7/collections.py:90(__iter__) 5340664 0.219 0.000 0.219 0.000 {math.sqrt} 5432768 0.189 0.000 0.189 0.000 {abs} 230401 0.183 0.000 0.183 0.000 /usr/lib64/python2.7/collections.py:71(__setitem__) 1 0.178 0.178 0.282 0.282 ./utils_matrix2layout.py:543(count_angles_layout) 10018 0.119 0.000 0.345 0.000 /usr/lib64/python2.7/_abcoll.py:548(update) 1 0.102 0.102 6.749 6.749 ./utils_matrix2layout.py:393(iterate_evolution) 1142 0.092 0.000 0.111 0.000 /usr/lib64/python2.7/site-packages/numpy/linalg/linalg.py:1299(svd)
To explain the columns, Instant User’s Manual says:
- tottime
- for the total time spent in the given function (and excluding time made in calls to sub-functions)
- cumtime
- is the cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.
Lets compile to C with Cython
Simply performing this on a module which does most of the work gave me about 20% speedup:
# dnf install python2-Cython $ cython utils_matrix2layout.py $ gcc `python2-config --cflags --ldflags` -shared utils_matrix2layout.c -o utils_matrix2layout.so
There is much more to do to optimize it, but that would need additional work, so not now :-) Some helpful links:
- use python2-config to get compile and linking options
- when you want to create *.so instead of executable, you need to use
-shared
- bloq with summary and nice FAQ