Almost everybody now uses numpy as it is extremely helpful for data analysis. However, oftentimes (if not almost always) numpy does not deliver at its full strength since it is installed in a very inefficient way - when it is linked with old-fashioned ATLAS and BLAS libraries which can use only 1 CPU core even when your computer is equipped with a multicore processor or even a few processors.
This post is intended for Linux users only. All the shell commands below are for Ubuntu, but you will easily find analogues for your distribution. Those from Windows world might need additiondal googling. The other option is to switch to scientific Python bundles like Anaconda (my choice), Canopy or some other.
You might easily check if you are affected by the single core numpy problem. To do this just create a simple test program:
And run it as a background process
Afterwards, run top to check the performance:
You will see a python process at the very top of your process list. Now pay attention to the %CPU column of that process: a value around 100 means that it is actually using only 1 CPU core.
If this is the case, you might want to significantly improve numpy’s performance. Fortunately. this is pretty easy.
Check libraries
There are two quite different situations:
- you have some ATLAS/BLAS libraries installed already;
- you don’t have any libraries yet.
To find out what you have, check whether your numpy is linked to BLAS
In earlier numpy versions (before 1.10) you have to check linkage of _dotblas.so (instead of multiarray.so), so you should do:
The output will look something like:
If there is no mention of libblas.so (like in the output above), then you have situation #2 - you don’t have any BLAS library installed. This means that your numpy uses its own internal library of linear algebra functions which is extremely slow. So you will get the greatest performance improvement, but at the expense of recompiling and reinstalling numpy.
If libblas.so is mentioned in your ldd output, you may just reassign a BLAS library which is fast and easy.
In any case, you need a better BLAS library first.
Install OpenBLAS library
OpenBLAS is a very good library with various algorithms and functions of linear algebra which lies in the core of many modern data analysis methods.
However, to begin with, you need a fortran compiler, so install gfortran package, as g77 compiler that you most probably have is incompatible with OpenBLAS.
Now download OpenBLAS sources from Github
Enter the directory and compile sources
When make has successfully finished, install the library
The default installation directory is /opt/OpenBLAS. You may choose a different location, though:
Reassign BLAS
If you had another BLAS library at the beginning, you need to make OpenBLAS library the preferred choice of all BLAS libraries installed.
Now run the test again and make sure that all CPU cores are now being used. This is all you have to do. Now enjoy the full speed numpy.
Build the right numpy
Those who did not have any BLAS libraries are left with nothing to do but reinstall numpy.
First of all, get rid of the wrong numpy you already have.
Then create a file .numpy-site.cfg in your home directory with the following content:
If you have chosen a different location for your OpenBLAS installation, edit the paths accordingly.
And install numpy again
If there were no errors during compilation and installation and everything went just fine, run the test again to make sure that all CPU cores are now being used.
If you prefer a manual compilation/installation, like I often do, you may try the following approach. First, download numpy sources.
This will create in the current directory a file named like numpy-1.10.2.tar.gz (the version will definitely change in future). Unzip it and enter the source directory.
Now create site.cfg file (notice that the name is a bit different here) with the very same content as .numpy-site.cfg above.
And finally build and install numpy
Now you can run the test to see how fast your numpy is.