This meta-project provides an easy way to install all of the python tools I typically use. It also serves as a fairly minimal example of setting up a package the pip can install, and specifying dependencies.
In particular, I structure it for the following use-cases:
One can use and install python in several ways, but I strongly suggest using one of the complete package managers. These provide a convenient way of quickly getting up and started, and can simplify dependency issues. Presently I recommend using Anaconda in the form of a minimal Miniconda installation with custom environments.
Here are some old notes: I don't think this information is relevant any more, but have not checked carefully, so I am keeping it here.
Anaconda, provided by Continuum Analytics, is a complete python. distribution. There are free and professional versions with the latter being free for academic use. Here are some features:
One caveat is that Anaconda does not work all systems (it did not work on 32 bit Mac for example) and lacks some of the GUI tools like Mayavi provided with Enthough Python Distribution. (It seems to me that lack of QT support is the issue.)
The Enthough Python Distribution provided by Enthought is another complete python installation including NumPy, SciPy, matplotlib, and many other useful tools. There is both a free version, and a professional version -- the latter is free for academic use and includes some additional tools for analysis and visualization.
I do not know the details, but it seems that Continuum Analytics was setup by some developers that left Enthought and aims to provide a set of high performance tools, where as Enthought is moving more towards convenience tools with their Canopy platform.
It is perfectly reasonably to install both Anaconda and Enthough Python Distribution: just adjust your path to make sure you execute the appropriate version of python when you want to switch between the two.
My suggestion is to install one (or both) of these rather than a custom python installation -- they smooth over difficulties with compiling various libraries required especially by matplotlib. The only real difficulty with maintaining both is that you will need to install your needed libraries in each.
If you can, I would recommend creating appropriate virtual environments for managing any additional packages you need to install. In this way, you keep your system python clean, have access to the latest tools in a complete distribution (which is also kept clean) and can play with various package combinations: Important, for example, if you want to make sure you understand all of the dependencies of your code.
Note that Anaconda has its own environment manager instead of virtualenv called Conda so the setups are different. We describe both flavours here.
I install Anaconda in /data/apps/anaconda/1.3.1 which I symlink to /data/apps/anaconda/current. Add /data/apps/anaconda/current/bin to your path. Then use Conda to manage the equivalents of virtual environments, but for now I am just using a "global" environment. I needed to do the following to get to a working state:
conda update anaconda conda ipython pip sympy numexpr conda pip ipdb winpdb zope.interface mercurial conda pip psutil memory_profiler conda pip scikits.bvp1lg theano conda pip pp conda pip
Here is the executive summary based on the Enthough Python Distribution:
Install Enthough Python Distribution (or Anaconda), git, and GSL.
Install virtualenv (and pip which is not provided by Enthough Python Distribution):
sudo easy_install pip sudo pip install virtualenv
or down load virtualenv.py and replace \(virtualenv\) with \(python virtualenv.py\) below if you want to keep your base python installation pure.
Install the virtual environments setup some aliases:
virtualenv --system-site-packages --distribute ~/.python_environments/epd virtualenv --no-site-packages --distribute ~/.python_environments/clean virtualenv -p /usr/bin/python --system-site-packages --distribute \ ~/.python_environments/sys virtualenv -p ~/usr/apps/anaconda/Current/bin/python \ --system-site-packages --distribute \ ~/.python_environments/anaconda cat >> ~/.bashrc <<EOF alias v.epd=". ~/.python_environments/epd/bin/activate" alias v.sys=". ~/.python_environments/sys/bin/activate" alias v.clean=". ~/.python_environments/clean/bin/activate" v.epd EOF
Install Mercurial:
pip install hg
If on a Mac, then fix pythonw:
mkdir -p ~/src/python/git cd ~/src/python/git #git clone http://github.com/gldnspud/virtualenv-pythonw-osx.git git clone http://github.com/nicholsn/virtualenv-pythonw-osx.git cd virtualenv-pythonw-osx deactivate; v.epd # Make sure you use the appropriate virtualenv python install_pythonw.py /Users/mforbes/.python_environments/epd
Activate your desired virtual environment and choose the set of requirements to install:
v.epd pip install -r all.txt
Here is a list of various requirements obtained by running pip --freeze. These were intended to be used with virtualenv so I am not sure yet about there relevance when using Anaconda These are all disjoint, so you can pick and choose.
Here are some additional requirement files:
Installs NumPy, SciPy, and matplotlib from source. Note: this does not work for some reason because pip fails to install some compiled libraries. (The NumPy install will look fine, but SciPy will then fail.) Here is a discussion. To deal with this, first use pip to install this developmental version of NumPy. This will install the source. Then go into the source directory and run python setup.py install --prefix=/path/to/virtualenv. I.e.:
pip install --upgrade -r bleading-edge.txt cd ~/.python_environments/epd/src/numpy python setup.py install --prefix=~/.python_environments/epd
Here are detailed instructions using Enthough Python Distribution:
Install a version of python. Many systems have a version preinstalled, so this step is optional. However, if you plan to do serious development, then I strongly recommend installing the Enthough Python Distribution. There is a free version, and an almost full featured free version for academic use: You can also pay for a comercial version and recieve support. The EPD is very complete, and just works on most common platforms and I highly recommend it. Make sure you can run the version of python you desire.
If you install the EPD, then it will typically add something like the following to your ~/.bash_login or ~/.profile files:
# Setting PATH for EPD-7.3-2 # The orginal version is saved in .bash_login.pysave PATH="/Library/Frameworks/Python.framework/Versions/Current/bin:${PATH}" export PATH MKL_NUM_THREADS=1 export MKL_NUM_THREADS
(If you want to use a multithreaded version of numpy, you will need to change the value of MKL_NUM_THREADS. See this discussion.)
Create a virtualenv. This will allow you to install new packages in a controlled manner that will not mess with the system version (or the EPD version). You can create multiple virtual environments for different projects or associated with different versions of python. Again, this is highly recommended. There are several ways of doing this.
Note
Methods 1) and 2) will install virtualenv to the location specified by the current version of python. This means that you might need root access, and it will slightly "muck up" you pristine system install. This is generally not a problem, but if it bothers you see step 3).
If you have pip (the new python packageing system), then you can use it to install virtualenv as follows:
pip install virtualenv
If you do not have pip, you might have easy_install:
easy_install virtualenv
If you do not want to muck up your system version of python at all, then you can simply download the file virtualenv.py. In the commands that follow, replace virtualenv with python virtualenv.py.
Setup a virtual environment for your work. You can have many differen environments, so you will need to choose a meaningful name. I use "epd" for the EPD version of python, "sys" for the system version of python, and "clean" for a version using EPD but without the site-packages:
virtualenv --system-site-packages --distribute ~/.python_environments/epd virtualenv --no-site-packages --distribute ~/.python_environments/clean virtualenv -p /usr/bin/python --system-site-packages --distribute \ ~/.python_environments/sys
Once this virtualenv is activated, install packages with pip will place all of the installed files in the ~/.python_environments/epd directory. (You can change this to any convenient location). The --system-site-packages option allows the virtualenv access to the system libraries (in my case, all of the EPD goodies). If you want to test a system for deployment, making sure that it does not have any external dependencies, then you would use the --no-site-packages option instead. Run virtualenv --help for more information.
Add some aliases to help you activate virtualenv sessions. I include the following in my .bashrc file:
# Some virtualenv related macros alias v.epd=". ~/.python_environments/epd/bin/activate" alias v.sys=". ~/.python_environments/sys/bin/activate" alias v.clean=". ~/.python_environments/clean/bin/activate" v.epd
You can activate your chosen environment with one of the commands v.epd, v.clean, or v.sys. The default activation script will insert "(epd)" etc. to your prompt:
~ mforbes$ v.epd (epd)~ mforbes$ v.sys (sys)~ mforbes$ deactivate ~ mforbes$
To get out of the environments, just type deactivate as shown above.
Note
If you have an older version of IPython (pre 0.13), then you may need to call ipython from a function like this:
# Remap ipython if VIRTUAL_ENV is defined function ipython { if [ -n "${VIRTUAL_ENV}" -a -x "${VIRTUAL_ENV}/bin/python" ]; then START_IPYTHON='\ import sys; \ from IPython.frontend.terminal.ipapp import launch_new_instance;\ sys.exit(launch_new_instance())' "${VIRTUAL_ENV}/bin/python" -c "${START_IPYTHON}" "$@" else command ipython "$*" fi }
This deals with issues that IPython was not virtualenv aware. The recommended solution is still to install IPython in the virtualenv using pip install ipython, but then you will need one in each environment. As of IPython 0.13, this support is included. (See this PR.)
If you have not used IPython before, then you should have a look. It has some fantastic features like %paste and the IPython notebook interface.
Install mercurial. You may already have this (try hg --version). If not, either install a native distribution (which might have some GUI tools) or install with:
pip install hg
Install git. This may not be as easy, but some packages are only available from github.
On Mac OS X you may need to install pythonw for some GUI applications (like RunSnakeRun). You an do this using this solution:
mkdir -p ~/src/python/git cd ~/src/python/git git clone http://github.com/gldnspud/virtualenv-pythonw-osx.git cd virtualenv-pythonw-osx python install_pythonw.py /Users/mforbes/.python_environments/epd
You will have to do this in each virtualenv you want to use the GUI apps from.
Non-python prerequisites. These need to be installed outside of the python environment for some of the required libraries to work.
Install various requirements as follows:
pip install -r requirements/all.txt
Here are some notes about using pip that I did not find obvious.
It is clear from the documentation about requirements that you can specify version controlled repositories with pip, however, the exact syntax for specifying revisions etc. is not so clear. Examining the source shows that you can specify revisions, tags, etc. as follows:
# Get the "tip" hg+http://bitbucket.org/mforbes/pymmf#egg=pymmf # Get the revision with tag "v1.0" or at the tip of branch "v1.0" hg+https://bitbucket.org/mforbes/pymmf@v1.0#egg=pymmf # Get the specified revision exactly hg+https://bitbucket.org/mforbes/pymmf@633be89a#egg=pymmf
What appears after the "@" sign is any valid revision (for mercurial see hg help revision for various options). Unfortunately, I see no way of specifying something like ">=1.1", or ">=633be89a" (i.e. a descendent of a particular revision). (See issue 782)
The EPD is built using the Intel MKL. Here are some instructions on how to compile your own version of NumPy and SciPy with the MKL.
Checkout the source code:
pip install --no-install -e git+http://github.com/numpy/numpy#egg=numpy-dev pip install --no-install -e git+http://github.com/scipy/scipy#egg=scipy-dev
Setup the environment to use the Intel compilers:
. /usr/local/bin/intel64.sh . /opt/intel/Compiler/11.1/069/mkl/tools/environment/mklvarsem64t.sh
Edit the site.cfg file in the NumPy source directory. I am not sure exactly which libraries to include. See these discussions:
- http://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl
- Check the site.cfg in your EPD installation.
cd ~/.python_environments/epd/src/numpy cp site.cfg.example site.cfg vi site.cfg
Here is what I used:
[mkl] library_dirs = /opt/intel/Compiler/11.1/069/mkl/lib/em64t/ include_dirs = /opt/intel/Compiler/11.1/069/mkl/include lapack_libs = mkl_lapack95_lp64 mkl_libs = mkl_def, mkl_intel_lp64, mkl_intel_thread, mkl_core, mkl_mc
I also needed to modify numpy/distutils/intelccompiler.py as follows:
cc_args = "-fPIC" def __init__ (self, verbose=0, dry_run=0, force=0): UnixCCompiler.__init__ (self, verbose,dry_run, force) - self.cc_exe = 'icc -m64 -fPIC' + self.cc_exe = 'icc -O3 -g -openmp -m64 -fPIC' compiler = self.cc_exe self.set_executables(compiler=compiler, compiler_so=compiler,
Build both NumPy and SciPy with the following:
cd ~/.python_environments/epd/src/numpy python setup.py config --compiler=intelem --fcompiler=intelem\ build_clib --compiler=intelem --fcompiler=intelem\ build_ext --compiler=intelem --fcompiler=intelem\ install cd ~/.python_environments/epd/src/scipy
Run and check the build configuration:
$ python -c "import numpy;print numpy.__file__;print numpy.show_config()" /phys/users/mforbes/.python_environments/epd/lib/python2.7/site-packages/numpy/__init__.pyc lapack_opt_info: libraries = ['mkl_lapack95_lp64', 'mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread'] library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include'] blas_opt_info: libraries = ['mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread'] library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include'] lapack_mkl_info: libraries = ['mkl_lapack95_lp64', 'mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread'] library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include'] blas_mkl_info: libraries = ['mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread'] library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include'] mkl_info: libraries = ['mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread'] library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include'] None
Note
You will need to setup the environment to run with the MKL libraries. The EPD avoids this by distributing the libraries. I suggest that you add the following to the activation script:
cat >> ~/.python_environments/epd/bin/activate <<EOF # This adds the MKL libraries to the path for use with my custom numpy # and scipy builds. . /usr/local/bin/intel64.sh . /opt/intel/Compiler/11.1/069/mkl/tools/environment/mklvarsem64t.sh EOF
See also:
http://math.nju.edu.cn/help/mathhpc/doc/intel/mkl/mklgs_lnx.htm
http://blog.sun.tc/2010/11/numpy-and-scipy-with-intel-mkl-on-linux.html
http://www.scipy.org/Installing_SciPy/Linux
This suggests maybe using the runtime libraries instead (just mkl_libs = mkl_rt). I have not yet tried this.
http://cournape.github.com/Bento/
It looks like it might be easier to use Bento rather than distutils
This section describes various other pieces of software that I use that interact with python.
pyaudio is a python interface to the PortAudio library for generating sounds and sound files. To do real-time sound generation, one really needs to non-blocking interface (otherwise, the delay between blocking calls will affect the signal in a manner that is difficult to compensate for). Unfortunately, the default builds require Mac OS X 10.7 or higher.
I like to write my local documentation in reStructuredText (such as this file). As I often use math, I make the default role :math:`` and use MathJax. Here is an example:
.. default-role:: math Now I can type math like this: `E=mc^2` or in an equation line this .. math:: \int_0^1 e^{x} = e - 1
Note
Now I can type math like this: \(E=mc^2\) or in an equation line this
In order to work offline, I install MathJax locally using the IPython as described here:
from IPython.external.mathjax import install_mathjax install_mathjax()
This installs it in ~/.python_environments/epd/lib/python2.7/site-packages/IPython/frontend/html/notebook/static/mathjax which can be used locally. I symlink it to ~/.mathjax, but you must find a way to inject the stylesheet into your HTML. One way is with the .. raw:: html directive:
.. raw:: html <script type="text/javascript" src="/Users/mforbes/.mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"> </script>
This page has a great discussion of line and memory profiling:
I use Emacs as my principle editor and like to have access to syntax highlighting, auto-completion etc. Thus, I typically install the following packages, but these are not completely straightforward.
Pymacs allows Emacs to access the python interpreter and is used by Ropemacs to provide some nice features like code checking. The source appears not to be pip installable, so you must download it and run make as follows:
git clone http://github.com/pinard/Pymacs.git
cd Pymacs
make
pip install -e .
Anaconda provides a very nice python system, especially with the Conda package management tool, but there are a few problems:
As an example, here we create a Conda package for installing the FFTW and related software. We start with a fresh Anaconda installation: (this command would show if we have any packages installed that are not managed by Conda)
$ conda package --untracked prefix: /data/apps/anaconda/1.3.1
Now we manually install the FFTW etc.:
cd ~/src wget http://www.fftw.org/fftw-3.3.3.tar.gz wget http://www.fftw.org/fftw-3.3.3.tar.gz.md5sum md5 fftw-3.3.3.tar.gz # Check that this is okay tar -zxvf fftw-3.3.3.tar.gz cd fftw-3.3.3 # Build and install the single, double, long-double # and quad-precision versions PREFIX=/data/apps/anaconda/current/ for opt in " " "--enable-sse2 --enable-single" \ "--enable-long-double" "--enable-quad-precision"; do ./configure --prefix="${PREFIX}"\ --enable-threads\ --enable-shared\ $opt make -j8 install done # Note: this needs a patch to work on Mac OS X # https://code.google.com/p/anfft/issues/detail?id=4 export FFTW_PATH=/data/apps/anaconda/current/lib/ pip install --upgrade anfft pyfftw
These are untracked:
$ conda package --untracked prefix: /data/apps/anaconda/1.3.1 bin/fftw-wisdom ... include/fftw3.f ... lib/libfftw3.3.dylib ... lib/pkgconfig/fftw3.pc ... lib/python2.7/site-packages/Mako-0.7.3-py2.7.egg-info/PKG-INFO ... lib/python2.7/site-packages/anfft-0.2-py2.7.egg-info/PKG-INFO ... lib/python2.7/site-packages/pyFFTW-0.9.0-py2.7.egg-info/PKG-INFO ... lib/python2.7/site-packages/pyfftw/__init__.py ... share/info/fftw3.info ... share/man/man1/fftw-wisdom-to-conf.1 ...
These can be bundled into a new package that can later be installed directly:
$ conda package --pkg-name=fftw --pkg-version=3.3.3 prefix: /data/apps/anaconda/1.3.1 Number of files: 82 fftw-3.3.3-py27_0.tar.bz2 created successfully