Materials to brush up your Python skills
Python dependency management is a serious topic, as you want to be able to:
During your classes (and future professional life), your most common need will be to import libraries which are not necessarily installed by default, and on computers where you do not necessarily have administrator rights.
import
instructionThe import
instruction allows you to access code located in separated files (modules). A bundled arborescence of modules is called a package.
When you type import foo
, the Python interpreter will look for:
foo.py
, orfoo/
with a file called __init__.py
inside.It will look in order in the directories listed in sys.path
.
Try running in your Python interpreter:
>>> import sys
>>> sys.path
[
'',
'/usr/lib/python312.zip',
'/usr/lib/python3.12',
'/usr/lib/python3.12/lib-dynload',
'/home/xo/.local/lib/python3.12/site-packages',
'/usr/local/lib/python3.12/dist-packages',
'/usr/lib/python3/dist-packages'
]
This is just an exemple with the default Python interpreter shipped with Ubuntu 20.04, and the result will probably be different for you. What is important here is to note that the order of import resolution will proceed as follow:
PYTHONPATH
environment variables come next''
and '/usr/lib/python312.zip'
in the current example);LD_LIBRARY_PATH
variable lets you extend the list of folders to check.
pip
Dealing with dependencies in custom folders does not scale well. Some people like to extend the sys.path
folder programmatically, but in most cases it is a poor idea.
pip
is an excellent tool provided to install packages (and their dependencies!) from:
pip install .
;pip install pandas
.pip
installs packages in system folders first, but you can use the --user
option to install in local folders. Remember '/home/xo/.local/lib/python3.12/site-packages'
. It is important to ensure that such folder where pip
installs packages is part of your Python sys.path
list of folders, but pip
will warn you if that’s not the case.
pip
installs only Python dependencies. Publishing a package on PyPI (out of the scope of this seminar) is pretty flexible. Also, it does not ensure the global consistency of the Python ecosystem you have installed on your computer, which can become problematic when you install, update, deinstall, reinstall a lot of packages. It is very likely that you will reach a point where your Python is “broken” because of dependency requirement conflicts.
A common solution is to use virtual environments to avoid such issue. It is possible to use those without Anaconda, but since Anaconda solves other dependency problems as well, let’s move to that topic.
You should already have installed Anaconda by now. Anaconda is a distribution providing Python with a number of optional libraries, tools and their dependencies. Where pip
is only focused on Python dependencies and requires YOU to provide other non Python dependencies, Anaconda lets you install other dependencies compiled in C, C++, R or more. Moreover, the global consistency in versions of dependencies is consistently enforced.
You may then install any missing package as follows:
conda install pandas
conda
. If you have administrator rights on your computer, we recommend that you install a faster implementation called mamba
:
conda install -c conda-forge -n base mamba
pip install
ing libraries in your Anaconda environment as versions of dependencies are no longer enforced.
The conda-forge
channel is very comprehensive, and even more strict about consistency of your environment, which is why we recommend that you set the following options once for all.
conda config --add channels conda-forge
In addition to this strict mechanism, it is considered good practice to work in custom environments. A conda
environment sets a specific Python version with chosen dependencies:
Pixi is a tool that simplifies and makes conda environment management faster and more efficient.