Table of Contents
Python on CS Lab and shared research computers
Why Python Packages are problematic on shared computers
Python has its own unique way of downloading, storing, and resolving packages. While this has its advantages, there were some decisions made about package storage and resolution, that has lead to some problems—particularly with how and where packages are stored.
There are a few different locations where these packages can be installed on your system. For example, most system packages are stored in a child directory of the path stored in sys.prefix. Third party packages installed using pip are typically placed in one of the directories pointed to by site.getsitepackages:
It’s important to know this because, by default, every project on your system will use these same directories to store and retrieve site packages (third party libraries). At first glance, this may not seem like a big deal, and it isn’t really, for system packages (packages that are part of the standard Python library), but it does matter for site packages.
Consider the following scenario where you have two projects: ProjectA and ProjectB, both of which have a dependency on the same package, PackageX. The problem becomes apparent when we start requiring different versions of PackageX. Maybe ProjectA needs v1.0.0, while ProjectB requires the newer v2.0.0, for example.
This is a real problem for Python since it can’t differentiate between versions in the site-packages directory. So both v1.0.0 and v2.0.0 would reside in the same directory with the same name:
Since packages are stored according to just their name, there is no differentiation between versions. Thus, both projects, ProjectA and ProjectB, would be required to use the same version, which is unacceptable in many cases.
This is where virtual environments and the virtualenv/venv tools come into play.
What Is a Python Virtual Environment
The main purpose of Python virtual environments is to create an isolated environment for Python projects. This means that each project can have its own dependencies, regardless of the dependencies used for other projects.
In the example above, you’d need to create a separate virtual environments for ProjectA and ProjectB, and each environment would be able to depend on whatever version of PackageA they may need, independent of the other.
There are no limits to the number of environments you can have since they’re just directories containing a few scripts. Plus, they’re easily created using the virtualenv or pyenv command line tools.
Using Virtual Environments
CS Lab and shared research computers should already have the venv
module from the standard library installed.
Start by making a new directory to work with:
$ mkdir python-virtual-environments && cd python-virtual-environments
Create a new virtual environment inside the directory:
python3 -m venv project_1_env
Note: By default, this will not include any of your existing site packages.
In the above example, this command creates a directory called project_1_env, which contains a directory structure similar to this:
├── bin │ ├── activate │ ├── activate.csh │ ├── activate.fish │ ├── easy_install │ ├── easy_install-3.8 │ ├── pip │ ├── pip3 │ ├── pip3.8 │ ├── python -> python3. │ ├── python3 -> /usr/bin/python3 ├── include ├── lib │ └── python3.8 │ └── site-packages ├── pyvenv.cfg └── share
Here’s what each folder contains:
- bin: files that interact with the virtual environment
- include: C headers that compile the Python packages
- lib: a copy of the Python version along with a site-packages folder where each dependency is installed
In order to use this environment’s packages/resources in isolation, you need to “activate” it. To do this, just run the following:
$ source project_1_env/bin/activate (project_1_env) $
Notice how your prompt is now prefixed with the name of your environment (project_1_env
, in this case). This is the indicator that project_1_env
is currently active, which means the python executable will only use this environment’s packages and settings.
Now you can use pip to install packages (large packages may take a while to install).
To go back to the “system” context, execute a deactivate:
(env) $ deactivate $
Now your shell session is back to normal, and the python command refers to the global Python install. Remember to do this whenever you’re done using a specific virtual environment.
Using Virtual Environment makes it east to reset your Python environment if you need to use a different set of packages. Just delete the old environment and create a new one.
Be sure to cleanup/delete unused temp files and environments when done using them.
Tips for Using a Python3 IDE with Virtual Environments
Using Spyder in the 244 Lab
- Activate (start) your virtual environment.
- Run spyder
- Go to Tools –> preferences –> python Interpreter and select the python file from the virtual env you want to link to Spyder ex : /home/you/envs/your_env/bin/python
Preferred Method on Research Computers: Spyder Method 1 (Use with tensorflow)
To use Spyder.
- Activate (start) your virtual environment.
- Install the needed packages with pip
- Install Spyder into your environment using pip install spyder.
- Then start spyder from the environment command line with the command spyder.
Spyder Method 2 (Doesn't work well with tensorflow)
- Activate (start) your virtual environment.
- Run spyder from the environment (after source activate)
- Go to Tools –> preferences –> python Interpreter and select the python file from the env you want to link to spyder ex : /home/you/envs/your_env/bin/python
See... Using Python Virtual Environments for more detail