Table of Contents

Python on CS Lab and shared research computers

Why Python Packages are problematic on shared computers

Python has its own unique way of downloading, storing, and resolving packages. While this has its advantages, there were some decisions made about package storage and resolution, that has lead to some problems—particularly with how and where packages are stored.

There are a few different locations where these packages can be installed on your system. For example, most system packages are stored in a child directory of the path stored in sys.prefix. Third party packages installed using pip are typically placed in one of the directories pointed to by site.getsitepackages:

It’s important to know this because, by default, every project on your system will use these same directories to store and retrieve site packages (third party libraries). At first glance, this may not seem like a big deal, and it isn’t really, for system packages (packages that are part of the standard Python library), but it does matter for site packages.

Consider the following scenario where you have two projects: ProjectA and ProjectB, both of which have a dependency on the same package, PackageX. The problem becomes apparent when we start requiring different versions of PackageX. Maybe ProjectA needs v1.0.0, while ProjectB requires the newer v2.0.0, for example.

This is a real problem for Python since it can’t differentiate between versions in the site-packages directory. So both v1.0.0 and v2.0.0 would reside in the same directory with the same name:

Since packages are stored according to just their name, there is no differentiation between versions. Thus, both projects, ProjectA and ProjectB, would be required to use the same version, which is unacceptable in many cases.

This is where virtual environments and the virtualenv/venv tools come into play.

What Is a Python Virtual Environment

The main purpose of Python virtual environments is to create an isolated environment for Python projects. This means that each project can have its own dependencies, regardless of the dependencies used for other projects.

In the example above, you’d need to create a separate virtual environments for ProjectA and ProjectB, and each environment would be able to depend on whatever version of PackageA they may need, independent of the other.

There are no limits to the number of environments you can have since they’re just directories containing a few scripts. Plus, they’re easily created using the virtualenv or pyenv command line tools.

Using Virtual Environments

CS Lab and shared research computers should already have the venv module from the standard library installed.

Start by making a new directory to work with:

$ mkdir python-virtual-environments && cd python-virtual-environments

Create a new virtual environment inside the directory:

python3 -m venv project_1_env

Note: By default, this will not include any of your existing site packages.

In the above example, this command creates a directory called project_1_env, which contains a directory structure similar to this:

├── bin
│   ├── activate
│   ├── activate.csh
│   ├── activate.fish
│   ├── easy_install
│   ├── easy_install-3.8
│   ├── pip
│   ├── pip3
│   ├── pip3.8
│   ├── python -> python3.
│   ├── python3 -> /usr/bin/python3
├── include
├── lib
│   └── python3.8
│       └── site-packages
├── pyvenv.cfg
└── share

Here’s what each folder contains:

In order to use this environment’s packages/resources in isolation, you need to “activate” it. To do this, just run the following:

$ source project_1_env/bin/activate
(project_1_env) $

Notice how your prompt is now prefixed with the name of your environment (project_1_env, in this case). This is the indicator that project_1_env is currently active, which means the python executable will only use this environment’s packages and settings.

Now you can use pip to install packages (large packages may take a while to install).

To go back to the “system” context, execute a deactivate:

(env) $ deactivate
$

Now your shell session is back to normal, and the python command refers to the global Python install. Remember to do this whenever you’re done using a specific virtual environment.

Using Virtual Environment makes it east to reset your Python environment if you need to use a different set of packages. Just delete the old environment and create a new one.

Be sure to cleanup/delete unused temp files and environments when done using them.

Tips for Using a Python3 IDE with Virtual Environments

Using Spyder in the 244 Lab

  1. Activate (start) your virtual environment.
  2. Run spyder
  3. Go to Tools –> preferences –> python Interpreter and select the python file from the virtual env you want to link to Spyder ex : /home/you/envs/your_env/bin/python

Preferred Method on Research Computers: Spyder Method 1 (Use with tensorflow)

To use Spyder.

  1. Activate (start) your virtual environment.
  2. Install the needed packages with pip
  3. Install Spyder into your environment using pip install spyder.
  4. Then start spyder from the environment command line with the command spyder.

Spyder Method 2 (Doesn't work well with tensorflow)

  1. Activate (start) your virtual environment.
  2. Run spyder from the environment (after source activate)
  3. Go to Tools –> preferences –> python Interpreter and select the python file from the env you want to link to spyder ex : /home/you/envs/your_env/bin/python

See... Using Python Virtual Environments for more detail