====== Python on CS Lab and shared research computers ====== ===== Why Python Packages are problematic on shared computers ===== Python has its own unique way of downloading, storing, and resolving packages. While this has its advantages, there were some decisions made about package storage and resolution, that has lead to some problems—particularly with how and where packages are stored. There are a few different locations where these packages can be installed on your system. For example, most system packages are stored in a child directory of the path stored in sys.prefix. Third party packages installed using pip are typically placed in one of the directories pointed to by site.getsitepackages: It’s important to know this because, by default, every project on your system will use these same directories to store and retrieve site packages (third party libraries). At first glance, this may not seem like a big deal, and it isn’t really, for system packages (packages that are part of the standard Python library), but it does matter for site packages. Consider the following scenario where you have two projects: **ProjectA** and **ProjectB**, both of which have a dependency on the same package, **PackageX**. The problem becomes apparent when we start requiring different versions of **PackageX**. Maybe **ProjectA** needs v1.0.0, while **ProjectB** requires the newer v2.0.0, for example. This is a real problem for Python since it can’t differentiate between versions in the site-packages directory. So both v1.0.0 and v2.0.0 would reside in the same directory with the same name: Since packages are stored according to just their name, there is no differentiation between versions. Thus, both projects, **ProjectA** and **ProjectB**, would be required to use the same version, which is unacceptable in many cases. This is where virtual environments and the virtualenv/venv tools come into play. ===== What Is a Python Virtual Environment ===== The main purpose of Python virtual environments is to create an isolated environment for Python projects. This means that each project can have its own dependencies, regardless of the dependencies used for other projects. In the example above, you’d need to create a separate virtual environments for **ProjectA** and **ProjectB**, and each environment would be able to depend on whatever version of **PackageA** they may need, independent of the other. There are no limits to the number of environments you can have since they’re just directories containing a few scripts. Plus, they’re easily created using the virtualenv or pyenv command line tools. ===== Using Virtual Environments ===== CS Lab and shared research computers should already have the ''venv'' module from the standard library installed. Start by making a new directory to work with: $ mkdir python-virtual-environments && cd python-virtual-environments Create a new virtual environment inside the directory: ''python3 -m venv project_1_env'' **Note:** By default, this will not include any of your existing site packages. In the above example, this command creates a directory called project_1_env, which contains a directory structure similar to this: ├── bin │ ├── activate │ ├── activate.csh │ ├── activate.fish │ ├── easy_install │ ├── easy_install-3.8 │ ├── pip │ ├── pip3 │ ├── pip3.8 │ ├── python -> python3. │ ├── python3 -> /usr/bin/python3 ├── include ├── lib │ └── python3.8 │ └── site-packages ├── pyvenv.cfg └── share Here’s what each folder contains: * bin: files that interact with the virtual environment * include: C headers that compile the Python packages * lib: a copy of the Python version along with a site-packages folder where each dependency is installed In order to use this environment’s packages/resources in isolation, you need to “activate” it. To do this, just run the following: $ source project_1_env/bin/activate (project_1_env) $ Notice how your prompt is now prefixed with the name of your environment (''project_1_env'', in this case). This is the indicator that ''project_1_env'' is currently active, which means the python executable will only use this environment’s packages and settings. Now you can use pip to install packages (large packages may take a while to install). To go back to the “system” context, execute a deactivate: (env) $ deactivate $ Now your shell session is back to normal, and the python command refers to the global Python install. Remember to do this whenever you’re done using a specific virtual environment. Using Virtual Environment makes it east to reset your Python environment if you need to use a different set of packages. Just delete the old environment and create a new one. **Be sure to cleanup/delete unused temp files and environments when done using them.** ===== Tips for Using a Python3 IDE with Virtual Environments ===== ==== Using Spyder in the 244 Lab ==== - Activate (start) your virtual environment. - Run spyder - Go to Tools --> preferences --> python Interpreter and select the python file from the virtual env you want to link to Spyder ex : /home/you/envs/your_env/bin/python ==== Preferred Method on Research Computers: Spyder Method 1 (Use with tensorflow) ==== To use Spyder. - Activate (start) your virtual environment. - Install the needed packages with pip - Install Spyder into your environment using pip install spyder. - Then start spyder from the environment command line with the command spyder. ==== Spyder Method 2 (Doesn't work well with tensorflow) ==== - Activate (start) your virtual environment. - Run spyder from the environment (after source activate) - Go to Tools --> preferences --> python Interpreter and select the python file from the env you want to link to spyder ex : /home/you/envs/your_env/bin/python [[https://realpython.com/python-virtual-environments-a-primer/|See... Using Python Virtual Environments]] for more detail