conda – Tumurtogtokh Davaakhuu

Reading Time: 6 minutes

Nowadays Python is everywhere – academics, data science, machine learning, enterprise application, web application, scripting you name it python is everywhere. Whatever you do, python is there to help you or give you a headache. Let us say, you have learnt python programming and ready to use to develop applications that wow your future employers and make your future glorious. Surely, as that is great, you jump into coding python scrips and eventually start installing python packages. From there one follows a dangerous path into developer’s nightmare.

Package installation may lead to having incompatibility issues or make other application unworkable. And you may discover that your code does not work on some machines, while your code just works flawlessly on your local machine. Even though it gives a classic excuse for not writing good code, we, as developers, should make sure our code works everywhere. So what is a problem here? Python environment.

So, not to have a daily nightmare about incompatibility issues, individual python environment needs to be created for a project. A virtual environment is a bunch of scripts and directories that can run python isolated. By using a virtual environment, each python projects can have its own dependencies regardless of other project and system python environment. Here I will write about how to create and use virtual environment in python2 and python3. I will use virualenv for python2 and venv, virtualenvwrapper, and conda for python3.

Starting with python2, you need to install virtualenv package.

See, I already have installed it. After installing it, you can see its commands by:

virtualenv
-h

We do not need to use most of those. To create a virtual environment, we use virtualenv [options] destination. Specifically, call

virtualenv –python=python2 path/to/virtual/environment/dir

Obviously, we can create a python3 virtual environment by –python=python3 assuming you have python3 installed on your machine. This command creates an isolated virtual environment as same as a given python path. Instead of using python2 and python3, we can give a direct path.

Once a virtual environment is created, we can use it by calling:

source path/to/virtual/environment/bin/activate

And calling deactivate when a virtual environment is activated, leaves the virtual environment. On python3, it has venv package as default for a virtual environment. To use it, we just need to call

python3 -m venv path/to/virtual/environment

Everything else is same. What we have done is that we created a fresh python environment and copied dependencies to a specified directory. There we can install any packages without affecting other environment and local python by activating this environment. That is an absolute breeze.

However, we do not want to push it git repository but we do need to have a way of knowing required dependencies since some of us work on a project from different machines. I, for one, work from three different machines with windows and linux system.

One way to solve this problem is to create a folder for the virtual environment and ignore it for git. That way we can work on a project from different machines and on different os. We just have to keep a track of required dependencies. Using pip, we can save information of installed dependencies into a file and install those dependencies on a different environment when need to. Pip freeze command shows packages installed on the python environment. We need to save it to a file. It is a good practice to save it as ‘requirement.txt’. Later, on a different machine, we install required dependencies by using

pip install -r requirement.txt

Now we are in a very convenient position where we can work on a python project with any machines in any os. But what is the .env on my terminal and what about the occupied disk space of virtual environment? Obviously, those virtual environment files are necessary. However, when you have a number of completed python project and you want to release some spaces for some reason, would you check each projects and delete those files manually? Or wouldn’t it be easier if all virtual environment files are in one directory? And .env is a directory name where virtual environment is created. Ok, cool. But wouldn’t it be more convenient to see an actual python project name when the virtual environment is being used?

Virtual environment wrapper

To solve those minor inconveniences, we can use virtualenvwrapper package. Install this package with:

pip install virtualenvwrapper

after installing it, we need to activate its shell functions and create an environment variable that indicates a location of virtual environments. By running which virtualenvwrapper.sh, we get a path of its shell functions. We need to add it to shell startup. Since I am using ubuntu, I added it to ~/.bashrc. I also created environment variable for a location where I save virtual environments. So far, I added those lines to .bashrc:

export WORKON=$HOME/.virtualenvs 
source /usr/local/bin/virtualenvwrapper.sh

So now, you can create a virtual environment with:

mkvirtualenv -p python3 name/of/virtual/env

And activate it and use it. Also, virtualenvwrapper provides a few useful shell functions. One of those is workon. See, it makes life easier. To see a complete guide, visit this page.

Because it saves all files of every virtual environment created, it is easy to delete those if need to. And seeing the project name on terminal is pretty cool actually. That is it for creating and managing virtual environments using pip.

Using Conda

Now let’s see how to do it with conda. Conda is a package management system for Python, R, Lua, Scala, Java, JavaScript, and Fortran. Widely used in academic fields, we may know it by Anaconda. And people claim that conda is better tool for data science projects (I don’t know why. Read it here). Anaconda distribution is used by universities and it has its own GUI environment management tool with its navigator. However, not every developer loves to use gui-based tools, right?

Assuming conda is installed (if not install it from here), to create a conda environment

conda create --name name/of/env python=3.x

and activate the created environment with

conda activate name/of/env

Conda creates virtual environment and stores all related files to its installation location. Which makes it easy to manage. To see all conda environments, we can call:

conda info --envs

To remove particular environment, we call

conda remove -- name name/of/env -- all

Conda stores all environment names inside environment.txt in home directory/.conda folder. A content of it is same as conda info –envs command. On the other hand, all executables and packages are stored inside anaconda installation directory/envs folder. It is useful to know it since anaconda stores GBs of files (unbelievable, right?).

It is not the only way to create a conda virtual environment. We can save required information inside a yml file and use it to create conda environment. Generally, people create environment.yml and put python info, environment name, and dependencies there. With that yml file, we can create a virtual environment easier by calling

conda env create -f environment.yml

Updating environment.yml is easy, just need to write what needs to put there and call

conda env update -f environment.yml

You can confirm it by conda list command. I know it prints a tremendous amount of packages. For love of the god, I hope conda does use all of them effectively. To distribute our project, we need to save dependencies information. Calling conda env export shows you all required information to save. We need to copy and paste it into requirement.yml (for ubuntu, simply run conda env export > environment.yml).

So that is it. Now I hope it becomes easier to manage python virtual environment. From here, we can write our application with ease and eventually package it and distribute it. That is another story to tell some other time.

Tag: conda

Developing Python application: Virtual environment

Virtual environment wrapper

Using Conda