4.3.
Libraries in Python

Dr. W.J.B. Mattingly
Smithsonian Data Science Lab and United States Holocaust Memorial Museum
January 2022

4.3.1. Covered in this Chapter

  1. External Libraries

  2. How to Install Libraries

  3. How to Import Libraries

4.3.2. Introduction

Libraries are common across all programming languages. They allow you to import large amounts of code that contain functions and classes that you can leverage in your own code. A good way to think about a library is as the old expression: “don’t reinvent the wheel”. Use what others have done! Open-source is open for a reason! People who make these libraries do it so that you don’t have to solve certain problems. They’ve done it for you.

In this chapter, we will learn how to install and import libraries. The reason we are doing this now is because the remainder of this textbook will require external libraries, specifically pandas, requests, and BeautifulSoup.

4.3.3. How to Install Python Libraries

Python comes preinstalled with pip. Pip is a package manager. It downloads, installs, and manages different versions of software on your machine. If you are coming to this textbook from windows, this concept might be a bit foreign. Think of pip something that will manage all your libraries for you. If you have installed Python via Anaconda, as I recommended, then pip will be automatically put into your system’s Path. Path on Windows is a way that your computer can understand commands for exe files that you have on your system. If you open the terminal and you type “pip –version”. On Windows, the terminal is command prompt.

Because I am using JupyterNotebook for this textbook, I can make terminal commands with the “!” before a block of code. When you type “pip –version”, you should see something like the following output.

!pip --version
pip 21.2.4 from /home/wjbmattingly/anaconda3/lib/python3.9/site-packages/pip (python 3.9)

If pip is working, then it means we can install the libraries that we will need for the remainder of the textbook. We will require three libraries:

  • Pandas

  • requests

  • BeautifulSoup

To install a library, you use the pip command “pip install library_name”, like so:

!pip install pandas

You should see an output similar to this. If you do, then pandas has installed correctly. Let’s now do the same thing for requests and BeautifulSoup.

!pip install requests
Requirement already satisfied: requests in /home/wjbmattingly/anaconda3/lib/python3.9/site-packages (2.27.1)
Requirement already satisfied: idna<4,>=2.5 in /home/wjbmattingly/anaconda3/lib/python3.9/site-packages (from requests) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/wjbmattingly/anaconda3/lib/python3.9/site-packages (from requests) (1.26.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in /home/wjbmattingly/anaconda3/lib/python3.9/site-packages (from requests) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /home/wjbmattingly/anaconda3/lib/python3.9/site-packages (from requests) (2021.10.8)

If you have already installed a library, the output will look something like this above.

For BeautifulSoup, we need to say pip install beautifulsoup4 (I will explain why later in this textbook.)

!pip install beautifulsoup4
Collecting beautifulsoup4
  Downloading beautifulsoup4-4.10.0-py3-none-any.whl (97 kB)
     |████████████████████████████████| 97 kB 2.5 MB/s eta 0:00:011
?25hCollecting soupsieve>1.2
  Downloading soupsieve-2.3.1-py3-none-any.whl (37 kB)
Installing collected packages: soupsieve, beautifulsoup4
Successfully installed beautifulsoup4-4.10.0 soupsieve-2.3.1

And that’s it! You have installed 3 new Python libraries! We need to learn one last thing, though, before we conclude this chapter. We need to learn how to import libraries within a Python script.

4.3.4. How to Import a Library

To import a library, we use the command import library_name, like so:

import requests

Sometimes, when we work with libraries, there is a certain Pythonic way to import the library. A classic case is pandas. With pandas, we import it “as pd”. The “as pd” means that the name in the script will not be “pandas”, rather “pd”.

import pandas as pd

4.3.5. Conclusion

That’s all we need to cover in this chapter about libraries. You should feel a bit more comfortable about what libraries are, why they are useful, how to install them, and how to import them. As we continue through the final parts of this textbook, you will become more comfortable with libraries.