4.2. Graphing Network Data with Pandas#

4.2.1. Getting the Data from Pandas to NetworkX#

Pandas on its own cannot plot out network data. Instead, we must rely on two other libraries, NetworkX and Matplotlib. NetworkX is the standard Python library for working with networks. I have a forthcoming textbook, like this one, that walks users through NetworkX. Matplotlib is one of the standard plotting libraries. The purpose of this brief section, is to provide the code necessary for making Pandas work with NetworkX and Matplotlib to take networks stored in a Pandas DataFrame and transform the relationships into graphs. We will address social networks in greater detail in Part 4 of this textbook.

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

Let’s now load our data and see what it looks like.

df = pd.read_csv("data/network.csv")
df
source target
0 Tom Rose
1 Rose Rosita
2 Jerry Jeff
3 Jeff Larry
4 Carmen Carmen
5 Rosita Rosita
6 Larry Carmen
7 Larry Jerry

This is a pretty standard format for networks. We have two columns of data, a source, and a target. Imagine drawing a line to demonstrate networks, the source is where you start drawing the line and the target is where that line ends. This is known as force in network theory and is important for understanding the relationship between nodes, or individual points, in a network graph.

We can use NetworkX’s built in function from_pandas_edgelist() and get that data straight into an edgelist.

G= nx.from_pandas_edgelist(df, "source", "target")

4.2.2. Graphing the Data#

And with just two more lines of code we can plot that data out.

nx.draw(G)
plt.show()
../_images/f98f38d6679e9075fc2f1aad474ef58abccb4207a2ba136ab9357335ab1ab40e.png

4.2.3. Customize the Graph#

We have a problem with the image above, however, it is difficult to understand who the nodes represent. Let’s give them some labels.

nx.draw(G, with_labels=True)
plt.show()
../_images/8f9deccd4aaf17997922cae056eb0222431773c6760f35524578b1cf31bf36a1.png

Now that we have labels, we need to make them a bit easier to read. We can do this by changing the font color to “whitesmoke” and setting the background to gray. To achieve this we first need to create a fig object to which we will append a few attributes. Next, we draw the network graph and give it a font_color of our desire. Finally, we set the facecolor to gray and plot it.

fig = plt.figure()
nx.draw(G, with_labels=True, font_color="whitesmoke")
fig.set_facecolor('gray')
plt.show()
../_images/294ce8c2fc88e0bb0566068e3fcccc39649179dc3dc093e68888116a586cc62a.png

What if I wanted each node in our network to have an individual color? We can do that too by setting up a color map.

val = []
for i in range(len(G.nodes)):
    val.append(i)
nx.set_node_attributes(G, val, 'val')
fig = plt.figure()
nx.draw(G, with_labels=True, node_color=val, font_color="whitesmoke")
fig.set_facecolor('gray')
plt.show()
../_images/f8cfad803a1f6e1fa392420718f33b0ef9c51efa61174e803c4305b00d100b62.png