2.3. Streamlit Cache and Session States#

Being able to make more modular, custom, and dynamic applications means getting to know how to leverage advanced features in Streamlit when it comes to working with data stored in memory. In Streamlit, we have two ways to store data in memory, either by caching the data with @st.cache() or with the st.session_state.

2.3.1. Caching Data with st.cache`#

When working with large data-driven projects, run time will become an issue with Streamlit. This is because Streamlit reruns the Python file each time something changes in the application. With large datasets, this means that each time a user does anything within your application, Streamlit will need to reload all the data. For this reason, it is essential to know how to store large datasets (or models) in cache so that Streamlit does not need to reload large memory-intensive data or models each item it reruns.

We can cache our data with an @st.cache() above a function that loads the data. If we wanted to load our Titanic dataset and store it in memory, therefore, we would use the following code snippet.

@st.cache()
def load_df():
    df = pd.read_csv("./data/titanic.csv")
    return df

This is precisely the code that we will walk through when we create our first application in Streamlit later in this part of the textbook.

2.4. Storing Data with st.session_state#

Aside from storing large data with cache, we can also store previous states of data with the st.session_state. The Streamlit Session State gives greater flexibility to an application. It functions as a dictionary that stores data that remains the same during any given state of the app. This means that if your app is rerun by the user because they interacted with the application, then the variable stored in the session state would remain the same.

This is essential for more complex data-driven applications. Let’s consider the simple example that we saw earlier in this chapter when we examined the st.metric() widget.

if "prev_word_count" not in st.session_state:
    st.session_state["prev_word_count"] = 5
text = st.text_area("Paste text here to get word count.", "This is some default text.")
word_count = len(text.split())
change = word_count-st.session_state.prev_word_count
st.metric("Word Count", word_count, change)
st.session_state.prev_word_count = word_count

In this sample, we start off with a conditional:

if "prev_word_count" not in st.session_state:

This line looks to see if a variable name that we want to use is stored in our session state. If it is not stored there, then we want to create that new key. We do that with the following line:

    st.session_state["prev_word_count"] = 5

Here we are setting the prev_word_count key to 5.

Next, we give the user the ability to input some text for which they wish to receive a word count. In order for the metric to know if the new metric is higher or lower than the previous one, we must store the previous text’s total word count. To do this, we access the previous session state in the final line of the snippet below.

text = st.text_area("Paste text here to get word count.", "This is some default text.")
word_count = len(text.split())
change = word_count-st.session_state.prev_word_count

Once we have populated those results, we then can update the st.session_state.previous_word_count value to the new value. This allows us to always know the state of the previous word count, so that when we display the change value, we know precisely how much our metric has changed.

st.metric("Word Count", word_count, change)
st.session_state.prev_word_count = word_count