Skip to content

Build a ChatGPT Clone with LangChain and OpenAI in 5 Steps

Build your own ChatGPT clone with LangChain, OpenAI, and Streamlit in five steps. Create a conversational AI with memory and real-time streaming.

Daniel Evershaw(ML Engineer & Technical Writer)May 15, 20264 min read0 views

Last updated: May 15, 2026

two hands touching each other in front of a pink background
Quick Answer

You will build a ChatGPT clone using LangChain for memory and streaming, OpenAI for the model, and Streamlit for the UI. The tutorial covers setup, conversation chain creation, and a real-time chat interface.

Build a ChatGPT Clone with LangChain and OpenAI in 5 Steps

Have you ever wanted to create your own conversational AI assistant, complete with memory, streaming responses, and a polished chat interface? In this tutorial, you will build a ChatGPT clone from scratch using LangChain for orchestration, OpenAI for the language model, and Streamlit for the user interface. By the end, you will have a fully functional chatbot that maintains conversation context and streams responses in real time. This project is perfect for understanding the core components behind modern conversational agents and serves as a foundation for more advanced systems.

Prerequisites

Before you start, make sure you have the following:

  • Python 3.9 or newer installed on your machine
  • An OpenAI API key (set as an environment variable OPENAI_API_KEY)
  • Basic familiarity with Python and async programming
  • A terminal and a code editor

You will also need to install these Python packages:

pip install langchain langchain-openai streamlit python-dotenv

Architecture Overview

The system consists of three main layers:

  1. UI Layer (Streamlit): Handles user input, displays messages, and manages session state.
  2. Orchestration Layer (LangChain): Manages conversation memory, chains prompts, and streams responses.
  3. Model Layer (OpenAI): Generates replies using the GPT-4 or GPT-3.5 model.

The following diagram shows how these components interact during a single user query:

Step-by-Step Implementation

Step 1: Set Up Environment Variables

Create a .env file in your project root and add your OpenAI API key:

OPENAI_API_KEY=sk-your-key-here

Then create a file named chatbot.py and load the environment variables at the top:

import os
from dotenv import load_dotenv
 
load_dotenv()
 
openai_api_key = os.getenv("OPENAI_API_KEY")
if not openai_api_key:
    raise ValueError("OPENAI_API_KEY not found in .env file")

Step 2: Create the LangChain Chain with Memory

LangChain provides a ConversationBufferMemory and a ConversationChain that handle prompt history automatically. We will configure the chain with streaming enabled:

from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain_openai import ChatOpenAI
 
llm = ChatOpenAI(
    model="gpt-3.5-turbo",
    temperature=0.7,
    streaming=True,
    openai_api_key=openai_api_key
)
 
memory = ConversationBufferMemory(return_messages=True)
 
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=False
)

Notice that we set streaming=True on the LLM. This allows us to receive tokens one by one instead of waiting for the full response. The verbose=False keeps the console clean.

Step 3: Build the Streamlit User Interface

Streamlit makes it easy to create a chat interface. We will use session state to store the conversation history and a callback to handle streaming:

import streamlit as st
from langchain.callbacks.base import BaseCallbackHandler
 
class StreamHandler(BaseCallbackHandler):
    def __init__(self, container, initial_text=""):
        self.container = container
        self.text = initial_text
 
    def on_llm_new_token(self, token: str, **kwargs) -> None:
        self.text += token
        self.container.markdown(self.text)
 
st.set_page_config(page_title="ChatGPT Clone", page_icon="🤖")
st.title("ChatGPT Clone with LangChain")
 
if "messages" not in st.session_state:
    st.session_state.messages = []
 
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])
 
if prompt := st.chat_input("Type your message..."):
    st.session_state.messages.append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        st.markdown(prompt)
 
    with st.chat_message("assistant"):
        stream_handler = StreamHandler(st.empty())
        response = conversation.predict(input=prompt, callbacks=[stream_handler])
    st.session_state.messages.append({"role": "assistant", "content": response})

Key points: The StreamHandler callback updates the UI container each time a new token arrives, giving the illusion of real-time streaming. The conversation history is stored in st.session_state.messages to persist across reruns.

Step 4: Run the Application

Save the file and run it from the terminal:

streamlit run chatbot.py

Your browser will open at http://localhost:8501. You can now chat with your clone. Type a message and watch the response stream in.

Step 5: Add Conversation Persistence (Optional)

By default, memory resets when you refresh the page. To persist conversations across sessions, you can save the memory to a file or database. Here is a simple JSON-based approach:

import json
 
def save_memory(memory, filepath="memory.json"):
    data = {"history": memory.chat_memory.messages}
    with open(filepath, "w") as f:
        json.dump(data, f, default=str)
 
def load_memory(memory, filepath="memory.json"):
    try:
        with open(filepath, "r") as f:
            data = json.load(f)
        memory.chat_memory.messages = data["history"]
    except FileNotFoundError:
        pass

Call load_memory(memory) at startup and save_memory(memory) after each response. This gives your chatbot long-term memory.

Common Pitfalls

  • API key errors: Make sure the .env file is in the same directory as your script and that the variable is named exactly OPENAI_API_KEY. Restart Streamlit after changing the file.
  • Streaming not working: Verify that streaming=True is set on the ChatOpenAI instance and that you pass the callbacks list to predict(). If you use invoke() instead, streaming will not work.
  • Memory not persisting: Streamlit reruns the script on every interaction. Use st.session_state to store objects like the conversation chain itself. Otherwise, a new chain is created each time, losing memory.
  • Rate limiting: OpenAI imposes rate limits on free and low-tier accounts. If you get 429 errors, add a small delay or use a lower-tier model like gpt-3.5-turbo.

Next Steps

You now have a working ChatGPT clone with streaming and memory. To take it further, consider adding retrieval-augmented generation (RAG) so your chatbot can answer questions based on your own documents. You could also switch to an open-source model like LLaMA 3 running locally via Ollama to avoid API costs. Another improvement is to add a system prompt that gives your chatbot a specific personality or role. The LangChain documentation is an excellent resource for exploring these extensions.

Frequently Asked Questions

Do I need a paid OpenAI account?

Yes, you need an OpenAI API key which requires a paid account. However, you can use the free trial credits that come with new accounts. Alternatively, you can swap the model for a local one like LLaMA 3 via Ollama.

Can I use a different LLM provider?

Absolutely. LangChain supports many providers. Replace `ChatOpenAI` with `ChatAnthropic`, `ChatGooglePalm`, or a local model via `Ollama`. The rest of the code remains largely unchanged.

How do I clear the conversation memory?

You can clear the memory by calling `memory.clear()` in your code. In the Streamlit UI, add a button that triggers this method. Alternatively, restart the Streamlit app by pressing Ctrl+C and running it again.

Why is my response not streaming?

Make sure you set `streaming=True` on the LLM object and pass `callbacks=[stream_handler]` to the `predict()` method. Also verify that you are using a model that supports streaming, such as GPT-3.5 or GPT-4.

Comments

Leave a comment. Your email won't be published.

Supports basic formatting: **bold**, *italic*, `code`, [links](url)

Related Articles