LLNL Hackathon: Chat GPT, GraphRAG, & Vector DataBase Pipeline with LangChain

Overview

Created an LLM Pipeline using LangChain, GraphRAG, and Vector Databases for the data secure local instance of ChatGPT(LivChat) at Lawrence Livermore National Labs(LLNL) to allow users access to LLNL-specific information through LivChat.The LLM Pipeline creates clearance level-dependent indexes of Knowledge Graphs and Vector Databases, and queries both to synthesize a response for the user.

Step By Step Process

Lawrence Livermore National Labs(LLNL) has a local instance of ChatGPT 3.5 & 4o(LivChat) on site to minimize sensitive data risks that arise when user queries are sent off site to OpenAI(ChatGPT), Microsoft(Copilot), etc. We proposed an improvement to the local model to allow users to be able to ask this local ChatGPT instance questions that require internal LLNL knowledge that OpenAI doesn't have access to when training ChatGPT. The improvement consists of using LangChain🦜🔗 to provide a method for LivChat to query clearance level specific data from Knowledge Graphs using GraphRAG, and a Vector Database using Pinecone, to then have LivChat read both queries and synthesize a response for the user. A flow chart of the overall process is shown below.