What is LangChain? Easier development around LLMs – InfoWorld
By Martin Heller
Contributor, InfoWorld |
Using large language models (LLMs) is generally easy, although there’s an art to constructing effective prompts for them. On the other hand, programming with language models can be challenging. Enter LangChain.
LangChain is a framework for developing applications powered by language models. You can use LangChain to build chatbots or personal assistants, to summarize, analyze, or generate Q&A over documents or structured data, to write or understand code, to interact with APIs, and to create other applications that take advantage of generative AI. There are currently two versions of LangChain, one in Python and one in TypeScript/JavaScript.
LangChain enables language models to connect to sources of data, and also to interact with their environments. LangChain components are modular abstractions and collections of implementations of the abstractions. LangChain off-the-shelf chains are structured assemblies of components for accomplishing specific higher-level tasks. You can use components to customize existing chains and to build new chains.
Note that there are two kinds of language models, LLMs and chat models. LLMs take a string as input and return a string. Chat models take a list of messages as input and return a chat message. Chat messages contain two components, the content and a role. Roles specify where the content came from: a human, an AI, the system, a function call, or a generic input.
In general, LLMs use prompt templates for their input. A prompt template allows you to specify the role that you want the LLM or chat model to take, for example “a helpful assistant that translates English to French.” It also allows you to apply the template to many instances of content, such as a list of phrases that you want translated.
LangChain has six modules:
Model I/O lets you manage prompts, call language models through common interfaces, and extract information from model outputs.
Data connection gives you the building blocks to load, transform, store and query your data.
Complex applications require chaining LLMs, either with each other or with other components. LangChain provides the Chain interface for such “chained” applications.
A conversational system should be able to access some window of past messages directly. LangChain calls this ability memory.
As opposed to chains, which hard-code sequences, agents use a language model as a reasoning engine to determine which actions to take and in which order.
Callbacks allow you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
LangSmith helps you trace and evaluate your LangChain language model applications and intelligent agents to help you move from prototype to production. As of this writing, it is still a closed beta. You can view a walkthrough of LangSmith and read the LangSmith docs without joining the beta test.
Use cases for LangChain include Q&A over documents, analyzing structured data, interacting with apis, code understanding, agent simulations, agents, autonomous (long-running) agents, chatbots, code writing, extraction, analyzing graph data, multi-modal outputs, self-checking, summarization, and tagging.
Some of these use cases have many examples, such as Q&A, which has about 17. Others have only one, such as web scraping.
There are roughly 163 LangChain integrations as of this writing. These include five callbacks, nine chat models, 115 document loaders, six document transformers, 54 LLMs, 11 ways of implementing memory (mostly with databases), 22 retrievers (mostly search methods), 31 text embedding models, 21 agent toolkits, 34 tools, and 42 vector stores. The integrations are also available grouped by provider.
LangChain essentially acts as a neutral hub for all of these capabilities.
To install LangChain for Python, use pip
or conda
. The best practice is to install Python packages in virtual environments so that they don’t have version conflicts over dependencies.
I’ll show pip
commands. For conda
commands, consult the installation page and click on Conda.
The basic, minimal installation is
For the record, that’s what I used. It does not include the modules for model providers, data stores, or other integrations. I plan to install whichever of those I need, when I need them.
To install LangChain and the common language models, use
To install LangChain and all integrations, use
If you’re using zsh
, which is the default shell on recent versions of macOS, then you’ll need to quote expressions with square brackets. Otherwise, without the quotes, the shell interprets square brackets as indicating arrays. For example:
To install LangChain for JavaScript, use npm
, Yarn
, or pnpm
, for example:
You can use LangChain for JavaScript in Node.js, Cloudflare Workers, Vercel / Next.js (Browser, Serverless, and Edge functions), Supabase Edge Functions, web browsers, and Deno.
I won’t show you more about LangChain for JavaScript; I suggest that you consult the LangChain for JavaScript installation page to get started.
While there are hundreds of examples in the LangChain documentation, I only have room to show you one. This Python code comes from the end of the Quickstart, and demonstrates an LLMChain. This chain takes input variables, passes them to a prompt template to create a prompt, passes the prompt to an LLM (ChatOpenAI), and then passes the CSV output through an (optional) output parser to create a Python array of strings.
LangChain Expression Language is a declarative way to compose chains and get streaming, batch, and async support out of the box. LCEL makes using LangChain easier. You can use all the same existing LangChain constructs to create chains as you would when composing them in code, since LCEL is essentially a high-level alternative to creating chains in Python or TypeScript/JavaScript.
There’s a LangChain Teacher you can run interactively to learn LCEL, although you’ll need to install LangChain for Python first. Note that I wasn’t able to run the teacher. It seems to have a version-dependent bug.
LCEL expressions use pipe characters (|
) to connect variables into chains. For example, a basic common chain uses a model and a prompt:
In context, you might have this Python program:
The output (as given on the site) is:
As you’ve seen, LangChain offers a powerful way to create generative AI applications powered by language models and data, connected into chains. I’ve shown you a few Python examples, and given you a link to the JavaScript examples. You can also program LangChain in R using a Python shim, as my InfoWorld colleague Sharon Machlis explains in Generative AI with LangChain, RStudio, and just enough Python. Another useful resource is the LangChain blog, which publishes a short article most weekdays.
Next read this:
Martin Heller is a contributing editor and reviewer for InfoWorld. Formerly a web and Windows programming consultant, he developed databases, software, and websites from 1986 to 2010. More recently, he has served as VP of technology and education at Alpha Software and chairman and CEO at Tubifi.
Copyright © 2023 IDG Communications, Inc.
Copyright © 2023 IDG Communications, Inc.