What are large language models used for? – Android Police

Large language models (LLMs) are the basis for AI chatbots and much more. Here’s what’s going on behind the scenes
With the online world obsessing about chatbots, some AI (artificial intelligence) phrases are entering the mainstream conversation. One of those is large language models or LLMs, and you can't read about OpenAI or chatbots like ChatGPT and Google Bard for long without running into it. But outside of computer science, not many people know what this technology is.
Large language models power the AI chatbot tech that's getting so much discussion these days. And whether you're eager to see if AI can help you write an email on your Android (it probably already is) or are worried about students cheating with chatbots (there's a lot to unpack there), it's important to understand how they work. So let's dive into the mathematical morass that's large language models and see what's going on!
These models, which are based on machine learning and neural network technology, analyze and label different parts of a language so that they can "talk" like a person. Or, in ChatGPT's case (and the other GPT family like GPT-3 and GPT-4), imitate different tones and conversations, and some are better than others. This is part of the natural language processing or NLP discipline. Large language models are a key part of the chatbot AI and are built to continue learning as long as they can process more examples of human language.
LLMs don't learn grammar like humans do. Instead, they follow a unique process that labels different parts — say, words in a sentence — so LLMs can make a good mathematical guess at how to write or talk. With enough study, these deep learning models can make a good guess, good enough to imitate a college essay or a helpful customer service representative.
LLMs are inherently complex, and we don't have the time to give an entire college course on them (although that would be fun). Instead, let's break down some of their important parts and how these machine-learning models work.
Tokenization is the process of turning everyday human language into sequences that LLMS can understand, a part of natural language processing. That involves assigning sections (usually words or parts of words) number values and encoding them for rapid analysis. Think of it like the AI way of teaching phonetics. Tokenization's goal is to create context vectors, which are like cheat sheets or formulas for the AI to guess how a sentence goes.
The more the AI studies language and gets information about how language fits together, the better guesses it can make about what words come next in certain kinds of sentences. Add that together over and over again, and you get models that can reproduce different ways humans talk on the internet.
A transformer model is a neural network that analyzes sequential data and looks for signs about how that data fits together, like which words are most likely to follow other words. These models are made of blocks or layers, each focusing on a different type of analysis to help understand what words fit together and what words don't. Sometimes, they get their own names, like the open source BERT. They form foundation models that are the basis for all LLMs.
Credit for transformer creation is generally given to Google's engineers in the late 2010s. As mentioned above, transformer models don't learn a language. Instead, they use algorithms to understand how humans write words. Feed a transformer model a bunch of hipster blogs about coffee, and it quickly learns how to write a generic hipster piece about coffee. Meanwhile, machine learning techniques like reinforcement learning give the model feedback about when it's wrong. Transformer models are the foundation of LLM language generation, and they can be increasingly complex depending on their purpose, so complex that they need lots of servers to hold the large-scale model.
Their creators devise creative ways to classify words so that the models understand how they work. For example, positional encoding embeds the order of words in sentences so that the model always knows what order they come in, even if words are provided randomly. Attention mechanisms like self-attention assign more importance to some parts of sentences than others, allowing the models to recognize what humans emphasize when writing.
Prompts are the inputs that developers give LLMs to tokenize and analyze, basically training data for different use cases. Prompts can be nearly anything. For chatbots, for example, prompts are tons of online writing, essays, articles, and books (which is why some authors are suing). The more prompts an LLM gets in the training process, the better it can predict the next word and create sentences. That's because language, especially the casual language used by humans online, is redundant and frequently predictable.
Prompts also influence how the AI sounds and responds to things, which can cause trouble. Many of us remember that one of Microsoft's early attempts at online AI was quickly shut down after it became a Nazi because its prompts were from Twitter users. Choosing the proper prompts for a deep learning AI is essential. Services like ChatGPT try to cast a wide net while providing important parameters about what not to include. Lots of fine-tuning is involved, and frequent tweaks to learning algorithms help AI models learn how to treat data or perform specific tasks.
Today's new AI chatbots do, it's how they generate text and answer questions. Basic chatbots, like those found on brand website pop-ups or Facebook stores, don't use this technology and barely qualify as AI in many cases. But services like the family of GPTs, Google's Bard, Bing AI, Pi, and others use different kinds of LLMs. A growing number of apps use simpler models to imitate human speech, like the latest therapy apps (with mixed results).
By this point, you may be wondering if LLMs are responsible for AI-generated art from things like DALL-E 2 and Midjourney. Basically, yes, visually generative AI are versions of LLMs. They use similar models to look at visual aspects instead of written language. That's how they can roughly understand objects, subjects, and different art styles.
But art and text data aren't the only things LLMs are useful for. These are just the beginning. State-of-the-art AI systems are learning molecular structures and protein sequencing in the same way, which helps scientists and pharmaceutical companies discover new solutions. They're handling minor coding tasks and metadata work to make websites better and more accessible. And general-purpose models are helping humans communicate better in different languages. Even everyday uses, like summarizing long reports for busy readers, have large-scale advantages.
Dangerous in the sense that they're going to create murderbots to take over Earth? No. They're also probably not going to take over many jobs unless those jobs are simple and repetitive. For LLMs, context is difficult. They can't easily understand new information and can never reflect on what to say before saying it.
Still, these models have other issues that can make them dangerous. Most issues boil down to a few root causes:
They can spread disinformation or biased opinions: LLMs and their chatbots don't know if the information is accurate. They only know, in a literal way, what they've been told and how to repeat it in different ways. Chatbots have been caught spreading disinformation and been accused of political bias, among other problems. It's hard to create a large language model with internet information that doesn't run into disinformation issues. And sometimes, LLM-powered chatbots produce fake information, including fake financial numbers for corporations and fake cases for lawyers. AI developers sometimes call these hallucinations, and it's hard to optimize them away.
They can allow dangerous behavior: You may have heard the stories about how ChatGPT's grandma exploit had granny giving away illegal keys to licensed software or telling you how to make napalm, despite filters to prevent behavior like this. When a large language model consumes enough information, it can teach you how to do about anything, and for today's chatbots, that can include a lot of Dark Web stuff. So far, creators haven't found an effective way to stop it entirely.
They're a privacy risk: The massive datasets fed into large language models can also contain some sensitive information, including your information. That includes info about you that's easy to find online or on public social. It can even include conversations you've had online or online activity that advertisers use. Since LLM AIs are new, there's not much robust privacy protection preventing this.
They're energy hogs: LLMs are huge and consume a ton of energy. That's bad news for companies trying to lower their carbon footprints, and leads to many associated environmental costs.
They don't have any ethics: People can use tools like ChatGPT to create almost anything they want. Now, schools need new ways to detect fake AI essays. People can ask ChatGPT to generate insulting or hateful content or ask it to impersonate someone for catfishing, blackmail, or other purposes. It may even write code for malware or whip up fake research about why vaccines don't work. These are issues that can't be solved with a simple filter, and we're only beginning to see the ramifications.
Large language models use advanced AI tools to categorize language (or other data) in such a way that lets them understand how humans communicate. That's largely influenced by the parameters set and the prompts fed into LLMs, which is how tools like ChatGPT are formed.
The future of LLMs can seem scary and for a good reason. But despite the pitfalls this new technology creates, LLMs have many powerful and positive uses. We have a lot to learn about how to use them, how to structure them so that they're safe to use, and how to feed them the correct prompts. Deepfakes and essay cheats are only a couple of the results when we get things wrong. Welcome to the wild west of language-based AI. It's going to be quite a ride.
Tyler Lacoma has spent more than 10 years testing tech and studying the latest web tool to help keep readers current. He’s here for you when you need a how-to guide, explainer, review, or list of the best solutions for your Android life.