Welcome! As a new user of NovelAI, there might be some terminology and concepts that are unfamiliar to you. The goal of this glossary is to briefly explain the terms used in our documentation.
- AI Model
- AI Modules
- Banned Tokens
- Config Preset
- Phrase Bias
- Token Probabilities
Also known as Large Language Models or LLMs. In simple terms, AI models learn by reading lots of data using deep learning algorithms. The sheer amount of analyzed data allows the AI to understand language and predict how to continue written text.
After this initial training, AI Models go through a process known as fine-tuning, in which a curated dataset is used to further improve the AI's capabilities. In NovelAI's case, our fine-tuning focuses on making the AI better at storytelling.
Also known as Soft Prompts. Modules directly influence the behavior and generations of the AI toward a particular style or genre. You can use the Default Modules we provide or you can train your own Custom Modules on the training data you provide.
NovelAI's premium curency. You get Anlas by being subscribed to NovelAI. The amount received depends on your subscription tier. Additionally, you can purchase Paid Anlas as long as you have a renewing subscription.
Used to prevent certain token sequences from being generated by the AI.
Banned Tokens can be set by the user in the Advanced Settings Sidebar.
A specific set of generation settings to adjust how the AI behaves. These parameters include things like randomness, repetition penalty, sampling methods and the order in which they're applied.
Each AI Model comes with it's own set of default config presets. You can also make your own custom preset to suit your needs.
The range of tokens the AI sees before generating an output. Context is built starting from the current point of the story, and goes back until the maximum context size is reached. Everything outside of context effectively never happened, as far as the AI is concerned. Your maximum context size depends on your current subscription tier and chosen AI Model.
When context is built, Memory, Author's Note and Lorebook entries are injected into the context depending on their insertion settings.
A special token consisting of three asterisks in a row
*** usually used to indicate a scene or chapter break.
A tool to increase or decrease the likelihood of certain words or phrases being generated by the AI. Negative Phrase Bias values reduce the chances while positive values increase them.
The initial written text in a story before any AI generation takes place. The prompt serves as the foundation for what the AI will generate. By default, the prompt appears in cream color in the "NovelAI Dark" theme.
The .scenario file that contains the initial prompt to start a story along with all information inside Memory, Author's Note and the Lorebook. Scenarios also contain the author's chosen AI Model and Config Preset. When loading a scenario you may be asked to fill out any placeholders left by the author.
The basic unit of text that the AI uses to process and generate language. AI models don’t see words as individual letters. Instead, the text is broken down into tokens, which are words or word fragments.
The way tokens are arranged and their Token ID depends on the Tokenizer used by the AI Model.
For example, the sentence “The quick brown fox jumps over the goblin.” would tokenize as “The| quick| brown| fox| jumps| over| the| go|bl|in.” in the Pile tokenizer used by GPT-NeoX 20B and Krake, with each | signifying a boundary between tokens.
Also known as Logits. When the AI generates an output it chooses from a pool of tokens. Token Probabilities refer to the chance each token had of being used by the AI. The % probabilities of tokens are influenced by generation settings, biasing, banning, and modules.