Tokens

Smallest unit or chunks of text that a model processes

What is a token?

In the context of large language models, 'token' refers to the smallest unit or chunks of text that a model processes. Used by LLMs to process and generate language, tokens can be as short as one character, as long as a word, or even larger chunks of text-like phrases, depending on the model and its configuration.

Tokens serve as a connection between human language and a structure that AI models can understand. Many modern language models, such as GPT models, are trained as token-based models. AI models are designed to handle a specific number of tokens at one go.

Each input provided to the model is broken down into tokens and analyzed, and the understanding is used to create a response. The exact process is followed for creating a response - the model generates one token at a time based on the previous token.

Types of tokens:

Here are some types of tokens used in AI Large Language Models:

Word Tokens: These represent individual words or phrases in the text, like "house."
Sub-word Tokens: Words can be divided into smaller sub-word units. For instance, "speaking" can be segmented into "speak" and "ing."
Punctuation Tokens: Tokens that signify various punctuation marks, such as commas (","), periods ("."), and others.
Special Tokens: Unique symbols like "[CLS]" (classification token), "[SEP]" (separator token), or "[MASK]" (mask token) have specific roles within the model.
Number Tokens: Textual numbers are transformed into numerical tokens. For example, "10" might be represented as a numerical token.

Liked the content? you'll love our emails!

Thank you! We will send you newest issues straight to your inbox!

Oops! Something went wrong while submitting the form.

Is Explainability critical for your AI solutions?

Schedule a demo with our team to understand how AryaXAI can make your mission-critical 'AI' acceptable and aligned with all your stakeholders.

Book a Demo

AryaXAI provides the most accurate explainability and alignment stack to deliver accurate, true-to-model explainability, monitoring, risk management, and alignment techniques essential for highly mission-critical or regulated AI solutions.

Wikis

Tokens

Is Explainability critical for your AI solutions?

Tokens

Liked the content? you'll love our emails!

Is Explainability critical for your AI solutions?