Tokens In Foundational Models

The smallest units of data that a model can process in Natural Language Processing (NLP). Can refer to words, characters, subwords or even sentences depending on the granularity of the model.

Tokens In Foundational Models

Areas of application

  • Natural Language Processing (NLP)
  • Language Modeling
  • Text Analysis
  • Information Retrieval

Example

In a language model, tokens could be individual words (‘dog’, ‘cat’, etc.) or subwords (e.g., ‘d-o-g’)