Glossary

MMLU

Posted by Fede Nolasco | Feb 18, 2024

The Massive Multi-task Language Understanding (MMLU) Benchmark is a comprehensive assessment tool for Language Models, focusing on evaluating their proficiency and knowledge across diverse fields. It offers a unique test set of over 14,079 tasks, designed to measure the model’s capabilities in problem-solving and understanding complex topics.

Mmmu: Massive Multi-Discipline Multimodal Understanding And Reasoning Benchmark

Posted by Fede Nolasco | Feb 18, 2024

(Mmmu) is a standard measure used to validate artificial intelligence solutions’ capacity to understand and reason from multimodal data sets.

Model Checking

Posted by Fede Nolasco | Feb 18, 2024

Model Checking is a technique for verifying the correctness of a system model. It involves using a transition system model to ensure it meets safety and liveness properties for optimal functionality.