Glossary

BigBench: Capabilities and biases of large language models

Posted by Fede Nolasco | Jan 22, 2024

BIG-Bench: A new benchmark for assessing LLMs, offering challenging, long-lasting tests to evaluate capabilities and biases comprehensively.

AgiEval: Human-centric Benchmark

Posted by Fede Nolasco | Jan 22, 2024

AGIEval is a human-centric benchmark that evaluates the general abilities of foundation models in tasks pertinent to human cognition and problem-solving.

Abductive logic programming

Posted by Fede Nolasco | Jan 21, 2024

Abductive logic programming (ALP) helps solve complex AI problems, enhancing efficiency and accuracy with conventional methods.