← Human Intelligence A Hyper-Heuristic →

Humaneval Benchmark

A dataset designed to evaluate the code generation capabilities of large language models (LLMs).

Areas of application

Language understanding
Algorithms
Simple mathematics
Software interview questions

Example

The HumanEval benchmark consists of 164 hand-crafted programming challenges, each including a function signature, docstring, body, and several unit tests.

Resources

DCAM and data management capabilities

← Human Intelligence A Hyper-Heuristic →