HarpIA is a platform that aims to assist in the evaluation of large language models (LLMs). These models have seen an accelerated evolution in recent years, and the practice of evaluating these models needs to keep pace. The HarpIA platform seeks to contribute to this end by promoting productivity, standardization, reproducibility, and transparency of current practices in the evaluation of LLMs.
The platform is composed of models that operate independently. Each module offers a set of tools that can be functionally concatenated to satisfy different evaluation requisites. This is because the tools share input and output interfaces. For example, the output of the metrics tool of the HarpIA Lab serves as the input for the statistical analysis tool or the graphical analysis tool that will be offered by the HarpIA Lab. Finally, all modules of the HarpIA platform are developed under an open-source policy so that it can be adapted or extended to satisfy particular evaluation requisites the user may have.
Learn more about the tools offered by each module and their current development stage:
“Perhaps more than a platform, HarpIA can be seen as an ecosystem for evaluating large language models.”
Team, 2025

The term HarpIA refers to the harpy eagle, the strongest predator among birds. Found in Brazil, especially in the Amazon and the Atlantic Forest, it is also found in other regions of South America. It is a very astute and strong bird, smart and observant. One of its characteristics is to scan its prey, assessing its characteristics and then making the decision to capture it.