HarpIA Twin

The HarpIA Twin module is designed to support the evaluation of LLMs integrated into sociotechnical information systems, drawing inspiration from the concept of digital twins. Unlike the HarpIA Survey and HarpIA Lab modules, which focus on human interaction and controlled experimentation, HarpIA Twin aims to link the evaluation activities (whether automated or human-driven) supported by the platform to the complete usage context of the LLMs.

This module, currently in its early development stage, is intended to help assess whether an LLM, when embedded within a system, meets functional and non-functional requirements that depend on the model’s outputs.

Some functionalities under development in this module include:

Simulation of real-world environments, enabled by APIs that allow the platform to communicate with the LLM embedded in the business logic implemented in the information system;
Integration and evaluation of different LLMs, based on their interaction with the same sociotechnical system;
Registration of system requirements and association of the LLM’s output and behavior with those predefined functional and non-functional requirements.

With the implementation of the HarpIA Twin module, new opportunities arise for operationalizing activities such as continuous monitoring of LLM performance and structured data collection for implementing continuous improvement cycles.

Development stage

Release 1:

Version 1 will introduce functionalities for communication with external systems via API and integration with different LLMs.