i'm afraid i don't use those. nonetheless, i think a good llm eval tool should make it easy to
• examine & label data (and calibrate annotators)
• evaluate & align llm-evaluators to labels
• optimize evaluators
wrote more about it here: eugeneyan.com/writing/alig...
i'm afraid i don't use those. nonetheless, i think a good llm eval tool should make it easy to
• examine & label data (and calibrate annotators)
• evaluate & align llm-evaluators to labels
• optimize evaluators
wrote more about it here: eugeneyan.com/writing/alig...