Evaluation Code

#4
by liyongkang - opened

Hi,

Thank you very much for your work. I have been trying to replicate the results on Bright using this model, but unfortunately, the results I obtained were significantly lower and did not match the ones you reported. Would it be possible for you to kindly share the evaluation code with me? Specifically, I would greatly appreciate it if you could also provide the prompts you used for each dataset.

Thank you so much for your help!

Beijing Academy of Artificial Intelligence org

Hello, @liyongkang . I just updated FlagEmbedding to support evaluation of BRIGHT benchmark. You can refer to this script to reproduce the evaluation results: https://github.com/FlagOpen/FlagEmbedding/blob/master/examples/evaluation/bright/eval_bright_short.sh.

Sign up or log in to comment