Evaluation Code
#4
by
liyongkang
- opened
Hi,
Thank you very much for your work. I have been trying to replicate the results on Bright using this model, but unfortunately, the results I obtained were significantly lower and did not match the ones you reported. Would it be possible for you to kindly share the evaluation code with me? Specifically, I would greatly appreciate it if you could also provide the prompts you used for each dataset.
Thank you so much for your help!
Hello, @liyongkang . I just updated FlagEmbedding to support evaluation of BRIGHT benchmark. You can refer to this script to reproduce the evaluation results: https://github.com/FlagOpen/FlagEmbedding/blob/master/examples/evaluation/bright/eval_bright_short.sh.