Hi, I have one problem regarding to bleu and meteor.
The function evaluate_metrics_from_lists() return metrics and per_file_metrics, where metrics are for the whole dataset while per_file_metrics are for each file. I found the mean of the blue and meteor in per_file_metrics are not equal to those in metrics.
For example, bleu1 in metrics doesn't equal mean(bleu1) of all files in per_file_metrics. Other metrics (cider, spice and spider) are equal.
Do you known the reason? I noticed there were someone mentioned this in cococaption's issues but was not solved.
Thank you very much!
Hi, I have one problem regarding to bleu and meteor.
The function evaluate_metrics_from_lists() return metrics and per_file_metrics, where metrics are for the whole dataset while per_file_metrics are for each file. I found the mean of the blue and meteor in per_file_metrics are not equal to those in metrics.
For example, bleu1 in metrics doesn't equal mean(bleu1) of all files in per_file_metrics. Other metrics (cider, spice and spider) are equal.
Do you known the reason? I noticed there were someone mentioned this in cococaption's issues but was not solved.
Thank you very much!