ILSUM 2024

Indian Language Summarization

Task 1

For Task 1 The standard ROUGE metrics will be utilized as the standard method for evaluating automatic summarization in the live submission system. Additionally, once all submissions have been received, the BERT score will be used offline to gain insight into abstractive summarization methods. BERT Score will not be part of the leaderboard scores due to high computation requirements. We also plan to do manual evaluation on part of the dataset.

Task 2

Macro F1 Score will be used for Task 2.

Note: While we will use a max length of 75 words for evaluation, participants are expected to predict an appropriate summary length for each article. Too long or short lengths compared to the groud truth summary can adverseley effect ROUGE precision or recall respectively.