BANKING77-77 Benchmark Shatters Records with 94.61% Score

A new milestone has been achieved in the BANKING77-77 test set, with a score of 94.61% being reached, marking a 0.13% improvement over the previous best score of 94.48%.

The key details of this achievement include:

  • A 0.13% improvement over the previous best score
  • A 0.78% improvement over the widely cited 93.83% baseline
  • No test leakage, with a 5-fold cross-validation on the official train set used to freeze the recipe, followed by retraining on 100% of the train data and a single final test evaluation

The model used to achieve this score remains relatively compact, with a footprint of approximately 68 MiB and an inference time of around 216 ms. The improvement was achieved through the use of multiview encoder adaptation on the last layers, a relatively lightweight change that ultimately led to the breakthrough after many smaller tweaks failed to transfer from the holdout to the test set.

This achievement highlights the challenges of transferring gains from the holdout set to the true held-out test set, and the importance of persistence and innovative problem-solving in achieving new milestones.

Photo by Centre for Ageing Better on Pexels
Photos provided by Pexels