Most Ligand-Based Classification Benchmarks Reward Memorization Rather than Generalization

Volume: 58, Issue: 5, Pages: 916 - 932
Published: Apr 26, 2018
Abstract
Undetected overfitting can occur when there are significant redundancies between training and validation data. We describe AVE, a new measure of training-validation redundancy for ligand-based classification problems that accounts for the similarity amongst inactive molecules as well as active. We investigated seven widely-used benchmarks for virtual screening and classification, and show that the amount of AVE bias strongly correlates with the...
Paper Details
Title
Most Ligand-Based Classification Benchmarks Reward Memorization Rather than Generalization
Published Date
Apr 26, 2018
Volume
58
Issue
5
Pages
916 - 932
Citation AnalysisPro
  • Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
  • Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.