Oral
in
Affinity Workshop: New In ML
Unlocking Chemical Insights: Superior Molecular Representations from Intermediate Encoder Layers
Tue 15 Jul 8 a.m. PDT — 5:30 p.m. PDT
Pretrained molecular encoders have become indispensable in computational chemistry for tasks such as property prediction and molecular generation. However, the standard practice of relying solely on final-layer embeddings for downstream tasks may discard valuable information. In this work, we challenge this convention by conducting a comprehensive layer-wise analysis of five diverse molecular encoders across 22 ADMET property prediction tasks. Our results demonstrate that embeddings from intermediate layers consistently outperform final-layer representations: Using fixed embeddings from the optimal intermediate layers improved downstream performance by an average of 5.4\%, reaching gains up to 28.6\%; finetuning up to these intermediate layers yielded even greater average improvements of 8.5\%, with performance increases as high as 40.8\%, achieving new state-of-the-art results on several benchmarks. These findings highlight the importance of exploring the full representational depth of molecular encoders to achieve substantial performance improvements and computational efficiency.