Document Type


Publication Date



Dye aggregates are of interest for excitonic applications, including biomedical imaging, organic photovoltaics, and quantum information systems. Dyes with large transition dipole moments (μ) are necessary to optimize coupling within dye aggregates. Extinction coefficients (ε) can be used to determine the μ of dyes, and so dyes with a large ε (>150,000 M−1) should be engineered or identified. However, dye properties leading to a large ε are not fully understood, and low-throughput methods of dye screening, such as experimental measurements or density functional theory (DFT) calculations, can be time-consuming. In order to screen large datasets of molecules for desirable properties (i.e., large ε and μ), a computational workflow was established using machine learning (ML), DFT, time-dependent (TD-) DFT, and molecular dynamics (MD). ML models were developed through training and validation on a dataset of 8802 dyes using structural features. A Classifier was developed with an accuracy of 97% and a Regressor was constructed with an R2 of above 0.9, comparing between experiment and ML prediction. Using the Regressor, the ε values of over 18,000 dyes were predicted. The top 100 dyes were further screened using DFT and TD-DFT to identify 15 dyes with a μ relative to a reference dye, pentamethine indocyanine dye Cy5. Two benchmark MD simulations were performed on Cy5 and Cy5.5 dimers, and it was found that MD could accurately capture experimental results. The results of this study exhibit that our computational workflow for identifying dyes with a large μ for excitonic applications is effective and can be used as a tool to develop new dyes for excitonic applications.


For a complete list of authors, please see the article.

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.