Efficiency Prediction for Organic Photovoltaic Cells Using Molecular Fingerprints and Machine Learning Regression Models
ZHENG Yujie1,†, LIANG Xinbin1,†, ZHANG Qi1, SUN Wenbo1, SHI Tongchao2,3, DU Juan2,3, SUN Kuan1
1 MOE Key Laboratory of Low-grade Energy Utilization Technologies and Systems, School of Energy and Power Engineering, Chongqing University, Chongqing 400044, China 2 State Key Laboratory of High Field Laser Physics, Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai 201800, China 3 Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
Abstract: The development of organic photovoltaics (OPV) relies heavily on the new discovery of efficient OPV materials. In recent years, machine learning-assisted OPV material development has received wide attention, to overcome the inefficiency of traditional development mode. Herein, we proposed a new method that combines molecular fingerprints and regression models to achieve rapid prediction of the power conversion efficiency of newly-designed OPV donor materials. Based on the latest donor material database collected from the Web of Science database, the prediction accuracies of different combinations of molecular fingerprints and various machine learning regression models were compared systematically. We found that the combination of Morgan fingerprint and random forest model performs the best under the R-squared evaluation. And the combination of Hybridization fingerprint and support vector machine model performs the best under the mean absolute error evaluation. Moreover, a general trend is that the prediction accuracy of all models increases as the length of the molecular fingerprint increases. This method can be useful for preliminary screening of new OPV materials in a fast manner, and thus promotes the development of high-performance OPVs through accelerating the development of new OPV materials.
1 Min H S. Australian Journal of Basic & Applied Sciences,2016,10(8),21. 2 Kang Z, Chen S C, Ma Y, et al. ACS Applied Materials & Interfaces,2017,9(29),24771. 3 Zhang S J, Gao J H, Wang W, et al. ACS Applied Energy Materials,2018,1(3),1276. 4 Soultati A, Fakharuddin A, Polydorou E, et al. ACS Applied Energy Materials,2019,2(3),1663. 5 Cao H D, Bauer N, Pang C, et al. ACS Applied Energy Materials,2018,1(12),7146. 6 Sun W B, Zheng Y J, Yang K, et al. Science Advances,2019,5(11),4275. 7 Butler T K, Davies W D, Hugh C, et al. Nature,2018,559(7715),547. 8 Nagasawa S, Al N E, Saeki A. Journal of Physical Chemical Letters,2018,9(10),2639. 9 Sun W B, Li M, Li Y, et al. Advanced Theory and Simulations,2019,2(1),1800116. 10 Hachmann J, Olivares A R, Atahan E S, et al. The Journal of Physical Chemistry Letters,2011,2(17),2241. 11 Jorgensen P B, Mesta M, Shil S, et al. Journal of Chemical Physics,2018,148(24),241735. 12 Sahu H, Rao W N, Troisi A, et al. Advanced Energy Materials,2018,8(24),1801032. 13 Lee M H. Advanced Energy Materials,2019,9(26),1900891. 14 Márton Vass, Albert J Kooistra, Tina Ritschel, et al. Current Opinion in Pharmacology,2016,30,59. 15 Adrià Cereto-Massagué, María José Ojeda, Cristina Valls, et al. Me-thods,2015,71,58. 16 Muegge I, Mukherjee P. Expert Opinion on Drug Discovery,2015,11(2),137. 17 Kyaw-Zeyar Myint, Lirong Wang, Qin Tong, et al. Molecular Pharmaceutics,2012,9(10),2912. 18 Youngsoo Lee, Heather L Miller, Patricia Jensen, et al. Cancer Research,2003,63(17),5428. 19 O Anatole Von Lilienfeld, Raghunathan Ramakrishnan, Matthias Rupp, et al. International Journal of Quantum Chemistry,2015,115(16),1084. 20 Mellor C L, Marchese Robinson R L, Benigni R, et al. Regulatory Toxi-cology and Pharmacology,2019,101,121. 21 Zhang X B, King M L, Hyndman R J. Computational Statistics & Data Analysis,2006,50(11),3009. 22 Friedman J H. Annals of Statistics,2001,29(5),1189. 23 Breiman L. Machine Learning,2001,45(1),5. 24 Geurts P, Ernst D, Wehenkel L. Machine Learning,2006,63(1),3. 25 Quinlan J R. IEEE Transactions on Systems, Man, and Cybernetics,1990,20(2),339. 26 Jain A K, Mao J C, Mohiuddin K M. Computer,1996,29(3),31. 27 Cortes C, Vapnik V. Machine Learning,1995,20(3),273. 28 Chang C C, Lin C J. ACM Transactions on Intelligent Systems and Technology,2011,2(3),1. 29 Pedregosa F, Varoquaux G, Gramfort A, et al. Journal of Machine Lear-ning Research,2011,12,2825. 30 Cereto M A, Ojeda M J, Valls C, et al. Methods,2015,71,58. 31 Li Z J, Wan H G, Shi Y H, et al. Journal of Chemical Information and Computer Sciences,2004,44(5),1886. 32 Weininger D. Journal of Chemical Information and Modeling,1988,28(1),31. 33 Dong J, Cao D S, Miao H Y, et al. Journal of Cheminform,2015,7,60. 34 Guilherme F S, Marcos D V. Tree Genetics & Genomes,2020,16(2),37. 35 Chen D Y, Liu X H, Zhou Y P, et al. Journal of Applied Polymer Science,2000,76(4),481. 36 Florea A C, Andonie R. International Journal of Computers Communications & Control,2019,14(2),154. 37 Srinivas N, Krause A, Kakade S M, et al. IEEE Transactions on Information Theory,2012,58(5),3250. 38 Hinton G, Deng L, Yu D, et al. IEEE Signal Processing Magazine,2012,29(6),82. 39 Yu S Y, Wu Y, Li W, et al. Neurocomputing,2017,257,97.