Abstract
we studied the theoretical foundations of artificial neural networks as applied to the possibility of approximating functions of many variables by superposition of functions of one variable. We considered the most important universal approximation theorems. We also studied the approximation theorems with the required number of neurons in a layer (width constraint) or the number of layers in a neural network (depth constraint), and the theorems in which their authors prove the existence of min bounds both for the number of layers and for the number of neurons per layer
References
Бетелин В. Б. О проблеме доверия к технологиям искусственного интеллекта. Успехи кибернетики. 2021;2(3):6—7. DOI: 10.51790/2712-9942-2021-2-3-1.
Колмогоров А. Н. О представлении непрерывных функций нескольких переменных в виде суперпозиций непрерывных функций одного переменного и сложения. Докл. АН СССР. 1957;114(5):953—956.
Lorentz G. G. Metric Entropy, Widths, and Superpositions of Functions. American Mathematical Monthly. 1962;69(6):469—485. DOI: 10.1080/00029890.1962.11989915.
Колмогоров А. Н., Тихомиров В. М. ε-энтропия и ε-емкость множеств в функциональных пространствах. Успехи мат. наук.1959;14(2):3—86.
Sprecher D. On the Structure of Continuous Functions of Several Variables. Transactions of the American Mathematical Society. 1965;115(3):340—355. DOI: 10.2307/1994273.
Ostrand P. A. Dimension of Metric Spaces and Hilbert’s Problem 13. Bulletin of the American Mathematical Society. 1965;71(4):619—623. DOI: 10.1090/s0002-9904-1965-11363-5.
Akashi S. Application of ε-entropy Theory to Kolmogorov—Arnold Representation Theorem. Reports on Mathematical Physics. 2001;48(1—2):19—26. DOI: 10.1016/s0034-4877(01)80060-4.
Girosi F., Poggio T. Representation Properties of Networks: Kolmogorov’s Theorem is Irrelevant. NeuralComputation. 1989;1(4):465—469. DOI: 10.1162/neco.1989.1.4.465.
Cybenko G. Approximation by Superpositions of a Sigmoidal Function. Mathematics of Control, Signals, and Systems. 1989;2(4):303—314. CiteSeerX: 10.1.1.441.7873. DOI: 10.1007/BF02551274.
Funahashi K.-I. On the Approximate Realization of Continuous Mappings by Neural Networks. NeuralNetworks. 1989;2(3):183—192. DOI: 10.1016/0893-6080(89)90003-8.
Hornik K., Stinchcombe M., White H. Multilayer Feedforward Networks are Universal Approximators. Neural Networks. 1989;2(5):359—366. DOI: 10.1016/0893-6080(89)90020-8.
Hornik K. Approximation Capabilities of Multilayer Feedforward Networks. Neural Networks. 1991;4(2):251—257. DOI: 10.1016/0893-6080(91)90009-T.
Husaini N. А., Ghazali R., Nazri M. N., Lokman Н. I., Mustafa M. D., Tutut H. Pi-Sigma Neural Networkfor a One-Step-Ahead Temperature Forecasting. International Journal of Computational Intelligence and Applications. 2014;13(4):1450023. DOI: 10.1142/S1469026814500230.
Lu Z., Pu H., Wang F., Hu Z., Wang L. The Expressive Power of Neural Networks: A View from the Width. Режим доступа: https://doi.org/10.48550/arXiv.1709.02540.
Eldan R., Shamir O. The Power of Depth for Feedforward Neural Networks. Proceedings of Machine Learning Research. 2016;49:907—940.
Cohen N., Sharir O., Shashua A. On the Expressive Power of Deep Learning: A Tensor Analysis. Proceedings of Machine Learning Research. 2016;49:698—728.
Telgarsky M. Benefits of Depth in Neural Networks. Proceedings of Machine Learning Research. 2016;49:1517—1539.
Park S., Yun C., Lee J., Shin J. Minimum Width for Universal Approximation. Режим доступа: https://arxiv.org/abs/2006.08859.
Kidger P., Lyons T. Universal Approximation with Deep Narrow Networks. Proceedings of Machine Learning Research. 2020;125:2306—2327.
Leshno M., Ya Lin V., Pinkus A., Schocken S. Multilayer Feedforward Networks with a Nonpolynomial Activation Function Can Approximate Any Function. Neural Networks. 1993;6(6):861—867.
Hanin B., Sellke M. Approximating Continuous Functions by ReLU Nets of Minimal Width. Режим доступа: https://arxiv.org/abs/1710.11278.
Johnson J.Deep, Skinny Neural Networks are not Universal Approximators. Режим доступа: https://arxiv.org/abs/1810.00393.
Kidger P., Lyons T. Universal Approximation with Deep Narrow Networks. Режим доступа: https://arxiv.org/abs/1905.08539.
Maiorov V., Pinkus A. Lower Bounds for Approximation by MLP Neural Networks. Neurocomputing. 1999;25(1—3):81—91. DOI: 10.1016/S0925-2312(98)00111-8.
Guliyev N., Ismailov V. Approximation Capability of Two Hidden Layer Feedforward Neural Networks with Fixed Weights. Neurocomputing. 2018;316:262—269.
Guliyev N., Ismailov V. On the Approximation by Single Hidden Layer Feedforward Neural Networks with Fixed Weights. Neural Networks. 2018;98:296—304.