Evaluating Adan vs. Adam: an analysis of optimizer performance in deep learning
Choosing a suitable optimization algorithm in deep learning is essential for effective model development as it significantly influences convergence speed, model performance, and the success of the train- ing process. Optimizers play an essential role in adjusting the model’s parameters to minimize e...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Proceeding Paper |
| Language: | en |
| Published: |
Springer Nature Link
2025
|
| Subjects: | |
| Online Access: | http://irep.iium.edu.my/120526/7/120526_Evaluating%20Adan%20vs.%20Adam.pdf http://irep.iium.edu.my/120526/ https://link.springer.com/chapter/10.1007/978-3-031-82931-4_19 https://doi.org/10.1007/978-3-031-82931-4_19 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Choosing a suitable optimization algorithm in deep learning is essential for effective model development as it significantly influences convergence speed, model performance, and the success of the train- ing process. Optimizers play an essential role in adjusting the model’s parameters to minimize errors, assisting the learning process during the model development. With various optimization algorithms available, choosing the one that best suits the deep learning model and dataset can make a substantial difference in achieving optimal results. Adaptive Moment Estimation (Adam) and Adaptive Nesterov Accelerated Gradi- ent (Adan), two well-known optimizers, are widely used in deep learning for their ability to handle large-scale data and complex models efficiently. While Adam is known for its balance between speed and reliability, Adan builds on this by incorporating momentum and lookahead mechanisms to enhance the model’s performance. However, choosing the right opti- mizer for different tasks can be challenging, as each optimizer offers var- ious advantages and disadvantages. This paper, therefore, explores the comparative effectiveness of Adam and Adan optimizers, analyzing their impact on convergence speed, model performance, and overall training success on different classifications tasks, which are image and text classifi- cations. The results show that Adam performs better initially, but prone to overfitting. On the other hand, for image classification tasks, Adan provides more consistent optimisation across extended training periods. Based on these results, this paper aims to provide insights into the strengths and limitations of each optimizer, highlighting the importance of considering task-specific requirements when selecting an optimization algorithm for deep learning models. |
|---|
