Generative AI: A Comprehensive Review of Foundational Models and Emerging Methods

Noper Ardi; Isnayanti

Authors

Noper Ardi Politeknik Negeri Batam Author
Isnayanti Institut Teknologi Batam Author

Keywords:

Generative ai, foundational Models, Generative Adversarial Networks, Variational Autoencoders, Large Language Models

Abstract

Generative Artificial Intelligence (AI) has emerged as a transformative field within computer science, heralding a new era of content creation and problem-solving. This comprehensive review charts the evolution of generative models, from the foundational pillars to the cutting-edge methods that are reshaping industries. We begin by examining the seminal architectures that laid the groundwork for the field: Generative Adversarial Networks (GANs), with their unique adversarial training paradigm; Variational Autoencoders (VAEs), which leverage probabilistic graphical models for generation; and the early instantiations of Transformer models that revolutionized sequence-to-sequence tasks. Subsequently, we transition to the current vanguard of generative AI, providing an in-depth analysis of Large Language Models (LLMs). These models have demonstrated unprecedented capabilities in understanding and generating human-like text, leading to a paradigm shift in natural language processing. Concurrently, we explore the rise of Diffusion Models, which have set new benchmarks in high-fidelity image synthesis through a process of iterative denoising. This review synthesizes the theoretical underpinnings, architectural innovations, and practical applications of these models. We also present a comparative analysis, highlighting their respective strengths, limitations, and the evolutionary trajectory of the field. Finally, we discuss the prominent challenges and ethical considerations that accompany the proliferation of generative AI and conclude with a perspective on future research directions that will continue to propel this remarkable domain forward.

References

[1] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).

[2] Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

[3] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).

[4] Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.

[5] Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In International conference on machine learning (pp. 214-223). PMLR.

[6] Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.

[7] Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. In Advances in neural information processing systems (pp. 1877-1901).

[8] Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155.

[9] Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems (pp. 6840-6851).

[10] Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10684-10695).

[11] Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., ... & Liang, P. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.

[12] Weidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P. S., ... & Gabriel, I. (2021). Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359.

[13] Huda, A., & Ardi, N. (2021). Predictive Analytic on Human Resource Department Data Based on Uncertain Numeric Features Classification. International Journal of Interactive Mobile Technologies (iJIM), 15(08), pp. 172–181.

[14] Ardi, N., Lubis, A. I. ., & Isnayanti. (2023). Decision Tree for Predicting the Mortality in Hemodialysis Patient with Diabetes. Jurnal Minfo Polgan, 12(1), 346-356.

[15] Ardi, Noper, S. Supardianto, and Ahmadi Irmansyah Lubis. "Predicting missing value data on IEC TC10 datasets for dissolved gas analysis using tertius algorithm." Journal of Applied Informatics and Computing 7, no. 1 (2023): 50-56.

[16] Ardi, N., Adri, M. and Azhar, N., 2022. Implementasi Arsitektur Hierarchical Model View Controller (HMVC) Dalam Portal Akademik. Jurnal Teknik Komputer dan Informatika, 1(1| Agustus), pp.21-30.

[17] Ardi, N. and Isnayanti, I., 2022. Implementasi Artificial Neural Network dalam Memprediksi Jumlah Peserta Les Bahasa Inggris Menggunakan Metode Back Propagation (Studi Kasus di Lembaga Kursus Global English). Jurnal Teknik Komputer dan Informatika, 1(1| Agustus), pp.45-51.

[18] N. Ardi and D. A. Azhari, “Analisis Aplikasi Lembur Berbasis Web menggunakan System Usability Scale (SUS) Pada PT. XYZ”, J. Ilm. Inform. Glob., vol. 16, no. 1, pp. 10–19, Apr. 2025.

Generative AI: A Comprehensive Review of Foundational Models and Emerging Methods

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Most read articles by the same author(s)

menu

Latest publications