Benchmarking Transformer Models Against Classical Approaches for Fake Review Detection on the Deceptive Opinion Spam Corpus
Main Article Content
Abstract
In today’s digital environment, online reviews have become one of the key factors that influence the decisions of customers. This is especially true in areas such as e-commerce, travel and the hospitality industry, where buyers depend heavily on the shared experiences of others before making a choice. At the same time, the growing issue of fake or fabricated reviews has raised serious concerns, as it reduces the reliability of online platforms and creates confusion for consumers. Detecting such misleading reviews is not an easy task, since the language used in them is often very close to what is seen in genuine opinions. In the present work, an attempt has been made to compare the performance of traditional machine learning techniques with that of transformer-based deep learning models for the identification of fake reviews. As part of the baseline, Logistic Regression and Linear SVM were applied with TF-IDF features. On the other hand, advanced architectures like BERT, RoBERTa and XLNet were fine-tuned on the Deceptive Opinion Spam Corpus. The results clearly indicated that the classical models gave accuracies in the range of mid-80 percent, whereas the transformer-based models performed much better, crossing or coming close to 90 percent. Among the transformer models, RoBERTa showed the most balanced performance across precision and recall, XLNet gave the highest recall, which is very important when sensitivity is the main concern, while BERT achieved competitive results with less demand on computing resources.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Copyright of the published article belongs to the authors and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 (CC BY SA) International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See the Effect of Open Access).
References
[1] J. Banks, “Local Consumer Review Survey 2022,” BrightLocal Ltd, 2022. https://www.brightlocal.com/research/local-consumer-review-survey-2022/
[2] B. Yalcinkaya and D. R. Just, “Comparison of Customer Reviews for Local and Chain Restaurants: Multilevel Approach to Google Reviews Data,” Cornell Hosp. Q., vol. 64, no. 1, pp. 63–73, 2023, https://doi.org/10.1177/19389655221102388
[3] M. Luca and G. Zervas, “Fake it till you make it: Reputation, competition, and yelp review fraud,” Manage. Sci., vol. 62, no. 12, pp. 3412–3427, 2016, https://doi.org/10.1287/mnsc.2015.2304
[4] D. Mayzlin, Y. Dover, and J. Chevalier, “Promotional reviews: An empirical investigation of online review manipulation,” Am. Econ. Rev., vol. 104, no. 8, pp. 2421–2455, 2014, https://doi.org/10.1257/aer.104.8.2421
[5] A. Heydari, M. A. Tavakoli, N. Salim, and Z. Heydari, “Detection of review spam: A survey,” Expert Syst. Appl., vol. 42, no. 7, pp. 3634–3642, 2015, https://doi.org/10.1016/j.eswa.2014.12.029
[6] A. Mukherjee, B. Liu, and N. Glance, “Spotting fake reviewer groups in consumer reviews,” in Proceedings of the 21st international conference on World Wide Web, 2012, pp. 191–200.
[7] S. Feng, R. Banerjee, and Y. Choi, “Syntactic stylometry for deception detection,” in 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference, 2012, vol. 2, pp. 171–175.
[8] N. Jindal and B. Liu, “Analyzing and detecting review spam,” in Proceedings - IEEE International Conference on Data Mining, ICDM, 2007, pp. 547–552. https://doi.org/10.1109/ICDM.2007.68
[9] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015, https://doi.org/10.1038/nature14539
[10] T. Zhao, J. Du, Y. Shao, and A. Li, “Aspect-Based Sentiment Analysis Using Local Context Focus Mechanism with DeBERTa,” in 2023 5th International Conference on Data-Driven Optimization of Complex Systems, DOCS 2023, 2023, pp. 1–6. https://doi.org/10.1109/DOCS60977.2023.10294548
[11] A. A. Harby and F. Zulkernine, “A Comparative Analysis of Graph Neural Networks for Fake News Detection,” in Proceedings - International Computer Software and Applications Conference, 2023, vol. 2023-June, pp. 1215–1222. https://doi.org/10.1109/COMPSAC57700.2023.00184
[12] H. Li, Z. Chen, B. Liu, X. Wei, and J. Shao, “Spotting Fake Reviews via Collective Positive-Unlabeled Learning,” in Proceedings - IEEE International Conference on Data Mining, ICDM, 2014, vol. 2015-Janua, no. January, pp. 899–904. https://doi.org/10.1109/ICDM.2014.47
[13] M. Crawford, T. M. Khoshgoftaar, J. D. Prusa, A. N. Richter, and H. Al Najada, “Survey of review spam detection using machine learning techniques,” J. Big Data, vol. 2, no. 1, p. 23, 2015, https://doi.org/10.1186/s40537-015-0029-9
[14] N. Jindal and B. Liu, “Opinion spam and analysis,” in WSDM’08 - Proceedings of the 2008 International Conference on Web Search and Data Mining, 2008, pp. 219–229. https://doi.org/10.1145/1341531.1341560
[15] M. Ott, Y. Choi, C. Cardie, and J. T. Hancock, “Finding deceptive opinion spam by any stretch of the imagination,” ACL-HLT 2011 - Proc. 49th Annu. Meet. Assoc. Comput. Linguist. Hum. Lang. Technol., vol. 1, pp. 309–319, 2011.
[16] S. Feng, L. Xing, A. Gogar, and Y. Choi, “Distributional footprints of deceptive product reviews,” in ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media, 2012, vol. 6, no. 1, pp. 98–105. https://doi.org/10.1609/icwsm.v6i1.14275
[17] A. Mukherjee, V. Venkataraman, B. Liu, and N. Glance, “What yelp fake review filter might be doing?,” in Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013, 2013, vol. 7, no. 1, pp. 409–418. https://doi.org/10.1609/icwsm.v7i1.14389
[18] E. P. Lim, V. A. Nguyen, N. Jindal, B. Liu, and H. W. Lauw, “Detecting product review spammers using rating behaviors,” in International Conference on Information and Knowledge Management, Proceedings, 2010, pp. 939–948. https://doi.org/10.1145/1871437.1871557
[19] J. Lu, X. Zhan, G. Liu, X. Zhan, and X. Deng, “BSTC: A Fake Review Detection Model Based on a Pre-Trained Language Model and Convolutional Neural Network,” Electron., vol. 12, no. 10, p. 2165, 2023, https://doi.org/10.3390/electronics12102165
[20] G. Bathla, P. Singh, R. K. Singh, E. Cambria, and R. Tiwari, “Intelligent fake reviews detection based on aspect extraction and analysis using deep learning,” Neural Comput. Appl., vol. 34, no. 22, pp. 20213–20229, 2022, https://doi.org/10.1007/s00521-022-07531-8
[21] N. Wang, J. Yang, X. Kong, and Y. Gao, “A fake review identification framework considering the suspicion degree of reviews with time burst characteristics,” Expert Syst. Appl., vol. 190, p. 116207, 2022, https://doi.org/10.1016/j.eswa.2021.116207
[22] G. Stanton and A. A. Irissappane, “GANs for semi-supervised opinion spam detection,” IJCAI Int. Jt. Conf. Artif. Intell., vol. 2019-Augus, pp. 5204–5210, 2019, https://doi.org/10.24963/ijcai.2019/723
[23] M. Huss and S. Förster, “Vorstoss und Rückzug der Gletscher während der Kleinen Eiszeit,” arXiv Prepr. arXiv1907.00001, no. 2011, 2019, [Online]. Available: https://arxiv.org/abs/1907.00001
[24] L. Xin, “Spin-1 Bosons in the Presence of Spin-orbit Coupling,” arXiv Prepr. arXiv1805.00001, 2018, [Online]. Available: http://arxiv.org/abs/1805.00001
[25] S. Shehnepoor, M. Salehi, R. Farahbakhsh, and N. Crespi, “NetSpam: A network-based spam detection framework for reviews in online social media,” IEEE Trans. Inf. Forensics Secur., vol. 12, no. 7, pp. 1585–1595, 2017, https://doi.org/10.1109/TIFS.2017.2675361
[26] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, pp. 4171–4186, 2019.
[27] Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pre-training Approach,” arXiv Prepr. arXiv1907.11692, 2019, [Online]. Available: http://arxiv.org/abs/1907.11692
[28] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, “XLNet: Generalized autoregressive pre-training for language understanding,” Adv. Neural Inf. Process. Syst., vol. 32, 2019.
[29] R. Gupta, V. Jindal, and I. Kashyap, “Recent state-of-the-art of fake review detection: a comprehensive review,” Knowl. Eng. Rev., vol. 39, p. e8, 2024, https://doi.org/10.1017/S0269888924000067
[30] L. C. Cheng, Y. T. Wu, C. T. Chao, and J. H. Wang, “Detecting fake reviewers from the social context with a graph neural network method,” Decis. Support Syst., vol. 179, p. 114150, 2024, https://doi.org/10.1016/j.dss.2023.114150
[31] S. Xu, H. Cuan, Z. Yin, and C. Yin, “A Hybridized Approach for Enhanced Fake Review Detection,” IEEE Trans. Comput. Soc. Syst., vol. 11, no. 6, pp. 7448–7466, 2024, https://doi.org/10.1109/TCSS.2024.3411635
[32] M. Zhang, Y. Zhang, and X. Zhang, “SGAN-SAM-ERNIE: A Novel and Effective Detection Scheme for Chinese Fake Reviews,” IEEE Access, vol. 12, pp. 114190–114197, 2024, https://doi.org/10.1109/ACCESS.2024.3445354
[33] Y. Pan and L. Xu, “Detecting Fake Online Reviews: An Unsupervised Detection Method With a Novel Performance Evaluation,” Int. J. Electron. Commer., vol. 28, no. 1, pp. 84–107, 2024, https://doi.org/10.1080/10864415.2023.2295067
[34] P. Phukon, P. Potikas, and K. Potika, “Detecting Fake Reviews Using Aspect-Based Sentiment Analysis and Graph Convolutional Networks,” Appl. Sci., vol. 15, no. 7, 2025, https://doi.org/10.3390/app15073771
[35] J. Salminen, M. Mustak, S. G. Jung, H. Makkonen, and B. J. Jansen, “Decoding deception in the online marketplace: enhancing fake review detection with psycholinguistics and transformer models,” J. Mark. Anal., pp. 1–18, 2025, https://doi.org/10.1057/s41270-025-00393-8
[36] M. Puttarattanamanee, L. Boongasame, and K. Thammarak, “A Comparative Study of Sentiment Analysis Methods for Detecting Fake Reviews in E-Commerce,” HighTech Innov. J., vol. 4, no. 2, pp. 349–363, 2023, https://doi.org/10.28991/HIJ-2023-04-02-08
[37] H. Aghakhani, A. MacHiry, S. Nilizadeh, C. Kruegel, and G. Vigna, “Detecting deceptive reviews using generative adversarial networks,” in Proceedings - 2018 IEEE Symposium on Security and Privacy Workshops, SPW 2018, 2018, pp. 89–95. https://doi.org/10.1109/SPW.2018.00022
[38] P. He, J. Gao, and W. Chen, “Debertav3: Improving Deberta Using Electra-Style Pre-Training With Gradient-Disentangled Embedding Sharing,” 11th Int. Conf. Learn. Represent. ICLR 2023, 2023.
[39] Y. Ren and D. Ji, “Neural networks for deceptive opinion spam detection: An empirical study,” Inf. Sci. (Ny)., vol. 385–386, pp. 213–224, 2017, https://doi.org/10.1016/j.ins.2017.01.015
[40] W. Zhang, Y. Du, T. Yoshida, and Q. Wang, “DRI-RCNN: An approach to deceptive review identification using recurrent convolutional neural network,” Inf. Process. Manag., vol. 54, no. 4, pp. 576–592, 2018, https://doi.org/10.1016/j.ipm.2018.03.007
[41] R. Mohawesh, S. Xu, M. Springer, M. Al-Hawawreh, and S. Maqsood, “Fake or Genuine? Contextualised Text Representation for Fake Review Detection,” arXiv Prepr. arXiv2112.14343, pp. 137–148, 2021, https://doi.org/10.5121/csit.2021.112311
[42] S. Geetha, E. Elakiya, R. S. Kanmani, and M. K. Das, “High performance fake review detection using pretrained DeBERTa optimized with Monarch Butterfly paradigm,” Sci. Rep., vol. 15, no. 1, p. 7445, 2025, https://doi.org/10.1038/s41598-025-89453-8