• Türkçe
    • English
  • English 
    • Türkçe
    • English
  • Login
View Item 
  •   Home
  • Avesis
  • Dokümanı Olanlar
  • Bildiri
  • View Item
  •   Home
  • Avesis
  • Dokümanı Olanlar
  • Bildiri
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

The Effect of Recursive Feature Elimination with Cross-Validation Method on Classification Performance with Different Sizes of Datasets

Author
Akkaya, Berke
Metadata
Show full item record
Abstract
The high dimensionality problem, which is one of the problems encountered in classification problems, arises when there are too many features in the dataset. This problem affects the success of classification models and causes loss of time. Feature selection is one of the methods used to eliminate the high dimensionality problem. Feature selection is defined as the selection of the best features that can represent the original dataset. This process aims to reduce the size of the data by reducing the number of features in the dataset by selecting the most useful and important features for the relevant problem. In this study, the performances of various classification algorithms in different data sizes were compared by using the recursive feature elimination method with cross-validation, which is one of the feature selection methods. Recursive feature elimination with cross-validation is a method that tries to get the most accurate result by eliminating the least important variables with cross- validation. In the study, datasets containing binary classification problems with a balanced distribution were used. Accuracy, ROC-AUC score, and fit time were used as evaluation metrics, while Logistic Regression, Support Vector Machines, Naive Bayes, k-Nearest Neighbors, Stochastic Gradient Descent, Decision Tree, AdaBoost, Multilayer Perceptron, and XGBoost classifiers were used as classification algorithms in the study. When the findings obtained as a result of recursive feature elimination with cross-validation were examined, it was observed that the accuracy increased by 5% on average and the ROC-AUC score increased by 5,3% on average, and the fit time decreased by about 5,1 seconds on average. It has been concluded that Naive Bayes and Multilayer Perceptron classifiers are the most sensitive to feature selection since they are the classifiers whose classification performance increases the most after feature selection.
URI
http://hdl.handle.net/20.500.12627/169394
https://avesis.istanbul.edu.tr/api/publication/2febfa84-2e47-4e77-a1ee-177d9f9ef7b3/file
Collections
  • Bildiri [1228]

Creative Commons Lisansı

İstanbul Üniversitesi Akademik Arşiv Sistemi (ilgili içerikte aksi belirtilmediği sürece) Creative Commons Alıntı-GayriTicari-Türetilemez 4.0 Uluslararası Lisansı ile lisanslanmıştır.

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 


Hakkımızda
Açık Erişim PolitikasıVeri Giriş Rehberleriİletişim
sherpa/romeo
Dergi Adı/ISSN || Yayıncı

Exact phrase only All keywords Any

BaşlıkbaşlayaniçerenISSN

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsTypesThis CollectionBy Issue DateAuthorsTitlesSubjectsTypes

My Account

LoginRegister

Creative Commons Lisansı

İstanbul Üniversitesi Akademik Arşiv Sistemi (ilgili içerikte aksi belirtilmediği sürece) Creative Commons Alıntı-GayriTicari-Türetilemez 4.0 Uluslararası Lisansı ile lisanslanmıştır.

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV