A method for combining mutual information and canonical correlation analysis: Predictive Mutual Information and its use in feature selection

Kursun, Olcay; Sakar, C. Okan

dc.contributor.author	Kursun, Olcay
dc.contributor.author	Sakar, C. Okan
dc.date.accessioned	2021-03-06T08:22:48Z
dc.date.available	2021-03-06T08:22:48Z
dc.date.issued	2012
dc.identifier.citation	Sakar C. O. , Kursun O., "A method for combining mutual information and canonical correlation analysis: Predictive Mutual Information and its use in feature selection", EXPERT SYSTEMS WITH APPLICATIONS, cilt.39, ss.3333-3344, 2012
dc.identifier.issn	0957-4174
dc.identifier.other	vv_1032021
dc.identifier.other	av_e09d932c-cf06-42a3-8550-9130969d8701
dc.identifier.uri	http://hdl.handle.net/20.500.12627/147920
dc.identifier.uri	https://doi.org/10.1016/j.eswa.2011.09.020
dc.description.abstract	Feature selection is a critical step in many artificial intelligence and pattern recognition problems. Shannon's Mutual Information (MI) is a classical and widely used measure of dependence measure that serves as a good feature selection algorithm. However, as it is a measure of mutual information in average, under-sampled classes (rare events) can be overlooked by this measure, which can cause critical false negatives (missing a relevant feature very predictive of some rare but important classes). Shannon's mutual information requires a well sampled database, which is not typical of many fields of modern science (such as biomedical), in which there are limited number of samples to learn from, or at least, not all the classes of the target function (such as certain phenotypes in biomedical) are well-sampled. On the other hand, Kernel Canonical Correlation Analysis (KCCA) is a nonlinear correlation measure effectively used to detect independence but its use for feature selection or ranking is limited due to the fact that its formulation is not intended to measure the amount of information (entropy) of the dependence. In this paper, we propose a hybrid measure of relevance, Predictive Mutual Information (PMI) based on MI, which also accounts for predictability of signals from each other in its calculation as in KCCA. We show that PMI has more improved feature detection capability than MI, especially in catching suspicious coincidences that are rare but potentially important not only for experimental studies but also for building computational models. We demonstrate the usefulness of PM!, and superiority over MI, on both toy and real datasets. (C) 2011 Elsevier Ltd. All rights reserved.
dc.language.iso	eng
dc.subject	Mühendislik
dc.subject	OPERASYON ARAŞTIRMA VE YÖNETİM BİLİMİ
dc.subject	Ekonomi ve İş
dc.subject	Sosyal Bilimler (SOC)
dc.subject	Sosyal ve Beşeri Bilimler
dc.subject	Ekonometri
dc.subject	Yöneylem
dc.subject	Bilgi Sistemleri, Haberleşme ve Kontrol Mühendisliği
dc.subject	Sinyal İşleme
dc.subject	Bilgisayar Bilimleri
dc.subject	Algoritmalar
dc.subject	Mühendislik ve Teknoloji
dc.subject	BİLGİSAYAR BİLİMİ, YAPAY ZEKA
dc.subject	Mühendislik, Bilişim ve Teknoloji (ENG)
dc.subject	Bilgisayar Bilimi
dc.subject	MÜHENDİSLİK, ELEKTRİK VE ELEKTRONİK
dc.title	A method for combining mutual information and canonical correlation analysis: Predictive Mutual Information and its use in feature selection
dc.type	Makale
dc.relation.journal	EXPERT SYSTEMS WITH APPLICATIONS
dc.contributor.department	Bahçeşehir Üniversitesi , ,
dc.identifier.volume	39
dc.identifier.issue	3
dc.identifier.startpage	3333
dc.identifier.endpage	3344
dc.contributor.firstauthorID	74450

Bu öğenin dosyaları:

Dosyalar	Boyut	Biçim	Göster
Bu öğe ile ilişkili dosya yok.

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Makale [92796]

Basit öğe kaydını göster