Efficient Feature Selection by using Global Redundancy Minimization and Constraint Score
Akansha A. Tandon1, Sujata Tuppad2
1Akansha A. Tandon, Department of Computer Science & Engineering, BAMU Matsyodari Shikshan Sansthas College of Engineering and Technology Jalna, Aurangabad (Maharashtra), India.
2Sujata Tuppad, Assistant Professor, Matsyodari Shikshan Sanstha’s College of Engineering and Technology, Jalna, Aurangabad (Maharashtra), India.
Manuscript received on October 10, 2016. | Revised Manuscript received on October 16, 2016. | Manuscript published on October 31, 2016. | PP: 18-20 | Volume-3 Issue-6, October 2016. | Retrieval Number: F0387103616
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: A central problem in automatic learning is the identification of a representative set of characteristics from which to construct a classification model for a particular task. This thesis deals with the problem of the selection of characteristics for automatic learning by a correlation – based approach. The central assumption is that good sets of characteristics contain characteristics that are highly correlated with the class but not correlated with each other. A formula for evaluating characteristics, based on ideas derived from test theory, provides an operational definition of this hypothesis. CFS (Correlation based Feature Selection) is an algorithm that couples this evaluation formula with an appropriate correlation measure and a heuristic search strategy. Other experiments compared the CFS to a wrapper – a well-known approach to feature selection that uses the target learning algorithm to evaluate sets of features. In many cases CFS has given results comparable to the envelope, and in general, surpassed the envelope on small sets of data. CFS runs much faster than the wrapper, enabling it to extend to larger sets of data.
Keywords: Feature selection, feature ranking, redundancy minimization, Radial Basis Function, Kernel.