基于模因框架的包装过滤特征选择算法

2024-05-09 00:58

本文主要是介绍基于模因框架的包装过滤特征选择算法,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

#引用

##LaTex

@ARTICLE{4067093,
author={Z. Zhu and Y. S. Ong and M. Dash},
journal={IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)},
title={Wrapper ndash;Filter Feature Selection Algorithm Using a Memetic Framework},
year={2007},
volume={37},
number={1},
pages={70-76},
keywords={biology computing;genetic algorithms;learning (artificial intelligence);pattern classification;search problems;classification problem;feature selection algorithm;genetic algorithm;local search;memetic framework;microarray data set;wrapper filter;Acceleration;Classification algorithms;Computational efficiency;Filters;Genetic algorithms;Machine learning;Machine learning algorithms;Pattern recognition;Pervasive computing;Spatial databases;Chi-square;feature selection;filter;gain ratio;genetic algorithm (GA);hybrid GA (HGA);memetic algorithm (MA);relief;wrapper;Algorithms;Artificial Intelligence;Biomimetics;Computer Simulation;Models, Theoretical;Pattern Recognition, Automated;Software;Systems Theory},
doi={10.1109/TSMCB.2006.883267},
ISSN={1083-4419},
month={Feb},}

##Normal

Z. Zhu, Y. S. Ong and M. Dash, “Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework,” in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 37, no. 1, pp. 70-76, Feb. 2007.
doi: 10.1109/TSMCB.2006.883267
keywords: {biology computing;genetic algorithms;learning (artificial intelligence);pattern classification;search problems;classification problem;feature selection algorithm;genetic algorithm;local search;memetic framework;microarray data set;wrapper filter;Acceleration;Classification algorithms;Computational efficiency;Filters;Genetic algorithms;Machine learning;Machine learning algorithms;Pattern recognition;Pervasive computing;Spatial databases;Chi-square;feature selection;filter;gain ratio;genetic algorithm (GA);hybrid GA (HGA);memetic algorithm (MA);relief;wrapper;Algorithms;Artificial Intelligence;Biomimetics;Computer Simulation;Models, Theoretical;Pattern Recognition, Automated;Software;Systems Theory},
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4067093&isnumber=4067063


#摘要

a novel hybrid wrapper and filter feature selection algorithm for a classification problem using a memetic framework

a filter ranking method
genetic algorithm
univariate feature ranking information

the University of California, Irvine repository and microarray data sets

classification accuracy, number of selected features, and computational efficiency.

memetic algorithm (MA) — balance between local search and genetic search
to maximize search quality and efficiency


#主要内容

  1. filter methods
  2. wrapper methods

##wrapper–filter feature selection algorithm (WFFSA) using a memetic framework

这里写图片描述

WFFSA

Lamarckian learning

local improvement
Genetic operators


###A 编码表示与初始化

这里写图片描述

a chromosome is a binary string of length equal to the total number of features

randomly initialized


###B 目标函数

the classification accuracy

这里写图片描述

S c S_c Sc — the corresponding selected feature subset encoded in chromosome c c c
J ( S c ) J \left( S_c \right) J(Sc) — criterion function


###C LS改进过程

domain knowledge and heuristics

filter ranking methods as memes or LS heuristics

three different filter ranking methods, namely:

  1. ReliefF;
  2. gain ratio;
  3. chi-square.

based on different criteria:

  1. Euclidean distance,
  2. information entropy,
  3. chi-square statistics

basic LS operators:

  1. “Add”: select a feature from Y using the linear ranking selection and move it to X.
  2. “Del”: select a feature from X using the linear ranking selection and move it to Y .

这里写图片描述

The intensity of LS — the LS length l l l and interval w w w
LS length l l l — the maximum number of Del and Add operations in each LS — l 2 l^2 l2 possible combinations of Add and Del operations
interval w w w — the w w w elite chromosomes in the population

until a local optimum or an improvement is reached

  1. Improvement First Strategy: a random choice from the l 2 l^2 l2 combinations. stops once an improvement is obtained either in terms of classification accuracy or a reduction in the number of selected features without deterioration in accuracy greater than ε ε ε.
    这里写图片描述
  2. Greedy Strategy: carries out all possible l 2 l^2 l2 combinations — the best improved solution
    这里写图片描述
  3. Sequential Strategy: the Add operation searches for the most significant feature y y y in Y Y Y in a sequential manner; the Del operation searches for the least significant feature x from X in a sequential manner
  4. Evolutionary Operators:
    这里写图片描述
    这里写图片描述
  5. Computational Complexity:
    The ranking of features based on the filter methods — linear time complexity — a one-time offline cost — negligible
    the computational cost of a single fitness evaluation — the basic unit of computational cost
    GA — O ( p g ) O(pg) O(pg): p p p — the size of population, g g g — the number of search generations
    +improvement first strategy — O ( p g + l 2 w g / 2 ) O (pg + l^2wg/2) O(pg+l2wg/2)
    +the greedy strategy ( l 2 w / p l^2w/p l2w/p) — O ( p g + l 2 w g ) O (pg + l^2wg) O(pg+l2wg)
    +the sequential strategy ( ( 2 ∣ Y ∣ − l ) l / 2 (2|Y | − l)l/2 (2Yl)l/2 and ( 2 ∣ X ∣ + l ) l / 2 2|X| + l)l/2 2X+l)l/2 — Add and Del operations — n l w nlw nlw) — O ( p g + n l w g ) O(pg + nlwg) O(pg+nlwg)
    KaTeX parse error: Unexpected character: '' at position 8: n \gg ̲ lKaTeX parse error: Unexpected character: '' at position 8: nlw \gg̲ l^2w > l^2w/2 — sequential LS strategy requires significantly more computations

##试验

UCI data sets
ALL/AML, Colon, NCI60, and SRBCT

population size — 30 30 30 and 50 50 50 or 100 100 100 (microarray data sets)
fitness function calls — 6000 6000 6000 and 10000 10000 10000 or 20000 20 000 20000 (microarray data sets)

这里写图片描述

the one nearest neighbor (1NN) classifier
the leave-one-out cross validation (LOOCV)

这里写图片描述
这里写图片描述
这里写图片描述
这里写图片描述

这篇关于基于模因框架的包装过滤特征选择算法的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/971941

相关文章

C++ HTTP框架推荐(特点及优势)

《C++HTTP框架推荐(特点及优势)》:本文主要介绍C++HTTP框架推荐的相关资料,本文通过实例代码给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友参考下吧... 目录1. Crow2. Drogon3. Pistache4. cpp-httplib5. Beast (Boos

SpringBoot基础框架详解

《SpringBoot基础框架详解》SpringBoot开发目的是为了简化Spring应用的创建、运行、调试和部署等,使用SpringBoot可以不用或者只需要很少的Spring配置就可以让企业项目快... 目录SpringBoot基础 – 框架介绍1.SpringBoot介绍1.1 概述1.2 核心功能2

使用雪花算法产生id导致前端精度缺失问题解决方案

《使用雪花算法产生id导致前端精度缺失问题解决方案》雪花算法由Twitter提出,设计目的是生成唯一的、递增的ID,下面:本文主要介绍使用雪花算法产生id导致前端精度缺失问题的解决方案,文中通过代... 目录一、问题根源二、解决方案1. 全局配置Jackson序列化规则2. 实体类必须使用Long封装类3.

Springboot实现推荐系统的协同过滤算法

《Springboot实现推荐系统的协同过滤算法》协同过滤算法是一种在推荐系统中广泛使用的算法,用于预测用户对物品(如商品、电影、音乐等)的偏好,从而实现个性化推荐,下面给大家介绍Springboot... 目录前言基本原理 算法分类 计算方法应用场景 代码实现 前言协同过滤算法(Collaborativ

Spring框架中@Lazy延迟加载原理和使用详解

《Spring框架中@Lazy延迟加载原理和使用详解》:本文主要介绍Spring框架中@Lazy延迟加载原理和使用方式,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐... 目录一、@Lazy延迟加载原理1.延迟加载原理1.1 @Lazy三种配置方法1.2 @Component

openCV中KNN算法的实现

《openCV中KNN算法的实现》KNN算法是一种简单且常用的分类算法,本文主要介绍了openCV中KNN算法的实现,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的... 目录KNN算法流程使用OpenCV实现KNNOpenCV 是一个开源的跨平台计算机视觉库,它提供了各

springboot+dubbo实现时间轮算法

《springboot+dubbo实现时间轮算法》时间轮是一种高效利用线程资源进行批量化调度的算法,本文主要介绍了springboot+dubbo实现时间轮算法,文中通过示例代码介绍的非常详细,对大家... 目录前言一、参数说明二、具体实现1、HashedwheelTimer2、createWheel3、n

SpringIntegration消息路由之Router的条件路由与过滤功能

《SpringIntegration消息路由之Router的条件路由与过滤功能》本文详细介绍了Router的基础概念、条件路由实现、基于消息头的路由、动态路由与路由表、消息过滤与选择性路由以及错误处理... 目录引言一、Router基础概念二、条件路由实现三、基于消息头的路由四、动态路由与路由表五、消息过滤

SpringBoot实现MD5加盐算法的示例代码

《SpringBoot实现MD5加盐算法的示例代码》加盐算法是一种用于增强密码安全性的技术,本文主要介绍了SpringBoot实现MD5加盐算法的示例代码,文中通过示例代码介绍的非常详细,对大家的学习... 目录一、什么是加盐算法二、如何实现加盐算法2.1 加盐算法代码实现2.2 注册页面中进行密码加盐2.

Python Dash框架在数据可视化仪表板中的应用与实践记录

《PythonDash框架在数据可视化仪表板中的应用与实践记录》Python的PlotlyDash库提供了一种简便且强大的方式来构建和展示互动式数据仪表板,本篇文章将深入探讨如何使用Dash设计一... 目录python Dash框架在数据可视化仪表板中的应用与实践1. 什么是Plotly Dash?1.1