基于模因框架的包装过滤特征选择算法

2024-05-09 00:58

本文主要是介绍基于模因框架的包装过滤特征选择算法,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

#引用

##LaTex

@ARTICLE{4067093,
author={Z. Zhu and Y. S. Ong and M. Dash},
journal={IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)},
title={Wrapper ndash;Filter Feature Selection Algorithm Using a Memetic Framework},
year={2007},
volume={37},
number={1},
pages={70-76},
keywords={biology computing;genetic algorithms;learning (artificial intelligence);pattern classification;search problems;classification problem;feature selection algorithm;genetic algorithm;local search;memetic framework;microarray data set;wrapper filter;Acceleration;Classification algorithms;Computational efficiency;Filters;Genetic algorithms;Machine learning;Machine learning algorithms;Pattern recognition;Pervasive computing;Spatial databases;Chi-square;feature selection;filter;gain ratio;genetic algorithm (GA);hybrid GA (HGA);memetic algorithm (MA);relief;wrapper;Algorithms;Artificial Intelligence;Biomimetics;Computer Simulation;Models, Theoretical;Pattern Recognition, Automated;Software;Systems Theory},
doi={10.1109/TSMCB.2006.883267},
ISSN={1083-4419},
month={Feb},}

##Normal

Z. Zhu, Y. S. Ong and M. Dash, “Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework,” in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 37, no. 1, pp. 70-76, Feb. 2007.
doi: 10.1109/TSMCB.2006.883267
keywords: {biology computing;genetic algorithms;learning (artificial intelligence);pattern classification;search problems;classification problem;feature selection algorithm;genetic algorithm;local search;memetic framework;microarray data set;wrapper filter;Acceleration;Classification algorithms;Computational efficiency;Filters;Genetic algorithms;Machine learning;Machine learning algorithms;Pattern recognition;Pervasive computing;Spatial databases;Chi-square;feature selection;filter;gain ratio;genetic algorithm (GA);hybrid GA (HGA);memetic algorithm (MA);relief;wrapper;Algorithms;Artificial Intelligence;Biomimetics;Computer Simulation;Models, Theoretical;Pattern Recognition, Automated;Software;Systems Theory},
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4067093&isnumber=4067063


#摘要

a novel hybrid wrapper and filter feature selection algorithm for a classification problem using a memetic framework

a filter ranking method
genetic algorithm
univariate feature ranking information

the University of California, Irvine repository and microarray data sets

classification accuracy, number of selected features, and computational efficiency.

memetic algorithm (MA) — balance between local search and genetic search
to maximize search quality and efficiency


#主要内容

  1. filter methods
  2. wrapper methods

##wrapper–filter feature selection algorithm (WFFSA) using a memetic framework

这里写图片描述

WFFSA

Lamarckian learning

local improvement
Genetic operators


###A 编码表示与初始化

这里写图片描述

a chromosome is a binary string of length equal to the total number of features

randomly initialized


###B 目标函数

the classification accuracy

这里写图片描述

S c S_c Sc — the corresponding selected feature subset encoded in chromosome c c c
J ( S c ) J \left( S_c \right) J(Sc) — criterion function


###C LS改进过程

domain knowledge and heuristics

filter ranking methods as memes or LS heuristics

three different filter ranking methods, namely:

  1. ReliefF;
  2. gain ratio;
  3. chi-square.

based on different criteria:

  1. Euclidean distance,
  2. information entropy,
  3. chi-square statistics

basic LS operators:

  1. “Add”: select a feature from Y using the linear ranking selection and move it to X.
  2. “Del”: select a feature from X using the linear ranking selection and move it to Y .

这里写图片描述

The intensity of LS — the LS length l l l and interval w w w
LS length l l l — the maximum number of Del and Add operations in each LS — l 2 l^2 l2 possible combinations of Add and Del operations
interval w w w — the w w w elite chromosomes in the population

until a local optimum or an improvement is reached

  1. Improvement First Strategy: a random choice from the l 2 l^2 l2 combinations. stops once an improvement is obtained either in terms of classification accuracy or a reduction in the number of selected features without deterioration in accuracy greater than ε ε ε.
    这里写图片描述
  2. Greedy Strategy: carries out all possible l 2 l^2 l2 combinations — the best improved solution
    这里写图片描述
  3. Sequential Strategy: the Add operation searches for the most significant feature y y y in Y Y Y in a sequential manner; the Del operation searches for the least significant feature x from X in a sequential manner
  4. Evolutionary Operators:
    这里写图片描述
    这里写图片描述
  5. Computational Complexity:
    The ranking of features based on the filter methods — linear time complexity — a one-time offline cost — negligible
    the computational cost of a single fitness evaluation — the basic unit of computational cost
    GA — O ( p g ) O(pg) O(pg): p p p — the size of population, g g g — the number of search generations
    +improvement first strategy — O ( p g + l 2 w g / 2 ) O (pg + l^2wg/2) O(pg+l2wg/2)
    +the greedy strategy ( l 2 w / p l^2w/p l2w/p) — O ( p g + l 2 w g ) O (pg + l^2wg) O(pg+l2wg)
    +the sequential strategy ( ( 2 ∣ Y ∣ − l ) l / 2 (2|Y | − l)l/2 (2Yl)l/2 and ( 2 ∣ X ∣ + l ) l / 2 2|X| + l)l/2 2X+l)l/2 — Add and Del operations — n l w nlw nlw) — O ( p g + n l w g ) O(pg + nlwg) O(pg+nlwg)
    KaTeX parse error: Unexpected character: '' at position 8: n \gg ̲ lKaTeX parse error: Unexpected character: '' at position 8: nlw \gg̲ l^2w > l^2w/2 — sequential LS strategy requires significantly more computations

##试验

UCI data sets
ALL/AML, Colon, NCI60, and SRBCT

population size — 30 30 30 and 50 50 50 or 100 100 100 (microarray data sets)
fitness function calls — 6000 6000 6000 and 10000 10000 10000 or 20000 20 000 20000 (microarray data sets)

这里写图片描述

the one nearest neighbor (1NN) classifier
the leave-one-out cross validation (LOOCV)

这里写图片描述
这里写图片描述
这里写图片描述
这里写图片描述

这篇关于基于模因框架的包装过滤特征选择算法的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/971941

相关文章

java敏感词过滤的实现方式

《java敏感词过滤的实现方式》文章描述了如何搭建敏感词过滤系统来防御用户生成内容中的违规、广告或恶意言论,包括引入依赖、定义敏感词类、非敏感词类、替换词类和工具类等步骤,并指出资源文件应放在src/... 目录1.引入依赖2.定义自定义敏感词类3.定义自定义非敏感类4.定义自定义替换词类5.最后定义工具类

C++11中的包装器实战案例

《C++11中的包装器实战案例》本文给大家介绍C++11中的包装器实战案例,本文结合实例代码给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友参考下吧... 目录引言1.std::function1.1.什么是std::function1.2.核心用法1.2.1.包装普通函数1.2.

深入理解Mysql OnlineDDL的算法

《深入理解MysqlOnlineDDL的算法》本文主要介绍了讲解MysqlOnlineDDL的算法,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小... 目录一、Online DDL 是什么?二、Online DDL 的三种主要算法2.1COPY(复制法)

Java 缓存框架 Caffeine 应用场景解析

《Java缓存框架Caffeine应用场景解析》文章介绍Caffeine作为高性能Java本地缓存框架,基于W-TinyLFU算法,支持异步加载、灵活过期策略、内存安全机制及统计监控,重点解析其... 目录一、Caffeine 简介1. 框架概述1.1 Caffeine的核心优势二、Caffeine 基础2

GSON框架下将百度天气JSON数据转JavaBean

《GSON框架下将百度天气JSON数据转JavaBean》这篇文章主要为大家详细介绍了如何在GSON框架下实现将百度天气JSON数据转JavaBean,文中的示例代码讲解详细,感兴趣的小伙伴可以了解下... 目录前言一、百度天气jsON1、请求参数2、返回参数3、属性映射二、GSON属性映射实战1、类对象映

解决若依微服务框架启动报错的问题

《解决若依微服务框架启动报错的问题》Invalidboundstatement错误通常由MyBatis映射文件未正确加载或Nacos配置未读取导致,需检查XML的namespace与方法ID是否匹配,... 目录ruoyi-system模块报错报错详情nacos文件目录总结ruoyi-systnGLNYpe

Python Web框架Flask、Streamlit、FastAPI示例详解

《PythonWeb框架Flask、Streamlit、FastAPI示例详解》本文对比分析了Flask、Streamlit和FastAPI三大PythonWeb框架:Flask轻量灵活适合传统应用... 目录概述Flask详解Flask简介安装和基础配置核心概念路由和视图模板系统数据库集成实际示例Stre

Olingo分析和实践之OData框架核心组件初始化(关键步骤)

《Olingo分析和实践之OData框架核心组件初始化(关键步骤)》ODataSpringBootService通过初始化OData实例和服务元数据,构建框架核心能力与数据模型结构,实现序列化、URI... 目录概述第一步:OData实例创建1.1 OData.newInstance() 详细分析1.1.1

Java中的雪花算法Snowflake解析与实践技巧

《Java中的雪花算法Snowflake解析与实践技巧》本文解析了雪花算法的原理、Java实现及生产实践,涵盖ID结构、位运算技巧、时钟回拨处理、WorkerId分配等关键点,并探讨了百度UidGen... 目录一、雪花算法核心原理1.1 算法起源1.2 ID结构详解1.3 核心特性二、Java实现解析2.

Spring 框架之Springfox使用详解

《Spring框架之Springfox使用详解》Springfox是Spring框架的API文档工具,集成Swagger规范,自动生成文档并支持多语言/版本,模块化设计便于扩展,但存在版本兼容性、性... 目录核心功能工作原理模块化设计使用示例注意事项优缺点优点缺点总结适用场景建议总结Springfox 是