ぴかりんの頭の中味

主に食べ歩きの記録。北海道室蘭市在住。

【論】Duan,2005,Multiple SVM-RFE for gene selection~

2007年09月12日 22時09分48秒 | 論文記録
Kai-Bo Duan, Jagath C.Rajapakse, Haiying Wang, Francisco Azuaje
Multiple SVM-RFE for gene selection in cancer classification with expression data
Nanobioscience, IEEE Transactions on, Volume:4, Issue:3, page(s):228-234
[PDF][Web Site]

・遺伝子抽出法であるSVM-RFEを改良したMultiple SVM-RFEの紹介
・データ
1.Breast cancer [Hedenfalk]
2.Colon Tumor [Alon]
3.ALL-AML Leukemia [Golub]
4.Lung Cancer [Gavin]
・実験
1.SVM-RFEとMSVM-RFEをcross validationで比較・評価
2.抽出した遺伝子をGOで評価

・方法「This paper proposes a new feature selection method that uses a backward elimination procedure similar to that implemented in support vector machine recursive feature elimination (SVM-RFE). Unlike the SVM-RFE method, at each step, the proposed approach computes the feature ranking score from a statistical analysis of weight vectors of multiple linear SVMs trained on subsamples of the original training data.
・SVM-RFEとは「Nested subsets of features are selected in a sequential backward elimination manner, which starts with all the feature variables and removes one feature variable at a time. At each step, the coefficients of the weight vector w of a linear SVM are used to compute the feature ranking score.
・問題点「Due to computing efficiency reasons, the algorithm can be generalized to remove more than one feature per step [9]. However, the removal of several features at at time may degrade the performance of the feature selection method.
・ミソ「The bootstrap stabilization idea can be applied to SVM-RFE. However, instead of applying this idea on SVM-RFE as a whole, we may apply it on each step of the recursive procedure of SVM-RFE.
・特性「The proposed MSVM-RFE is computationally more expensive than SVM-RFE. However, as feature selection is a prestep for building a good classifier, it is worthwhile to go through a computationally more expensive way if a better feature subset can be selected.
・GOについて「It comprises three hierarchies, sometimes referred to as taxonomies or "aspects," that respectively hold terms describing the molecular function (MF), biological process (BP), and cellular component (CC). The vocabularies (one for each ontology) and their relationships are represented in the form of disrected acyclic graphs(DAGs),
・GOの意義「Thus, relationships between GO-based similarity and gene expression correlation may offer a new approach to assessing the relevance of a set of genes selected.
・結論「We conclude that: 1) the proposed MSVM-RFE method can select better gene subsets than SVM-RFE and improve the cancer classification accuracy; 2) gene selection also improves the performance of SVMs and is a necessary step for cancer classification with gene expression data; and 3) GO-based similarity values of pairs of genes belonging to subsets selected by MSVM-RFE are significantly low, which may be seen as an indicator of fuctional diversity (or redundancy reduction).
コメント
  • X
  • Facebookでシェアする
  • はてなブックマークに追加する
  • LINEでシェアする