Jianjun Yu, Jindan Yu, Arpit A Almal, Saravana M Dhanasekaran, Debashis Ghosh, William P Worzel, and Arul M Chinnaiyan
Feature Selection and Molecular Classification of Cancer Using Genetic Programming
Neoplasia. 2007 April; 9(4): 292-303.
[PDF][Web Site]
・遺伝子抽出とサンプルクラス分け法として、GP (Genetic Programming) を応用する。クラス分けの結果を繰り返しフィードバックし、最適な遺伝子セットを抽出する。
・データ
1.SRBCT data (NB, RMS, BL, EWS) [Khan]
2.Lung adenocarcinoma data(high-risk group and low-risk group) [Beer]
3.Three prostate cancer data (benign prostate samples (BENIGN) and PCA) [Lapointe, Dhanasekaran, Yu]
4.Two prostate cancer data (PCA and MET) [LaTulippe, Yu]
・比較法
1.Compound covariate predictor
2.3-Nearest neighbors
3.Nearest centroid
4.SVMs
5.DLDA
・GPとは「Genetic programming (GP) is a type of machine learning technique that uses evolutionary algorithm to simulate natural selection as well as population dynamics, hence leading to simple and comprehensible classifiers.」
・問題点「However, the potential of GP in cancer classification has not been fully explored. For example, GP classifiers indentified from one data set have not been validated in independent data sets.」
・特徴「Examination of classifier genes have revealed that GP classifiers (Table 4 and 5) are much simpler than predictors reported by other approaches, where more than 10 genes are often requied to build an effective predictor. GP, by contrast, can use only 2 to 5 genes to produce effective classifiers and achive high prediction power.」
・特徴「A major difference between GP and other machine learning techniques is its mathematical connections between genes within a classifier.」
・特徴「An inrinsic advantage of GP is that it automatically selects a small number of feature genes during "evoliution".」
・特徴「However, GP has added advantages over other algorithms. Its special features include the following: 1) the ability to automatically select a small number of genes as potential discriminative genes, 2) the ability to combine such genes and construct a simple and comprehensible classifier, and 3) the capability to generate multiple candidate classifiers.」
・Evolutionary algorithm が分かっていない[図]。cDNAとAffyのデータを混ぜて計算しているのが目新しい。
Feature Selection and Molecular Classification of Cancer Using Genetic Programming
Neoplasia. 2007 April; 9(4): 292-303.
[PDF][Web Site]
・遺伝子抽出とサンプルクラス分け法として、GP (Genetic Programming) を応用する。クラス分けの結果を繰り返しフィードバックし、最適な遺伝子セットを抽出する。
・データ
1.SRBCT data (NB, RMS, BL, EWS) [Khan]
2.Lung adenocarcinoma data(high-risk group and low-risk group) [Beer]
3.Three prostate cancer data (benign prostate samples (BENIGN) and PCA) [Lapointe, Dhanasekaran, Yu]
4.Two prostate cancer data (PCA and MET) [LaTulippe, Yu]
・比較法
1.Compound covariate predictor
2.3-Nearest neighbors
3.Nearest centroid
4.SVMs
5.DLDA
・GPとは「Genetic programming (GP) is a type of machine learning technique that uses evolutionary algorithm to simulate natural selection as well as population dynamics, hence leading to simple and comprehensible classifiers.」
・問題点「However, the potential of GP in cancer classification has not been fully explored. For example, GP classifiers indentified from one data set have not been validated in independent data sets.」
・特徴「Examination of classifier genes have revealed that GP classifiers (Table 4 and 5) are much simpler than predictors reported by other approaches, where more than 10 genes are often requied to build an effective predictor. GP, by contrast, can use only 2 to 5 genes to produce effective classifiers and achive high prediction power.」
・特徴「A major difference between GP and other machine learning techniques is its mathematical connections between genes within a classifier.」
・特徴「An inrinsic advantage of GP is that it automatically selects a small number of feature genes during "evoliution".」
・特徴「However, GP has added advantages over other algorithms. Its special features include the following: 1) the ability to automatically select a small number of genes as potential discriminative genes, 2) the ability to combine such genes and construct a simple and comprehensible classifier, and 3) the capability to generate multiple candidate classifiers.」
・Evolutionary algorithm が分かっていない[図]。cDNAとAffyのデータを混ぜて計算しているのが目新しい。