cg

changeset 29:5e2e4732b647
.
author: bshanks@bshanks.dyndns.org
date: Mon Apr 13 03:43:51 2009 -0700 (16 years ago)
parents: 01c118d1074b
children: 6ec3230fe1dc
files: grant.html grant.odt grant.pdf grant.txt
--- a/grant.html	Mon Apr 13 03:31:42 2009 -0700
+++ b/grant.html	Mon Apr 13 03:43:51 2009 -0700
@@ -56,7 +56,7 @@
-            genes to include is called feature selection.  Feature selection is one component
+            genes1 to include is called feature selection. Feature selection is one component
@@ -66,9 +66,11 @@
+__________________________
+   1Strictly speaking, the features are gene expression levels, but we&#8217;ll call them genes.
+                                            2
+
-                                            2
-
@@ -81,41 +83,38 @@
-               Principle 1: Combinatorial gene expression
-               Above, we defined an &#8220;instance&#8221; as the combination of a voxel with the
-            &#8220;associated gene expression data&#8221;. In our case this refers to the expression level
-            of genes within the voxel, but should we include the expression levels of all
-            genes, or only a few of them?
-               It is too much to hope that every anatomical region of interest will be iden-
-            tified by a single gene. For example, in the cortex, there are some areas which
-            are not clearly delineated by any gene included in the Allen Brain Atlas (ABA)
-            dataset.  However, at least some of these areas can be delineated by looking
-            at combinations of genes (an example of an area for which multiple genes are
-            necessary and sufficient is provided in Preliminary Results).
+               Principle 1:  Combinatorial gene expression It is too much to hope
+            that every anatomical region of interest will be identified by a single gene. For
+            example, in the cortex, there are some areas which are not clearly delineated
+            by any gene included in the Allen Brain Atlas (ABA) dataset.  However, at
+            least some of these areas can be delineated by looking at combinations of genes
+            (an example of an area for which multiple genes are necessary and sufficient
+            is provided in Preliminary Results).  Therefore, each instance should contain
+            multiple features (genes).
-               When the classifier classifies a voxel, it is only allowed to look at the expres-
-            sion of the genes which have been selected as features.  The more data that is
-            available to a classifier, the better that it can do.  For example, perhaps there
-            are weak correlations over many genes that add up to a strong signal. So, why
-            not include every gene as a feature?  The reason is that we wish to employ
-            the classifier in situations in which it is not feasible to gather data about every
-            gene. For example, if we want to use the expression of marker genes as a trigger
-            for some regionally-targeted intervention, then our intervention must contain a
-            molecular mechanism to check the expression level of each marker gene before
-            it triggers.  It is currently infeasible to design a molecular trigger that checks
-            the level of more than a handful of genes. Similarly, if the goal is to develop a
-            procedure to do ISH on tissue samples in order to label their anatomy, then it
-            is infeasible to label more than a few genes.  Therefore, we must select only a
-            few genes as features.
+            When the classifier classifies a voxel, it is only allowed to look at the expression of
+            the genes which have been selected as features. The more data that is available
+            to a classifier, the better that it can do.  For example, perhaps there are weak
+            correlations over many genes that add up to a strong signal. So, why not include
+            every gene as a feature? The reason is that we wish to employ the classifier in
+            situations in which it is not feasible to gather data about every gene.   For
+            example, if we want to use the expression of marker genes as a trigger for some
+            regionally-targeted intervention, then our intervention must contain a molecular
+            mechanism to check the expression level of each marker gene before it triggers.
+            It is currently infeasible to design a molecular trigger that checks the level of
+            more than a handful of genes. Similarly, if the goal is to develop a procedure to
+            do ISH on tissue samples in order to label their anatomy, then it is infeasible
+            to label more than a few genes.  Therefore, we must select only a few genes as
+            features.
-                                            3
-
+                                            3
+
@@ -154,12 +153,12 @@
-                                            4
-
+                                            4
+
@@ -200,17 +199,17 @@
-                                            5
-
+                                            5
+
-            ually pick it out1. This suggests the following procedure: cluster together genes
+            ually pick it out2. This suggests the following procedure: cluster together genes
@@ -240,17 +239,17 @@
+            genes uniquely picks out the target area.  Finding marker genes will be useful
+            for drug discovery as well as for experimentation because marker genes can be
+            used to design interventions which selectively target individual cortical areas.
-   1This would seem to contradict our finding in aim 1 that some cortical areas are combina-
+   2This would seem to contradict our finding in aim 1 that some cortical areas are combina-
-            genes uniquely picks out the target area.  Finding marker genes will be useful
-            for drug discovery as well as for experimentation because marker genes can be
-            used to design interventions which selectively target individual cortical areas.
@@ -292,18 +291,18 @@
-                                            7
-
+                                            7
+
-            inary Data), we think that it will be possible to find a better method2 (we also
+            inary Data), we think that it will be possible to find a better method3 (we also
-   2We ran &#8220;vanilla&#8221; NNMF, whereas the paper under discussion used a modified method.
+   3We ran &#8220;vanilla&#8221; NNMF, whereas the paper under discussion used a modified method.
@@ -320,14 +319,14 @@
-            regression, gene wwc13 is the best fit single gene for predicting whether or not a
+            regression, gene wwc14 is the best fit single gene for predicting whether or not a
-               Gnee mtif24 is shown in figure the upper-right of Fig. . Mtif2 captures MO&#8217;s
+               Gnee mtif25 is shown in figure the upper-right of Fig. . Mtif2 captures MO&#8217;s
@@ -339,17 +338,17 @@
-            3 genes which most match area AUD, according to a pointwise method5.  The
+            3 genes which most match area AUD, according to a pointwise method6.  The
-            which considers local geometry6 The pointwise method in the top row identifies
+            which considers local geometry7 The pointwise method in the top row identifies
-   3&#8220;WW, C2 and coiled-coil domain containing 1&#8221;; EntrezGene ID 211652
-    4&#8220;mitochondrial translational initiation factor 2&#8221;; EntrezGene ID 76784
-    5For each gene, a logistic regression in which the response variable was whether or not a
+   4&#8220;WW, C2 and coiled-coil domain containing 1&#8221;; EntrezGene ID 211652
+    5&#8220;mitochondrial translational initiation factor 2&#8221;; EntrezGene ID 76784
+    6For each gene, a logistic regression in which the response variable was whether or not a
-    6For each gene the gradient similarity (see section ??) between (a) a map of the expression
+    7For each gene the gradient similarity (see section ??) between (a) a map of the expression
@@ -389,11 +388,11 @@
-            gene expression profiles.  We achieved classification accuracy of about 81%7.
+            gene expression profiles.  We achieved classification accuracy of about 81%8.
-   75-fold cross-validation.
+   85-fold cross-validation.
@@ -489,7 +488,7 @@
-    do we need to cite: no known markers? impressive results?
+    do we need to cite: no known markers, impressive results?
--- a/grant.txt	Mon Apr 13 03:31:42 2009 -0700
+++ b/grant.txt	Mon Apr 13 03:43:51 2009 -0700
@@ -31,7 +31,7 @@
-Each gene expression level is called a __feature__, and the selection of which genes to include is called __feature selection__. Feature selection is one component of the task of learning a classifier. Some methods for learning classifiers start out with a separate feature selection phase, whereas other methods combine feature selection with other aspects of training. 
+Each gene expression level is called a __feature__, and the selection of which genes\footnote{Strictly speaking, the features are gene expression levels, but we'll call them genes.} to include is called __feature selection__. Feature selection is one component of the task of learning a classifier. Some methods for learning classifiers start out with a separate feature selection phase, whereas other methods combine feature selection with other aspects of training. 
@@ -41,14 +41,10 @@
-
-Above, we defined an "instance" as the combination of a voxel with the "associated gene expression data". In our case this refers to the expression level of genes within the voxel, but should we include the expression levels of all genes, or only a few of them? 
-
-It is too much to hope that every anatomical region of interest will be identified by a single gene. For example, in the cortex, there are some areas which are not clearly delineated by any gene included in the Allen Brain Atlas (ABA) dataset. However, at least some of these areas can be delineated by looking at combinations of genes (an example of an area for which multiple genes are necessary and sufficient is provided in Preliminary Results).
+It is too much to hope that every anatomical region of interest will be identified by a single gene. For example, in the cortex, there are some areas which are not clearly delineated by any gene included in the Allen Brain Atlas (ABA) dataset. However, at least some of these areas can be delineated by looking at combinations of genes (an example of an area for which multiple genes are necessary and sufficient is provided in Preliminary Results). Therefore, each instance should contain multiple features (genes).
-
@@ -317,4 +313,4 @@
-do we need to cite: no known markers? impressive results?
+do we need to cite: no known markers, impressive results?
author	bshanks@bshanks.dyndns.org
date	Mon Apr 13 03:43:51 2009 -0700 (16 years ago)
parents	01c118d1074b
children	6ec3230fe1dc
files	grant.html grant.odt grant.pdf grant.txt