cg

changeset 34:c435e5da5211
.
author: bshanks@bshanks.dyndns.org
date: Mon Apr 13 19:38:30 2009 -0700 (16 years ago)
parents: 6d023f15572e
children: 99e5d268bab0
files: grant.doc grant.html grant.odt grant.pdf grant.txt
--- a/grant.html	Mon Apr 13 14:53:12 2009 -0700
+++ b/grant.html	Mon Apr 13 19:38:30 2009 -0700
@@ -195,8 +195,8 @@
-usefulness of such research. We have run NNMF on the cortical dataset and while the results are promising (see Preliminary
-Data), we think that it will be possible to find a better method3 (we also think that more automation of the parts that this
+usefulness of such research. We have run NNMF on the cortical dataset3 and while the results are promising (see Preliminary
+Data), we think that it will be possible to find a better method (we also think that more automation of the parts that this
@@ -206,19 +206,33 @@
-At first glance AGEA seems similar to this proposal, but in fact it is different.
-Gene Finder is different from our Aim 1 in at least four ways. First, although the user chooses a seed voxel, Gene Finder,
-not the user, chooses the cluster for which genes will be found, and in our experience it never chooses cortical areas, instead
-preferring cortical layers. Therefore, Gene Finder cannot be used to find marker genes for cortical areas. Second, Gene Finder
-finds only single genes, whereas we will also look for combinations of genes. Third, gene finder can only use overexpression
-as a marker, whereas we will also look for underexpression. Fourth, Gene Finder uses a simple pointwise metric (&#8220;expression
-energy ratio&#8221;, which captures overexpression), whereas we will also use geometric metrics such as gradient similarity.
-The hierarchial clustering is different from our Aim 2 in at least two ways. todo
-_________________________________________
+Gene Finder is different from our Aim 1 in at least four ways.  First, although the user chooses a seed voxel, Gene
+Finder, not the user, chooses the cluster for which genes will be found, and in our experience it never chooses cortical areas,
+instead preferring cortical layers4.  Therefore, Gene Finder cannot be used to find marker genes for cortical areas.  Second,
+Gene Finder finds only single genes, whereas we will also look for combinations of genes5.  Third, gene finder can only use
+overexpression as a marker, whereas in the Preliminary Data we show that underexpression can also be used. Fourth, Gene
+Finder uses a simple pointwise score6, whereas we will also use geometric metrics such as gradient similarity.
+The hierarchial clustering is different from our Aim 2 in at least three ways.  First, the clustering finds clusters cor-
+responding to layers, but no clusters corresponding to areas7  8  Our Aim 2 will not be accomplished until a clustering is
+produced which yields areas.  Second, AGEA uses perhaps the simplest possible similarity score (correlation), and does no
+dimensionality reduction before calculating similarity. While it is possible that a more complex system will not do any better
+than this, we believe further exploration of alternative methods of scoring and dimensionality reduction is warranted. Third,
+AGEA did not look at clusters of genes; in Preliminary Data we have shown that clusters of genes may identify intersting
+spatial subregions such as cortical areas.
+_______
+    4Because of the way in which Gene Finder chooses a cluster, layers will always be preferred to areas if pairwise correlations between the gene
+expression of voxels in different areas but the same layer are stronger than pairwise correlatios between the gene expression of voxels in different
+layers but the same area. This appears to be the case.
+    5See Preliminary Data for an example of an area which cannot be marked by any single gene in the dataset, but which can be marked by a
+combination.
+    6&#8220;Expression energy ratio&#8221;, which captures overexpression.
+    7This is for the same reason as in footnote 4.
+    8There are clusters which presumably correspond to the intersection of a layer and an area, but since one area will have many layer-area
+intersection clusters, further work is needed to make sense of these.
@@ -234,12 +248,12 @@
-natorially.  according to logistic regression, gene wwc14  is the best fit single gene for predicting whether or not a pixel on
+natorially.  according to logistic regression, gene wwc19  is the best fit single gene for predicting whether or not a pixel on
-Gnee mtif25 is shown in figure the upper-right of Fig. . Mtif2 captures MO&#8217;s upper-left boundary, but not its lower-right
+Gnee mtif210 is shown in figure the upper-right of Fig. . Mtif2 captures MO&#8217;s upper-left boundary, but not its lower-right
@@ -247,16 +261,16 @@
-. The top row of Fig.   displays the 3 genes which most match area AUD, according to a pointwise method6.  The bottom
-row displays the 3 genes which most match AUD according to a method which considers local geometry7  The pointwise
+. The top row of Fig.  displays the 3 genes which most match area AUD, according to a pointwise method11. The bottom
+row displays the 3 genes which most match AUD according to a method which considers local geometry12  The pointwise
-   4&#8220;WW, C2 and coiled-coil domain containing 1&#8221;; EntrezGene ID 211652
-    5&#8220;mitochondrial translational initiation factor 2&#8221;; EntrezGene ID 76784
-    6For each gene, a logistic regression in which the response variable was whether or not a surface pixel was within area AUD, and the predictor
+   9&#8220;WW, C2 and coiled-coil domain containing 1&#8221;; EntrezGene ID 211652
+   10&#8220;mitochondrial translational initiation factor 2&#8221;; EntrezGene ID 76784
+   11For each gene, a logistic regression in which the response variable was whether or not a surface pixel was within area AUD, and the predictor
-    7For each gene the gradient similarity (see section ??) between (a) a map of the expression of each gene on the cortical surface and (b) the
+   12For each gene the gradient similarity (see section ??) between (a) a map of the expression of each gene on the cortical surface and (b) the
@@ -275,7 +289,7 @@
-surface pixels based on their gene expression profiles. We achieved classification accuracy of about 81%8. As noted above,
+surface pixels based on their gene expression profiles. We achieved classification accuracy of about 81%13. As noted above,
@@ -291,7 +305,7 @@
-   85-fold cross-validation.
+  135-fold cross-validation.
--- a/grant.txt	Mon Apr 13 14:53:12 2009 -0700
+++ b/grant.txt	Mon Apr 13 19:38:30 2009 -0700
@@ -147,7 +147,7 @@
-Factorization (NNMF), and a hierarchial recursive bifurcation clustering scheme based on correlation as the similarity score. The paper yielded impressive results, proving the usefulness of such research. We have run NNMF on the cortical dataset and while the results are promising (see Preliminary Data), we think that it will be possible to find a better method\footnote{We ran "vanilla" NNMF, whereas the paper under discussion used a modified method. Their main modification consisted of adding a soft spatial contiguity constraint. However, on our dataset, NNMF naturally produced spatially contiguous clusters, so no additional constraint was needed. The paper under discussion mentions that they also tried a hierarchial variant of NNMF, but since they didn't report its results, we assume that those result were not any more impressive than the results of the non-hierarchial variant.} (we also think that more automation of the parts that this paper's authors did manually will be possible).
+Factorization (NNMF), and a hierarchial recursive bifurcation clustering scheme based on correlation as the similarity score. The paper yielded impressive results, proving the usefulness of such research. We have run NNMF on the cortical dataset\footnote{We ran "vanilla" NNMF, whereas the paper under discussion used a modified method. Their main modification consisted of adding a soft spatial contiguity constraint. However, on our dataset, NNMF naturally produced spatially contiguous clusters, so no additional constraint was needed. The paper under discussion mentions that they also tried a hierarchial variant of NNMF, but since they didn't report its results, we assume that those result were not any more impressive than the results of the non-hierarchial variant.} and while the results are promising (see Preliminary Data), we think that it will be possible to find a better method (we also think that more automation of the parts that this paper's authors did manually will be possible).
@@ -164,11 +164,9 @@
-At first glance AGEA seems similar to this proposal, but in fact it is different. 
-
-Gene Finder is different from our Aim 1 in at least four ways. First, although the user chooses a seed voxel, Gene Finder, not the user, chooses the cluster for which genes will be found, and in our experience it never chooses cortical areas, instead preferring cortical layers. Therefore, Gene Finder cannot be used to find marker genes for cortical areas. Second, Gene Finder finds only single genes, whereas we will also look for combinations of genes. Third, gene finder can only use overexpression as a marker, whereas we will also look for underexpression. Fourth, Gene Finder uses a simple pointwise metric ("expression energy ratio", which captures overexpression), whereas we will also use geometric metrics such as gradient similarity. 
-
-The hierarchial clustering is different from our Aim 2 in at least two ways.  todo
+Gene Finder is different from our Aim 1 in at least four ways. First, although the user chooses a seed voxel, Gene Finder, not the user, chooses the cluster for which genes will be found, and in our experience it never chooses cortical areas, instead preferring cortical layers\footnote{\label{layersNotAreas}Because of the way in which Gene Finder chooses a cluster, layers will always be preferred to areas if pairwise correlations between the gene expression of voxels in different areas but the same layer are stronger than pairwise correlatios between the gene expression of voxels in different layers but the same area. This appears to be the case.}. Therefore, Gene Finder cannot be used to find marker genes for cortical areas. Second, Gene Finder finds only single genes, whereas we will also look for combinations of genes\footnote{See Preliminary Data for an example of an area which cannot be marked by any single gene in the dataset, but which can be marked by a combination.}. Third, gene finder can only use overexpression as a marker, whereas in the Preliminary Data we show that underexpression can also be used. Fourth, Gene Finder uses a simple pointwise score\footnote{"Expression energy ratio", which captures overexpression.}, whereas we will also use geometric metrics such as gradient similarity. 
+
+The hierarchial clustering is different from our Aim 2 in at least three ways. First, the clustering finds clusters corresponding to layers, but no clusters corresponding to areas\footnote{This is for the same reason as in footnote \ref{layersNotAreas}.} \footnote{There are clusters which presumably correspond to the intersection of a layer and an area, but since one area will have many layer-area intersection clusters, further work is needed to make sense of these.} Our Aim 2 will not be accomplished until a clustering is produced which yields areas. Second, AGEA uses perhaps the simplest possible similarity score (correlation), and does no dimensionality reduction before calculating similarity. While it is possible that a more complex system will not do any better than this, we believe further exploration of alternative methods of scoring and dimensionality reduction is warranted. Third, AGEA did not look at clusters of genes; in Preliminary Data we have shown that clusters of genes may identify intersting spatial subregions such as cortical areas.
author	bshanks@bshanks.dyndns.org
date	Mon Apr 13 19:38:30 2009 -0700 (16 years ago)
parents	6d023f15572e
children	99e5d268bab0
files	grant.doc grant.html grant.odt grant.pdf grant.txt