cg
diff grant.txt @ 87:f04ea2784509
.
author | bshanks@bshanks.dyndns.org |
---|---|
date | Tue Apr 21 05:34:25 2009 -0700 (16 years ago) |
parents | aafe6f8c3593 |
children | ae1e1da359d2 |
line diff
1.1 --- a/grant.txt Tue Apr 21 04:05:54 2009 -0700
1.2 +++ b/grant.txt Tue Apr 21 05:34:25 2009 -0700
1.3 @@ -23,7 +23,13 @@
1.4
1.5 \newpage
1.6
1.7 -== Background and significance ==
1.8 +== The challenge topic ==
1.9 +
1.10 +This proposal addresses challenge topic 06-HG-101. Massive new datasets obtained with techniques such as in situ hybridization (ISH), immunohistochemistry, in situ transgenic reporter, microarray voxelation, and others, allow the expression levels of many genes at many locations to be compared. Our goal is to develop automated methods to relate spatial variation in gene expression to anatomy. We want to find marker genes for specific anatomical regions, and also to draw new anatomical maps based on gene expression patterns.
1.11 +
1.12 +== The Challenge and Potential impact ==
1.13 +
1.14 +Now we will discuss each of our three aims in turn. For each aim, we will develop a conceptual framework for thinking about the task, and we will present our strategy for solving it. Next we will discuss related work. At the conclusion of each section, we will summarize why our strategy is different from what has been done before. At the end of this section, we will describe the potential impact.
1.15
1.16 === Aim 1: Given a map of regions, find genes that mark the regions ===
1.17
1.18 @@ -201,7 +207,20 @@
1.19
1.20
1.21
1.22 -\vspace{0.3cm}**Significance**
1.23 +
1.24 +=== Related work ===
1.25 +
1.26 +\cite{ng_anatomic_2009} describes the application of AGEA to the cortex. The paper describes interesting results on the structure of correlations between voxel gene expression profiles within a handful of cortical areas. However, this sort of analysis is not related to either of our aims, as it neither finds marker genes, nor does it suggest a cortical map based on gene expression data. Neither of the other components of AGEA can be applied to cortical areas; AGEA's Gene Finder cannot be used to find marker genes for the cortical areas; and AGEA's hierarchial clustering does not produce clusters corresponding to the cortical areas\footnote{In both cases, the cause is that pairwise correlations between the gene expression of voxels in different areas but the same layer are often stronger than pairwise correlations between the gene expression of voxels in different layers but the same area. Therefore, a pairwise voxel correlation clustering algorithm will tend to create clusters representing cortical layers, not areas (there may be clusters which presumably correspond to the intersection of a layer and an area, but since one area will have many layer-area intersection clusters, further work is needed to make sense of these). The reason that Gene Finder cannot the find marker genes for cortical areas is that, although the user chooses a seed voxel, Gene Finder chooses the ROI for which genes will be found, and it creates that ROI by (pairwise voxel correlation) clustering around the seed.}.
1.27 +
1.28 +
1.29 +%% Most of the projects which have been discussed have been done by the same groups that develop the public datasets. Although these projects make their algorithms available for use on their own website, none of them have released an open-source software toolkit; instead, users are restricted to using the provided algorithms only on their own dataset.
1.30 +
1.31 +In summary, for all three aims, (a) only one of the previous projects explores combinations of marker genes, (b) there has been almost no comparison of different algorithms or scoring methods, and (c) there has been no work on computationally finding marker genes for cortical areas, or on finding a hierarchial clustering that will yield a map of cortical areas de novo from gene expression data.
1.32 +
1.33 +Our project is guided by a concrete application with a well-specified criterion of success (how well we can find marker genes for \begin{latex}/\end{latex} reproduce the layout of cortical areas), which will provide a solid basis for comparing different methods.
1.34 +
1.35 +
1.36 +== Significance ==
1.37
1.38 The method developed in aim (1) will be applied to each cortical area to find a set of marker genes such that the combinatorial expression pattern of those genes uniquely picks out the target area. Finding marker genes will be useful for drug discovery as well as for experimentation because marker genes can be used to design interventions which selectively target individual cortical areas.
1.39
1.40 @@ -209,28 +228,16 @@
1.41
1.42
1.43 %% Since the number of classes of stains is small compared to the number of genes,
1.44 +
1.45 The method developed in aim (2) will provide a genoarchitectonic viewpoint that will contribute to the creation of a better map. The development of present-day cortical maps was driven by the application of histological stains. If a different set of stains had been available which identified a different set of features, then today's cortical maps may have come out differently. It is likely that there are many repeated, salient spatial patterns in the gene expression which have not yet been captured by any stain. Therefore, cortical anatomy needs to incorporate what we can learn from looking at the patterns of gene expression.
1.46
1.47 -
1.48 -While we do not here propose to analyze human gene expression data, it is conceivable that the methods we propose to develop could be used to suggest modifications to the human cortical map as well.
1.49 -
1.50 -
1.51 -=== Related work ===
1.52 -
1.53 -\cite{ng_anatomic_2009} describes the application of AGEA to the cortex. The paper describes interesting results on the structure of correlations between voxel gene expression profiles within a handful of cortical areas. However, this sort of analysis is not related to either of our aims, as it neither finds marker genes, nor does it suggest a cortical map based on gene expression data. Neither of the other components of AGEA can be applied to cortical areas; AGEA's Gene Finder cannot be used to find marker genes for the cortical areas; and AGEA's hierarchial clustering does not produce clusters corresponding to the cortical areas\footnote{In both cases, the cause is that pairwise correlations between the gene expression of voxels in different areas but the same layer are often stronger than pairwise correlations between the gene expression of voxels in different layers but the same area. Therefore, a pairwise voxel correlation clustering algorithm will tend to create clusters representing cortical layers, not areas (there may be clusters which presumably correspond to the intersection of a layer and an area, but since one area will have many layer-area intersection clusters, further work is needed to make sense of these). The reason that Gene Finder cannot the find marker genes for cortical areas is that, although the user chooses a seed voxel, Gene Finder chooses the ROI for which genes will be found, and it creates that ROI by (pairwise voxel correlation) clustering around the seed.}.
1.54 -
1.55 -
1.56 -%% Most of the projects which have been discussed have been done by the same groups that develop the public datasets. Although these projects make their algorithms available for use on their own website, none of them have released an open-source software toolkit; instead, users are restricted to using the provided algorithms only on their own dataset.
1.57 -
1.58 -In summary, for all three aims, (a) only one of the previous projects explores combinations of marker genes, (b) there has been almost no comparison of different algorithms or scoring methods, and (c) there has been no work on computationally finding marker genes for cortical areas, or on finding a hierarchial clustering that will yield a map of cortical areas de novo from gene expression data.
1.59 -
1.60 -Our project is guided by a concrete application with a well-specified criterion of success (how well we can find marker genes for \begin{latex}/\end{latex} reproduce the layout of cortical areas), which will provide a solid basis for comparing different methods.
1.61 +While we do not here propose to analyze human gene expression data, it is conceivable that the methods we propose to develop could be used to suggest modifications to the human cortical map as well. In fact, the methods we will develop will be applicable to other datasets beyond the brain. We will provide an open-source toolbox to allow other researchers to easily use our methods. With these methods, researchers with gene expression for any area of the body will be able to efficiently find marker genes for anatomical regions, or to use gene expression to discover new anatomical patterning. As described above, marker genes have a variety of uses in the development of drugs and experimental manipulations, and in the anatomical characterization of tissue samples. The discovery of new ways to carve up anatomical structures into regions will widely impact all areas of biology.
1.62 +
1.63
1.64
1.65
1.66 \newpage
1.67 -
1.68 -== Preliminary Studies ==
1.69 +== The approach: Preliminary Studies ==
1.70 \begin{wrapfigure}{L}{0.35\textwidth}\centering
1.71 %%\includegraphics[scale=.27]{singlegene_SS_corr_top_1_2365_jet.eps}\includegraphics[scale=.27]{singlegene_SS_corr_top_2_242_jet.eps}\includegraphics[scale=.27]{singlegene_SS_corr_top_3_654_jet.eps}
1.72 %%\\
1.73 @@ -445,10 +452,10 @@
1.74
1.75
1.76 \newpage
1.77 -== Research Design and Methods ==
1.78 -
1.79 -
1.80 -\vspace{0.3cm}**Flatmapping and segmentation of cortical layers**
1.81 +== The approach: what we plan to do ==
1.82 +
1.83 +
1.84 +\vspace{0.3cm}**Flatmap and segment cortical layers**
1.85
1.86 %%In anatomy, the manifold of interest is usually either defined by a combination of two relevant anatomical axes (todo), or by the surface of the structure (as is the case with the cortex). In the former case, the manifold of interest is a plane, but in the latter case it is curved. If the manifold is curved, there are various methods for mapping the manifold into a plane.
1.87
1.88 @@ -514,6 +521,7 @@
1.89
1.90
1.91 \vspace{0.3cm}**Apply these algorithms to the cortex**
1.92 +
1.93 Using the methods developed in Aim 1, we will present, for each cortical area, a short list of markers to identify that area; and we will also present lists of "panels" of genes that can be used to delineate many areas at once. Using the methods developed in Aim 2, we will present one or more hierarchial cortical maps. We will identify and explain how the statistical structure in the gene expression data led to any unexpected or interesting features of these maps.
1.94
1.95
1.96 @@ -523,6 +531,34 @@
1.97 %%Presently, we do not have a probabalistic atlas which is registered to the ABA space. However, in anticipation of the availability of such maps, we would like to explore extensions to our Aim 1 techniques which can handle probabalistic maps.
1.98
1.99
1.100 +== Timeline and milestones ==
1.101 +
1.102 +=== Aim 1 ===
1.103 +
1.104 +* Oct-Nov 2009: develop an automated mechanism for segmenting the cortical voxels into layers
1.105 +* Nov 2009 (milestone): a preliminary automated mechanism for segmenting the cortical voxels into layers
1.106 +* Oct 2009-Feb 2010: develop scoring methods and to test them in various supervised learning frameworks. Also test out various dimensionality reduction schemes in combination with supervised learning.
1.107 +* Dec 2009-April 2010: create or extend supervised learning frameworks which use multivariate versions of the best scoring methods
1.108 +* January 2010 (milestone): submit a publication on single marker genes for cortical areas
1.109 +* February-June 2010: explore the best way to integrate radial profiles with supervised learning. Explore the best way to make supervised learning techniques robust against incorrect labels (i.e. when the areas drawn on the input cortical map are slightly off). Quantitatively compare the performance of different supervised learning techniques.
1.110 +* May-July 2010: Validate marker genes found in the ABA dataset by checking against other gene expression datasets
1.111 +* June 2010: submit a paper describing a method fulfilling Aim 1
1.112 +* July 2010: submit a paper describing combinations of marker genes for each cortical area, and a small number of marker genes that can, in combination, define most of the areas at once
1.113 +* April-July 2010: create documentation and unit tests for software toolbox for Aim 1.
1.114 +* August 2010-: respond to user bug reports for Aim 1 software toolbox.
1.115 +
1.116 +=== Aim 2 ===
1.117 +* April-September 2010: explore dimensionality reduction algorithms for Aim 2
1.118 +* June-November 2010: explore standard hierarchial clustering algorithms, used in combination with dimensionality reduction, for Aim 2
1.119 +* July-December 2010: explore co-clustering algorithms. Think about how radial profile information can be used for Aim 2. Adapt clustering algorithms to use radial profile information.
1.120 +* January-March 2011: Quantitatively compare the performance of different dimensionality reduction and clustering techniques. Quantitatively compare the value of different flatmapping methods and ways of representing radial profiles.
1.121 +* January-June 2011: using the methods developed for Aim 2, explore the genomic anatomy of the cortex. Read the literature and talk to people to learn about research related to unexpected and interesting discoveries.
1.122 +* February-May 2011: create documentation and unit tests for software toolbox for Aim 2.
1.123 +* June 2011-: respond to user bug reports for Aim 1 software toolbox.
1.124 +* March 2011: submit a paper describing a method fulfilling Aim 2
1.125 +* May 2011: submit a paper on the genomic anatomy of the cortex, using the methods developed in Aim 2
1.126 +* May-August 2011: revisit Aim 1 to see if what was learned during Aim 2 can improve the methods for Aim 1.
1.127 +
1.128 \newpage
1.129
1.130 \bibliographystyle{plain}