cg

diff grant.html @ 30:6ec3230fe1dc
.
author: bshanks@bshanks.dyndns.org
date: Mon Apr 13 03:52:58 2009 -0700 (16 years ago)
parents: 5e2e4732b647
children: 95910357b4ac
--- a/grant.html	Mon Apr 13 03:43:51 2009 -0700
+++ b/grant.html	Mon Apr 13 03:52:58 2009 -0700
@@ -1,494 +1,359 @@
+Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
-            Massive new datasets obtained with techniques such as in situ hybridization
-            (ISH) and BAC-transgenics allow the expression levels of many genes at many
-            locations to be compared. Our goal is to develop automated methods to relate
-            spatial variation in gene expression to anatomy. We want to find marker genes
-            for specific anatomical regions, and also to draw new anatomical maps based on
-            gene expression patterns. We have three specific aims:
-               (1) develop an algorithm to screen spatial gene expression data for combi-
-            nations of marker genes which selectively target anatomical regions
-               (2) develop an algorithm to suggest new ways of carving up a structure into
-            anatomical subregions, based on spatial patterns in gene expression
-               (3) create a 2-D &#8220;flat map&#8221; dataset of the mouse cerebral cortex that con-
-            tains a flattened version of the Allen Mouse Brain Atlas ISH data, as well as
-            the boundaries of cortical anatomical areas.  Use this dataset to validate the
-            methods developed in (1) and (2).
-               In addition to validating the usefulness of the algorithms, the application of
-            these methods to cerebral cortex will produce immediate benefits, because there
-            are currently no known genetic markers for many cortical areas.  The results
-            of the project will support the development of new ways to selectively target
-            cortical areas, and it will support the development of a method for identifying
-            the cortical areal boundaries present in small tissue samples.
-               All algorithms that we develop will be implemented in an open-source soft-
-            ware toolkit.  The toolkit, as well as the machine-readable datasets developed
-            in aim (3), will be published and freely available for others to use.
-                                            1
+Massive new datasets obtained with techniques such as in situ hybridization (ISH) and BAC-transgenics allow the expression
+levels of many genes at many locations to be compared. Our goal is to develop automated methods to relate spatial variation
+in gene expression to anatomy. We want to find marker genes for specific anatomical regions, and also to draw new anatomical
+maps based on gene expression patterns. We have three specific aims:
+(1) develop an algorithm to screen spatial gene expression data for combinations of marker genes which selectively target
+anatomical regions
+(2) develop an algorithm to suggest new ways of carving up a structure into anatomical subregions, based on spatial
+patterns in gene expression
+(3) create a 2-D &#8220;flat map&#8221; dataset of the mouse cerebral cortex that contains a flattened version of the Allen Mouse Brain
+Atlas ISH data, as well as the boundaries of cortical anatomical areas.  Use this dataset to validate the methods developed
+in (1) and (2).
+In addition to validating the usefulness of the algorithms, the application of these methods to cerebral cortex will produce
+immediate benefits, because there are currently no known genetic markers for many cortical areas. The results of the project
+will support the development of new ways to selectively target cortical areas, and it will support the development of a method
+for identifying the cortical areal boundaries present in small tissue samples.
+All algorithms that we develop will be implemented in an open-source software toolkit.   The toolkit, as well as the
+machine-readable datasets developed in aim (3), will be published and freely available for others to use.
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    1 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+Background and significance
+Aim 1
+Machine learning terminology: supervised learning
+The task of looking for marker genes for anatomical subregions means that one is looking for a set of genes such that, if
+the expression level of those genes is known, then the locations of the subregions can be inferred.
+If we define the subregions so that they cover the entire anatomical structure to be divided, then instead of saying that we
+are using gene expression to find the locations of the subregions, we may say that we are using gene expression to determine
+to which subregion each voxel within the structure belongs.  We call this a classification task, because each voxel is being
+assigned to a class (namely, its subregion).
+Therefore, an understanding of the relationship between the combination of their expression levels and the locations of
+the subregions may be expressed as a function.  The input to this function is a voxel, along with the gene expression levels
+within that voxel; the output is the subregional identity of the target voxel, that is, the subregion to which the target voxel
+belongs. We call this function a classifier. In general, the input to a classifier is called an instance, and the output is called
+a label (or a class label).
+The object of aim 1 is not to produce a single classifier, but rather to develop an automated method for determining a
+classifier for any known anatomical structure.  Therefore, we seek a procedure by which a gene expression dataset may be
+analyzed in concert with an anatomical atlas in order to produce a classifier. Such a procedure is a type of a machine learning
+procedure. The construction of the classifier is called training (also learning), and the initial gene expression dataset used in
+the construction of the classifier is called training data.
+In the machine learning literature, this sort of procedure may be thought of as a supervised learning task, defined as a
+task in which the goal is to learn a mapping from instances to labels, and the training data consists of a set of instances
+(voxels) for which the labels (subregions) are known.
+Each gene expression level is called a feature, and the selection of which genes1  to include is called feature selection.
+Feature selection is one component of the task of learning a classifier. Some methods for learning classifiers start out with a
+separate feature selection phase, whereas other methods combine feature selection with other aspects of training.
+One class of feature selection methods assigns some sort of score to each candidate gene. The top-ranked genes are then
+chosen. Some scoring measures can assign a score to a set of selected genes, not just to a single gene; in this case, a dynamic
+procedure may be used in which features are added and subtracted from the selected set depending on how much they raise
+the score. Such procedures are called &#8220;stepwise&#8221; or &#8220;greedy&#8221;.
+Although the classifier itself may only look at the gene expression data within each voxel before classifying that voxel, the
+learning algorithm which constructs the classifier may look over the entire dataset.  We can categorize score-based feature
+selection methods depending on how the score of calculated. Often the score calculation consists of assigning a sub-score to
+each voxel, and then aggregating these sub-scores into a final score (the aggregation is often a sum or a sum of squares). If
+only information from nearby voxels is used to calculate a voxel&#8217;s sub-score, then we say it is a local scoring method. If only
+information from the voxel itself is used to calculate a voxel&#8217;s sub-score, then we say it is a pointwise scoring method.
+Key questions when choosing a learning method are:  What are the instances?  What are the features?  How are the
+features chosen? Here are four principles that outline our answers to these questions.
+Principle 1: Combinatorial gene expression It is too much to hope that every anatomical region of interest will be
+identified by a single gene.  For example, in the cortex, there are some areas which are not clearly delineated by any gene
+included in the Allen Brain Atlas (ABA) dataset.  However, at least some of these areas can be delineated by looking at
+combinations of genes (an example of an area for which multiple genes are necessary and sufficient is provided in Preliminary
+Results). Therefore, each instance should contain multiple features (genes).
+Principle 2:  Only look at combinations of small numbers of genes When the classifier classifies a voxel, it is
+only allowed to look at the expression of the genes which have been selected as features. The more data that is available to
+a classifier, the better that it can do.  For example, perhaps there are weak correlations over many genes that add up to a
+strong signal. So, why not include every gene as a feature? The reason is that we wish to employ the classifier in situations
+in which it is not feasible to gather data about every gene. For example, if we want to use the expression of marker genes as
+a trigger for some regionally-targeted intervention, then our intervention must contain a molecular mechanism to check the
+expression level of each marker gene before it triggers. It is currently infeasible to design a molecular trigger that checks the
+_________________________________________
+   1Strictly speaking, the features are gene expression levels, but we&#8217;ll call them genes.
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    2 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+level of more than a handful of genes.  Similarly, if the goal is to develop a procedure to do ISH on tissue samples in order
+to label their anatomy, then it is infeasible to label more than a few genes.  Therefore, we must select only a few genes as
+features.
+Principle 3: Use geometry in feature selection
+When doing feature selection with score-based methods, the simplest thing to do would be to score the performance of
+each voxel by itself and then combine these scores (pointwise scoring). A more powerful approach is to also use information
+about the geometric relations between each voxel and its neighbors; this requires non-pointwise, local scoring methods. See
+Preliminary Results for evidence of the complementary nature of pointwise and local scoring methods.
+Principle 4: Work in 2-D whenever possible
+There are many anatomical structures which are commonly characterized in terms of a two-dimensional manifold. When
+it is known that the structure that one is looking for is two-dimensional, the results may be improved by allowing the analysis
+algorithm to take advantage of this prior knowledge. In addition, it is easier for humans to visualize and work with 2-D data.
+Therefore, when possible, the instances should represent pixels, not voxels.
+Aim 2
+Machine learning terminology: clustering
+If one is given a dataset consisting merely of instances, with no class labels, then analysis of the dataset is referred to as
+unsupervised learning in the jargon of machine learning. One thing that you can do with such a dataset is to group instances
+together.  A set of similar instances is called a cluster, and the activity of finding grouping the data into clusters is called
+clustering or cluster analysis.
+The task of deciding how to carve up a structure into anatomical subregions can be put into these terms. The instances
+are once again voxels (or pixels) along with their associated gene expression profiles.  We make the assumption that voxels
+from the same subregion have similar gene expression profiles, at least compared to the other subregions.  This means that
+clustering voxels is the same as finding potential subregions; we seek a partitioning of the voxels into subregions, that is, into
+clusters of voxels with similar gene expression.
+It is desirable to determine not just one set of subregions, but also how these subregions relate to each other, if at all;
+perhaps some of the subregions are more similar to each other than to the rest, suggesting that, although at a fine spatial scale
+they could be considered separate, on a coarser spatial scale they could be grouped together into one large subregion. This
+suggests the outcome of clustering may be a hierarchial tree of clusters, rather than a single set of clusters which partition
+the voxels. This is called hierarchial clustering.
+Similarity scores
+A crucial choice when designing a clustering method is how to measure similarity, across either pairs of instances, or
+clusters, or both.  There is much overlap between scoring methods for feature selection (discussed above under Aim 1) and
+scoring methods for similarity.
+Spatially contiguous clusters; image segmentation
+We have shown that aim 2 is a type of clustering task. In fact, it is a special type of clustering task because we have an
+additional constraint on clusters; voxels grouped together into a cluster must be spatially contiguous. In Preliminary Results,
+we show that one can get reasonable results without enforcing this constraint, however, we plan to compare these results
+against other methods which guarantee contiguous clusters.
+Perhaps the biggest source of continguous clustering algorithms is the field of computer vision, which has produced a
+variety of image segmentation algorithms.  Image segmentation is the task of partitioning the pixels in a digital image into
+clusters, usually contiguous clusters.  Aim 2 is similar to an image segmentation task.  There are two main differences; in
+our task, there are thousands of color channels (one for each gene), rather than just three.  There are imaging tasks which
+use more than three colors, however, for example multispectral imaging and hyperspectral imaging, which are often used to
+process satellite imagery.  A more crucial difference is that there are various cues which are appropriate for detecting sharp
+object boundaries in a visual scene but which are not appropriate for segmenting abstract spatial data such as gene expression.
+Although many image segmentation algorithms can be expected to work well for segmenting other sorts of spatially arranged
+data, some of these algorithms are specialized for visual images.
+Dimensionality reduction
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    3 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+Unlike aim 1,  there is no externally-imposed need to select only a handful of informative genes for inclusion in the
+instances.  However, some clustering algorithms perform better on small numbers of features.  There are techniques which
+&#8220;summarize&#8221; a larger number of features using a smaller number of features; these techniques go by the name of feature
+extraction or dimensionality reduction.  The small set of features that such a technique yields is called the reduced feature
+set. After the reduced feature set is created, the instances may be replaced by reduced instances, which have as their features
+the reduced feature set rather than the original feature set of all gene expression levels. Note that the features in the reduced
+feature set do not necessarily correspond to genes; each feature in the reduced set may be any function of the set of gene
+expression levels.
+Another use for dimensionality reduction is to visualize the relationships between subregions.  For example, one might
+want to make a 2-D plot upon which each subregion is represented by a single point, and with the property that subregions
+with similar gene expression profiles should be nearby on the plot (that is, the property that distance between pairs of points
+in the plot should be proportional to some measure of dissimilarity in gene expression).  It is likely that no arrangement of
+the points on a 2-D plan will exactly satisfy this property &#8211; however, dimensionality reduction techniques allow one to find
+arrangements of points that approximately satisfy that property.  Note that in this application, dimensionality reduction
+is being applied after clustering; whereas in the previous paragraph, we were talking about using dimensionality reduction
+before clustering.
+Clustering genes rather than voxels
+Although the ultimate goal is to cluster the instances (voxels or pixels), one strategy to achieve this goal is to first cluster
+the features (genes). There are two ways that clusters of genes could be used.
+Gene clusters could be used as part of dimensionality reduction:  rather than have one feature for each gene, we could
+have one reduced feature for each gene cluster.
+Gene clusters could also be used to directly yield a clustering on instances. This is because many genes have an expression
+pattern which seems to pick out a single, spatially continguous subregion.  Therefore, it seems likely that an anatomically
+interesting subregion will have multiple genes which each individually pick it out2.  This suggests the following procedure:
+cluster together genes which pick out similar subregions,  and then to use the more popular common subregions as the
+final clusters.  In the Preliminary Data we show that a number of anatomically recognized cortical regions, as well as some
+&#8220;superregions&#8221; formed by lumping together a few regions, are associated with gene clusters in this fashion.
+Aim 3
+Background
+The cortex is divided into areas and layers. To a first approximation, the parcellation of the cortex into areas can be drawn
+as a 2-D map on the surface of the cortex.  In the third dimension, the boundaries between the areas continue downwards
+into the cortical depth, perpendicular to the surface.  The layer boundaries run parallel to the surface.  One can picture an
+area of the cortex as a slice of many-layered cake.
+Although it is known that different cortical areas have distinct roles in both normal functioning and in disease processes,
+there are no known marker genes for many cortical areas. When it is necessary to divide a tissue sample into cortical areas,
+this is a manual process that requires a skilled human to combine multiple visual cues and interpret them in the context of
+their approximate location upon the cortical surface.
+Even the questions of how many areas should be recognized in cortex,  and what their arrangement is,  are still not
+completely settled.   A proposed division of the cortex into areas is called a cortical map.   In the rodent, the lack of a
+single agreed-upon map can be seen by contrasting the recent maps given by Swanson?? on the one hand, and Paxinos
+and Franklin?? on the other. While the maps are certainly very similar in their general arrangement, significant differences
+remain in the details.
+Significance
+The method developed in aim (1) will be applied to each cortical area to find a set of marker genes such that the
+combinatorial expression pattern of those genes uniquely picks out the target area.  Finding marker genes will be useful for
+drug discovery as well as for experimentation because marker genes can be used to design interventions which selectively
+target individual cortical areas.
+_______________
+   2This would seem to contradict our finding in aim 1 that some cortical areas are combinatorially coded by multiple genes. However, it is possible
+that the currently accepted cortical maps divide the cortex into subregions which are unnatural from the point of view of gene expression; perhaps
+there is some other way to map the cortex for which each subregion can be identified by single genes.
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    4 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+The application of the marker gene finding algorithm to the cortex will also support the development of new neuroanatom-
+ical methods.  In addition to finding markers for each individual cortical areas, we will find a small panel of genes that can
+find many of the areal boundaries at once.  This panel of marker genes will allow the development of an ISH protocol that
+will allow experimenters to more easily identify which anatomical areas are present in small samples of cortex.
+The method developed in aim (3) will provide a genoarchitectonic viewpoint that will contribute to the creation of a better
+map. The development of present-day cortical maps was driven by the application of histological stains.  It is conceivable
+that if a different set of stains had been available which identified a different set of features, then the today&#8217;s cortical maps
+would have come out differently. Since the number of classes of stains is small compared to the number of genes, it is likely
+that there are many repeated, salient spatial patterns in the gene expression which have not yet been captured by any stain.
+Therefore, current ideas about cortical anatomy need to incorporate what we can learn from looking at the patterns of gene
+expression.
+While we do not here propose to analyze human gene expression data, it is conceivable that the methods we propose to
+develop could be used to suggest modifications to the human cortical map as well.
+Related work
+There does not appear to be much work on the automated analysis of spatial gene expression data.
+There is a substantial body of work on the analysis of gene expression data, however, most of this concerns gene expression
+data which is not fundamentally spatial.
+As noted above, there has been much work on both supervised learning and clustering, and there are many available
+algorithms for each.  However, the completion of Aims 1 and 2 involves more than just choosing between a set of existing
+algorithms,  and will constitute a substantial contribution to biology.   The algorithms require the scientist to provide a
+framework for representing the problem domain, and the way that this framework is set up has a large impact on performance.
+Creating a good framework can require creatively reconceptualizing the problem domain, and is not merely a mechanical
+&#8220;fine-tuning&#8221; of numerical parameters.  For example, we believe that domain-specific scoring measures (such as gradient
+similarity, which is discussed in Preliminary Work) may be necessary in order to achieve the best results in this application.
+We are aware of two existing efforts to relate spatial gene expression data to anatomy through computational methods.
+[? ] describes an analysis of the anatomy of the hippocampus using the ABA dataset.  In addition to manual analysis,
+two clustering methods were employed, a modified Non-negative Matrix Factorization (NNMF), and a hierarchial bifurcation
+clustering scheme based on correlation as the similarity score.  The paper yielded impressive results, proving the usefulness
+of such research. We have run NNMF on the cortical dataset and while the results are promising (see Preliminary Data), we
+think that it will be possible to find a better method3  (we also think that more automation of the parts that this paper&#8217;s
+authors did manually will be possible).
+and [?] describes AGEA. todo
+_____________
+   3We ran &#8220;vanilla&#8221; NNMF, whereas the paper under discussion used a modified method.  Their main modification consisted of adding a soft
+spatial contiguity constraint.  However, on our dataset, NNMF naturally produced spatially contiguous clusters, so no additional constraint was
+needed. The paper under discussion mentions that they also tried a hierarchial variant of NNMF, but since they didn&#8217;t report its results, we assume
+that those result were not any more impressive than the results of the non-hierarchial variant.
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    5 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+                   
-             Background and significance
-             Aim 1
-            Machine learning terminology: supervised learning
-               The task of looking for marker genes for anatomical subregions means that
-            one is looking for a set of genes such that, if the expression level of those genes
-            is known, then the locations of the subregions can be inferred.
-               If we define the subregions so that they cover the entire anatomical structure
-            to be divided, then instead of saying that we are using gene expression to find
-            the locations of the subregions, we may say that we are using gene expression to
-            determine to which subregion each voxel within the structure belongs. We call
-            this a classification task, because each voxel is being assigned to a class (namely,
-            its subregion).
-               Therefore, an understanding of the relationship between the combination of
-            their expression levels and the locations of the subregions may be expressed as
-            a function. The input to this function is a voxel, along with the gene expression
-            levels within that voxel;  the output is the subregional identity of the target
-            voxel, that is, the subregion to which the target voxel belongs.  We call this
-            function a classifier.  In general, the input to a classifier is called an instance,
-            and the output is called a label (or a class label).
-               The object of aim 1 is not to produce a single classifier, but rather to develop
-            an automated method for determining a classifier for any known anatomical
-            structure.  Therefore, we seek a procedure by which a gene expression dataset
-            may be analyzed in concert with an anatomical atlas in order to produce a
-            classifier.  Such a procedure is a type of a machine learning procedure.  The
-            construction of the classifier is called training (also learning), and the initial
-            gene expression dataset used in the construction of the classifier is called training
-            data.
-               In the machine learning literature, this sort of procedure may be thought
-            of as a supervised learning task, defined as a task in which the goal is to learn
-            a mapping from instances to labels, and the training data consists of a set of
-            instances (voxels) for which the labels (subregions) are known.
-               Each gene expression level is called a feature, and the selection of which
-            genes1 to include is called feature selection. Feature selection is one component
-            of the task of learning a classifier.  Some methods for learning classifiers start
-            out with a separate feature selection phase, whereas other methods combine
-            feature selection with other aspects of training.
-               One class of feature selection methods assigns some sort of score to each
-            candidate gene. The top-ranked genes are then chosen. Some scoring measures
-            can assign a score to a set of selected genes, not just to a single gene; in this
-            case, a dynamic procedure may be used in which features are added and sub-
-            tracted from the selected set depending on how much they raise the score. Such
-            procedures are called &#8220;stepwise&#8221; or &#8220;greedy&#8221;.
-__________________________
-   1Strictly speaking, the features are gene expression levels, but we&#8217;ll call them genes.
-                                            2
-
-               Although the classifier itself may only look at the gene expression data within
-            each voxel before classifying that voxel, the learning algorithm which constructs
-            the classifier may look over the entire dataset.  We can categorize score-based
-            feature selection methods depending on how the score of calculated.   Often
-            the score calculation consists of assigning a sub-score to each voxel, and then
-            aggregating these sub-scores into a final score (the aggregation is often a sum or
-            a sum of squares). If only information from nearby voxels is used to calculate a
-            voxel&#8217;s sub-score, then we say it is a local scoring method.  If only information
-            from the voxel itself is used to calculate a voxel&#8217;s sub-score, then we say it is a
-            pointwise scoring method.
-               Key questions when choosing a learning method are: What are the instances?
-            What are the features?  How are the features chosen?  Here are four principles
-            that outline our answers to these questions.
-               Principle 1:  Combinatorial gene expression It is too much to hope
-            that every anatomical region of interest will be identified by a single gene. For
-            example, in the cortex, there are some areas which are not clearly delineated
-            by any gene included in the Allen Brain Atlas (ABA) dataset.  However, at
-            least some of these areas can be delineated by looking at combinations of genes
-            (an example of an area for which multiple genes are necessary and sufficient
-            is provided in Preliminary Results).  Therefore, each instance should contain
-            multiple features (genes).
-               Principle 2: Only look at combinations of small numbers of genes
-            When the classifier classifies a voxel, it is only allowed to look at the expression of
-            the genes which have been selected as features. The more data that is available
-            to a classifier, the better that it can do.  For example, perhaps there are weak
-            correlations over many genes that add up to a strong signal. So, why not include
-            every gene as a feature? The reason is that we wish to employ the classifier in
-            situations in which it is not feasible to gather data about every gene.   For
-            example, if we want to use the expression of marker genes as a trigger for some
-            regionally-targeted intervention, then our intervention must contain a molecular
-            mechanism to check the expression level of each marker gene before it triggers.
-            It is currently infeasible to design a molecular trigger that checks the level of
-            more than a handful of genes. Similarly, if the goal is to develop a procedure to
-            do ISH on tissue samples in order to label their anatomy, then it is infeasible
-            to label more than a few genes.  Therefore, we must select only a few genes as
-            features.
-               Principle 3: Use geometry in feature selection
-               When doing feature selection with score-based methods, the simplest thing
-            to do would be to score the performance of each voxel by itself and then com-
-            bine these scores (pointwise scoring).  A more powerful approach is to also use
-            information about the geometric relations between each voxel and its neighbors;
-            this requires non-pointwise, local scoring methods. See Preliminary Results for
-            evidence of the complementary nature of pointwise and local scoring methods.
-                                            3
-
-               Principle 4: Work in 2-D whenever possible
-               There are many anatomical structures which are commonly characterized in
-            terms of a two-dimensional manifold. When it is known that the structure that
-            one is looking for is two-dimensional, the results may be improved by allowing
-            the analysis algorithm to take advantage of this prior knowledge.  In addition,
-            it is easier for humans to visualize and work with 2-D data.
-               Therefore, when possible, the instances should represent pixels, not voxels.
-             Aim 2
-            Machine learning terminology: clustering
-               If one is given a dataset consisting merely of instances, with no class labels,
-            then analysis of the dataset is referred to as unsupervised learning in the jargon
-            of machine learning. One thing that you can do with such a dataset is to group
-            instances together. A set of similar instances is called a cluster, and the activity
-            of finding grouping the data into clusters is called clustering or cluster analysis.
-               The task of deciding how to carve up a structure into anatomical subregions
-            can be put into these terms.  The instances are once again voxels (or pixels)
-            along with their associated gene expression profiles.  We make the assumption
-            that voxels from the same subregion have similar gene expression profiles, at
-            least compared to the other subregions.  This means that clustering voxels is
-            the same as finding potential subregions; we seek a partitioning of the voxels
-            into subregions, that is, into clusters of voxels with similar gene expression.
-               It is desirable to determine not just one set of subregions,  but also how
-            these subregions relate to each other, if at all; perhaps some of the subregions
-            are more similar to each other than to the rest, suggesting that, although at a
-            fine spatial scale they could be considered separate, on a coarser spatial scale
-            they could be grouped together into one large subregion.  This suggests the
-            outcome of clustering may be a hierarchial tree of clusters, rather than a single
-            set of clusters which partition the voxels. This is called hierarchial clustering.
-               Similarity scores
-               A crucial choice when designing a clustering method is how to measure
-            similarity, across either pairs of instances, or clusters, or both.  There is much
-            overlap between scoring methods for feature selection (discussed above under
-            Aim 1) and scoring methods for similarity.
-               Spatially contiguous clusters; image segmentation
-               We have shown that aim 2 is a type of clustering task.   In fact,  it is a
-            special type of clustering task because we have an additional constraint on
-            clusters; voxels grouped together into a cluster must be spatially contiguous.
-            In Preliminary Results, we show that one can get reasonable results without
-            enforcing this constraint, however, we plan to compare these results against
-            other methods which guarantee contiguous clusters.
-               Perhaps the biggest source of continguous clustering algorithms is the field
-            of computer vision, which has produced a variety of image segmentation algo-
-                                            4
-
-            rithms.  Image segmentation is the task of partitioning the pixels in a digital
-            image into clusters, usually contiguous clusters.  Aim 2 is similar to an image
-            segmentation task. There are two main differences; in our task, there are thou-
-            sands of color channels (one for each gene), rather than just three.  There are
-            imaging tasks which use more than three colors, however, for example multispec-
-            tral imaging and hyperspectral imaging, which are often used to process satellite
-            imagery. A more crucial difference is that there are various cues which are ap-
-            propriate for detecting sharp object boundaries in a visual scene but which are
-            not appropriate for segmenting abstract spatial data such as gene expression.
-            Although many image segmentation algorithms can be expected to work well
-            for segmenting other sorts of spatially arranged data, some of these algorithms
-            are specialized for visual images.
-               Dimensionality reduction
-               Unlike aim 1, there is no externally-imposed need to select only a handful
-            of informative genes for inclusion in the instances.  However, some clustering
-            algorithms perform better on small numbers of features.  There are techniques
-            which &#8220;summarize&#8221; a larger number of features using a smaller number of fea-
-            tures; these techniques go by the name of feature extraction or dimensionality
-            reduction.  The small set of features that such a technique yields is called the
-            reduced feature set. After the reduced feature set is created, the instances may
-            be replaced by reduced instances, which have as their features the reduced fea-
-            ture set rather than the original feature set of all gene expression levels.  Note
-            that the features in the reduced feature set do not necessarily correspond to
-            genes; each feature in the reduced set may be any function of the set of gene
-            expression levels.
-               Another use for dimensionality reduction is to visualize the relationships
-            between subregions.  For example, one might want to make a 2-D plot upon
-            which each subregion is represented by a single point, and with the property
-            that subregions with similar gene expression profiles should be nearby on the
-            plot (that is, the property that distance between pairs of points in the plot
-            should be proportional to some measure of dissimilarity in gene expression). It
-            is likely that no arrangement of the points on a 2-D plan will exactly satisfy
-            this property &#8211; however, dimensionality reduction techniques allow one to find
-            arrangements of points that approximately satisfy that property.   Note that
-            in this application, dimensionality reduction is being applied after clustering;
-            whereas in the previous paragraph, we were talking about using dimensionality
-            reduction before clustering.
-               Clustering genes rather than voxels
-               Although the ultimate goal is to cluster the instances (voxels or pixels), one
-            strategy to achieve this goal is to first cluster the features (genes).  There are
-            two ways that clusters of genes could be used.
-               Gene clusters could be used as part of dimensionality reduction: rather than
-            have one feature for each gene, we could have one reduced feature for each gene
-            cluster.
-                                            5
-
-               Gene clusters could also be used to directly yield a clustering on instances.
-            This is because many genes have an expression pattern which seems to pick
-            out a single, spatially continguous subregion. Therefore, it seems likely that an
-            anatomically interesting subregion will have multiple genes which each individ-
-            ually pick it out2. This suggests the following procedure: cluster together genes
-            which pick out similar subregions, and then to use the more popular common
-            subregions as the final clusters. In the Preliminary Data we show that a num-
-            ber of anatomically recognized cortical regions, as well as some &#8220;superregions&#8221;
-            formed by lumping together a few regions, are associated with gene clusters in
-            this fashion.
-             Aim 3
-            Background
-               The cortex is divided into areas and layers.  To a first approximation, the
-            parcellation of the cortex into areas can be drawn as a 2-D map on the surface of
-            the cortex.  In the third dimension, the boundaries between the areas continue
-            downwards into the cortical depth,  perpendicular to the surface.   The layer
-            boundaries run parallel to the surface. One can picture an area of the cortex as
-            a slice of many-layered cake.
-               Although it is known that different cortical areas have distinct roles in both
-            normal functioning and in disease processes, there are no known marker genes
-            for many cortical areas.  When it is necessary to divide a tissue sample into
-            cortical areas, this is a manual process that requires a skilled human to combine
-            multiple visual cues and interpret them in the context of their approximate
-            location upon the cortical surface.
-               Even the questions of how many areas should be recognized in cortex, and
-            what their arrangement is, are still not completely settled. A proposed division
-            of the cortex into areas is called a cortical map.  In the rodent, the lack of a
-            single agreed-upon map can be seen by contrasting the recent maps given by
-            Swanson?? on the one hand, and Paxinos and Franklin?? on the other. While
-            the maps are certainly very similar in their general arrangement, significant
-            differences remain in the details.
-               Significance
-               The method developed in aim (1) will be applied to each cortical area to find
-            a set of marker genes such that the combinatorial expression pattern of those
-            genes uniquely picks out the target area.  Finding marker genes will be useful
-            for drug discovery as well as for experimentation because marker genes can be
-            used to design interventions which selectively target individual cortical areas.
-__________________________
-   2This would seem to contradict our finding in aim 1 that some cortical areas are combina-
-torially coded by multiple genes.  However, it is possible that the currently accepted cortical
-maps divide the cortex into subregions which are unnatural from the point of view of gene
-expression; perhaps there is some other way to map the cortex for which each subregion can
-be identified by single genes.
-                                            6
-
-               The application of the marker gene finding algorithm to the cortex will
-            also support the development of new neuroanatomical methods. In addition to
-            finding markers for each individual cortical areas, we will find a small panel
-            of genes that can find many of the areal boundaries at once.  This panel of
-            marker genes will allow the development of an ISH protocol that will allow
-            experimenters to more easily identify which anatomical areas are present in
-            small samples of cortex.
-               The method developed in aim (3) will provide a genoarchitectonic viewpoint
-            that will contribute to the creation of a better map. The development of present-
-            day cortical maps was driven by the application of histological stains.   It is
-            conceivable that if a different set of stains had been available which identified
-            a different set of features, then the today&#8217;s cortical maps would have come out
-            differently. Since the number of classes of stains is small compared to the number
-            of genes, it is likely that there are many repeated, salient spatial patterns in
-            the gene expression which have not yet been captured by any stain. Therefore,
-            current ideas about cortical anatomy need to incorporate what we can learn
-            from looking at the patterns of gene expression.
-               While we do not here propose to analyze human gene expression data, it is
-            conceivable that the methods we propose to develop could be used to suggest
-            modifications to the human cortical map as well.
-             Related work
-            There does not appear to be much work on the automated analysis of spatial
-            gene expression data.
-               There is a substantial body of work on the analysis of gene expression data,
-            however, most of this concerns gene expression data which is not fundamentally
-            spatial.
-               As noted above, there has been much work on both supervised learning and
-            clustering,  and there are many available algorithms for each.   However,  the
-            completion of Aims 1 and 2 involves more than just choosing between a set of
-            existing algorithms, and will constitute a substantial contribution to biology.
-            The algorithms require the scientist to provide a framework for representing the
-            problem domain, and the way that this framework is set up has a large impact
-            on performance.  Creating a good framework can require creatively reconcep-
-            tualizing the problem domain, and is not merely a mechanical &#8220;fine-tuning&#8221;
-            of numerical parameters. For example, we believe that domain-specific scoring
-            measures (such as gradient similarity, which is discussed in Preliminary Work)
-            may be necessary in order to achieve the best results in this application.
-               We are aware of two existing efforts to relate spatial gene expression data to
-            anatomy through computational methods.
-               [?] describes an analysis of the anatomy of the hippocampus using the ABA
-            dataset. In addition to manual analysis, two clustering methods were employed,
-            a modified Non-negative Matrix Factorization (NNMF), and a hierarchial bifur-
-            cation clustering scheme based on correlation as the similarity score. The paper
-            yielded impressive results, proving the usefulness of such research. We have run
-                                            7
-
-            NNMF on the cortical dataset and while the results are promising (see Prelim-
-            inary Data), we think that it will be possible to find a better method3 (we also
-            think that more automation of the parts that this paper&#8217;s authors did manually
-            will be possible).
-               and [?] describes AGEA. todo
-__________________________
-   3We ran &#8220;vanilla&#8221; NNMF, whereas the paper under discussion used a modified method.
-Their main modification consisted of adding a soft spatial contiguity constraint.  However,
-on our dataset,  NNMF naturally produced spatially contiguous clusters,  so no additional
-constraint was needed. The paper under discussion mentions that they also tried a hierarchial
-variant of NNMF, but since they didn&#8217;t report its results, we assume that those result were
-not any more impressive than the results of the non-hierarchial variant.
-                                            8
-
-             Preliminary work
-             Format conversion between SEV, MATLAB, NIFTI
-            todo
-             Flatmap of cortex
-            todo
-               Using combinations of multiple genes is necessary and sufficient to
-            delineate some cortical areas
-               Here we give an example of a cortical area which is not marked by any
-            single gene, but which can be identified combinatorially.  according to logistic
-            regression, gene wwc14 is the best fit single gene for predicting whether or not a
-            pixel on the cortical surface belongs to the motor area (area MO). The upper-left
-            picture in Figure  shows wwc1&#8217;s spatial expression pattern over the cortex. The
-            lower-right boundary of MO is represented reasonably well by this gene, however
-            the gene overshoots the upper-left boundary. This flattened 2-D representation
-            does not show it, but the area corresponding to the overshoot is the medial
-            surface of the cortex. MO is only found on the lateral surface (todo).
-               Gnee mtif25 is shown in figure the upper-right of Fig. . Mtif2 captures MO&#8217;s
-            upper-left boundary, but not its lower-right boundary.  Mtif2 does not express
-            very much on the medial surface.  By adding together the values at each pixel
-            in these two figures, we get the lower-left of Figure . This combination captures
-            area MO much better than any single gene.
-               Correlation todo
-               Conditional entropy todo
-               Gradient similarity todo
-               Geometric and pointwise scoring methods provide complementary
-            information
-               To show that local geometry can provide useful information that cannot be
-            detected via pointwise analyses, consider Fig. . The top row of Fig.  displays the
-            3 genes which most match area AUD, according to a pointwise method6.  The
-            bottom row displays the 3 genes which most match AUD according to a method
-            which considers local geometry7 The pointwise method in the top row identifies
-__________________________
+Figure 1: Upper left: wwc1. Upper right: mtif2. Lower left: wwc1 + mtif2 (each pixel&#8217;s value on the lower left is the sum
+of the corresponding pixels in the upper row). Within each picture, the vertical axis roughly corresponds to anterior at the
+top and posterior at the bottom, and the horizontal axis roughly corresponds to medial at the left and lateral at the right.
+The red outline is the boundary of region MO. Pixels are colored approximately according to the density of expressing cells
+underneath each pixel, with red meaning a lot of expression and blue meaning little.
+Preliminary work
+Format conversion between SEV, MATLAB, NIFTI
+todo
+Flatmap of cortex
+todo
+Using combinations of multiple genes is necessary and sufficient to delineate some cortical areas
+Here we give an example of a cortical area which is not marked by any single gene, but which can be identified combi-
+natorially.  according to logistic regression, gene wwc14  is the best fit single gene for predicting whether or not a pixel on
+the cortical surface belongs to the motor area (area MO). The upper-left picture in Figure  shows wwc1&#8217;s spatial expression
+pattern over the cortex.  The lower-right boundary of MO is represented reasonably well by this gene, however the gene
+overshoots the upper-left boundary.  This flattened 2-D representation does not show it, but the area corresponding to the
+overshoot is the medial surface of the cortex. MO is only found on the lateral surface (todo).
+Gnee mtif25 is shown in figure the upper-right of Fig. . Mtif2 captures MO&#8217;s upper-left boundary, but not its lower-right
+boundary. Mtif2 does not express very much on the medial surface. By adding together the values at each pixel in these two
+figures, we get the lower-left of Figure . This combination captures area MO much better than any single gene.
+Correlation todo
+Conditional entropy todo
+Gradient similarity todo
+Geometric and pointwise scoring methods provide complementary information
+To show that local geometry can provide useful information that cannot be detected via pointwise analyses, consider Fig.
+. The top row of Fig.  displays the 3 genes which most match area AUD, according to a pointwise method6. The bottom row
+_________________________________________
-    6For each gene, a logistic regression in which the response variable was whether or not a
-surface pixel was within area AUD, and the predictor variable was the value of the expression
-of the gene underneath that pixel. The resulting scores were used to rank the genes in terms
-of how well they predict area AUD.
-    7For each gene the gradient similarity (see section ??) between (a) a map of the expression
-of each gene on the cortical surface and (b) the shape of area AUD, was calculated, and this
-was used to rank the genes.
-                                            9
-
-                                        
-            
-            Figure 1:  Upper left:  wwc1.  Upper right:  mtif2.  Lower left:  wwc1 + mtif2
-            (each pixel&#8217;s value on the lower left is the sum of the corresponding pixels in
-            the upper row).  Within each picture, the vertical axis roughly corresponds to
-            anterior at the top and posterior at the bottom, and the horizontal axis roughly
-            corresponds to medial at the left and lateral at the right.  The red outline is
-            the boundary of region MO. Pixels are colored approximately according to the
-            density of expressing cells underneath each pixel, with red meaning a lot of
-            expression and blue meaning little.
-                                            10
-
-                                                        
-                                                        
-            Figure 2: The top row shows the three genes which (individually) best predict
-            area AUD, according to logistic regression.  The bottom row shows the three
-            genes which (individually) best match area AUD, according to gradient similar-
-            ity. From left to right and top to bottom, the genes are Ssr1, Efcbp1, Aph1a,
-            Ptk7, Aph1a again, and Lepr
-            genes which express more strongly in AUD than outside of it; its weakness is that
-            this includes many areas which don&#8217;t have a salient border matching the areal
-            border. The geometric method identifies genes whose salient expression border
-            seems to partially line up with the border of AUD; its weakness is that this
-            includes genes which don&#8217;t express over the entire area. Genes which have high
-            rankings using both pointwise and border criteria, such as Aph1a in the example,
-            may be particularly good markers.   None of these genes are,  individually,  a
-            perfect marker for AUD; we deliberately chose a &#8220;difficult&#8221; area in order to
-            better contrast pointwise with geometric methods.
-               Areas which can be identified by single genes
-               todo
-             Specific to Aim 1 (and Aim 3)
-            Forward stepwise logistic regression todo
-               SVM on all genes at once
-               In order to see how well one can do when looking at all genes at once, we
-            ran a support vector machine to classify cortical surface pixels based on their
-            gene expression profiles.  We achieved classification accuracy of about 81%8.
-            As noted above, however, a classifier that looks at all the genes at once isn&#8217;t
-            practically useful.
-____________
-   85-fold cross-validation.
-                                            11
-
-               The requirement to find combinations of only a small number of genes limits
-            us from straightforwardly applying many of the most simple techniques from
-            the field of supervised machine learning.  In the parlance of machine learning,
-            our task combines feature selection with supervised learning.
-               Decision trees
-               todo
-             Specific to Aim 2 (and Aim 3)
-            Raw dimensionality reduction results
-               todo
-               (might want to incld nnMF since mentioned above)
-               Dimensionality reduction plus K-means or spectral clustering
-               Many areas are captured by clusters of genes
-               todo
-               todo
-                                            12
-
-             Research plan
-            todo amongst other things:
-               Develop algorithms that find genetic markers for anatomical re-
-            gions
-              1. Develop scoring measures for evaluating how good individual genes are at
-                 marking areas:  we will compare pointwise, geometric, and information-
-                 theoretic measures.
-              2. Develop a procedure to find single marker genes for anatomical regions: for
-                 each cortical area, by using or combining the scoring measures developed,
-                 we will rank the genes by their ability to delineate each area.
-              3. Extend the procedure to handle difficult areas by using combinatorial cod-
-                 ing: for areas that cannot be identified by any single gene, identify them
-                 with a handful of genes. We will consider both (a) algorithms that incre-
-                 mentally/greedily combine single gene markers into sets, such as forward
-                 stepwise regression and decision trees, and also (b) supervised learning
-                 techniques which use soft constraints to minimize the number of features,
-                 such as sparse support vector machines.
-              4. Extend the procedure to handle difficult areas by combining or redrawing
-                 the boundaries:  An area may be difficult to identify because the bound-
-                 aries are misdrawn, or because it does not &#8220;really&#8221; exist as a single area,
-                 at least on the genetic level. We will develop extensions to our procedure
-                 which (a) detect when a difficult area could be fit if its boundary were
-                 redrawn slightly, and (b) detect when a difficult area could be combined
-                 with adjacent areas to create a larger area which can be fit.
-               Apply these algorithms to the cortex
-              1. Create open source format conversion tools:  we will create tools to bulk
-                 download the ABA dataset and to convert between SEV, NIFTI and MAT-
-                 LAB formats.
-              2. Flatmap the ABA cortex data: map the ABA data onto a plane and draw
-                 the cortical area boundaries onto it.
-              3. Find layer boundaries:  cluster similar voxels together in order to auto-
-                 matically find the cortical layer boundaries.
-              4. Run the procedures that we developed on the cortex: we will present, for
-                 each area, a short list of markers to identify that area; and we will also
-                 present lists of &#8220;panels&#8221; of genes that can be used to delineate many areas
-                 at once.
-                                            13
-
-               Develop algorithms to suggest a division of a structure into anatom-
-            ical parts
-              1. Explore dimensionality reduction algorithms applied to pixels:  including
-                 TODO
-              2. Explore dimensionality reduction algorithms applied to genes:  including
-                 TODO
-              3. Explore clustering algorithms applied to pixels: including TODO
-              4. Explore clustering algorithms applied to genes:  including gene shaving,
-                 TODO
-              5. Develop an algorithm to use dimensionality reduction and/or hierarchial
-                 clustering to create anatomical maps
-              6. Run this algorithm on the cortex: present a hierarchial, genoarchitectonic
-                 map of the cortex
-______________________________________________
-    stuff  i  dunno  where  to  put  yet  (there  is  more  scattered  through  grant-
-oldtext):
+    6For each gene, a logistic regression in which the response variable was whether or not a surface pixel was within area AUD, and the predictor
+variable was the value of the expression of the gene underneath that pixel. The resulting scores were used to rank the genes in terms of how well
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    6 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+                                     
+                                     
+Figure 2:  The top row shows the three genes which (individually) best predict area AUD, according to logistic regression.
+The bottom row shows the three genes which (individually) best match area AUD, according to gradient similarity.  From
+left to right and top to bottom, the genes are Ssr1, Efcbp1, Aph1a, Ptk7, Aph1a again, and Lepr
+displays the 3 genes which most match AUD according to a method which considers local geometry7 The pointwise method
+in the top row identifies genes which express more strongly in AUD than outside of it; its weakness is that this includes many
+areas which don&#8217;t have a salient border matching the areal border.  The geometric method identifies genes whose salient
+expression border seems to partially line up with the border of AUD; its weakness is that this includes genes which don&#8217;t
+express over the entire area. Genes which have high rankings using both pointwise and border criteria, such as Aph1a in the
+example, may be particularly good markers. None of these genes are, individually, a perfect marker for AUD; we deliberately
+chose a &#8220;difficult&#8221; area in order to better contrast pointwise with geometric methods.
+Areas which can be identified by single genes
+todo
+Specific to Aim 1 (and Aim 3)
+Forward stepwise logistic regression todo
+SVM on all genes at once
+In order to see how well one can do when looking at all genes at once, we ran a support vector machine to classify cortical
+surface pixels based on their gene expression profiles.  We achieved classification accuracy of about 81%8.  As noted above,
+however, a classifier that looks at all the genes at once isn&#8217;t practically useful.
+The requirement to find combinations of only a small number of genes limits us from straightforwardly applying many
+of the most simple techniques from the field of supervised machine learning.  In the parlance of machine learning, our task
+combines feature selection with supervised learning.
+Decision trees
+todo
+Specific to Aim 2 (and Aim 3)
+Raw dimensionality reduction results
+todo
+(might want to incld nnMF since mentioned above)
+_________________________________________
+they predict area AUD.
+    7For each gene the gradient similarity (see section ??) between (a) a map of the expression of each gene on the cortical surface and (b) the
+shape of area AUD, was calculated, and this was used to rank the genes.
+    85-fold cross-validation.
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    7 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+Dimensionality reduction plus K-means or spectral clustering
+Many areas are captured by clusters of genes
+todo
+todo
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    8 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+Research plan
+todo amongst other things:
+Develop algorithms that find genetic markers for anatomical regions
+1.Develop scoring measures for evaluating how good individual genes are at marking areas: we will compare pointwise,
+geometric, and information-theoretic measures.
+2.Develop a procedure to find single marker genes for anatomical regions: for each cortical area, by using or combining
+the scoring measures developed, we will rank the genes by their ability to delineate each area.
+3.Extend the procedure to handle difficult areas by using combinatorial coding: for areas that cannot be identified by any
+single gene, identify them with a handful of genes.  We will consider both (a) algorithms that incrementally/greedily
+combine single gene markers into sets, such as forward stepwise regression and decision trees, and also (b) supervised
+learning techniques which use soft constraints to minimize the number of features,  such as sparse support vector
+machines.
+4.Extend the procedure to handle difficult areas by combining or redrawing the boundaries: An area may be difficult to
+identify because the boundaries are misdrawn, or because it does not &#8220;really&#8221; exist as a single area, at least on the
+genetic level.  We will develop extensions to our procedure which (a) detect when a difficult area could be fit if its
+boundary were redrawn slightly, and (b) detect when a difficult area could be combined with adjacent areas to create
+a larger area which can be fit.
+Apply these algorithms to the cortex
+1.Create open source format conversion tools:  we will create tools to bulk download the ABA dataset and to convert
+between SEV, NIFTI and MATLAB formats.
+2.Flatmap the ABA cortex data: map the ABA data onto a plane and draw the cortical area boundaries onto it.
+3.Find layer boundaries: cluster similar voxels together in order to automatically find the cortical layer boundaries.
+4.Run the procedures that we developed on the cortex: we will present, for each area, a short list of markers to identify
+that area; and we will also present lists of &#8220;panels&#8221; of genes that can be used to delineate many areas at once.
+Develop algorithms to suggest a division of a structure into anatomical parts
+1.Explore dimensionality reduction algorithms applied to pixels: including TODO
+2.Explore dimensionality reduction algorithms applied to genes: including TODO
+3.Explore clustering algorithms applied to pixels: including TODO
+4.Explore clustering algorithms applied to genes: including gene shaving, TODO
+5.Develop an algorithm to use dimensionality reduction and/or hierarchial clustering to create anatomical maps
+6.Run this algorithm on the cortex: present a hierarchial, genoarchitectonic map of the cortex
+_____________________
+    stuff i dunno where to put yet (there is more scattered through grant-oldtext):
-    In anatomy, the manifold of interest is usually either defined by a combina-
-tion of two relevant anatomical axes (todo), or by the surface of the structure
-(as is the case with the cortex).  In the former case, the manifold of interest is
-a plane, but in the latter case it is curved. If the manifold is curved, there are
-various methods for mapping the manifold into a plane.
-    The method that we will develop will begin by mapping the data into a
-2-D plane.  Although the manifold that characterized cortical areas is known
-to be the cortical surface, it remains to be seen which method of mapping the
-manifold into a plane is optimal for this application. We will compare mappings
-which attempt to preserve size (such as the one used by Caret??) with mappings
-which preserve angle (conformal maps).
-    Although there is much 2-D organization in anatomy, there are also struc-
-tures whose shape is fundamentally 3-dimensional.  If possible, we would like
-the method we develop to include a statistical test that warns the user if the
-assumption of 2-D structure seems to be wrong.
-    if we need citations for aim 3 significance,  http://www.sciencedirect.
-com/science?_ob=ArticleURL&amp;_udi=B6WSS-4V70FHY-9&amp;_user=4429&amp;_coverDate=
-12%2F26%2F2008&amp;_rdoc=1&amp;_fmt=full&amp;_orig=na&amp;_cdi=7054&amp;_docanchor=&amp;_acct=
-C000059602&amp;_version=1&amp;_urlVersion=0&amp;_userid=4429&amp;md5=551eccc743a2bfe6e992eee0c3194203#
-app2 has examples of genetic targeting to specific anatomical regions
-    &#8212;
-    note:
-    do we need to cite: no known markers, impressive results?
-                                            14
+    In anatomy, the manifold of interest is usually either defined by a combination of two relevant anatomical axes (todo), or
+by the surface of the structure (as is the case with the cortex). In the former case, the manifold of interest is a plane, but in
+the latter case it is curved. If the manifold is curved, there are various methods for mapping the manifold into a plane.
+    The method that we will develop will begin by mapping the data into a 2-D plane. Although the manifold that charac-
+terized cortical areas is known to be the cortical surface, it remains to be seen which method of mapping the manifold into
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    9 ___                                                      Continuation Format Page
+                          Principal Investigator/Program Director(Last, First, Middle):              Stevens, Charles F.___
+a plane is optimal for this application. We will compare mappings which attempt to preserve size (such as the one used by
+Caret?? ) with mappings which preserve angle (conformal maps).
+Although there is much 2-D organization in anatomy, there are also structures whose shape is fundamentally 3-dimensional.
+If possible, we would like the method we develop to include a statistical test that warns the user if the assumption of 2-D
+structure seems to be wrong.
+if we need citations for aim 3 significance, http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6WSS-4V70FHY-9&amp;_
+user=4429&amp;_coverDate=12%2F26%2F2008&amp;_rdoc=1&amp;_fmt=full&amp;_orig=na&amp;_cdi=7054&amp;_docanchor=&amp;_acct=C000059602&amp;_version=
+1&amp;_urlVersion=0&amp;_userid=4429&amp;md5=551eccc743a2bfe6e992eee0c3194203#app2 has examples of genetic targeting to spe-
+cific anatomical regions
+&#8212;
+note:
+do we need to cite: no known markers, impressive results?
+_______________________________________________________________________________________________________
+PHS 398/2590 (Rev. 09/04)                                                    Page    10 ___                                                     Continuation Format Page
author	bshanks@bshanks.dyndns.org
date	Mon Apr 13 03:52:58 2009 -0700 (16 years ago)
parents	5e2e4732b647
children	95910357b4ac