nsf

diff grant.txt @ 123:5f792263405e

.
author bshanks@bshanks.dyndns.org
date Wed Jul 08 14:21:46 2009 -0700 (16 years ago)
parents dad49a6f95b6
children
line diff
1.1 --- a/grant.txt Fri Jul 03 05:17:28 2009 -0700 1.2 +++ b/grant.txt Wed Jul 08 14:21:46 2009 -0700 1.3 @@ -266,7 +266,7 @@ 1.4 1.5 %%As noted above, there has been much work in the machine learning literature on both supervised and unsupervised learning and there are many available algorithms for each. However, the algorithms require the scientist to provide a framework for representing the problem domain, and the way that this framework is set up has a large impact on performance. Creating a good framework can require creatively reconceptualizing the problem domain, and is not merely a mechanical "fine-tuning" of numerical parameters. For example, we believe that domain-specific scoring measures (such as gradient similarity, which is discussed in Preliminary Results) may be necessary in order to achieve the best results in this application. So, the project involves more than the blind application of existing machine learning analysis programs to a new dataset. 1.6 1.7 -As noted above, the GIS community has developed tools for supervised classification and unsupervised clustering in the context of the analysis of hyperspectral imaging data. One tool is Spectral Python\footnote{http://spectralpython.sourceforge.net/}. Spectral Python implements various supervised and unsupervised classification methods, as well as utility functions for loading, viewing, and saving spatial data. Although Spectral Python has feature extraction methods (such as principal components analysis) which create a small set of new features computed based on the original features, it does not have feature selection methods, that is, methods to select a small subset out of the original features (although feature selection in hyperspectral imaging has been investigated by others\cite{serpico_new_2001}. %%We intend to extend Spectral Python's reportoire of supervised and unsupervised machine learning methods, as well as to add feature selection methods. 1.8 +As noted above, the GIS community has developed tools for supervised classification and unsupervised clustering in the context of the analysis of hyperspectral imaging data. One tool is Spectral Python\cite{boggs_spectral_2008}. Spectral Python implements various supervised and unsupervised classification methods, as well as utility functions for loading, viewing, and saving spatial data. Although Spectral Python has feature extraction methods (such as principal components analysis) which create a small set of new features computed based on the original features, it does not have feature selection methods, that is, methods to select a small subset out of the original features (although feature selection in hyperspectral imaging has been investigated by others\cite{serpico_new_2001}. %%We intend to extend Spectral Python's reportoire of supervised and unsupervised machine learning methods, as well as to add feature selection methods. 1.9 1.10 There is a substantial body of work on the analysis of gene expression data. Most of this concerns gene expression data which are not fundamentally spatial\footnote{By "__fundamentally__ spatial" we mean that there is information from a large number of spatial locations indexed by spatial coordinates; not just data which have only a few different locations or which is indexed by anatomical label.}. Here we review only that work which concerns the automated analysis of spatial gene expression data with respect to anatomy. 1.11 1.12 @@ -368,7 +368,7 @@ 1.13 \caption{Upper left: $wwc1$. Upper right: $mtif2$. Lower left: wwc1 + mtif2 (each pixel's value on the lower left is the sum of the corresponding pixels in the upper row).} 1.14 \label{MOcombo}\end{wrapfigure} 1.15 1.16 -We are enthusiastic about the sharing of methods and data, and at the conclusion of the project, we will make all of our data and computer source code publically available, either in supplemental attachments to publications, or on a website. The source code will be released under the GNU Public License. We intend to include a software program which, when run, will take as input the Allen Brain Atlas raw data, and produce as output all numbers and charts found in publications resulting from the project. Source code to be released will include extensions to Caret\cite{van_essen_integrated_2001}, an existing open-source scientific imaging program, and to Spectral Python. Data to be released will include the 2-D "flat map" dataset. This dataset will be submitted to a machine learning dataset repository. 1.17 +We are enthusiastic about the sharing of methods and data, and at the conclusion of the project, we will make all of our data and computer source code publicly available, either in supplemental attachments to publications, or on a website. The source code will be released under the GNU Public License. We intend to include a software program which, when run, will take as input the Allen Brain Atlas raw data, and produce as output all numbers and charts found in publications resulting from the project. Source code to be released will include extensions to Caret\cite{van_essen_integrated_2001}, an existing open-source scientific imaging program, and to Spectral Python. Data to be released will include the 2-D "flat map" dataset. This dataset will be submitted to a machine learning dataset repository. 1.18 1.19 %% Our goal is that replicating our results, or applying the methods we develop to other targets, will be quick and easy for other investigators. 1.20