cg

diff grant.txt @ 70:5cdbbf86e10b
.
author: bshanks@bshanks.dyndns.org
date: Mon Apr 20 16:23:22 2009 -0700 (16 years ago)
parents: 60d7c1c1b94f
children: 48dae6cb2c09
--- a/grant.txt	Mon Apr 20 15:08:40 2009 -0700
+++ b/grant.txt	Mon Apr 20 16:23:22 2009 -0700
@@ -1,6 +1,9 @@
+%%\usepackage{floatflt}
+\usepackage{wrapfig}
+
@@ -231,6 +234,13 @@
+\begin{wrapfigure}{L}{0.2\textwidth}\centering
+\includegraphics[scale=.31]{holeExample_2682_SS_jet.eps} 
+\caption{Gene Pitx2 is selectively underexpressed in area SS (somatosensory).}
+\label{hole}\end{wrapfigure}
+
+
+
@@ -244,6 +254,21 @@
+\begin{wrapfigure}{L}{0.4\textwidth}\centering
+%%\includegraphics[scale=.31]{singlegene_SS_corr_top_1_2365_jet.eps}\includegraphics[scale=.31]{singlegene_SS_corr_top_2_242_jet.eps}\includegraphics[scale=.31]{singlegene_SS_corr_top_3_654_jet.eps}
+%%\\
+%%\includegraphics[scale=.31]{singlegene_SS_lr_top_1_654_jet.eps}\includegraphics[scale=.31]{singlegene_SS_lr_top_2_685_jet.eps}\includegraphics[scale=.31]{singlegene_SS_lr_top_3_724_jet.eps}
+%%\caption{Top row: Genes Nfic, A930001M12Rik, C130038G02Rik are the most correlated with area SS (somatosensory cortex). Bottom row: Genes C130038G02Rik, Cacna1i, Car10 are those with the best fit using logistic regression. Within each picture, the vertical axis roughly corresponds to anterior at the top and posterior at the bottom, and the horizontal axis roughly corresponds to medial at the left and lateral at the right. The red outline is the boundary of region MO. Pixels are colored according to correlation, with red meaning high correlation and blue meaning low.}
+
+\includegraphics[scale=.31]{singlegene_SS_corr_top_1_2365_jet.eps}\includegraphics[scale=.31]{singlegene_SS_corr_top_2_242_jet.eps}
+\\
+\includegraphics[scale=.31]{singlegene_SS_lr_top_1_654_jet.eps}\includegraphics[scale=.31]{singlegene_SS_lr_top_2_685_jet.eps}
+
+\caption{Top row: Genes Nfic and A930001M12Rik are the most correlated with area SS (somatosensory cortex). Bottom row: Genes C130038G02Rik and Cacna1i are those with the best fit using logistic regression. Within each picture, the vertical axis roughly corresponds to anterior at the top and posterior at the bottom, and the horizontal axis roughly corresponds to medial at the left and lateral at the right. The red outline is the boundary of region MO. Pixels are colored according to correlation, with red meaning high correlation and blue meaning low.}
+\label{SScorrLr}\end{wrapfigure}
+
+
+
@@ -260,15 +285,12 @@
+
+
-\begin{figure}\centering
-\includegraphics[scale=.31]{holeExample_2682_SS_jet.eps} 
-\caption{Gene Pitx2 is selectively underexpressed in area SS (somatosensory).}
-\label{hole}\end{figure}
-
@@ -277,20 +299,17 @@
-\begin{figure}\centering
-\includegraphics[scale=.31]{singlegene_SS_corr_top_1_2365_jet.eps}
-\includegraphics[scale=.31]{singlegene_SS_corr_top_2_242_jet.eps}
-\includegraphics[scale=.31]{singlegene_SS_corr_top_3_654_jet.eps}
+
+\begin{wrapfigure}{L}{0.4\textwidth}\centering
+%%\includegraphics[scale=.31]{singlegene_AUD_lr_top_1_3386_jet.eps}\includegraphics[scale=.31]{singlegene_AUD_lr_top_2_1258_jet.eps}\includegraphics[scale=.31]{singlegene_AUD_lr_top_3_420_jet.eps}
+%%
+%%\includegraphics[scale=.31]{singlegene_AUD_gr_top_1_2856_jet.eps}\includegraphics[scale=.31]{singlegene_AUD_gr_top_2_420_jet.eps}\includegraphics[scale=.31]{singlegene_AUD_gr_top_3_2072_jet.eps}
+%%\caption{The top row shows the three genes which (individually) best predict area AUD, according to logistic regression. The bottom row shows the three genes which (individually) best match area AUD, according to gradient similarity. From left to right and top to bottom, the genes are $Ssr1$, $Efcbp1$, $Aph1a$, $Ptk7$, $Aph1a$ again, and $Lepr$}
+\includegraphics[scale=.31]{singlegene_AUD_lr_top_1_3386_jet.eps}\includegraphics[scale=.31]{singlegene_AUD_lr_top_2_1258_jet.eps}
-\includegraphics[scale=.31]{singlegene_SS_lr_top_1_654_jet.eps}
-\includegraphics[scale=.31]{singlegene_SS_lr_top_2_685_jet.eps}
-\includegraphics[scale=.31]{singlegene_SS_lr_top_3_724_jet.eps}
-
-
-\caption{Top row: Genes Nfic, A930001M12Rik, C130038G02Rik are the most correlated with area SS (somatosensory cortex). Bottom row: Genes C130038G02Rik, Cacna1i, Car10 are those with the best fit using logistic regression. Within each picture, the vertical axis roughly corresponds to anterior at the top and posterior at the bottom, and the horizontal axis roughly corresponds to medial at the left and lateral at the right. The red outline is the boundary of region MO. Pixels are colored according to correlation, with red meaning high correlation and blue meaning low.}
-\label{SScorrLr}\end{figure}
-
-
+\includegraphics[scale=.31]{singlegene_AUD_gr_top_1_2856_jet.eps}\includegraphics[scale=.31]{singlegene_AUD_gr_top_2_420_jet.eps}
+\caption{The top row shows the two genes which (individually) best predict area AUD, according to logistic regression. The bottom row shows the two genes which (individually) best match area AUD, according to gradient similarity. From left to right and top to bottom, the genes are $Ssr1$, $Efcbp1$, $Ptk7$, and $Aph1a$.}
+\label{AUDgeometry}\end{wrapfigure}
@@ -302,6 +321,14 @@
+\begin{wrapfigure}{L}{0.4\textwidth}\centering
+\includegraphics[scale=.31]{MO_vs_Wwc1_jet.eps}\includegraphics[scale=.31]{MO_vs_Mtif2_jet.eps} 
+
+\includegraphics[scale=.31]{MO_vs_Wwc1_plus_Mtif2_jet.eps} 
+\caption{Upper left: $wwc1$. Upper right: $mtif2$. Lower left: wwc1 + mtif2 (each pixel's value on the lower left is the sum of the corresponding pixels in the upper row).}
+\label{MOcombo}\end{wrapfigure}
+
+
@@ -322,17 +349,15 @@
-\begin{figure}\centering
-\includegraphics[scale=.31]{singlegene_AUD_lr_top_1_3386_jet.eps}
-\includegraphics[scale=.31]{singlegene_AUD_lr_top_2_1258_jet.eps}
-\includegraphics[scale=.31]{singlegene_AUD_lr_top_3_420_jet.eps}
-
-\includegraphics[scale=.31]{singlegene_AUD_gr_top_1_2856_jet.eps}
-\includegraphics[scale=.31]{singlegene_AUD_gr_top_2_420_jet.eps}
-\includegraphics[scale=.31]{singlegene_AUD_gr_top_3_2072_jet.eps}
-\caption{The top row shows the three genes which (individually) best predict area AUD, according to logistic regression. The bottom row shows the three genes which (individually) best match area AUD, according to gradient similarity. From left to right and top to bottom, the genes are $Ssr1$, $Efcbp1$, $Aph1a$, $Ptk7$, $Aph1a$ again, and $Lepr$}
-\label{AUDgeometry}\end{figure}
-
+
+\begin{wrapfigure}{L}{0.4\textwidth}\centering
+\includegraphics[scale=.31]{singlegene_example_2682_Pitx2_SS_jet.eps}\includegraphics[scale=.31]{singlegene_example_371_Aldh1a2_SSs_jet.eps}
+\includegraphics[scale=.31]{singlegene_example_2759_Ppfibp1_PIR_jet.eps}\includegraphics[scale=.31]{singlegene_example_3310_Slco1a5_FRP_jet.eps}
+\includegraphics[scale=.31]{singlegene_example_3709_Tshz2_RSP_jet.eps}\includegraphics[scale=.31]{singlegene_example_3674_Trhr_COApm_jet.eps}
+\includegraphics[scale=.31]{singlegene_example_925_Col12a1_ACA+PL+ILA+DP+ORB+MO_jet.eps}\includegraphics[scale=.31]{singlegene_example_1334_Ets1_post_lat_vis_jet.eps}
+
+\caption{From left to right and top to bottom, single genes which roughly identify areas SS (somatosensory primary + supplemental), SSs (supplemental somatosensory), PIR (piriform), FRP (frontal pole), RSP (retrosplenial), COApm (Cortical amygdalar, posterior part, medial zone). Grouping some areas together, we have also found genes to identify the groups ACA+PL+ILA+DP+ORB+MO (anterior cingulate, prelimbic, infralimbic, dorsal peduncular, orbital, motor), posterior and lateral visual (VISpm, VISpl, VISI, VISp; posteromedial, posterolateral, lateral, and primary visual; the posterior and lateral visual area is distinguished from its neighbors, but not from the entire rest of the cortex). The genes are $Pitx2$, $Aldh1a2$, $Ppfibp1$, $Slco1a5$, $Tshz2$, $Trhr$, $Col12a1$, $Ets1$.}
+\label{singleSoFar}\end{wrapfigure}
@@ -341,24 +366,13 @@
-\begin{figure}\centering
-\includegraphics[scale=.31]{singlegene_example_2682_Pitx2_SS_jet.eps}
-\includegraphics[scale=.31]{singlegene_example_371_Aldh1a2_SSs_jet.eps}
-\includegraphics[scale=.31]{singlegene_example_2759_Ppfibp1_PIR_jet.eps}
-\includegraphics[scale=.31]{singlegene_example_3310_Slco1a5_FRP_jet.eps}
-\includegraphics[scale=.31]{singlegene_example_3709_Tshz2_RSP_jet.eps}
-\includegraphics[scale=.31]{singlegene_example_3674_Trhr_COApm_jet.eps}
-\includegraphics[scale=.31]{singlegene_example_925_Col12a1_ACA+PL+ILA+DP+ORB+MO_jet.eps}
-\includegraphics[scale=.31]{singlegene_example_1334_Ets1_post_lat_vis_jet.eps}
-
-\caption{From left to right and top to bottom, single genes which roughly identify areas SS (somatosensory primary + supplemental), SSs (supplemental somatosensory), PIR (piriform), FRP (frontal pole), RSP (retrosplenial), COApm (Cortical amygdalar, posterior part, medial zone). Grouping some areas together, we have also found genes to identify the groups ACA+PL+ILA+DP+ORB+MO (anterior cingulate, prelimbic, infralimbic, dorsal peduncular, orbital, motor), posterior and lateral visual (VISpm, VISpl, VISI, VISp; posteromedial, posterolateral, lateral, and primary visual; the posterior and lateral visual area is distinguished from its neighbors, but not from the entire rest of the cortex). The genes are $Pitx2$, $Aldh1a2$, $Ppfibp1$, $Slco1a5$, $Tshz2$, $Trhr$, $Col12a1$, $Ets1$.}
-\label{singleSoFar}\end{figure}
-
-In Figure \ref{MOcombo}, we give an example of a cortical area which is not marked by any single gene, but which can be identified combinatorially. This shows that our proposal to develop a method to find combinations of marker genes is both possible and necessary.
+In Figure \ref{MOcombo}, we give an example of a cortical area which is not marked by any single gene, but which can be identified combinatorially. Acccording to logistic regression, gene wwc1 is the best fit single gene for predicting whether or not a pixel on the cortical surface belongs to the motor area (area MO). The upper-left picture in Figure \ref{MOcombo} shows wwc1's spatial expression pattern over the cortex. The lower-right boundary of MO is represented reasonably well by this gene, however the gene overshoots the upper-left boundary. This flattened 2-D representation does not show it, but the area corresponding to the overshoot is the medial surface of the cortex. MO is only found on the lateral surface. Gene mtif2 is shown in the upper-right. Mtif2 captures MO's upper-left boundary, but not its lower-right boundary. Mtif2 does not express very much on the medial surface. By adding together the values at each pixel in these two figures, we get the lower-left image. This combination captures area MO much better than any single gene. 
+
+This shows that our proposal to develop a method to find combinations of marker genes is both possible and necessary.
@@ -367,13 +381,6 @@
-\begin{figure}\centering
-\includegraphics[scale=.36]{MO_vs_Wwc1_jet.eps} 
-\includegraphics[scale=.36]{MO_vs_Mtif2_jet.eps} 
-
-\includegraphics[scale=.36]{MO_vs_Wwc1_plus_Mtif2_jet.eps} 
-\caption{Upper left: $wwc1$. Upper right: $mtif2$. Lower left: wwc1 + mtif2 (each pixel's value on the lower left is the sum of the corresponding pixels in the upper row). Acccording to logistic regression, gene wwc1 is the best fit single gene for predicting whether or not a pixel on the cortical surface belongs to the motor area (area MO). The upper-left picture in Figure \ref{MOcombo} shows wwc1's spatial expression pattern over the cortex. The lower-right boundary of MO is represented reasonably well by this gene, however the gene overshoots the upper-left boundary. This flattened 2-D representation does not show it, but the area corresponding to the overshoot is the medial surface of the cortex. MO is only found on the lateral surface. Gene mtif2 is shown in the upper-right. Mtif2 captures MO's upper-left boundary, but not its lower-right boundary. Mtif2 does not express very much on the medial surface. By adding together the values at each pixel in these two figures, we get the lower-left image. This combination captures area MO much better than any single gene. }
-\label{MOcombo}\end{figure}
@@ -382,40 +389,37 @@
+
-
-\vspace{0.3cm}**SVM on all genes at once**
-
-In order to see how well one can do when looking at all genes at once, we ran a support vector machine to classify cortical surface pixels based on their gene expression profiles. We achieved classification accuracy of about 81%\footnote{5-fold cross-validation.}. This shows that the genes included in the ABA dataset are sufficient to define much of cortical anatomy. As noted above, however, a classifier that looks at all the genes at once isn't as practically useful as a classifier that uses only a few genes. 
-
-
-
-
-
-=== Data-driven redrawing of the cortical map ===
-
-We have applied the following dimensionality reduction algorithms to reduce the dimensionality of the gene expression profile associated with each voxel: Principal Components Analysis (PCA), Simple PCA (SPCA), Multi-Dimensional Scaling (MDS), Isomap, Landmark Isomap, Laplacian eigenmaps, Local Tangent Space Alignment (LTSA), Hessian locally linear embedding, Diffusion maps, Stochastic Neighbor Embedding (SNE), Stochastic Proximity Embedding (SPE), Fast Maximum Variance Unfolding (FastMVU), Non-negative Matrix Factorization (NNMF). Space constraints prevent us from showing many of the results, but as a sample, PCA, NNMF, and landmark Isomap are shown in the second, third, and fourth rows of Figure \ref{dimReduc}.
-
-After applying the dimensionality reduction, we ran clustering algorithms on the reduced data. To date we have tried k-means and spectral clustering. The results of k-means after PCA, NNMF, and landmark Isomap are shown in the last row of Figure \ref{dimReduc}. To compare, the first row of Figure \ref{dimReduc} shows some of the major subdivisions of cortex. These results clearly show that different dimensionality reduction techniques capture different aspects of the data and lead to different clusterings, indicating the utility of our proposal to produce a detailed comparion of these techniques as applied to the domain of genomic anatomy.
-
-\begin{figure}\centering
-\includegraphics[scale=.31]{paint_merge3_major.eps}
-\\
+\begin{wrapfigure}{L}{0.6\textwidth}\centering
-\includegraphics[scale=.31]{merge3_norm_hv_PCA_ndims50_kmeans_7clust.eps}
-\includegraphics[scale=.31]{norm_hv_NNMF_3_norm_kmeans_4clust.eps}
-\includegraphics[scale=.31]{merge3_norm_hv_k150_LandmarkIsomap_ndims7_kmeans_7clust.eps}
-\caption{Top row: 19 of the major subdivisions of the cortex. Second row: the first 6 reduced dimensions, using PCA. Third row: the first 6 reduced dimensions, using NNMF. Fourth row: the first six reduced dimensions, using landmark Isomap. Bottom row: examples of kmeans clustering applied to reduced datasets to find 7 clusters. Left: PCA. Middle: NNMF. Right: Landmark Isomap. Additional details: In the third and fourth rows, 7 dimensions were found, but only 6 displayed. In the last row: for PCA, 50 dimensions were used; for NNMF, 6 dimensions were used; for landmark Isomap, 7 dimensions were used.}
-\label{dimReduc}\end{figure}
-
-todo: nnmf 7
+\includegraphics[scale=.24]{paint_merge3_major.eps}\includegraphics[scale=.22]{merge3_norm_hv_PCA_ndims50_kmeans_7clust.eps}\includegraphics[scale=.24]{norm_hv_NNMF_6_norm_kmeans_7clust.eps}\includegraphics[scale=.22]{merge3_norm_hv_k150_LandmarkIsomap_ndims7_kmeans_7clust.eps}
+\caption{First row: the first 6 reduced dimensions, using PCA. Second row: the first 6 reduced dimensions, using NNMF. Third row: the first six reduced dimensions, using landmark Isomap. Bottom row: examples of kmeans clustering applied to reduced datasets to find 7 clusters. Left: 19 of the major subdivisions of the cortex. Second from left: PCA. Third from left: NNMF. Right: Landmark Isomap. Additional details: In the third and fourth rows, 7 dimensions were found, but only 6 displayed. In the last row: for PCA, 50 dimensions were used; for NNMF, 6 dimensions were used; for landmark Isomap, 7 dimensions were used.}
+\label{dimReduc}\end{wrapfigure}
+
+
+\vspace{0.3cm}**SVM on all genes at once**
+
+In order to see how well one can do when looking at all genes at once, we ran a support vector machine to classify cortical surface pixels based on their gene expression profiles. We achieved classification accuracy of about 81%\footnote{5-fold cross-validation.}. This shows that the genes included in the ABA dataset are sufficient to define much of cortical anatomy. As noted above, however, a classifier that looks at all the genes at once isn't as practically useful as a classifier that uses only a few genes. 
+
+
+
+
+
+=== Data-driven redrawing of the cortical map ===
+
+We have applied the following dimensionality reduction algorithms to reduce the dimensionality of the gene expression profile associated with each voxel: Principal Components Analysis (PCA), Simple PCA (SPCA), Multi-Dimensional Scaling (MDS), Isomap, Landmark Isomap, Laplacian eigenmaps, Local Tangent Space Alignment (LTSA), Hessian locally linear embedding, Diffusion maps, Stochastic Neighbor Embedding (SNE), Stochastic Proximity Embedding (SPE), Fast Maximum Variance Unfolding (FastMVU), Non-negative Matrix Factorization (NNMF). Space constraints prevent us from showing many of the results, but as a sample, PCA, NNMF, and landmark Isomap are shown in the first, second, and third rows of Figure \ref{dimReduc}.
+
+After applying the dimensionality reduction, we ran clustering algorithms on the reduced data. To date we have tried k-means and spectral clustering. The results of k-means after PCA, NNMF, and landmark Isomap are shown in the last row of Figure \ref{dimReduc}. To compare, the leftmost picture on the bottom row of Figure \ref{dimReduc} shows some of the major subdivisions of cortex. These results clearly show that different dimensionality reduction techniques capture different aspects of the data and lead to different clusterings, indicating the utility of our proposal to produce a detailed comparion of these techniques as applied to the domain of genomic anatomy.
+
+
author	bshanks@bshanks.dyndns.org
date	Mon Apr 20 16:23:22 2009 -0700 (16 years ago)
parents	60d7c1c1b94f
children	48dae6cb2c09