cg

changeset 40:cb2ac88dd526
.
author: bshanks@bshanks.dyndns.org
date: Tue Apr 14 02:50:49 2009 -0700 (16 years ago)
parents: 9365a696c0b8
children: 34e681823d3a
files: grant.doc grant.html grant.odt grant.pdf grant.txt
--- a/grant.html	Tue Apr 14 02:31:37 2009 -0700
+++ b/grant.html	Tue Apr 14 02:50:49 2009 -0700
@@ -269,6 +269,8 @@
+The features and the target area are both functions on the surface pixels.  They can be referred to as scalar fields over
+the space of surface pixels; alternately, they can be thought of as images which can be displayed on the flatmapped surface.
@@ -279,10 +281,8 @@
-The features and the target area are both functions on the surface pixels; alternately, they can be thought of as images
-which can be displayed on the flatmapped surface. One class of feature selection scoring method are those which calculate
-some sort of &#8220;match&#8221; between each gene image and the target image. Those genes which match the best are good candidates
-for features.
+One class of feature selection scoring method are those which calculate some sort of &#8220;match&#8221; between each gene image
+and the target image. Those genes which match the best are good candidates for features.
@@ -292,8 +292,8 @@
-levels over pixels using each of these thresholds: the mean of that gene, the mean minus one standard deviation, the mean
-minus two standard deviations, the mean plus one standard deviation, the mean plus two standard deviations.
+levels using each of these thresholds: the mean of that gene, the mean minus one standard deviation, the mean minus two
+standard deviations, the mean plus one standard deviation, the mean plus two standard deviations.
@@ -311,9 +311,24 @@
-to the shape of the target region.
-had shape of the pattern of expression did not seem to match the shape of the target area.
-todo
+to the shape of the target region. We call this scoring method &#8220;gradient similarity&#8221;.
+One might say that gradient similarity attempts to measure how much the border of the area of gene expression and
+the border of the target region overlap.  However, since gene expression falls off continuously rather than jumping from its
+maximum value to zero, the spatial pattern of a gene&#8217;s expression often does not have a discrete border. Therefore, instead
+of looking for a discrete border, we look for large gradients.  Gradient similarity is a symmetric function over two images
+(i.e. two scalar fields). It is is high to the extent that matching pixels which have large values and large gradients also have
+gradients which are oriented in a similar direction. The formula is:
+&#x2211;
+
+pixel<img src="cmsy7-32.png" alt="&#x2208;" />pixels cos(abs(&#x2220;&#x2207;1 - &#x2220;&#x2207;2)) &#x22C5;|&#x2207;1|+|&#x2207;2|
+   2       &#x22C5; pixel_value1+pixel_value2 
+                   2
+where &#x2207;1  and &#x2207;2  are the gradient vectors of the two images at the current pixel; &#x2220;&#x2207;i is the angle of the gradient of
+image i at the current pixel; |&#x2207;1| is the magnitude of the gradient of image i at the current pixel; and pixelvaluei  is the
+value of the current pixel in image i.
+The intuition is that we want to see if the borders of the pattern in the two images are similar; if the borders are similar,
+then both images will have corresponding pixels with large gradients (because this is a border) which are oriented in a
+similar direction (because the borders are similar).
@@ -327,6 +342,17 @@
+_________________________________________
+   9&#8220;WW, C2 and coiled-coil domain containing 1&#8221;; EntrezGene ID 211652
+   10&#8220;mitochondrial translational initiation factor 2&#8221;; EntrezGene ID 76784
+   11For each gene, a logistic regression in which the response variable was whether or not a surface pixel was within area AUD, and the predictor
+variable was the value of the expression of the gene underneath that pixel. The resulting scores were used to rank the genes in terms of how well
+
+                                     
+                                     
+Figure 2: The top row shows the three genes which (individually) best predict area AUD, according to logistic regression.
+The bottom row shows the three genes which (individually) best match area AUD, according to gradient similarity.  From
+left to right and top to bottom, the genes are Ssr1, Efcbp1, Aph1a, Ptk7, Aph1a again, and Lepr
@@ -335,20 +361,6 @@
-_________________________________________
-   9&#8220;WW, C2 and coiled-coil domain containing 1&#8221;; EntrezGene ID 211652
-   10&#8220;mitochondrial translational initiation factor 2&#8221;; EntrezGene ID 76784
-   11For each gene, a logistic regression in which the response variable was whether or not a surface pixel was within area AUD, and the predictor
-variable was the value of the expression of the gene underneath that pixel. The resulting scores were used to rank the genes in terms of how well
-they predict area AUD.
-   12For each gene the gradient similarity (see section ??) between (a) a map of the expression of each gene on the cortical surface and (b) the
-shape of area AUD, was calculated, and this was used to rank the genes.
-
-                                     
-                                     
-Figure 2: The top row shows the three genes which (individually) best predict area AUD, according to logistic regression.
-The bottom row shows the three genes which (individually) best match area AUD, according to gradient similarity.  From
-left to right and top to bottom, the genes are Ssr1, Efcbp1, Aph1a, Ptk7, Aph1a again, and Lepr
@@ -369,10 +381,13 @@
-todo
-todo
-  135-fold cross-validation.
+they predict area AUD.
+   12For each gene the gradient similarity (see section ??) between (a) a map of the expression of each gene on the cortical surface and (b) the
+shape of area AUD, was calculated, and this was used to rank the genes.
+   135-fold cross-validation.
+todo
+todo
--- a/grant.txt	Tue Apr 14 02:31:37 2009 -0700
+++ b/grant.txt	Tue Apr 14 02:50:49 2009 -0700
@@ -202,6 +202,8 @@
+The features and the target area are both functions on the surface pixels. They can be referred to as scalar fields over the space of surface pixels; alternately, they can be thought of as images which can be displayed on the flatmapped surface. 
+
@@ -218,7 +220,7 @@
-The features and the target area are both functions on the surface pixels; alternately, they can be thought of as images which can be displayed on the flatmapped surface. One class of feature selection scoring method are those which calculate some sort of "match" between each gene image and the target image. Those genes which match the best are good candidates for features.
+One class of feature selection scoring method are those which calculate some sort of "match" between each gene image and the target image. Those genes which match the best are good candidates for features.
@@ -227,7 +229,7 @@
-The simplest way to use information theory is on discrete data, so we discretized our gene expression data by creating, for each gene, five thresholded binary masks of the gene data. For each gene, we created a binary mask of its expression levels over pixels using each of these thresholds: the mean of that gene, the mean minus one standard deviation, the mean minus two standard deviations, the mean plus one standard deviation, the mean plus two standard deviations.
+The simplest way to use information theory is on discrete data, so we discretized our gene expression data by creating, for each gene, five thresholded binary masks of the gene data. For each gene, we created a binary mask of its expression levels using each of these thresholds: the mean of that gene, the mean minus one standard deviation, the mean minus two standard deviations, the mean plus one standard deviation, the mean plus two standard deviations.
@@ -236,32 +238,15 @@
-We noticed that the previous two scoring methods, which are pointwise, often found genes whose pattern of expression did not look similar in shape to the target region. Fort his reason we designed a non-pointwise local scoring method to detect when a gene had a pattern of expression which looked like it had a boundary whose shape is similar to the shape of the target region.
-
-
-
-had shape of the pattern of expression did not seem to match the shape of the target area. 
-
-todo
-
-
-
-
-\vspace{0.3cm}**Using combinations of multiple genes is necessary and sufficient to delineate some cortical areas**
-
-Here we give an example of a cortical area which is not marked by any single gene, but which can be identified combinatorially. according to logistic regression, gene wwc1\footnote{"WW, C2 and coiled-coil domain containing 1"; EntrezGene ID 211652} is the best fit single gene for predicting whether or not a pixel on the cortical surface belongs to the motor area (area MO). The upper-left picture in Figure \ref{MOcombo} shows wwc1's spatial expression pattern over the cortex. The lower-right boundary of MO is represented reasonably well by this gene, however the gene overshoots the upper-left boundary. This flattened 2-D representation does not show it, but the area corresponding to the overshoot is the medial surface of the cortex. MO is only found on the lateral surface (todo).
-
-Gene mtif2\footnote{"mitochondrial translational initiation factor 2"; EntrezGene ID 76784} is shown in figure the upper-right of Fig. \ref{MOcombo}. Mtif2 captures MO's upper-left boundary, but not its lower-right boundary. Mtif2 does not express very much on the medial surface. By adding together the values at each pixel in these two figures, we get the lower-left of Figure \ref{MOcombo}. This combination captures area MO much better than any single gene. 
-
-\begin{figure}\label{MOcombo}
-\includegraphics[scale=.36]{MO_vs_Wwc1_jet.eps} 
-\includegraphics[scale=.36]{MO_vs_Mtif2_jet.eps} 
-
-\includegraphics[scale=.36]{MO_vs_Wwc1_plus_Mtif2_jet.eps} 
-\caption{Upper left: $wwc1$. Upper right: $mtif2$. Lower left: wwc1 + mtif2 (each pixel's value on the lower left is the sum of the corresponding pixels in the upper row). Within each picture, the vertical axis roughly corresponds to anterior at the top and posterior at the bottom, and the horizontal axis roughly corresponds to medial at the left and lateral at the right. The red outline is the boundary of region MO. Pixels are colored approximately according to the density of expressing cells underneath each pixel, with red meaning a lot of expression and blue meaning little.}
-\end{figure}
-
-
+We noticed that the previous two scoring methods, which are pointwise, often found genes whose pattern of expression did not look similar in shape to the target region. Fort his reason we designed a non-pointwise local scoring method to detect when a gene had a pattern of expression which looked like it had a boundary whose shape is similar to the shape of the target region. We call this scoring method "gradient similarity".
+
+One might say that gradient similarity attempts to measure how much the border of the area of gene expression and the border of the target region overlap. However, since gene expression falls off continuously rather than jumping from its maximum value to zero, the spatial pattern of a gene's expression often does not have a discrete border. Therefore, instead of looking for a discrete border, we look for large gradients. Gradient similarity is a symmetric function over two images (i.e. two scalar fields). It is is high to the extent that matching pixels which have large values and large gradients also have gradients which are oriented in a similar direction. The formula is:
+
+\sum_{pixel \in pixels} cos(abs(\angle \nabla_1 - \angle \nabla_2)) \cdot \frac{\vert \nabla_1 \vert + \vert \nabla_2 \vert}{2}  \cdot \frac{pixel\_value_1 + pixel\_value_2}{2}
+
+where $\nabla_1$ and $\nabla_2$ are the gradient vectors of the two images at the current pixel; $\angle \nabla_i$ is the angle of the gradient of image $i$ at the current pixel; $\vert \nabla_1 \vert$ is the magnitude of the gradient of image $i$ at the current pixel; and $pixel_value_i$ is the value of the current pixel in image $i$. 
+
+The intuition is that we want to see if the borders of the pattern in the two images are similar; if the borders are similar, then both images will have corresponding pixels with large gradients (because this is a border) which are oriented in a similar direction (because the borders are similar).
@@ -282,6 +267,24 @@
+\vspace{0.3cm}**Using combinations of multiple genes is necessary and sufficient to delineate some cortical areas**
+
+Here we give an example of a cortical area which is not marked by any single gene, but which can be identified combinatorially. according to logistic regression, gene wwc1\footnote{"WW, C2 and coiled-coil domain containing 1"; EntrezGene ID 211652} is the best fit single gene for predicting whether or not a pixel on the cortical surface belongs to the motor area (area MO). The upper-left picture in Figure \ref{MOcombo} shows wwc1's spatial expression pattern over the cortex. The lower-right boundary of MO is represented reasonably well by this gene, however the gene overshoots the upper-left boundary. This flattened 2-D representation does not show it, but the area corresponding to the overshoot is the medial surface of the cortex. MO is only found on the lateral surface (todo).
+
+Gene mtif2\footnote{"mitochondrial translational initiation factor 2"; EntrezGene ID 76784} is shown in figure the upper-right of Fig. \ref{MOcombo}. Mtif2 captures MO's upper-left boundary, but not its lower-right boundary. Mtif2 does not express very much on the medial surface. By adding together the values at each pixel in these two figures, we get the lower-left of Figure \ref{MOcombo}. This combination captures area MO much better than any single gene. 
+
+\begin{figure}\label{MOcombo}
+\includegraphics[scale=.36]{MO_vs_Wwc1_jet.eps} 
+\includegraphics[scale=.36]{MO_vs_Mtif2_jet.eps} 
+
+\includegraphics[scale=.36]{MO_vs_Wwc1_plus_Mtif2_jet.eps} 
+\caption{Upper left: $wwc1$. Upper right: $mtif2$. Lower left: wwc1 + mtif2 (each pixel's value on the lower left is the sum of the corresponding pixels in the upper row). Within each picture, the vertical axis roughly corresponds to anterior at the top and posterior at the bottom, and the horizontal axis roughly corresponds to medial at the left and lateral at the right. The red outline is the boundary of region MO. Pixels are colored approximately according to the density of expressing cells underneath each pixel, with red meaning a lot of expression and blue meaning little.}
+\end{figure}
+
+
+
+
+
author	bshanks@bshanks.dyndns.org
date	Tue Apr 14 02:50:49 2009 -0700 (16 years ago)
parents	9365a696c0b8
children	34e681823d3a
files	grant.doc grant.html grant.odt grant.pdf grant.txt