{"id":194,"date":"2013-07-13T00:10:31","date_gmt":"2013-07-13T00:10:31","guid":{"rendered":"http:\/\/blogs.discovery.wisc.edu\/compsysbio\/?p=194"},"modified":"2013-07-13T00:10:31","modified_gmt":"2013-07-13T00:10:31","slug":"random-walk-method-to-find-causal-pathway","status":"publish","type":"post","link":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/2013\/07\/13\/random-walk-method-to-find-causal-pathway\/","title":{"rendered":"random walk method to find causal pathway"},"content":{"rendered":"<p><a href=\"http:\/\/bioinformatics.oxfordjournals.org\/content\/22\/14\/e489.abstract\">This<\/a> paper described an algorithm to find the paths of casual interactions between two genes. Lets say we found a locus that is associated with the changes in expression level of a gene, but the region might contains multiple genes and even if we know the right gene, we don&#8217;t know how it affect the target gene. They use a random walk method to find a path from a possible source to the target gene in a biological network.<\/p>\n<p>They create a biological network using protein-protein interaction (represented as undirected edges) and protein phosphorylation and TF-DNA binding (represented as directed edges). For a target gene g<sub>t<\/sub>, we have the list of TFs T = (t<sub>1<\/sub>, t<sub>2<\/sub>, &#8230;, t<sub>n<\/sub>) and candidate causal genes are C = (g<sub>C<sub>1<\/sub><\/sub>, &#8230;, g<sub>C<sub>m<\/sub><\/sub>). For each t<sub>k<\/sub> in T, they initiate a search. In their search, when they are at node g, they want to select a gene g<sub>i<\/sub> from neighborhood of g that is more likely to have a causal relation with the target gene g<sub>t<\/sub>, therefore they use the correlation between\u00a0g<sub>i<\/sub> and\u00a0g<sub>t<\/sub> as the probability of selecting g<sub>i<\/sub> (they set an upper limit to the correlation value, so that other genes that might have causal relation with the target gene but have lower correlation due to post-translational regulation mechanisms, have a chance of selection). note that they don&#8217;t use the correlation between g<sub>i<\/sub> and\u00a0g but between g<sub>i<\/sub> and the target gene. using this method, they start at\u00a0t<sub>k<\/sub> and randomly visit one of its neighbors, and resume the search until they reach a gene in causal set C, or when entering a dead end (reaching a gene that they have visited all of its neighbors before) or reach maximum number of transitions.<\/p>\n<p>If the search reach a gene g<sub>c<\/sub> in C, they will have a path from\u00a0g<sub>c<\/sub> to\u00a0g<sub>t<\/sub>. They repeat this procedure N time for each t<sub>k<\/sub>. They approximate the probability of reaching g<sub>c<\/sub> from t<sub>k<\/sub>, p(g<sub>c<\/sub> , t<sub>k<\/sub>) using the number of times that g<sub>c<\/sub> was visited in a search initiated from t<sub>k<\/sub>. they calculate the probability of each causal gene g<sub>c<\/sub> as weighted sum of p(g<sub>c<\/sub> , t<sub>k<\/sub>) (weighted using the probability of\u00a0t<sub>k<\/sub> causing g<sub>t<\/sub>). From the set of all possible causal genes C, they can select the gene g<sub>c<\/sub> with highest probability.<\/p>\n<p>To test their method, they used a knock-out experiment to create a simulation QTL. For each knockout experiment, they select genes with significant change in expression level as target gene, and then create a QTL region by putting the knocked-out gene and 9 other genes together. then they run their procedure to select the causal gene. If the algorithm select the knocked-out gene, it is correct prediction, if it select another gene it is incorrect prediction. They reached 46% accuracy; they expect 10% success by randomly selecting a gene as causal gene, so their method is more than 4 times better than random. (They also ran their method on yeast QTL data from <a href=\"http:\/\/www.pnas.org\/content\/102\/5\/1572.long\">Brem and Kruglyak<\/a> and reported their findings).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This paper described an algorithm to find the paths of casual interactions between two genes. Lets say we found a locus that is associated with the changes in expression level of a gene, but the region might contains multiple genes &hellip; <a href=\"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/2013\/07\/13\/random-walk-method-to-find-causal-pathway\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/posts\/194"}],"collection":[{"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/comments?post=194"}],"version-history":[{"count":4,"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/posts\/194\/revisions"}],"predecessor-version":[{"id":196,"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/posts\/194\/revisions\/196"}],"wp:attachment":[{"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/media?parent=194"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/categories?post=194"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.discovery.wisc.edu\/compsysbio\/wp-json\/wp\/v2\/tags?post=194"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}