Search Results

Search found 22292 results on 892 pages for 'image optimization'.

Page 196/892 | < Previous Page | 192 193 194 195 196 197 198 199 200 201 202 203  | Next Page >

  • Faster way to split a string and count characters using R?

    - by chrisamiller
    I'm looking for a faster way to calculate GC content for DNA strings read in from a FASTA file. This boils down to taking a string and counting the number of times that the letter 'G' or 'C' appears. I also want to specify the range of characters to consider. I have a working function that is fairly slow, and it's causing a bottleneck in my code. It looks like this: ## ## count the number of GCs in the characters between start and stop ## gcCount <- function(line, st, sp){ chars = strsplit(as.character(line),"")[[1]] numGC = 0 for(j in st:sp){ ##nested ifs faster than an OR (|) construction if(chars[[j]] == "g"){ numGC <- numGC + 1 }else if(chars[[j]] == "G"){ numGC <- numGC + 1 }else if(chars[[j]] == "c"){ numGC <- numGC + 1 }else if(chars[[j]] == "C"){ numGC <- numGC + 1 } } return(numGC) } Running Rprof gives me the following output: > a = "GCCCAAAATTTTCCGGatttaagcagacataaattcgagg" > Rprof(filename="Rprof.out") > for(i in 1:500000){gcCount(a,1,40)}; > Rprof(NULL) > summaryRprof(filename="Rprof.out") self.time self.pct total.time total.pct "gcCount" 77.36 76.8 100.74 100.0 "==" 18.30 18.2 18.30 18.2 "strsplit" 3.58 3.6 3.64 3.6 "+" 1.14 1.1 1.14 1.1 ":" 0.30 0.3 0.30 0.3 "as.logical" 0.04 0.0 0.04 0.0 "as.character" 0.02 0.0 0.02 0.0 $by.total total.time total.pct self.time self.pct "gcCount" 100.74 100.0 77.36 76.8 "==" 18.30 18.2 18.30 18.2 "strsplit" 3.64 3.6 3.58 3.6 "+" 1.14 1.1 1.14 1.1 ":" 0.30 0.3 0.30 0.3 "as.logical" 0.04 0.0 0.04 0.0 "as.character" 0.02 0.0 0.02 0.0 $sampling.time [1] 100.74 Any advice for making this code faster?

    Read the article

  • extracting a quadrilateral image to a rectangle

    - by Will
    In the image below, the sign on the side of the van is not face-on to the camera. I want to calculate, as best I can with the pixels I have, what it'd look like face on. I imagine that this is some kind of loop through the x and y axis doing a Bresenham's line on both dimensions at once with some kind of mixing when pixels in the source image overlap - some sub-pixel mixing of some sort? What approaches are there, and how do you mix the pixels? Is there a standard approach for this?

    Read the article

  • Code runs 6 times slower with 2 threads than with 1

    - by Edward Bird
    So I have written some code to experiment with threads and do some testing. The code should create some numbers and then find the mean of those numbers. I think it is just easier to show you what I have so far. I was expecting with two threads that the code would run about 2 times as fast. Measuring it with a stopwatch I think it runs about 6 times slower! void findmean(std::vector<double>*, std::size_t, std::size_t, double*); int main(int argn, char** argv) { // Program entry point std::cout << "Generating data..." << std::endl; // Create a vector containing many variables std::vector<double> data; for(uint32_t i = 1; i <= 1024 * 1024 * 128; i ++) data.push_back(i); // Calculate mean using 1 core double mean = 0; std::cout << "Calculating mean, 1 Thread..." << std::endl; findmean(&data, 0, data.size(), &mean); mean /= (double)data.size(); // Print result std::cout << " Mean=" << mean << std::endl; // Repeat, using two threads std::vector<std::thread> thread; std::vector<double> result; result.push_back(0.0); result.push_back(0.0); std::cout << "Calculating mean, 2 Threads..." << std::endl; // Run threads uint32_t halfsize = data.size() / 2; uint32_t A = 0; uint32_t B, C, D; // Split the data into two blocks if(data.size() % 2 == 0) { B = C = D = halfsize; } else if(data.size() % 2 == 1) { B = C = halfsize; D = hsz + 1; } // Run with two threads thread.push_back(std::thread(findmean, &data, A, B, &(result[0]))); thread.push_back(std::thread(findmean, &data, C, D , &(result[1]))); // Join threads thread[0].join(); thread[1].join(); // Calculate result mean = result[0] + result[1]; mean /= (double)data.size(); // Print result std::cout << " Mean=" << mean << std::endl; // Return return EXIT_SUCCESS; } void findmean(std::vector<double>* datavec, std::size_t start, std::size_t length, double* result) { for(uint32_t i = 0; i < length; i ++) { *result += (*datavec).at(start + i); } } I don't think this code is exactly wonderful, if you could suggest ways of improving it then I would be grateful for that also.

    Read the article

  • Google Web Optimizer -- How long until winning combination?

    - by Django Reinhardt
    I've had an A/B Test running in Google Web Optimizer for six weeks now, and there's still no end in sight. Google is still saying: "We have not gathered enough data yet to show any significant results. When we collect more data we should be able to show you a winning combination." Is there any way of telling how close Google is to making up its mind? (Does anyone know what algorithm does it use to decide if there's been any "high confidence winners"?) According to the Google help documentation: Sometimes we simply need more data to be able to reach a level of high confidence. A tested combination typically needs around 200 conversions for us to judge its performance with certainty. But all of our conversions have over 200 conversations at the moment: 230 / 4061 (Original) 223 / 3937 (Variation 1) 205 / 3984 (Variation 2) 205 / 4007 (Variation 3) How much longer is it going to have to run?? Thanks for any help.

    Read the article

  • Rewriting a for loop in pure NumPy to decrease execution time

    - by Statto
    I recently asked about trying to optimise a Python loop for a scientific application, and received an excellent, smart way of recoding it within NumPy which reduced execution time by a factor of around 100 for me! However, calculation of the B value is actually nested within a few other loops, because it is evaluated at a regular grid of positions. Is there a similarly smart NumPy rewrite to shave time off this procedure? I suspect the performance gain for this part would be less marked, and the disadvantages would presumably be that it would not be possible to report back to the user on the progress of the calculation, that the results could not be written to the output file until the end of the calculation, and possibly that doing this in one enormous step would have memory implications? Is it possible to circumvent any of these? import numpy as np import time def reshape_vector(v): b = np.empty((3,1)) for i in range(3): b[i][0] = v[i] return b def unit_vectors(r): return r / np.sqrt((r*r).sum(0)) def calculate_dipole(mu, r_i, mom_i): relative = mu - r_i r_unit = unit_vectors(relative) A = 1e-7 num = A*(3*np.sum(mom_i*r_unit, 0)*r_unit - mom_i) den = np.sqrt(np.sum(relative*relative, 0))**3 B = np.sum(num/den, 1) return B N = 20000 # number of dipoles r_i = np.random.random((3,N)) # positions of dipoles mom_i = np.random.random((3,N)) # moments of dipoles a = np.random.random((3,3)) # three basis vectors for this crystal n = [10,10,10] # points at which to evaluate sum gamma_mu = 135.5 # a constant t_start = time.clock() for i in range(n[0]): r_frac_x = np.float(i)/np.float(n[0]) r_test_x = r_frac_x * a[0] for j in range(n[1]): r_frac_y = np.float(j)/np.float(n[1]) r_test_y = r_frac_y * a[1] for k in range(n[2]): r_frac_z = np.float(k)/np.float(n[2]) r_test = r_test_x +r_test_y + r_frac_z * a[2] r_test_fast = reshape_vector(r_test) B = calculate_dipole(r_test_fast, r_i, mom_i) omega = gamma_mu*np.sqrt(np.dot(B,B)) # write r_test, B and omega to a file frac_done = np.float(i+1)/(n[0]+1) t_elapsed = (time.clock()-t_start) t_remain = (1-frac_done)*t_elapsed/frac_done print frac_done*100,'% done in',t_elapsed/60.,'minutes...approximately',t_remain/60.,'minutes remaining'

    Read the article

  • Optimize a MySQL count each duplicate Query

    - by Onema
    I have the following query That gets the city name, city id, the region name, and a count of duplicate names for that record: SELECT Country_CA.City AS currentCity, Country_CA.CityID, globe_region.region_name, ( SELECT count(Country_CA.City) FROM Country_CA WHERE City LIKE currentCity ) as counter FROM Country_CA LEFT JOIN globe_region ON globe_region.region_id = Country_CA.RegionID AND globe_region.country_code = Country_CA.CountryCode ORDER BY City This example is for Canada, and the cities will be displayed on a dropdown list. There are a few towns in Canada, and in other countries, that have the same names. Therefore I want to know if there is more than one town with the same name region name will be appended to the town name. Region names are found in the globe_region table. Country_CA and globe_region look similar to this (I have changed a few things for visualization purposes) CREATE TABLE IF NOT EXISTS `Country_CA` ( `City` varchar(75) NOT NULL DEFAULT '', `RegionID` varchar(10) NOT NULL DEFAULT '', `CountryCode` varchar(10) NOT NULL DEFAULT '', `CityID` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`City`,`RegionID`), KEY `CityID` (`CityID`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; AND CREATE TABLE IF NOT EXISTS `globe_region` ( `country_code` char(2) COLLATE utf8_unicode_ci NOT NULL, `region_code` char(2) COLLATE utf8_unicode_ci NOT NULL, `region_name` varchar(50) COLLATE utf8_unicode_ci NOT NULL, PRIMARY KEY (`country_code`,`region_code`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; The query on the top does exactly what I want it to do, but It takes way too long to generate a list for 5000 records. I would like to know if there is a way to optimize the sub-query in order to obtain the same results faster. the results should look like this City CityID region_name counter sheraton 2349269 British Columbia 1 sherbrooke 2349270 Quebec 2 sherbrooke 2349271 Nova Scotia 2 shere 2349273 British Columbia 1 sherridon 2349274 Manitoba 1

    Read the article

  • How to index a date column with null values?

    - by Heinz Z.
    How should I index a date column when some rows has null values? We have to select rows between a date range and rows with null dates. We use Oracle 9.2 and higher. Options I found Using a bitmap index on the date column Using an index on date column and an index on a state field which value is 1 when the date is null Using an index on date column and an other granted not null column My thoughts to the options are: to 1: the table have to many different values to use an bitmap index to 2: I have to add an field only for this purpose and to change the query when I want to retrieve the null date rows to 3: locks tricky to add an field to an index which is not really needed What is the best practice for this case? Thanks in advance Some infos I have read: Oracle Date Index When does Oracle index null column values?

    Read the article

  • Database indexes - what should they be

    - by WebweaverD
    Most of my database tables have a clear unique index through which lookups are done 90% of the time but I am a bit unsure on this one - I have a table which keeps track of user rating totals for items in my database, I now want to add another table, to track individual ratings with an ip address column to make sure no one can rate something twice. Since I can see this becoming a big, high use table it is important to optimize it correctly. (MYSQL table) This table will have the following fields: rating_id(always - unique), item_id (always - not unique), user_id (optional - not unique), ip_address (always - not unique), rating_value(always - not unique), has_review(bool) Now I envisions 90% the queries going something like this: When a user rates something - select where item_id = x and ip_address = y, (if rows = 0) insert rating When in user account pages - select where ip_address = x or username = y Now none of the fields searched on are unique, can I still use them as indexes (for example item _id and ip_address), can I have two indexes and will this still improve performance over a non indexed table?

    Read the article

  • Should not a tail-recursive function also be faster?

    - by Balint Erdi
    I have the following Clojure code to calculate a number with a certain "factorable" property. (what exactly the code does is secondary). (defn factor-9 ([] (let [digits (take 9 (iterate #(inc %) 1)) nums (map (fn [x] ,(Integer. (apply str x))) (permutations digits))] (some (fn [x] (and (factor-9 x) x)) nums))) ([n] (or (= 1 (count (str n))) (and (divisible-by-length n) (factor-9 (quot n 10)))))) Now, I'm into TCO and realize that Clojure can only provide tail-recursion if explicitly told so using the recur keyword. So I've rewritten the code to do that (replacing factor-9 with recur being the only difference): (defn factor-9 ([] (let [digits (take 9 (iterate #(inc %) 1)) nums (map (fn [x] ,(Integer. (apply str x))) (permutations digits))] (some (fn [x] (and (factor-9 x) x)) nums))) ([n] (or (= 1 (count (str n))) (and (divisible-by-length n) (recur (quot n 10)))))) To my knowledge, TCO has a double benefit. The first one is that it does not use the stack as heavily as a non tail-recursive call and thus does not blow it on larger recursions. The second, I think is that consequently it's faster since it can be converted to a loop. Now, I've made a very rough benchmark and have not seen any difference between the two implementations although. Am I wrong in my second assumption or does this have something to do with running on the JVM (which does not have automatic TCO) and recur using a trick to achieve it? Thank you.

    Read the article

  • most efficient method of turning multiple 1D arrays into columns of a 2D array

    - by Ty W
    As I was writing a for loop earlier today, I thought that there must be a neater way of doing this... so I figured I'd ask. I looked briefly for a duplicate question but didn't see anything obvious. The Problem: Given N arrays of length M, turn them into a M-row by N-column 2D array Example: $id = [1,5,2,8,6] $name = [a,b,c,d,e] $result = [[1,a], [5,b], [2,c], [8,d], [6,e]] My Solution: Pretty straight forward and probably not optimal, but it does work: <?php // $row is returned from a DB query // $row['<var>'] is a comma separated string of values $categories = array(); $ids = explode(",", $row['ids']); $names = explode(",", $row['names']); $titles = explode(",", $row['titles']); for($i = 0; $i < count($ids); $i++) { $categories[] = array("id" => $ids[$i], "name" => $names[$i], "title" => $titles[$i]); } ?> note: I didn't put the name = value bit in the spec, but it'd be awesome if there was some way to keep that as well.

    Read the article

  • when is java faster than c++ (or when is JIT faster then precompiled)?

    - by kostja
    I have heard that under certain circumstances, Java programs or rather parts of java programs are able to be executed faster than the "same" code in C++ (or other precompiled code) due to JIT optimizations. This is due to the compiler being able to determine the scope of some variables, avoid some conditionals and pull similar tricks at runtime. Could you give an (or better - some) example, where this applies? And maybe outline the exact conditions under which the compiler is able to optimize the bytecode beyond what is possible with precompiled code? NOTE : This question is not about comparing Java to C++. Its about the possibilities of JIT compiling. Please no flaming. I am also not aware of any duplicates. Please point them out if you are.

    Read the article

  • Oracle Sql Query taking a day long to return results using dblink

    - by Suresh S
    Guys i have the following oracle sql query that gives me the monthwise report between the dates. Basically for nov month i want sum of values between the dates 01nov to 30 nov. The table tha is being queried is residing in another database and accesssed using dblink. The DT columns is of NUMBER type (for ex 20101201) .The execution of the query is taking a day long and not completed. kindly suggest me , if their is any optimisation that can be suggested to my DBA on the dblink, or any tuning that can be done on the query , or rewriting the same. SELECT /*+ PARALLEL (A 8) */ TO_CHAR(TRUNC(TRUNC(SYSDATE,'MM')- 1,'MM'),'MONYYYY') "MONTH", TYPE AS "TYPE", COLUMN, COUNT (DISTINCT A) AS "A_COUNT", COUNT (COLUMN) AS NO_OF_COLS, SUM (DURATION) AS "SUM_DURATION", SUM (COST) AS "COST" FROM **A@LN_PROD A** WHERE DT >=TO_NUMBER(TO_CHAR(TRUNC(TRUNC(SYSDATE,'MM')-1,'MM'),'YYYYMMDD')) AND DT < TO_NUMBER(TO_CHAR(TRUNC(TRUNC(SYSDATE,'MM'),'MM'),'YYYYMMDD')) GROUP BY TYPE, COLUMN

    Read the article

  • Does the <script> tag position in HTML affects performance of the webpage?

    - by Rahul Joshi
    If the script tag is above or below the body in a HTML page, does it matter for the performance of a website? And what if used in between like this: <body> ..blah..blah.. <script language="JavaScript" src="JS_File_100_KiloBytes"> function f1() { .. some logic reqd. for manipulating contents in a webpage } </script> ... some text here too ... </body> Or is this better?: <script language="JavaScript" src="JS_File_100_KiloBytes"> function f1() { .. some logic reqd. for manipulating contents in a webpage } </script> <body> ..blah..blah.. ..call above functions on some events like onclick,onfocus,etc.. </body> Or this one?: <body> ..blah..blah.. ..call above functions on some events like onclick,onfocus,etc.. <script language="JavaScript" src="JS_File_100_KiloBytes"> function f1() { .. some logic reqd. for manipulating contents in a webpage } </script> </body> Need not tell everything is again in the <html> tag!! How does it affect performance of webpage while loading? Does it really? Which one is the best, either out of these 3 or some other which you know? And one more thing, I googled a bit on this, from which I went here: Best Practices for Speeding Up Your Web Site and it suggests put scripts at the bottom, but traditionally many people put it in <head> tag which is above the <body> tag. I know it's NOT a rule but many prefer it that way. If you don't believe it, just view source of this page! And tell me what's the better style for best performance.

    Read the article

  • Completely remove ViewState for specific pages

    - by Kerido
    Hi everybody, I have a site that features some pages which do not require any post-back functionality. They simply display static HTML and don't even have any associated code. However, since the Master Page has a <form runat="server"> tag which wraps all ContentPlaceHolders, the resulting HTML always contains the ViewState field, i.e: <input type="hidden" id="__VIEWSTATE" value="/wEPDwUKMjEwNDQyMTMxM2Rk0XhpfvawD3g+fsmZqmeRoPnb9kI=" /> I realize, that when decrypted, this string corresponds to the <form> tag which I cannot remove. However, I would still like to remove the ViewState field for pages that only display static HTML. Is it possible?

    Read the article

  • How to optimize an database suggestion engine

    - by Dimitar Vouldjeff
    Hi, I`m making an online engine for item-to-item recommending movies. I have made some researches and I think that the best way to implement that is using pearson correlation and make a table with item1, item2 and correlation fields, but the problem is that after each rate of item I have to regenerate the correlation for in the worst case N records (where N is the number of items). Another think that I read is the following article, but I haven`t thought a way to implement it. So what is your suggestion to optimize this process? Or any other suggestions? Thanks.

    Read the article

  • "Anagram solver" based on statistics rather than a dictionary/table?

    - by James M.
    My problem is conceptually similar to solving anagrams, except I can't just use a dictionary lookup. I am trying to find plausible words rather than real words. I have created an N-gram model (for now, N=2) based on the letters in a bunch of text. Now, given a random sequence of letters, I would like to permute them into the most likely sequence according to the transition probabilities. I thought I would need the Viterbi algorithm when I started this, but as I look deeper, the Viterbi algorithm optimizes a sequence of hidden random variables based on the observed output. I am trying to optimize the output sequence. Is there a well-known algorithm for this that I can read about? Or am I on the right track with Viterbi and I'm just not seeing how to apply it?

    Read the article

  • Javascriptlibrary more efficient than Rickshaw for realtime visualizations

    - by dan kutz
    I want to visualize data as time-series graphs on mobile devices(tablets) and therefore stumbled upon rickshaw, which is based on D3. First I must say I was a little bit confused when I realized that realtime in web design is defined totally different to realtime in engineering which has fixed(and often very short) timeframes. Anyway my aim is to visualize the data as fast as possible, and on older tablets visualization with rickshaw is quite slow. Can anybody recommend another library, which may be more efficient in rendering? Or is there no way out and I have to go native? regards Dan.

    Read the article

  • Is SQL DATEDIFF(year, ..., ...) an Expensive Computation?

    - by rlb.usa
    I'm trying to optimize up some horrendously complicated SQL queries because it takes too long to finish. In my queries, I have dynamically created SQL statements with lots of the same functions, so I created a temporary table where each function is only called once instead of many, many times - this cut my execution time by 3/4. So my question is, can I expect to see much of a difference if say, 1,000 datediff computations are narrowed to 100?

    Read the article

  • How to display an image when aAsynch post back is happening

    - by Nasser Hajloo
    I have an Ajaxable application which has some UpdateProgress for each postBack events. everything is good and working great. I want to display another Indicator in my SiteMapPath bar which shows a post back is happening no matter what. For example when user clicks on a button an UpdateProgress display an Indicator Image in central part of application. I want to display another image ( for every postbacks not only for one) in top of application toolbar. Currently I use a simple Flag out ther and I want to use an indicator instead of that flag when ever a postback happens Thank you in advance

    Read the article

  • What does ER_WARN_FIELD_RESOLVED mean?

    - by VolkerK
    When SHOW WARNINGS after a EXPLAIN EXTENDED shows a Note 1276 Field or reference 'test.foo.bar' of SELECT #2 was resolved in SELECT #1 what exactly does that mean and what impact does it have? In my case it prevents mysql from using what seems to be a perfectly good index. But it's not about fixing that specific query (as it is an irrelevant test). I found http://dev.mysql.com/doc/refman/5.0/en/error-messages-server.html butError: 1276 SQLSTATE: HY000 (ER_WARN_FIELD_RESOLVED) Message: Field or reference '%s%s%s%s%s' of SELECT #%d was resolved in SELECT #%d isn't much of an explaination.

    Read the article

< Previous Page | 192 193 194 195 196 197 198 199 200 201 202 203  | Next Page >