Search Results

Search found 33242 results on 1330 pages for 'database optimization'.

Page 231/1330 | < Previous Page | 227 228 229 230 231 232 233 234 235 236 237 238  | Next Page >

  • Why is processing a sorted array faster than an unsorted array?

    - by GManNickG
    Here is a piece of code that shows some very peculiar performance. For some strange reason, sorting the data miraculously speeds up the code by almost 6x: #include <algorithm> #include <ctime> #include <iostream> int main() { // generate data const unsigned arraySize = 32768; int data[arraySize]; for (unsigned c = 0; c < arraySize; ++c) data[c] = std::rand() % 256; // !!! with this, the next loop runs faster std::sort(data, data + arraySize); // test clock_t start = clock(); long long sum = 0; for (unsigned i = 0; i < 100000; ++i) { // primary loop for (unsigned c = 0; c < arraySize; ++c) { if (data[c] >= 128) sum += data[c]; } } double elapsedTime = static_cast<double>(clock() - start) / CLOCKS_PER_SEC; std::cout << elapsedTime << std::endl; std::cout << "sum = " << sum << std::endl; } Without std::sort(data, data + arraySize);, the code runs in 11.54 seconds. With the sorted data, the code runs in 1.93 seconds. Initially I thought this might be just a language or compiler anomaly. So I tried it Java... import java.util.Arrays; import java.util.Random; public class Main { public static void main(String[] args) { // generate data int arraySize = 32768; int data[] = new int[arraySize]; Random rnd = new Random(0); for (int c = 0; c < arraySize; ++c) data[c] = rnd.nextInt() % 256; // !!! with this, the next loop runs faster Arrays.sort(data); // test long start = System.nanoTime(); long sum = 0; for (int i = 0; i < 100000; ++i) { // primary loop for (int c = 0; c < arraySize; ++c) { if (data[c] >= 128) sum += data[c]; } } System.out.println((System.nanoTime() - start) / 1000000000.0); System.out.println("sum = " + sum); } } with a similar but less extreme result. My first thought was that sorting brings the data into cache, but my next thought was how silly that is because the array was just generated. What is going on? Why is a sorted array faster than an unsorted array? The code is summing up some independent terms, the order should not matter.

    Read the article

  • Most flexible minimizer/compressor for ASP.NET MVC 2?

    - by AlexanderN
    From your experience, what's the most flexible minimizer/compressor (JS+CSS) for ASP.NET MVC you've dealt with? So far mbcompress doesn't seem to be too MVC friendly weboptimizer.codeplex.com lacks documentation clientdependency.codeplex.com is still in beta compress2 seems like a good candidate, but haven't tried it yet mvcscriptmanager only combines and compresses javascript but not CSS By flexible I mean Choose what should be compressed, minified, and combined Add exceptions. E.g. if debug don't compress XYZ.JS or don't minify ABC.CSS Caching In the end, it should help offer the best YSLOW score. If you know of any other assemblies out there, please list them also.

    Read the article

  • Should i use HttpResponse.End() for a fast webapp?

    - by acidzombie24
    HttpResponse.End() seems to throw an exception according to msdn. Right now i have the choice of returning a value to say end thread (it only goes 2 functions deep) or i can call end(). I know that throwing exceptions is significantly slower (read the comment for a C#/.NET test) so if i want a fast webapp should i consider not calling it when it is trivially easy to not call it? -edit- I do have a function call in certain functions and in the constructor in classes to ensure the user is logged in. So i call HttpResponse.End() in enough places although hopefully in regular site usage it doesn't occur too often.

    Read the article

  • Algorithm for optimally choosing actions to perform a task

    - by Jules
    There are two data types: tasks and actions. An action costs a certain time to complete, and a set of tasks this actions consists of. A task has a set of actions, and our job is to choose one of them. So: class Task { Set<Action> choices; } class Action { float time; Set<Task> dependencies; } For example the primary task could be "Get a house". The possible actions for this task: "Buy a house" or "Build a house". The action "Build a house" costs 10 hours and has the dependencies "Get bricks" and "Get cement", etcetera. The total time is the sum of all the times of the actions required to perform. We want to choose actions such that the total time is minimal. Note that the dependencies can be diamond shaped. For example "Get bricks" could require "Get a car" (to transport the bricks) and "Get cement" would also require a car. Even if you do "Get bricks" and "Get cement" you only have to count the time it takes to get a car once. Note also that the dependencies can be circular. For example "Money" - "Job" - "Car" - "Money". This is no problem for us, we simply select all of "Money", "Job" and "Car". The total time is simply the sum of the time of these 3 things. Mathematical description: Let actions be the chosen actions. valid(task) = ?action ? task.choices. (action ? actions ? ?tasks ? action.dependencies. valid(task)) time = sum {action.time | action ? actions} minimize time subject to valid(primaryTask)

    Read the article

  • Theory: Can JIT Compiler be used to parse the whole program first, then execute later?

    - by unknownthreat
    Normally, JIT Compiler works by reads the byte code, translate it into machine code, and execute it. This is what I understand, but in theory, is it possible to make the JIT Compiler parses the whole program first, then execute the program later as machine code? I do not know how JIT Compiler works technically and exactly, so I don't know any feasibility in this case. But theoretically, is it possible? Or am I doing it wrong?

    Read the article

  • Design for a Debate club assignment application

    - by Amir Rachum
    Hi all, For my university's debate club, I was asked to create an application to assign debate sessions and I'm having some difficulties as to come up with a good design for it. I will do it in Java. Here's what's needed: What you need to know about BP debates: There are four teams of 2 debaters each and a judge. The four groups are assigned a specific position: gov1, gov2, op1, op2. There is no significance to the order within a team. The goal of the application is to get as input the debaters who are present (for example, if there are 20 people, we will hold 2 debates) and assign them to teams and roles with regards to the history of each debater so that: Each debater should debate with (be on the same team) as many people as possible. Each debater should uniformly debate in different positions. The debate should be fair - debaters have different levels of experience and this should be as even as possible - i.e., there shouldn't be a team of two very experienced debaters and a team of junior debaters. There should be an option for the user to restrict the assignment in various ways, such as: Specifying that two people should debate together, in a specific position or not. Specifying that a single debater should be in a specific position, regardless of the partner. etc... If anyone can try to give me some pointers for a design for this application, I'll be so thankful! Also, I've never implemented a GUI before, so I'd appreciate some pointers on that as well, but it's not the major issue right now.

    Read the article

  • How can i get rid of 'ORA-01489: result of string concatenation is too long' in this query?

    - by core_pro
    this query gets the dominating sets in a network. so for example given a network A<----->B B<----->C B<----->D C<----->E D<----->C D<----->E F<----->E it returns B,E B,F A,E but it doesn't work for large data because i'm using string methods in my result. i have been trying to remove the string methods and return a view or something but to no avail With t as (select 'A' as per1, 'B' as per2 from dual union all select 'B','C' from dual union all select 'B','D' from dual union all select 'C','B' from dual union all select 'C','E' from dual union all select 'D','C' from dual union all select 'D','E' from dual union all select 'E','C' from dual union all select 'E','D' from dual union all select 'F','E' from dual) ,t2 as (select distinct least(per1, per2) as per1, greatest(per1, per2) as per2 from t union select distinct greatest(per1, per2) as per1, least(per1, per2) as per1 from t) ,t3 as (select per1, per2, row_number() over (partition by per1 order by per2) as rn from t2) ,people as (select per, row_number() over (order by per) rn from (select distinct per1 as per from t union select distinct per2 from t) ) ,comb as (select sys_connect_by_path(per,',')||',' as p from people connect by rn > prior rn ) ,find as (select p, per2, count(*) over (partition by p) as cnt from ( select distinct comb.p, t3.per2 from comb, t3 where instr(comb.p, ','||t3.per1||',') > 0 or instr(comb.p, ','||t3.per2||',') > 0 ) ) ,rnk as (select p, rank() over (order by length(p)) as rnk from find where cnt = (select count(*) from people) order by rnk ) select distinct trim(',' from p) as p from rnk where rnk.rnk = 1`

    Read the article

  • Speeding up inner joins between a large table and a small table

    - by Zaid
    This may be a silly question, but it may shed some light on how joins work internally. Let's say I have a large table L and a small table S (100K rows vs. 100 rows). Would there be any difference in terms of speed between the following two options?: OPTION 1: OPTION 2: --------- --------- SELECT * SELECT * FROM L INNER JOIN S FROM S INNER JOIN L ON L.id = S.id; ON L.id = S.id; Notice that the only difference is the order in which the tables are joined. I realize performance may vary between different SQL languages. If so, how would MySQL compare to Access?

    Read the article

  • How can I optimize retrieving lowest edit distance from a large table in SQL?

    - by Matt
    Hey, I'm having troubles optimizing this Levenshtein Distance calculation I'm doing. I need to do the following: Get the record with the minimum distance for the source string as well as a trimmed version of the source string Pick the record with the minimum distance If the min distances are equal (original vs trimmed), choose the trimmed one with the lowest distance If there are still multiple records that fall under the above two categories, pick the one with the highest frequency Here's my working version: DECLARE @Results TABLE ( ID int, [Name] nvarchar(200), Distance int, Frequency int, Trimmed bit ) INSERT INTO @Results SELECT ID, [Name], (dbo.Levenshtein(@Source, [Name])) As Distance, Frequency, 'False' As Trimmed FROM MyTable INSERT INTO @Results SELECT ID, [Name], (dbo.Levenshtein(@SourceTrimmed, [Name])) As Distance, Frequency, 'True' As Trimmed FROM MyTable SET @ResultID = (SELECT TOP 1 ID FROM @Results ORDER BY Distance, Trimmed, Frequency) SET @Result = (SELECT TOP 1 [Name] FROM @Results ORDER BY Distance, Trimmed, Frequency) SET @ResultDist = (SELECT TOP 1 Distance FROM @Results ORDER BY Distance, Trimmed, Frequency) SET @ResultTrimmed = (SELECT TOP 1 Trimmed FROM @Results ORDER BY Distance, Trimmed, Frequency) I believe what I need to do here is to.. Not dumb the results to a temporary table Do only 1 select from `MyTable` Setting the results right in the select from the initial select statement. (Since select will set variables and you can set multiple variables in one select statement) I know there has to be a good implementation to this but I can't figure it out... this is as far as I got: SELECT top 1 @ResultID = ID, @Result = [Name], (dbo.Levenshtein(@Source, [Name])) As distOrig, (dbo.Levenshtein(@SourceTrimmed, [Name])) As distTrimmed, Frequency FROM MyTable WHERE /* ... yeah I'm lost */ ORDER BY distOrig, distTrimmed, Frequency Any ideas?

    Read the article

  • Fast read of certain bytes of multiple files in C/C++

    - by Alejandro Cámara
    I've been searching in the web about this question and although there are many similar questions about read/write in C/C++, I haven't found about this specific task. I want to be able to read from multiple files (256x256 files) only sizeof(double) bytes located in a certain position of each file. Right now my solution is, for each file: Open the file (read, binary mode): fstream fTest("current_file", ios_base::out | ios_base::binary); Seek the position I want to read: fTest.seekg(position*sizeof(test_value), ios_base::beg); Read the bytes: fTest.read((char *) &(output[i][j]), sizeof(test_value)); And close the file: fTest.close(); This takes about 350 ms to run inside a for{ for {} } structure with 256x256 iterations (one for each file). Q: Do you think there is a better way to implement this operation? How would you do it?

    Read the article

  • How to improve performance of map that loads new overlay images

    - by anthonysomerset
    I have inherited a website to maintain that uses a html map overlaying a real map to link specific countries to specific pages. previously it loaded the default map image, then with some javascript it would change the image src to an image with that particular country in a different colour on mouseover and reset the image source back to the original image on mouse out to make maintenance (adding new countries) easier i made the initial map a background image by utilising some CSS for the div tag, and then created new images for each country which only had that countries hightlight so that the images remain fairly small. this works great but theres one issue which is particularly noticeable on slower internet connections when you hover over a country if you dont have the image file in your browser cache or downloaded it wont load the image unless you hover over another country and then back onto the first country - i guess this is due to the image having to manually be downloaded on first hover. My question: is it possible to force the load of these extra images AFTER the page and all the other assets have finished loading so that this behaviour is all but eliminated? the html code for the MAP is as follows: <div class="gtmap"><img id="Image-Maps_6200909211657061" src="<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png" usemap="#Image-Maps_6200909211657061" alt="We offer Guided Motorcycle Tours all around the world" width="615" height="296" /> <map id="_Image-Maps_6200909211657061" name="Image-Maps_6200909211657061"> <area shape="poly" coords="511,134,532,107,542,113,520,141" href="/guided-motorcycle-tours-japan/" alt="Guided Japan Motorcycle Tours" title="Japan" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-japan.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="252,61,266,58,275,64,262,68" href="/guided-motorcycle-tour.php?iceland-motorcycle-adventure-39" alt="Guided Iceland Motorcycle Tours" title="Iceland" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-iceland.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="587,246,597,256,577,279,568,270" href="/guided-motorcycle-tour.php?new-zealand-south-island-adventure-10" alt="New Zealand Guided Motorcycle Tours" title="New Zealand" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-nz.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="418,133,412,145,412,154,421,178,430,180,430,166,443,154,443,145,438,144,433,142,430,138,431,130,430,129,425,128" href="/guided-motorcycle-tours-india/" alt="India Guided Motorcycle Tours" title="India" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-india.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="460,152,466,149,474,165,470,171,466,161" href="/guided-motorcycle-tours-laos/" alt="Laos Guided Motorcycle Tours" title="Laos" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-laos.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="468,179,475,166,468,152,475,152,482,169" href="/guided-motorcycle-tour.php?indochina-motorcycle-adventure-tour-32" onClick="javascript: pageTracker._trackPageview('/internal-links/guided-tours/map/vietnam');" alt="Vietnam Guided Motorcycle Tours" title="Vietnam" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-viet.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="330,239,337,235,347,226,352,233,351,243,344,250,335,253,327,255,323,249,322,242,323,241" href="/guided-motorcycle-tours-southafrica/" alt="South Africa Guided Motorcycle Tours" title="South Africa" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-sa.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="290,77,293,86,298,96,286,102,285,97,285,89,282,84,282,79" href="/guided-motorcycle-tour.php?great-britain-isle-of-man-scotland-wales-uk-18" alt="United Kingdom" title="United Kingdom Guided Motorcycle Tours" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-uk.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="357,118,368,118,369,126,345,129,338,125,338,117,342,115,348,116" href="/guided-motorcycle-tour.php?explore-turkey-adventure-45" alt="Turkey" title="Turkey Guided Motorcycle Tours" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-turkey.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="206,95,193,101,185,101,178,106,165,111,157,109,147,105,134,103,121,103,107,103,96,103,86,104,81,99,77,91,70,83,62,79,60,72,61,64,59,57,60,51,71,50,83,49,95,50,107,54,117,53,129,47,137,36,148,37,163,38,177,44,187,54,195,60,184,72,191,80,200,87" href="/guided-motorcycle-tours-canada/" alt="Guided Canada Motorcycle Tours" title="Canada" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-canada.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="61,75,60,62,60,55,59,44,51,44,43,43,36,42,28,43,23,48,17,51,15,62,19,74,27,79,19,83,16,93,35,83,43,77,50,75,55,75" href="/guided-motorcycle-tours-alaska/" alt="Guided Alaska Motorcycle Tours" title="Alaska" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-alaska.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="82,101,99,101,133,101,148,105,161,110,172,106,187,100,180,113,171,122,165,131,159,149,147,141,137,140,129,147,120,141,112,138,103,137,93,132,86,122,86,112,86,106" href="/guided-motorcycle-tours-usa/" alt="USA Guided Motorcycle Tours" title="USA" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-usa.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="178,225,180,214,175,208,174,204,178,198,174,193,167,192,157,199,158,204,164,211,167,218" href="/guided-motorcycle-tour.php?peru-machu-picchu-adventure-25" alt="Peru Guided Motorcycle Tours" title="Peru" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-peru.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="172,226,169,239,166,256,166,267,164,279,171,277,174,262,175,250,179,234,180,225,176,224" href="/guided-motorcycle-tours-chile/" alt="Guided Chile Motorcycle Tours" title="Chile" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-chile.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> <area shape="poly" coords="199,260,194,261,187,265,184,276,183,296,170,292,168,282,174,270,174,257,177,245,180,230,190,228,205,237,199,245" href="/guided-motorcycle-tours-argentina/" alt="Guided Argentina Motorcycle Tours" title="Argentina" onmouseover="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-arg.png';" onmouseout="if(document.images) document.getElementById('Image-Maps_6200909211657061').src='<?php echo cdnhttpsCheck(); ?>assets/wmap/a-guided-tours-map-blank.png';" /> </map> </div> The <?php echo cdnhttpsCheck(); ?> is just a site specific function that gets the correct web domain/url from a config file to load resources from CDN where possible (eg all non HTTPS requests) We are loading Jquery at the bottom of the HTML if anybody wonders why it is missing from the code snippet for reference, the page with the map in question is found here: http://www.motoquest.com/guided-motorcycle-tours/

    Read the article

  • How to correctly cache images

    - by James Simpson
    I just installed Google's Page Speed plugin to Firebug, and everything looks good except for caching. I have set headers to cache my JS and CSS files, but it says the images aren't being cached. How can I make sure the images get cached for 30 days? These are static images, so I can't just add the headers with PHP like I did with the other files.

    Read the article

  • help me improve my sse yuv to rgb ssse3 code

    - by David McPaul
    Hello, I am looking to optimise some sse code I wrote for converting yuv to rgb (both planar and packed yuv functions). i am using SSSE3 at the moment but if there are useful functions from later sse versions thats ok. I am mainly interested in how I would work out processor stalls and the like. Anyone know of any tools that do static analysis of sse code? ; ; Copyright (C) 2009-2010 David McPaul ; ; All rights reserved. Distributed under the terms of the MIT License. ; ; A rather unoptimised set of ssse3 yuv to rgb converters ; does 8 pixels per loop ; inputer: ; reads 128 bits of yuv 8 bit data and puts ; the y values converted to 16 bit in xmm0 ; the u values converted to 16 bit and duplicated into xmm1 ; the v values converted to 16 bit and duplicated into xmm2 ; conversion: ; does the yuv to rgb conversion using 16 bit integer and the ; results are placed into the following registers as 8 bit clamped values ; r values in xmm3 ; g values in xmm4 ; b values in xmm5 ; outputer: ; writes out the rgba pixels as 8 bit values with 0 for alpha ; xmm6 used for scratch ; xmm7 used for scratch %macro cglobal 1 global _%1 %define %1 _%1 align 16 %1: %endmacro ; conversion code %macro yuv2rgbsse2 0 ; u = u - 128 ; v = v - 128 ; r = y + v + v >> 2 + v >> 3 + v >> 5 ; g = y - (u >> 2 + u >> 4 + u >> 5) - (v >> 1 + v >> 3 + v >> 4 + v >> 5) ; b = y + u + u >> 1 + u >> 2 + u >> 6 ; subtract 16 from y movdqa xmm7, [Const16] ; loads a constant using data cache (slower on first fetch but then cached) psubsw xmm0,xmm7 ; y = y - 16 ; subtract 128 from u and v movdqa xmm7, [Const128] ; loads a constant using data cache (slower on first fetch but then cached) psubsw xmm1,xmm7 ; u = u - 128 psubsw xmm2,xmm7 ; v = v - 128 ; load r,b with y movdqa xmm3,xmm0 ; r = y pshufd xmm5,xmm0, 0xE4 ; b = y ; r = y + v + v >> 2 + v >> 3 + v >> 5 paddsw xmm3, xmm2 ; add v to r movdqa xmm7, xmm1 ; move u to scratch pshufd xmm6, xmm2, 0xE4 ; move v to scratch psraw xmm6,2 ; divide v by 4 paddsw xmm3, xmm6 ; and add to r psraw xmm6,1 ; divide v by 2 paddsw xmm3, xmm6 ; and add to r psraw xmm6,2 ; divide v by 4 paddsw xmm3, xmm6 ; and add to r ; b = y + u + u >> 1 + u >> 2 + u >> 6 paddsw xmm5, xmm1 ; add u to b psraw xmm7,1 ; divide u by 2 paddsw xmm5, xmm7 ; and add to b psraw xmm7,1 ; divide u by 2 paddsw xmm5, xmm7 ; and add to b psraw xmm7,4 ; divide u by 32 paddsw xmm5, xmm7 ; and add to b ; g = y - u >> 2 - u >> 4 - u >> 5 - v >> 1 - v >> 3 - v >> 4 - v >> 5 movdqa xmm7,xmm2 ; move v to scratch pshufd xmm6,xmm1, 0xE4 ; move u to scratch movdqa xmm4,xmm0 ; g = y psraw xmm6,2 ; divide u by 4 psubsw xmm4,xmm6 ; subtract from g psraw xmm6,2 ; divide u by 4 psubsw xmm4,xmm6 ; subtract from g psraw xmm6,1 ; divide u by 2 psubsw xmm4,xmm6 ; subtract from g psraw xmm7,1 ; divide v by 2 psubsw xmm4,xmm7 ; subtract from g psraw xmm7,2 ; divide v by 4 psubsw xmm4,xmm7 ; subtract from g psraw xmm7,1 ; divide v by 2 psubsw xmm4,xmm7 ; subtract from g psraw xmm7,1 ; divide v by 2 psubsw xmm4,xmm7 ; subtract from g %endmacro ; outputer %macro rgba32sse2output 0 ; clamp values pxor xmm7,xmm7 packuswb xmm3,xmm7 ; clamp to 0,255 and pack R to 8 bit per pixel packuswb xmm4,xmm7 ; clamp to 0,255 and pack G to 8 bit per pixel packuswb xmm5,xmm7 ; clamp to 0,255 and pack B to 8 bit per pixel ; convert to bgra32 packed punpcklbw xmm5,xmm4 ; bgbgbgbgbgbgbgbg movdqa xmm0, xmm5 ; save bg values punpcklbw xmm3,xmm7 ; r0r0r0r0r0r0r0r0 punpcklwd xmm5,xmm3 ; lower half bgr0bgr0bgr0bgr0 punpckhwd xmm0,xmm3 ; upper half bgr0bgr0bgr0bgr0 ; write to output ptr movntdq [edi], xmm5 ; output first 4 pixels bypassing cache movntdq [edi+16], xmm0 ; output second 4 pixels bypassing cache %endmacro SECTION .data align=16 Const16 dw 16 dw 16 dw 16 dw 16 dw 16 dw 16 dw 16 dw 16 Const128 dw 128 dw 128 dw 128 dw 128 dw 128 dw 128 dw 128 dw 128 UMask db 0x01 db 0x80 db 0x01 db 0x80 db 0x05 db 0x80 db 0x05 db 0x80 db 0x09 db 0x80 db 0x09 db 0x80 db 0x0d db 0x80 db 0x0d db 0x80 VMask db 0x03 db 0x80 db 0x03 db 0x80 db 0x07 db 0x80 db 0x07 db 0x80 db 0x0b db 0x80 db 0x0b db 0x80 db 0x0f db 0x80 db 0x0f db 0x80 YMask db 0x00 db 0x80 db 0x02 db 0x80 db 0x04 db 0x80 db 0x06 db 0x80 db 0x08 db 0x80 db 0x0a db 0x80 db 0x0c db 0x80 db 0x0e db 0x80 ; void Convert_YUV422_RGBA32_SSSE3(void *fromPtr, void *toPtr, int width) width equ ebp+16 toPtr equ ebp+12 fromPtr equ ebp+8 ; void Convert_YUV420P_RGBA32_SSSE3(void *fromYPtr, void *fromUPtr, void *fromVPtr, void *toPtr, int width) width1 equ ebp+24 toPtr1 equ ebp+20 fromVPtr equ ebp+16 fromUPtr equ ebp+12 fromYPtr equ ebp+8 SECTION .text align=16 cglobal Convert_YUV422_RGBA32_SSSE3 ; reserve variables push ebp mov ebp, esp push edi push esi push ecx mov esi, [fromPtr] mov edi, [toPtr] mov ecx, [width] ; loop width / 8 times shr ecx,3 test ecx,ecx jng ENDLOOP REPEATLOOP: ; loop over width / 8 ; YUV422 packed inputer movdqa xmm0, [esi] ; should have yuyv yuyv yuyv yuyv pshufd xmm1, xmm0, 0xE4 ; copy to xmm1 movdqa xmm2, xmm0 ; copy to xmm2 ; extract both y giving y0y0 pshufb xmm0, [YMask] ; extract u and duplicate so each u in yuyv becomes u0u0 pshufb xmm1, [UMask] ; extract v and duplicate so each v in yuyv becomes v0v0 pshufb xmm2, [VMask] yuv2rgbsse2 rgba32sse2output ; endloop add edi,32 add esi,16 sub ecx, 1 ; apparently sub is better than dec jnz REPEATLOOP ENDLOOP: ; Cleanup pop ecx pop esi pop edi mov esp, ebp pop ebp ret cglobal Convert_YUV420P_RGBA32_SSSE3 ; reserve variables push ebp mov ebp, esp push edi push esi push ecx push eax push ebx mov esi, [fromYPtr] mov eax, [fromUPtr] mov ebx, [fromVPtr] mov edi, [toPtr1] mov ecx, [width1] ; loop width / 8 times shr ecx,3 test ecx,ecx jng ENDLOOP1 REPEATLOOP1: ; loop over width / 8 ; YUV420 Planar inputer movq xmm0, [esi] ; fetch 8 y values (8 bit) yyyyyyyy00000000 movd xmm1, [eax] ; fetch 4 u values (8 bit) uuuu000000000000 movd xmm2, [ebx] ; fetch 4 v values (8 bit) vvvv000000000000 ; extract y pxor xmm7,xmm7 ; 00000000000000000000000000000000 punpcklbw xmm0,xmm7 ; interleave xmm7 into xmm0 y0y0y0y0y0y0y0y0 ; extract u and duplicate so each becomes 0u0u punpcklbw xmm1,xmm7 ; interleave xmm7 into xmm1 u0u0u0u000000000 punpcklwd xmm1,xmm7 ; interleave again u000u000u000u000 pshuflw xmm1,xmm1, 0xA0 ; copy u values pshufhw xmm1,xmm1, 0xA0 ; to get u0u0 ; extract v punpcklbw xmm2,xmm7 ; interleave xmm7 into xmm1 v0v0v0v000000000 punpcklwd xmm2,xmm7 ; interleave again v000v000v000v000 pshuflw xmm2,xmm2, 0xA0 ; copy v values pshufhw xmm2,xmm2, 0xA0 ; to get v0v0 yuv2rgbsse2 rgba32sse2output ; endloop add edi,32 add esi,8 add eax,4 add ebx,4 sub ecx, 1 ; apparently sub is better than dec jnz REPEATLOOP1 ENDLOOP1: ; Cleanup pop ebx pop eax pop ecx pop esi pop edi mov esp, ebp pop ebp ret SECTION .note.GNU-stack noalloc noexec nowrite progbits

    Read the article

  • Does replacing statements by expressions using the C++ comma operator could allow more compiler opti

    - by Gabriel Cuvillier
    The C++ comma operator is used to chain individual expressions, yielding the value of the last executed expression as the result. For example the skeleton code (6 statements, 6 expressions): step1; step2; if (condition) step3; return step4; else return step5; May be rewritten to: (1 statement, 6 expressions) return step1, step2, condition? step3, step4 : step5; I noticed that it is not possible to perform step-by-step debugging of such code, as the expression chain seems to be executed as a whole. Does it means that the compiler is able to perform special optimizations which are not possible with the traditional statement approach (specially if the steps are const or inline)? Note: I'm not talking about the coding style merit of that way of expressing sequence of expressions! Just about the possible optimisations allowed by replacing statements by expressions.

    Read the article

  • Find point which sum of distances to set of other points is minimal

    - by Pawel Markowski
    I have one set (X) of points (not very big let's say 1-20 points) and the second (Y), much larger set of points. I need to choose some point from Y which sum of distances to all points from X is minimal. I came up with an idea that I would treat X as a vertices of a polygon and find centroid of this polygon, and then I will choose a point from Y nearest to the centroid. But I'm not sure whether centroid minimizes sum of its distances to the vertices of polygon, so I'm not sure whether this is a good way? Is there any algorithm for solving this problem? Points are defined by geographical coordinates.

    Read the article

  • How to optimize this user ranking query

    - by James Simpson
    I have 2 databases (users, userRankings) for a system that needs to have rankings updated every 10 minutes. I use the following code to update these rankings which works fairly well, but there is still a full table scan involved which slows things down with a few hundred thousand users. mysql_query("TRUNCATE TABLE userRankings"); mysql_query("INSERT INTO userRankings (userid) SELECT id FROM users ORDER BY score DESC"); mysql_query("UPDATE users a, userRankings b SET a.rank = b.rank WHERE a.id = b.userid"); In the userRankings table, rank is the primary key and userid is an index. Both tables are MyISAM (I've wondered if it might be beneficial to make userRankings InnoDB).

    Read the article

  • How can I strip Python logging calls without commenting them out?

    - by cdleary
    Today I was thinking about a Python project I wrote about a year back where I used logging pretty extensively. I remember having to comment out a lot of logging calls in inner-loop-like scenarios (the 90% code) because of the overhead (hotshot indicated it was one of my biggest bottlenecks). I wonder now if there's some canonical way to programmatically strip out logging calls in Python applications without commenting and uncommenting all the time. I'd think you could use inspection/recompilation or bytecode manipulation to do something like this and target only the code objects that are causing bottlenecks. This way, you could add a manipulator as a post-compilation step and use a centralized configuration file, like so: [Leave ERROR and above] my_module.SomeClass.method_with_lots_of_warn_calls [Leave WARN and above] my_module.SomeOtherClass.method_with_lots_of_info_calls [Leave INFO and above] my_module.SomeWeirdClass.method_with_lots_of_debug_calls Of course, you'd want to use it sparingly and probably with per-function granularity -- only for code objects that have shown logging to be a bottleneck. Anybody know of anything like this? Note: There are a few things that make this more difficult to do in a performant manner because of dynamic typing and late binding. For example, any calls to a method named debug may have to be wrapped with an if not isinstance(log, Logger). In any case, I'm assuming all of the minor details can be overcome, either by a gentleman's agreement or some run-time checking. :-)

    Read the article

  • SQL Server indexed view matching of views with joins not working

    - by usr
    Does anyone have experience of when SQL Servr 2008 R2 is able to automatically match indexed view (also known as materialized views) that contain joins to a query? for example the view select dbo.Orders.Date, dbo.OrderDetails.ProductID from dbo.OrderDetails join dbo.Orders on dbo.OrderDetails.OrderID = dbo.Orders.ID cannot be automatically matched to the same exact query. When I select directly from this view ith (noexpand) I actually get a much faster query plan that does a scan on the clustered index of the indexed view. Can I get SQL Server to do this matching automatically? I have quite a few queries and views... I am on enterprise edition of SQL Server 2008 R2.

    Read the article

  • Better way to summarize data about stop times?

    - by Vimvq1987
    This question is close to this: http://stackoverflow.com/questions/2947963/find-the-period-of-over-speed Here's my table: Longtitude Latitude Velocity Time 102 401 40 2010-06-01 10:22:34.000 103 403 50 2010-06-01 10:40:00.000 104 405 0 2010-06-01 11:00:03.000 104 405 0 2010-06-01 11:10:05.000 105 406 35 2010-06-01 11:15:30.000 106 403 60 2010-06-01 11:20:00.000 108 404 70 2010-06-01 11:30:05.000 109 405 0 2010-06-01 11:35:00.000 109 405 0 2010-06-01 11:40:00.000 105 407 40 2010-06-01 11:50:00.000 104 406 30 2010-06-01 12:00:00.000 101 409 50 2010-06-01 12:05:30.000 104 405 0 2010-06-01 11:05:30.000 I want to summarize times when vehicle had stopped (velocity = 0), include: it had stopped since "when" to "when" in how much minutes, how many times it stopped and how much time it stopped. I wrote this query to do it: select longtitude, latitude, MIN(time), MAX(time), DATEDIFF(minute, MIN(Time), MAX(time)) as Timespan from table_1 where velocity = 0 group by longtitude,latitude select DATEDIFF(minute, MIN(Time), MAX(time)) as minute into #temp3 from table_1 where velocity = 0 group by longtitude,latitude select COUNT(*) as [number]from #temp select SUM(minute) as [totaltime] from #temp3 drop table #temp This query return: longtitude latitude (No column name) (No column name) Timespan 104 405 2010-06-01 11:00:03.000 2010-06-01 11:10:05.000 10 109 405 2010-06-01 11:35:00.000 2010-06-01 11:40:00.000 5 number 2 totaltime 15 You can see, it works fine, but I really don't like the #temp table. Is there anyway to query this without use a temp table? Thank you.

    Read the article

  • Trying to reduce the speed overhead of an almost-but-not-quite-int number class

    - by Fumiyo Eda
    I have implemented a C++ class which behaves very similarly to the standard int type. The difference is that it has an additional concept of "epsilon" which represents some tiny value that is much less than 1, but greater than 0. One way to think of it is as a very wide fixed point number with 32 MSBs (the integer parts), 32 LSBs (the epsilon parts) and a huge sea of zeros in between. The following class works, but introduces a ~2x speed penalty in the overall program. (The program includes code that has nothing to do with this class, so the actual speed penalty of this class is probably much greater than 2x.) I can't paste the code that is using this class, but I can say the following: +, -, +=, <, > and >= are the only heavily used operators. Use of setEpsilon() and getInt() is extremely rare. * is also rare, and does not even need to consider the epsilon values at all. Here is the class: #include <limits> struct int32Uepsilon { typedef int32Uepsilon Self; int32Uepsilon () { _value = 0; _eps = 0; } int32Uepsilon (const int &i) { _value = i; _eps = 0; } void setEpsilon() { _eps = 1; } Self operator+(const Self &rhs) const { Self result = *this; result._value += rhs._value; result._eps += rhs._eps; return result; } Self operator-(const Self &rhs) const { Self result = *this; result._value -= rhs._value; result._eps -= rhs._eps; return result; } Self operator-( ) const { Self result = *this; result._value = -result._value; result._eps = -result._eps; return result; } Self operator*(const Self &rhs) const { return this->getInt() * rhs.getInt(); } // XXX: discards epsilon bool operator<(const Self &rhs) const { return (_value < rhs._value) || (_value == rhs._value && _eps < rhs._eps); } bool operator>(const Self &rhs) const { return (_value > rhs._value) || (_value == rhs._value && _eps > rhs._eps); } bool operator>=(const Self &rhs) const { return (_value >= rhs._value) || (_value == rhs._value && _eps >= rhs._eps); } Self &operator+=(const Self &rhs) { this->_value += rhs._value; this->_eps += rhs._eps; return *this; } Self &operator-=(const Self &rhs) { this->_value -= rhs._value; this->_eps -= rhs._eps; return *this; } int getInt() const { return(_value); } private: int _value; int _eps; }; namespace std { template<> struct numeric_limits<int32Uepsilon> { static const bool is_signed = true; static int max() { return 2147483647; } } }; The code above works, but it is quite slow. Does anyone have any ideas on how to improve performance? There are a few hints/details I can give that might be helpful: 32 bits are definitely insufficient to hold both _value and _eps. In practice, up to 24 ~ 28 bits of _value are used and up to 20 bits of _eps are used. I could not measure a significant performance difference between using int32_t and int64_t, so memory overhead itself is probably not the problem here. Saturating addition/subtraction on _eps would be cool, but isn't really necessary. Note that the signs of _value and _eps are not necessarily the same! This broke my first attempt at speeding this class up. Inline assembly is no problem, so long as it works with GCC on a Core i7 system running Linux!

    Read the article

  • Quickest way to write to file in java

    - by user1097772
    I'm writing an application which compares directory structure. First I wrote an application which writes gets info about files - one line about each file or directory. My soulution is: calling method toFile Static PrintWriter pw = new PrintWriter(new BufferedWriter( new FileWriter("DirStructure.dlis")), true); String line; // info about file or directory public void toFile(String line) { pw.println(line); } and of course pw.close(), at the end. My question is, can I do it quicker? What is the quickest way? Edit: quickest way = quickest writing in the file

    Read the article

  • Should not a tail-recursive function also be faster?

    - by Balint Erdi
    I have the following Clojure code to calculate a number with a certain "factorable" property. (what exactly the code does is secondary). (defn factor-9 ([] (let [digits (take 9 (iterate #(inc %) 1)) nums (map (fn [x] ,(Integer. (apply str x))) (permutations digits))] (some (fn [x] (and (factor-9 x) x)) nums))) ([n] (or (= 1 (count (str n))) (and (divisible-by-length n) (factor-9 (quot n 10)))))) Now, I'm into TCO and realize that Clojure can only provide tail-recursion if explicitly told so using the recur keyword. So I've rewritten the code to do that (replacing factor-9 with recur being the only difference): (defn factor-9 ([] (let [digits (take 9 (iterate #(inc %) 1)) nums (map (fn [x] ,(Integer. (apply str x))) (permutations digits))] (some (fn [x] (and (factor-9 x) x)) nums))) ([n] (or (= 1 (count (str n))) (and (divisible-by-length n) (recur (quot n 10)))))) To my knowledge, TCO has a double benefit. The first one is that it does not use the stack as heavily as a non tail-recursive call and thus does not blow it on larger recursions. The second, I think is that consequently it's faster since it can be converted to a loop. Now, I've made a very rough benchmark and have not seen any difference between the two implementations although. Am I wrong in my second assumption or does this have something to do with running on the JVM (which does not have automatic TCO) and recur using a trick to achieve it? Thank you.

    Read the article

< Previous Page | 227 228 229 230 231 232 233 234 235 236 237 238  | Next Page >