Search Results

Search found 33242 results on 1330 pages for 'database optimization'.

Page 228/1330 | < Previous Page | 224 225 226 227 228 229 230 231 232 233 234 235  | Next Page >

  • Is there a way to optimize this mysql query...?

    - by SpikETidE
    Hi Everyone... Say, I got these two tables.... Table 1 : Hotels hotel_id hotel_name 1 abc 2 xyz 3 efg Table 2 : Payments payment_id payment_date hotel_id total_amt comission p1 23-03-2010 1 100 10 p2 23-03-2010 2 50 5 p3 23-03-2010 2 200 25 p4 23-03-2010 1 40 2 Now, I need to get the following details from the two tables Given a particular date (say, 23-03-2010), the sum of the total_amt for each of the hotel for which a payment has been made on that particular date. All the rows that has the date 23-03-2010 ordered according to the hotel name A sample output is as follows... +------------+------------+------------+---------------+ | hotel_name | date | total_amt | commission | +------------+------------+------------+---------------+ | * abc | 23-03-2010 | 140 | 12 | +------------+------------+------------+---------------+ |+-----------+------------+------------+--------------+| || paymt_id | date | total_amt | commission || |+-----------+------------+------------+--------------+| || p1 | 23-03-2010 | 100 | 10 || |+-----------+------------+------------+--------------+| || p4 | 23-03-2010 | 40 | 2 || |+-----------+------------+------------+--------------+| +------------+------------+------------+---------------+ | * xyz | 23-03-2010 | 250 | 30 | +------------+------------+------------+---------------+ |+-----------+------------+------------+--------------+| || paymt_id | date | total_amt | commission || |+-----------+------------+------------+--------------+| || p2 | 23-03-2010 | 50 | 5 || |+-----------+------------+------------+--------------+| || p3 | 23-03-2010 | 200 | 25 || |+-----------+------------+------------+--------------+| +------------------------------------------------------+ Above the sample of the table that has to be printed... The idea is first to show the consolidated detail of each hotel, and when the '*' next to the hotel name is clicked the breakdown of the payment details will become visible... But that can be done by some jquery..!!! The table itself can be generated with php... Right now i am using two separate queries : One to get the sum of the amount and commission grouped by the hotel name. The next is to get the individual row for each entry having that date in the table. This is, of course, because grouping the records for calculating sum() returns only one row for each of the hotel with the sum of the amounts... Is there a way to combine these two queries into a single one and do the operation in a more optimized way...?? Hope i am being clear.. Thanks for your time and replies...

    Read the article

  • Optimize code performance when odd/even threads are doing different things in CUDA

    - by Orion Nebula
    Hi all! I have two large vectors, I am trying to do some sort of element multiplication, where an even-numbered element in the first vector is multiplied by the next odd-numbered element in the second vector .... and where the odd-numbered element in the first vector is multiplied by the preceding even-numbered element in the second vector Ex. vector 1 is V1(1) V1(2) V1(3) V1(4) vector 2 is V2(1) V2(2) V2(3) V2(4) V1(1) * V2(2) V1(3) * V2(4) V1(2) * V2(1) V1(4) * V2(3) I have written a Cuda code to do this: (Pds has the elements of the first vector in shared memory, Nds the second Vector) //instead of using %2 .. i check for the first bit to decide if number is odd/even -- faster if ((tx & 0x0001) == 0x0000) Nds[tx+1] = Pds[tx] * Nds[tx+1]; else Nds[tx-1] = Pds[tx] * Nds[tx-1]; __syncthreads(); Is there anyway to further accelerate this code or avoid divergence ? Thanks

    Read the article

  • Combine static files or load in parallel

    - by Niall Collins
    I am at present introducing code to my site to combine css and javascript files. Is there a way without having to include an external library to load javascript asynchronously or in parallel? I have read on some blogs that combining of files can be counter productive as the load of the http request can be large and its better to load multiple files in parallel. Opinions on this? I am caching my javascript/css. And would have thought it was better to combine rather than have multiple http requests.

    Read the article

  • Strange: Planner takes decision with lower cost, but (very) query long runtime

    - by S38
    Facts: PGSQL 8.4.2, Linux I make use of table inheritance Each Table contains 3 million rows Indexes on joining columns are set Table statistics (analyze, vacuum analyze) are up-to-date Only used table is "node" with varios partitioned sub-tables Recursive query (pg = 8.4) Now here is the explained query: WITH RECURSIVE rows AS ( SELECT * FROM ( SELECT r.id, r.set, r.parent, r.masterid FROM d_storage.node_dataset r WHERE masterid = 3533933 ) q UNION ALL SELECT * FROM ( SELECT c.id, c.set, c.parent, r.masterid FROM rows r JOIN a_storage.node c ON c.parent = r.id ) q ) SELECT r.masterid, r.id AS nodeid FROM rows r QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on rows r (cost=2742105.92..2862119.94 rows=6000701 width=16) (actual time=0.033..172111.204 rows=4 loops=1) CTE rows -> Recursive Union (cost=0.00..2742105.92 rows=6000701 width=28) (actual time=0.029..172111.183 rows=4 loops=1) -> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.025..0.027 rows=1 loops=1) Index Cond: (masterid = 3533933) -> Hash Join (cost=0.33..262208.33 rows=600070 width=28) (actual time=40628.371..57370.361 rows=1 loops=3) Hash Cond: (c.parent = r.id) -> Append (cost=0.00..211202.04 rows=12001404 width=20) (actual time=0.011..46365.669 rows=12000004 loops=3) -> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.002..0.002 rows=0 loops=3) -> Seq Scan on node_dataset c (cost=0.00..55001.01 rows=3000001 width=20) (actual time=0.007..3426.593 rows=3000001 loops=3) -> Seq Scan on node_stammdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=0.008..9049.189 rows=3000001 loops=3) -> Seq Scan on node_stammdaten_adresse c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=3.455..8381.725 rows=3000001 loops=3) -> Seq Scan on node_testdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=1.810..5259.178 rows=3000001 loops=3) -> Hash (cost=0.20..0.20 rows=10 width=16) (actual time=0.010..0.010 rows=1 loops=3) -> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.002..0.004 rows=1 loops=3) Total runtime: 172111.371 ms (16 rows) (END) So far so bad, the planner decides to choose hash joins (good) but no indexes (bad). Now after doing the following: SET enable_hashjoins TO false; The explained query looks like that: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on rows r (cost=15198247.00..15318261.02 rows=6000701 width=16) (actual time=0.038..49.221 rows=4 loops=1) CTE rows -> Recursive Union (cost=0.00..15198247.00 rows=6000701 width=28) (actual time=0.032..49.201 rows=4 loops=1) -> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.028..0.031 rows=1 loops=1) Index Cond: (masterid = 3533933) -> Nested Loop (cost=0.00..1507822.44 rows=600070 width=28) (actual time=10.384..16.382 rows=1 loops=3) Join Filter: (r.id = c.parent) -> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.001..0.003 rows=1 loops=3) -> Append (cost=0.00..113264.67 rows=3001404 width=20) (actual time=8.546..12.268 rows=1 loops=4) -> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.001..0.001 rows=0 loops=4) -> Bitmap Heap Scan on node_dataset c (cost=58213.87..113214.88 rows=3000001 width=20) (actual time=1.906..1.906 rows=0 loops=4) Recheck Cond: (c.parent = r.id) -> Bitmap Index Scan on node_dataset_parent (cost=0.00..57463.87 rows=3000001 width=0) (actual time=1.903..1.903 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_stammdaten_parent on node_stammdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=3.272..3.273 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_stammdaten_adresse_parent on node_stammdaten_adresse c (cost=0.00..8.60 rows=1 width=20) (actual time=4.333..4.333 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_testdaten_parent on node_testdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=2.745..2.746 rows=0 loops=4) Index Cond: (c.parent = r.id) Total runtime: 49.349 ms (21 rows) (END) - incredibly faster, because indexes were used. Notice: Cost of the second query ist somewhat higher than for the first query. So the main question is: Why does the planner make the first decision, instead of the second? Also interesing: Via SET enable_seqscan TO false; i temp. disabled seq scans. Than the planner used indexes and hash joins, and the query still was slow. So the problem seems to be the hash join. Maybe someone can help in this confusing situation? thx, R.

    Read the article

  • Position of least significant bit that is set

    - by peterchen
    I am looking for an efficient way to determine the position of the least significant bit that is set in an integer, e.g. for 0x0FF0 it would be 4. A trivial implementation is this: unsigned GetLowestBitPos(unsigned value) { assert(value != 0); // handled separately unsigned pos = 0; while (!(value & 1)) { value >>= 1; ++pos; } return pos; } Any ideas how to squeeze some cycles out of it? (Note: this question is for people that enjoy such things, not for people to tell me xyzoptimization is evil.) [edit] Thanks everyone for the ideas! I've learnt a few other things, too. Cool!

    Read the article

  • What is the Fastest Way to Check for a Keyword in a List of Keywords in Delphi?

    - by lkessler
    I have a small list of keywords. What I'd really like to do is akin to: case MyKeyword of 'CHIL': (code for CHIL); 'HUSB': (code for HUSB); 'WIFE': (code for WIFE); 'SEX': (code for SEX); else (code for everything else); end; Unfortunately the CASE statement can't be used like that for strings. I could use the straight IF THEN ELSE IF construct, e.g.: if MyKeyword = 'CHIL' then (code for CHIL) else if MyKeyword = 'HUSB' then (code for HUSB) else if MyKeyword = 'WIFE' then (code for WIFE) else if MyKeyword = 'SEX' then (code for SEX) else (code for everything else); but I've heard this is relatively inefficient. What I had been doing instead is: P := pos(' ' + MyKeyword + ' ', ' CHIL HUSB WIFE SEX '); case P of 1: (code for CHIL); 6: (code for HUSB); 11: (code for WIFE); 17: (code for SEX); else (code for everything else); end; This, of course is not the best programming style, but it works fine for me and up to now didn't make a difference. So what is the best way to rewrite this in Delphi so that it is both simple, understandable but also fast? (For reference, I am using Delphi 2009 with Unicode strings.) Followup: Toby recommended I simply use the If Then Else construct. Looking back at my examples that used a CASE statement, I can see how that is a viable answer. Unfortunately, my inclusion of the CASE inadvertently hid my real question. I actually don't care which keyword it is. That is just a bonus if the particular method can identify it like the POS method can. What I need is to know whether or not the keyword is in the set of keywords. So really I want to know if there is anything better than: if pos(' ' + MyKeyword + ' ', ' CHIL HUSB WIFE SEX ') > 0 then The If Then Else equivalent does not seem better in this case being: if (MyKeyword = 'CHIL') or (MyKeyword = 'HUSB') or (MyKeyword = 'WIFE') or (MyKeyword = 'SEX') then In Barry's comment to Kornel's question, he mentions the TDictionary Generic. I've not yet picked up on the new Generic collections and it looks like I should delve into them. My question here would be whether they are built for efficiency and how would using TDictionary compare in looks and in speed to the above two lines? In later profiling, I have found that the concatenation of strings as in: (' ' + MyKeyword + ' ') is VERY expensive time-wise and should be avoided whenever possible. Almost any other solution is better than doing this.

    Read the article

  • FxCop giving a warning on private constructor CA1823 and CA1053

    - by Luis Sánchez
    I have a class that looks like the following: Public Class Utilities Public Shared Function blah(userCode As String) As String 'doing some stuff End Function End Class I'm running FxCop 10 on it and it says: "Because type 'Utilities' contains only 'static' ( 'Shared' in Visual Basic) members, add a default private constructor to prevent the compiler from adding a default public constructor." Ok, you're right Mr. FxCop, I'll add a private constructor: Private Utilities() Now I'm having: "It appears that field 'Utilities.Utilities' is never used or is only ever assigned to. Use this field or remove it." Any ideas of what should I do to get rid of both warnings?

    Read the article

  • cheapest way to draw a fullscreen quad

    - by Soubok
    I wondering if there is a faster way to draw a full-screen quad in OpenGL: NewList(); PushMatrix(); LoadIdentity(); MatrixMode(PROJECTION); PushMatrix(); LoadIdentity(); Begin(QUADS); Vertex(-1,-1,0); Vertex(1,-1,0); Vertex(1,1,0); Vertex(-1,1,0); End(); PopMatrix(); MatrixMode(MODELVIEW); PopMatrix(); EndList();

    Read the article

  • Optimizing processing and management of large Java data arrays

    - by mikera
    I'm writing some pretty CPU-intensive, concurrent numerical code that will process large amounts of data stored in Java arrays (e.g. lots of double[100000]s). Some of the algorithms might run millions of times over several days so getting maximum steady-state performance is a high priority. In essence, each algorithm is a Java object that has an method API something like: public double[] runMyAlgorithm(double[] inputData); or alternatively a reference could be passed to the array to store the output data: public runMyAlgorithm(double[] inputData, double[] outputData); Given this requirement, I'm trying to determine the optimal strategy for allocating / managing array space. Frequently the algorithms will need large amounts of temporary storage space. They will also take large arrays as input and create large arrays as output. Among the options I am considering are: Always allocate new arrays as local variables whenever they are needed (e.g. new double[100000]). Probably the simplest approach, but will produce a lot of garbage. Pre-allocate temporary arrays and store them as final fields in the algorithm object - big downside would be that this would mean that only one thread could run the algorithm at any one time. Keep pre-allocated temporary arrays in ThreadLocal storage, so that a thread can use a fixed amount of temporary array space whenever it needs it. ThreadLocal would be required since multiple threads will be running the same algorithm simultaneously. Pass around lots of arrays as parameters (including the temporary arrays for the algorithm to use). Not good since it will make the algorithm API extremely ugly if the caller has to be responsible for providing temporary array space.... Allocate extremely large arrays (e.g. double[10000000]) but also provide the algorithm with offsets into the array so that different threads will use a different area of the array independently. Will obviously require some code to manage the offsets and allocation of the array ranges. Any thoughts on which approach would be best (and why)?

    Read the article

  • How to optimize neural network by using genetic algorithm?

    - by Billy Coen
    I'm quite new with this topic so any help would be great. What i need is to optimize a neural network in MATLAB by using GA. My network has [2x98] input and [1x98] target, i've tried consulting matlab help but im still kind of clueless about what to do :( so, any help would be appreciated. Thanks in advance. edit: i guess i didn't say what is there to be optimized as Dan said in the 1st answer. I guess most important thing is number of hidden neurons. And maybe number of hidden layers and training parameters like number of epochs or so. Sorry for not providing enough info, i'm still learning about this.

    Read the article

  • Whats faster in Javascript a bunch of small setInterval loops, or one big one?

    - by RobertWHurst
    Just wondering if its worth it to make a monolithic loop function or just add loops were they're needed. The big loop option would just be a loop of callbacks that are added dynamically with an add function. adding a function would look like this setLoop(function(){ alert('hahaha! I\'m a really annoying loop that bugs you every tenth of a second'); }); setLoop would add the function to the monolithic loop. so is the is worth anything in performance or should I just stick to lots of little loops using setInterval?

    Read the article

  • explicit copy constructor or implicit parameter by value

    - by R Samuel Klatchko
    I recently read (and unfortunately forgot where), that the best way to write operator= is like this: foo &operator=(foo other) { swap(*this, other); return *this; } instead of this: foo &operator=(const foo &other) { foo copy(other); swap(*this, copy); return *this; } The idea is that if operator= is called with an rvalue, the first version can optimize away construction of a copy. So when called with a rvalue, the first version is faster and when called with an lvalue the two are equivalent. I'm curious as to what other people think about this? Would people avoid the first version because of lack of explicitness? Am I correct that the first version can be better and can never be worse?

    Read the article

  • cached schwartzian transform

    - by davidk01
    I'm going through "Intermediate Perl" and it's pretty cool. I just finished the section on "The Schwartzian Transform" and after it sunk in I started to wonder why the transform doesn't use a cache. In lists that have several repeated values the transform recomputes the value for each one so I thought why not use a hash to cache results. Here' some code: # a place to keep our results my %cache; # the transformation we are interested in sub foo { # expensive operations } # some data my @unsorted_list = ....; # sorting with the help of the cache my @sorted_list = sort { ($cache{$a} or $cache{$a} = &foo($a)) <=> ($cache{$b} or $cache{$b} = &foo($b)) } @unsorted_list; Am I missing something? Why isn't the cached version of the Schwartzian transform listed in books and in general just better circulated because on first glance I think the cached version should be more efficient?

    Read the article

  • mysql subselect alternative

    - by Arnold
    Hi, Lets say I am analyzing how high school sports records affect school attendance. So I have a table in which each row corresponds to a high school basketball game. Each game has an away team id and a home team id (FK to another "team table") and a home score and an away score and a date. I am writing a query that matches attendance with this seasons basketball games. My sample output will be (#_students_missed_class, day_of_game, home_team, away_team, home_team_wins_this_season, away_team_wins_this_season) I now want to add how each team did the previous season to my analysis. Well, I have their previous season stored in the game table but i should be able to accomplish that with a subselect. So in my main select statement I add the subselect: SELECT COUNT(*) FROM game_table WHERE game_table.date BETWEEN 'start of previous season' AND 'end of previous season' AND ( (game_table.home_team = team_table.id AND game_table.home_score > game_table.away_score) OR (game_table.away_team = team_table.id AND game_table.away_score > game_table.home_score)) In this case team-table.id refers to the id of the home_team so I now have all their wins calculated from the previous year. This method of calculation is neither time nor resource intensive. The Explain SQL shows that I have ALL in the Type field and I am not using a Key and the query times out. I'm not sure how I can accomplish a more efficient query with a subselect. It seems proposterously inefficient to have to write 4 of these queries (for home wins, home losses, away wins, away losses). I am sure this could be more lucid. I'll absolutely add color tomorrow if anyone has questions

    Read the article

  • ASP.NET 4.5 Bundling in Debug Mode - Stale Resources

    - by RPM1984
    Is there any way we can make the ASP.NET 4.5 Bundling functionality generate GUID's as part of the querystring when running in debug mode (e.g bundling turned OFF). The problem is when developing locally, the scripts/CSS files are generated like this: <script type="text/javascript" src="/Content/Scripts/myscript.js" /> So if i change that file, i need to do a hard-refresh (sometimes a few times) to get the file to be picked up by the browser - annoying. Is there any way we can make it render out like this: <script type="text/javascript" src="/Content/Scripts/myscript.js?v=x" /> Where x is a GUID (e.g always unique). Ideas? I'm on ASP.NET MVC 4.

    Read the article

  • set difference in SQL query

    - by TheObserver
    I'm trying to select records with a statement SELECT * FROM A WHERE LEFT(B, 5) IN (SELECT * FROM (SELECT LEFT(A.B,5), COUNT(DISTINCT A.C) c_count FROM A GROUP BY LEFT(B,5) ) p1 WHERE p1.c_count = 1 ) AND C IN (SELECT * FROM (SELECT A.C , COUNT(DISTINCT LEFT(A.B,5)) b_count FROM A GROUP BY C ) p2 WHERE p2.b_count = 1) which takes a long time to run ~15 sec. Is there a better way of writing this SQL?

    Read the article

  • Benefits of 'Optimize code' option in Visual Studio build

    - by gt
    Much of our C# release code is built with the 'Optimize code' option turned off. I believe this is to allow code built in Release mode to be debugged more easily. Given that we are creating fairly simple desktop software which connects to backend Web Services, (ie. not a particularly processor-intensive application) then what if any sort of performance hit might be expected? And is any particular platform likely to be worse affected? Eg. multi-processor / 64 bit.

    Read the article

  • Why better isolation level means better performance in SQL Server

    - by Oleg Zhylin
    When measuring performance on my query I came up with a dependency between isolation level and elapsed time that was surprising to me READUNCOMMITTED - 409024 READCOMMITTED - 368021 REPEATABLEREAD - 358019 SERIALIZABLE - 348019 Left column is table hint, and the right column is elapsed time in microseconds (sys.dm_exec_query_stats.total_elapsed_time). Why better isolation level gives better performance? This is a development machine and no concurrency whatsoever happens. I would expect READUNCOMMITTED to be the fasted due to less locking overhead. Update: I did measure this with DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE issued and Profiler confirms there're no cache hits happening. Update2: The query in question is an OLAP one and we need to run it as fast as possible. Closing the production server from outside world to get the computation done is not out of question if this gives performance benefits.

    Read the article

  • Complicated idea - how to create car racing for my RPG game's players

    - by Donator
    So, I want to create car racing for my RPG game's players. Player can create race and choose how many participants can participate in race. After race is being created, other people can join it. When the maximum participants are collected, race begins. My idea, when the last participant joins, then instantly choose the winner (who's car is the best, that person wins), but how can I do it? If I choose to pick the winner after the last participant joins, then I have to put many queries in one page (select data from table, then delete the race, then select players' cars' statistics and pick the winner and then again, using mysql, send message to everyone). But this idea is really not optimal and it will lag cruelly for that last person. Maybe you have any ideas how I can avoid lag and make it more optimal. Thank you very much.

    Read the article

  • Access cost of dynamically created objects with dynamically allocated members

    - by user343547
    I'm building an application which will have dynamic allocated objects of type A each with a dynamically allocated member (v) similar to the below class class A { int a; int b; int* v; }; where: The memory for v will be allocated in the constructor. v will be allocated once when an object of type A is created and will never need to be resized. The size of v will vary across all instances of A. The application will potentially have a huge number of such objects and mostly need to stream a large number of these objects through the CPU but only need to perform very simple computations on the members variables. Could having v dynamically allocated could mean that an instance of A and its member v are not located together in memory? What tools and techniques can be used to test if this fragmentation is a performance bottleneck? If such fragmentation is a performance issue, are there any techniques that could allow A and v to allocated in a continuous region of memory? Or are there any techniques to aid memory access such as pre-fetching scheme? for example get an object of type A operate on the other member variables whilst pre-fetching v. If the size of v or an acceptable maximum size could be known at compile time would replacing v with a fixed sized array like int v[max_length] lead to better performance? The target platforms are standard desktop machines with x86/AMD64 processors, Windows or Linux OSes and compiled using either GCC or MSVC compilers.

    Read the article

  • Is it possible to do A/B testing by page rather than by individual?

    - by mojones
    Lets say I have a simple ecommerce site that sells 100 different t-shirt designs. I want to do some a/b testing to optimise my sales. Let's say I want to test two different "buy" buttons. Normally, I would use AB testing to randomly assign each visitor to see button A or button B (and try to ensure that that the user experience is consistent by storing that assignment in session, cookies etc). Would it be possible to take a different approach and instead, randomly assign each of my 100 designs to use button A or B, and measure the conversion rate as (number of sales of design n) / (pageviews of design n) This approach would seem to have some advantages; I would not have to worry about keeping the user experience consistent - a given page (e.g. www.example.com/viewdesign?id=6) would always return the same html. If I were to test different prices, it would be far less distressing to the user to see different prices for different designs than different prices for the same design on different computers. I also wonder whether it might be better for SEO - my suspicion is that Google would "prefer" that it always sees the same html when crawling a page. Obviously this approach would only be suitable for a limited number of sites; I was just wondering if anyone has tried it?

    Read the article

  • Jruby rspec to be run parallely

    - by Priyank
    Hi. Is there something like Spork for Jruby too? We want to parallelize our specs to run faster and pre-load the classes while running the rake task; however we have not been able to do so. Since our project is considerable in size, specs take about 15 minutes to complete and this poses a serious challenge to quick turnaround. Any ideas are more than welcome. Cheers

    Read the article

  • Grand Central Strategy for Opening Multiple Files

    - by user276632
    I have a working implementation using Grand Central dispatch queues that (1) opens a file and computes an OpenSSL DSA hash on "queue1", (2) writing out the hash to a new "side car" file for later verification on "queue2". I would like to open multiple files at the same time, but based on some logic that doesn't "choke" the OS by having 100s of files open and exceeding the hard drive's sustainable output. Photo browsing applications such as iPhoto or Aperture seem to open multiple files and display them, so I'm assuming this can be done. I'm assuming the biggest limitation will be disk I/O, as the application can (in theory) read and write multiple files simultaneously. Any suggestions? TIA

    Read the article

  • How does loop address alignment affect the speed on Intel x86_64?

    - by Alexander Gololobov
    I'm seeing 15% performance degradation of the same C++ code compiled to exactly same machine instructions but located on differently aligned addresses. When my tiny main loop starts at 0x415220 it's faster then when it is at 0x415250. I'm running this on Intel Core2 Duo. I use gcc 4.4.5 on x86_64 Ubuntu. Can anybody explain the cause of slowdown and how I can force gcc to optimally align the loop? Here is the disassembly for both cases with profiler annotation: 415220 576 12.56% |XXXXXXXXXXXXXX 48 c1 eb 08 shr $0x8,%rbx 415224 110 2.40% |XX 0f b6 c3 movzbl %bl,%eax 415227 0.00% | 41 0f b6 04 00 movzbl (%r8,%rax,1),%eax 41522c 40 0.87% | 48 8b 04 c1 mov (%rcx,%rax,8),%rax 415230 806 17.58% |XXXXXXXXXXXXXXXXXXX 4c 63 f8 movslq %eax,%r15 415233 186 4.06% |XXXX 48 c1 e8 20 shr $0x20,%rax 415237 102 2.22% |XX 4c 01 f9 add %r15,%rcx 41523a 414 9.03% |XXXXXXXXXX a8 0f test $0xf,%al 41523c 680 14.83% |XXXXXXXXXXXXXXXX 74 45 je 415283 ::Run(char const*, char const*)+0x4b3 41523e 0.00% | 41 89 c7 mov %eax,%r15d 415241 0.00% | 41 83 e7 01 and $0x1,%r15d 415245 0.00% | 41 83 ff 01 cmp $0x1,%r15d 415249 0.00% | 41 89 c7 mov %eax,%r15d 415250 679 13.05% |XXXXXXXXXXXXXXXX 48 c1 eb 08 shr $0x8,%rbx 415254 124 2.38% |XX 0f b6 c3 movzbl %bl,%eax 415257 0.00% | 41 0f b6 04 00 movzbl (%r8,%rax,1),%eax 41525c 43 0.83% |X 48 8b 04 c1 mov (%rcx,%rax,8),%rax 415260 828 15.91% |XXXXXXXXXXXXXXXXXXX 4c 63 f8 movslq %eax,%r15 415263 388 7.46% |XXXXXXXXX 48 c1 e8 20 shr $0x20,%rax 415267 141 2.71% |XXX 4c 01 f9 add %r15,%rcx 41526a 634 12.18% |XXXXXXXXXXXXXXX a8 0f test $0xf,%al 41526c 749 14.39% |XXXXXXXXXXXXXXXXXX 74 45 je 4152b3 ::Run(char const*, char const*)+0x4c3 41526e 0.00% | 41 89 c7 mov %eax,%r15d 415271 0.00% | 41 83 e7 01 and $0x1,%r15d 415275 0.00% | 41 83 ff 01 cmp $0x1,%r15d 415279 0.00% | 41 89 c7 mov %eax,%r15d

    Read the article

< Previous Page | 224 225 226 227 228 229 230 231 232 233 234 235  | Next Page >