Search Results

Search found 3438 results on 138 pages for 'nonlinear optimization'.

Page 54/138 | < Previous Page | 50 51 52 53 54 55 56 57 58 59 60 61 | Next Page >

Movies recommendation engine conceptual database design

- by Supyxy

I am working at an movie recommendations engine and i'm facing a DB design issue. My actual database looks like this: MOVIES [ID,TITLE] KEYWORDS_TABLE [ID,KEY_ID] - where ID is Foreign Key for MOVIES.id and KEY_ID is a key for a text keywords table This is not the entire DB, but i showed here what's important for my problem. I have about 50,000 movies and about 1,3 milion keywords correlations, and basically my algorithm consists in extracting all the who have the same keywords with a given movie, then ordering them by the number of keywords correlations. For example i looked for a movie similar to 'Cast away' and it returned 'Six days and six nights' because it had the most keywords correlations (4 keywords): Island Airplane crash Stranded Pilot The algorithm is based on more factors, but this one is the most important and the most difficult for the approach. Basically what i do now is getting all the movies that have at least one keyword similar to the given movie and then ordering them by other factors which are not important for a moment. There wouldn't be any problem if there weren't so many records, a query lasts in many cases up to 10-20 seconds and some of them return even over 5000 movies. Someone already helped me on here (thanks Mark Byers) with optimizing the query but that's not enough because it takes too longer SELECT DISTINCT M.title FROM keywords_table K1 JOIN keywords_table K2 ON K2.key_id = K1.key_id JOIN movies M ON K2.id = M.id WHERE K1.id = 4 So i thought it would be better if i pre-made those lists with movies recommendations for each movie, but i'm not sure how to design the tables.. whatever is it a good idea or how would you take this approach?

Read the article
Can we optimize code to reduce power consumption?

- by Manik Mahajan

Is there any techniques to optimize code in order to ensure lesser power consumption.Architecture is ARM.language is C

Read the article
Is a red-black tree my ideal data structure?

- by Hugo van der Sanden

I have a collection of items (big rationals) that I'll be processing. In each case, processing will consist of removing the smallest item in the collection, doing some work, and then adding 0-2 new items (which will always be larger than the removed item). The collection will be initialised with one item, and work will continue until it is empty. I'm not sure what size the collection is likely to reach, but I'd expect in the range 1M-100M items. I will not need to locate any item other than the smallest. I'm currently planning to use a red-black tree, possibly tweaked to keep a pointer to the smallest item. However I've never used one before, and I'm unsure whether my pattern of use fits its characteristics well. 1) Is there a danger the pattern of deletion from the left + random insertion will affect performance, eg by requiring a significantly higher number of rotations than random deletion would? Or will delete and insert operations still be O(log n) with this pattern of use? 2) Would some other data structure give me better performance, either because of the deletion pattern or taking advantage of the fact I only ever need to find the smallest item? Update: glad I asked, the binary heap is clearly a better solution for this case, and as promised turned out to be very easy to implement. Hugo

Read the article
Lazy loading the addthis script? (or lazy loading external js content dependent on already fired eve

- by Keith Bentrup

I want to have the addthis widget available for my users, but I want to lazy load it so that my page loads as quickly as possible. However, after trying it via a script tag and then via my lazy loading method, it appears to only work via the script tag. In the obfuscated code, I see something that looks like it's dependent on the DOMContentLoaded event (at least for firefox). Since the DOMContentLoaded event has already fired, the widget doesn't render properly. What to do? I could just use a script tag (slower)... or could I fire (in a cross browser way) the DOMContentLoaded (or equivalent) event? I have a feeling this may not be possible b/c I believe that (like jQuery) there are multiple tests of the content ready event, and so multiple simulated events would have to occur. Nonetheless, this is an interesting problem b/c I have seen a couple widgets now assume that you are including their stuff via static script tags. It would be nice if they wrote code that was more useful to developers concerned about speed, but until then, is there a work around?? And/or are any of my assumptions wrong? Edit: Because the 1st answer to the question seemed to miss the point of my problem, I wanted to clarify the situation. This is about a specific problem. I'm not looking for yet another lazy load script or check if some dependencies are loaded script. Specifically this problem deals with external widgets that you do not have control over and may or may not be obfuscated delaying the load of the external widgets until they are needed or at least, til substantially after everything else has been loaded including other deferred elements b/c of the how the widget was written, precludes existing, typical lazy loading paradigms While it's esoteric, I have seen it happen with a couple widgets - where the widget developers assume that you're just willing to throw in another script tag at the bottom of the page. I'm looking to save those 500-1000 ms** though as numerous studies by yahoo, google, and amazon show it to be important to your user's experience. **My testing with hammerhead and personal experience indicates that this will be my savings in this case.

Read the article
ASP.NET 4.5 Bundling in Debug Mode - Stale Resources

- by RPM1984

Is there any way we can make the ASP.NET 4.5 Bundling functionality generate GUID's as part of the querystring when running in debug mode (e.g bundling turned OFF). The problem is when developing locally, the scripts/CSS files are generated like this: <script type="text/javascript" src="/Content/Scripts/myscript.js" /> So if i change that file, i need to do a hard-refresh (sometimes a few times) to get the file to be picked up by the browser - annoying. Is there any way we can make it render out like this: <script type="text/javascript" src="/Content/Scripts/myscript.js?v=x" /> Where x is a GUID (e.g always unique). Ideas? I'm on ASP.NET MVC 4.

Read the article
PostgreSQL - fetch the row which has the Max value for a column

- by Joshua Berry

I'm dealing with a Postgres table (called "lives") that contains records with columns for time_stamp, usr_id, transaction_id, and lives_remaining. I need a query that will give me the most recent lives_remaining total for each usr_id There are multiple users (distinct usr_id's) time_stamp is not a unique identifier: sometimes user events (one by row in the table) will occur with the same time_stamp. trans_id is unique only for very small time ranges: over time it repeats remaining_lives (for a given user) can both increase and decrease over time example: time_stamp|lives_remaining|usr_id|trans_id ----------------------------------------- 07:00 | 1 | 1 | 1 09:00 | 4 | 2 | 2 10:00 | 2 | 3 | 3 10:00 | 1 | 2 | 4 11:00 | 4 | 1 | 5 11:00 | 3 | 1 | 6 13:00 | 3 | 3 | 1 As I will need to access other columns of the row with the latest data for each given usr_id, I need a query that gives a result like this: time_stamp|lives_remaining|usr_id|trans_id ----------------------------------------- 11:00 | 3 | 1 | 6 10:00 | 1 | 2 | 4 13:00 | 3 | 3 | 1 As mentioned, each usr_id can gain or lose lives, and sometimes these timestamped events occur so close together that they have the same timestamp! Therefore this query won't work: SELECT b.time_stamp,b.lives_remaining,b.usr_id,b.trans_id FROM (SELECT usr_id, max(time_stamp) AS max_timestamp FROM lives GROUP BY usr_id ORDER BY usr_id) a JOIN lives b ON a.max_timestamp = b.time_stamp Instead, I need to use both time_stamp (first) and trans_id (second) to identify the correct row. I also then need to pass that information from the subquery to the main query that will provide the data for the other columns of the appropriate rows. This is the hacked up query that I've gotten to work: SELECT b.time_stamp,b.lives_remaining,b.usr_id,b.trans_id FROM (SELECT usr_id, max(time_stamp || '*' || trans_id) AS max_timestamp_transid FROM lives GROUP BY usr_id ORDER BY usr_id) a JOIN lives b ON a.max_timestamp_transid = b.time_stamp || '*' || b.trans_id ORDER BY b.usr_id Okay, so this works, but I don't like it. It requires a query within a query, a self join, and it seems to me that it could be much simpler by grabbing the row that MAX found to have the largest timestamp and trans_id. The table "lives" has tens of millions of rows to parse, so I'd like this query to be as fast and efficient as possible. I'm new to RDBM and Postgres in particular, so I know that I need to make effective use of the proper indexes. I'm a bit lost on how to optimize. I found a similar discussion here. Can I perform some type of Postgres equivalent to an Oracle analytic function? Any advice on accessing related column information used by an aggregate function (like MAX), creating indexes, and creating better queries would be much appreciated! P.S. You can use the following to create my example case: create TABLE lives (time_stamp timestamp, lives_remaining integer, usr_id integer, trans_id integer); insert into lives values ('2000-01-01 07:00', 1, 1, 1); insert into lives values ('2000-01-01 09:00', 4, 2, 2); insert into lives values ('2000-01-01 10:00', 2, 3, 3); insert into lives values ('2000-01-01 10:00', 1, 2, 4); insert into lives values ('2000-01-01 11:00', 4, 1, 5); insert into lives values ('2000-01-01 11:00', 3, 1, 6); insert into lives values ('2000-01-01 13:00', 3, 3, 1);

Read the article
How can I further optimize this color difference function?

- by aLfa

I have made this function to calculate color differences in the CIE Lab colorspace, but it lacks speed. Since I'm not a Java expert, I wonder if any Java guru around has some tips that can improve the speed here. The code is based on the matlab function mentioned in the comment block. /** * Compute the CIEDE2000 color-difference between the sample color with * CIELab coordinates 'sample' and a standard color with CIELab coordinates * 'std' * * Based on the article: * "The CIEDE2000 Color-Difference Formula: Implementation Notes, * Supplementary Test Data, and Mathematical Observations,", G. Sharma, * W. Wu, E. N. Dalal, submitted to Color Research and Application, * January 2004. * available at http://www.ece.rochester.edu/~gsharma/ciede2000/ */ public static double deltaE2000(double[] lab1, double[] lab2) { double L1 = lab1[0]; double a1 = lab1[1]; double b1 = lab1[2]; double L2 = lab2[0]; double a2 = lab2[1]; double b2 = lab2[2]; // Cab = sqrt(a^2 + b^2) double Cab1 = Math.sqrt(a1 * a1 + b1 * b1); double Cab2 = Math.sqrt(a2 * a2 + b2 * b2); // CabAvg = (Cab1 + Cab2) / 2 double CabAvg = (Cab1 + Cab2) / 2; // G = 1 + (1 - sqrt((CabAvg^7) / (CabAvg^7 + 25^7))) / 2 double CabAvg7 = Math.pow(CabAvg, 7); double G = 1 + (1 - Math.sqrt(CabAvg7 / (CabAvg7 + 6103515625.0))) / 2; // ap = G * a double ap1 = G * a1; double ap2 = G * a2; // Cp = sqrt(ap^2 + b^2) double Cp1 = Math.sqrt(ap1 * ap1 + b1 * b1); double Cp2 = Math.sqrt(ap2 * ap2 + b2 * b2); // CpProd = (Cp1 * Cp2) double CpProd = Cp1 * Cp2; // hp1 = atan2(b1, ap1) double hp1 = Math.atan2(b1, ap1); // ensure hue is between 0 and 2pi if (hp1 < 0) { // hp1 = hp1 + 2pi hp1 += 6.283185307179586476925286766559; } // hp2 = atan2(b2, ap2) double hp2 = Math.atan2(b2, ap2); // ensure hue is between 0 and 2pi if (hp2 < 0) { // hp2 = hp2 + 2pi hp2 += 6.283185307179586476925286766559; } // dL = L2 - L1 double dL = L2 - L1; // dC = Cp2 - Cp1 double dC = Cp2 - Cp1; // computation of hue difference double dhp = 0.0; // set hue difference to zero if the product of chromas is zero if (CpProd != 0) { // dhp = hp2 - hp1 dhp = hp2 - hp1; if (dhp > Math.PI) { // dhp = dhp - 2pi dhp -= 6.283185307179586476925286766559; } else if (dhp < -Math.PI) { // dhp = dhp + 2pi dhp += 6.283185307179586476925286766559; } } // dH = 2 * sqrt(CpProd) * sin(dhp / 2) double dH = 2 * Math.sqrt(CpProd) * Math.sin(dhp / 2); // weighting functions // Lp = (L1 + L2) / 2 - 50 double Lp = (L1 + L2) / 2 - 50; // Cp = (Cp1 + Cp2) / 2 double Cp = (Cp1 + Cp2) / 2; // average hue computation // hp = (hp1 + hp2) / 2 double hp = (hp1 + hp2) / 2; // identify positions for which abs hue diff exceeds 180 degrees if (Math.abs(hp1 - hp2) > Math.PI) { // hp = hp - pi hp -= Math.PI; } // ensure hue is between 0 and 2pi if (hp < 0) { // hp = hp + 2pi hp += 6.283185307179586476925286766559; } // LpSqr = Lp^2 double LpSqr = Lp * Lp; // Sl = 1 + 0.015 * LpSqr / sqrt(20 + LpSqr) double Sl = 1 + 0.015 * LpSqr / Math.sqrt(20 + LpSqr); // Sc = 1 + 0.045 * Cp double Sc = 1 + 0.045 * Cp; // T = 1 - 0.17 * cos(hp - pi / 6) + // + 0.24 * cos(2 * hp) + // + 0.32 * cos(3 * hp + pi / 30) - // - 0.20 * cos(4 * hp - 63 * pi / 180) double hphp = hp + hp; double T = 1 - 0.17 * Math.cos(hp - 0.52359877559829887307710723054658) + 0.24 * Math.cos(hphp) + 0.32 * Math.cos(hphp + hp + 0.10471975511965977461542144610932) - 0.20 * Math.cos(hphp + hphp - 1.0995574287564276334619251841478); // Sh = 1 + 0.015 * Cp * T double Sh = 1 + 0.015 * Cp * T; // deltaThetaRad = (pi / 3) * e^-(36 / (5 * pi) * hp - 11)^2 double powerBase = hp - 4.799655442984406; double deltaThetaRad = 1.0471975511965977461542144610932 * Math.exp(-5.25249016001879 * powerBase * powerBase); // Rc = 2 * sqrt((Cp^7) / (Cp^7 + 25^7)) double Cp7 = Math.pow(Cp, 7); double Rc = 2 * Math.sqrt(Cp7 / (Cp7 + 6103515625.0)); // RT = -sin(delthetarad) * Rc double RT = -Math.sin(deltaThetaRad) * Rc; // de00 = sqrt((dL / Sl)^2 + (dC / Sc)^2 + (dH / Sh)^2 + RT * (dC / Sc) * (dH / Sh)) double dLSl = dL / Sl; double dCSc = dC / Sc; double dHSh = dH / Sh; return Math.sqrt(dLSl * dLSl + dCSc * dCSc + dHSh * dHSh + RT * dCSc * dHSh); }

Read the article
Python performance improvement request for winkler

- by Martlark

I'm a python n00b and I'd like some suggestions on how to improve the algorithm to improve the performance of this method to compute the Jaro-Winkler distance of two names. def winklerCompareP(str1, str2): """Return approximate string comparator measure (between 0.0 and 1.0) USAGE: score = winkler(str1, str2) ARGUMENTS: str1 The first string str2 The second string DESCRIPTION: As described in 'An Application of the Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial Census' by William E. Winkler and Yves Thibaudeau. Based on the 'jaro' string comparator, but modifies it according to whether the first few characters are the same or not. """ # Quick check if the strings are the same - - - - - - - - - - - - - - - - - - # jaro_winkler_marker_char = chr(1) if (str1 == str2): return 1.0 len1 = len(str1) len2 = len(str2) halflen = max(len1,len2) / 2 - 1 ass1 = '' # Characters assigned in str1 ass2 = '' # Characters assigned in str2 #ass1 = '' #ass2 = '' workstr1 = str1 workstr2 = str2 common1 = 0 # Number of common characters common2 = 0 #print "'len1', str1[i], start, end, index, ass1, workstr2, common1" # Analyse the first string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len1): start = max(0,i-halflen) end = min(i+halflen+1,len2) index = workstr2.find(str1[i],start,end) #print 'len1', str1[i], start, end, index, ass1, workstr2, common1 if (index > -1): # Found common character common1 += 1 #ass1 += str1[i] ass1 = ass1 + str1[i] workstr2 = workstr2[:index]+jaro_winkler_marker_char+workstr2[index+1:] #print "str1 analyse result", ass1, common1 #print "str1 analyse result", ass1, common1 # Analyse the second string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len2): start = max(0,i-halflen) end = min(i+halflen+1,len1) index = workstr1.find(str2[i],start,end) #print 'len2', str2[i], start, end, index, ass1, workstr1, common2 if (index > -1): # Found common character common2 += 1 #ass2 += str2[i] ass2 = ass2 + str2[i] workstr1 = workstr1[:index]+jaro_winkler_marker_char+workstr1[index+1:] if (common1 != common2): print('Winkler: Wrong common values for strings "%s" and "%s"' % \ (str1, str2) + ', common1: %i, common2: %i' % (common1, common2) + \ ', common should be the same.') common1 = float(common1+common2) / 2.0 ##### This is just a fix ##### if (common1 == 0): return 0.0 # Compute number of transpositions - - - - - - - - - - - - - - - - - - - - - # transposition = 0 for i in range(len(ass1)): if (ass1[i] != ass2[i]): transposition += 1 transposition = transposition / 2.0 # Now compute how many characters are common at beginning - - - - - - - - - - # minlen = min(len1,len2) for same in range(minlen+1): if (str1[:same] != str2[:same]): break same -= 1 if (same > 4): same = 4 common1 = float(common1) w = 1./3.*(common1 / float(len1) + common1 / float(len2) + (common1-transposition) / common1) wn = w + same*0.1 * (1.0 - w) return wn

Read the article
Combine static files or load in parallel

- by Niall Collins

I am at present introducing code to my site to combine css and javascript files. Is there a way without having to include an external library to load javascript asynchronously or in parallel? I have read on some blogs that combining of files can be counter productive as the load of the http request can be large and its better to load multiple files in parallel. Opinions on this? I am caching my javascript/css. And would have thought it was better to combine rather than have multiple http requests.

Read the article
Is there a way to optimize this mysql query...?

- by SpikETidE

Hi Everyone... Say, I got these two tables.... Table 1 : Hotels hotel_id hotel_name 1 abc 2 xyz 3 efg Table 2 : Payments payment_id payment_date hotel_id total_amt comission p1 23-03-2010 1 100 10 p2 23-03-2010 2 50 5 p3 23-03-2010 2 200 25 p4 23-03-2010 1 40 2 Now, I need to get the following details from the two tables Given a particular date (say, 23-03-2010), the sum of the total_amt for each of the hotel for which a payment has been made on that particular date. All the rows that has the date 23-03-2010 ordered according to the hotel name A sample output is as follows... +------------+------------+------------+---------------+ | hotel_name | date | total_amt | commission | +------------+------------+------------+---------------+ | * abc | 23-03-2010 | 140 | 12 | +------------+------------+------------+---------------+ |+-----------+------------+------------+--------------+| || paymt_id | date | total_amt | commission || |+-----------+------------+------------+--------------+| || p1 | 23-03-2010 | 100 | 10 || |+-----------+------------+------------+--------------+| || p4 | 23-03-2010 | 40 | 2 || |+-----------+------------+------------+--------------+| +------------+------------+------------+---------------+ | * xyz | 23-03-2010 | 250 | 30 | +------------+------------+------------+---------------+ |+-----------+------------+------------+--------------+| || paymt_id | date | total_amt | commission || |+-----------+------------+------------+--------------+| || p2 | 23-03-2010 | 50 | 5 || |+-----------+------------+------------+--------------+| || p3 | 23-03-2010 | 200 | 25 || |+-----------+------------+------------+--------------+| +------------------------------------------------------+ Above the sample of the table that has to be printed... The idea is first to show the consolidated detail of each hotel, and when the '*' next to the hotel name is clicked the breakdown of the payment details will become visible... But that can be done by some jquery..!!! The table itself can be generated with php... Right now i am using two separate queries : One to get the sum of the amount and commission grouped by the hotel name. The next is to get the individual row for each entry having that date in the table. This is, of course, because grouping the records for calculating sum() returns only one row for each of the hotel with the sum of the amounts... Is there a way to combine these two queries into a single one and do the operation in a more optimized way...?? Hope i am being clear.. Thanks for your time and replies...

Read the article
Optimizing processing and management of large Java data arrays

- by mikera

I'm writing some pretty CPU-intensive, concurrent numerical code that will process large amounts of data stored in Java arrays (e.g. lots of double[100000]s). Some of the algorithms might run millions of times over several days so getting maximum steady-state performance is a high priority. In essence, each algorithm is a Java object that has an method API something like: public double[] runMyAlgorithm(double[] inputData); or alternatively a reference could be passed to the array to store the output data: public runMyAlgorithm(double[] inputData, double[] outputData); Given this requirement, I'm trying to determine the optimal strategy for allocating / managing array space. Frequently the algorithms will need large amounts of temporary storage space. They will also take large arrays as input and create large arrays as output. Among the options I am considering are: Always allocate new arrays as local variables whenever they are needed (e.g. new double[100000]). Probably the simplest approach, but will produce a lot of garbage. Pre-allocate temporary arrays and store them as final fields in the algorithm object - big downside would be that this would mean that only one thread could run the algorithm at any one time. Keep pre-allocated temporary arrays in ThreadLocal storage, so that a thread can use a fixed amount of temporary array space whenever it needs it. ThreadLocal would be required since multiple threads will be running the same algorithm simultaneously. Pass around lots of arrays as parameters (including the temporary arrays for the algorithm to use). Not good since it will make the algorithm API extremely ugly if the caller has to be responsible for providing temporary array space.... Allocate extremely large arrays (e.g. double[10000000]) but also provide the algorithm with offsets into the array so that different threads will use a different area of the array independently. Will obviously require some code to manage the offsets and allocation of the array ranges. Any thoughts on which approach would be best (and why)?

Read the article
Optimizing spacing of mesh containing a given set of points

- by Feynman

I tried to summarize the this as best as possible in the title. I am writing an initial value problem solver in the most general way possible. I start with an arbitrary number of initial values at arbitrary locations (inside a boundary.) The first part of my program creates a mesh/grid (I am not sure which is the correct nuance), with N points total, that contains all the initial values. My goal is to optimize the mesh such that the spacing is as uniform as possible. My solver seems to work half decently (it needs some more obscure debugging that is not relevant here.) I am starting with one dimension. I intend to generalize the algorithm to an arbitrary number of dimensions once I get it working consistently. I am writing my code in fortran, but feel free to reply with pseudocode or the language of your choice. Allow me to elaborate with an example: Say I am working on a closed interval [1,10] xmin=1 xmax=10 Say I have 3 initial points: xmin, 5 and xmax num_ivc=3 known(num_ivc)=[xmin,5,xmax] //my arrays start at 1. Assume "known" starts sorted I store my mesh/grid points in an array called coord. Say I want 10 points total in my mesh/grid. N=10 coord(10) Remember, all this is arbitrary--except the variable names of course. The algorithm should set coord to {1,2,3,4,5,6,7,8,9,10} Now for a less trivial example: num_ivc=3 known(num_ivc)=[xmin,5.5,xmax or just num_ivc=1 known(num_ivc)=[5.5] Now, would you have 5 evenly spaced points on the interval [1, 5.5] and 5 evenly spaced points on the interval (5.5, 10]? But there is more space between 1 and 5.5 than between 5.5 and 10. So would you have 6 points on [1, 5.5] followed by 4 on (5.5 to 10]. The key is to minimize the difference in spacing. I have been working on this for 2 days straight and I can assure you it is a lot trickier than it sounds. I have written code that only works if N is large only works if N is small only works if it the known points are close together only works if it the known points are far apart only works if at least one of the known points is near a boundary only works if none of the known points are near a boundary So as you can see, I have coded the gamut of almost-solutions. I cannot figure out a way to get it to perform equally well in all possible scenarios (that is, create the optimum spacing.)

Read the article
MySQL efficiency as it relates to the database/table size

- by mlissner

I'm building a system using django, Sphinx and MySQL that's very quickly becoming quite large. The database currently has about 2000 rows, and I've written a program that's going to populate it with another 40,000 rows in a couple days. Since the database is live right now, and since I've never had a database with this much information in it, I'm worried about some things: Is adding all these rows going to seriously degrade the efficiency of my django app? Will I need to go back through it and optimize all my database calls so they're doing things more cleverly? Or will this make the database slow all around to the extent that I can't do anything about it at all? If you scoff at my 40k rows, then, my next question is, at what point SHOULD I be concerned? I will likely be adding another couple hundred thousand soon, so I worry, and I fret. How is sphinx going to feel about all this? Is it going to freak out when it realizes it has to index all this data? Or will it be fine? Is this normal for it? If it is, at what point should I be concerned that it's too much data for Sphinx? Thanks for any thoughts.

Read the article
What is the best algorithm for this problem?

- by mark

What is the most efficient algorithm to solve the following problem? Given 6 arrays, D1,D2,D3,D4,D5 and D6 each containing 6 numbers like: D1[0] = number D2[0] = number ...... D6[0] = number D1[1] = another number D2[1] = another number .... ..... .... ...... .... D1[5] = yet another number .... ...... .... Given a second array ST1, containing 1 number: ST1[0] = 6 Given a third array ans, containing 6 numbers: ans[0] = 3, ans[1] = 4, ans[2] = 5, ......ans[5] = 8 Using as index for the arrays D1,D2,D3,D4,D5 and D6, the number that goes from 0, to the number stored in ST1[0] minus one, in this example 6, so from 0 to 6-1, compare each res array against each D array My algorithm so far is: I tried to keep everything unlooped as much as possible. EML := ST1[0] //number contained in ST1[0] EML1 := 0 //start index for the arrays D While EML1 < EML if D1[ELM1] = ans[0] goto two if D2[ELM1] = ans[0] goto two if D3[ELM1] = ans[0] goto two if D4[ELM1] = ans[0] goto two if D5[ELM1] = ans[0] goto two if D6[ELM1] = ans[0] goto two ELM1 = ELM1 + 1 return 0 //bad row of numbers, if while ends two: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[1] goto two if D2[ELM1] = ans[1] goto two if D3[ELM1] = ans[1] goto two if D4[ELM1] = ans[1] goto two if D5[ELM1] = ans[1] goto two if D6[ELM1] = ans[1] goto two ELM1 = ELM1 + 1 return 0 three: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[2] goto two if D2[ELM1] = ans[2] goto two if D3[ELM1] = ans[2] goto two if D4[ELM1] = ans[2] goto two if D5[ELM1] = ans[2] goto two if D6[ELM1] = ans[2] goto two ELM1 = ELM1 + 1 return 0 four: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[3] goto two if D2[ELM1] = ans[3] goto two if D3[ELM1] = ans[3] goto two if D4[ELM1] = ans[3] goto two if D5[ELM1] = ans[3] goto two if D6[ELM1] = ans[3] goto two ELM1 = ELM1 + 1 return 0 five: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[4] goto two if D2[ELM1] = ans[4] goto two if D3[ELM1] = ans[4] goto two if D4[ELM1] = ans[4] goto two if D5[ELM1] = ans[4] goto two if D6[ELM1] = ans[4] goto two ELM1 = ELM1 + 1 return 0 six: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[0] return 1 //good row of numbers if D2[ELM1] = ans[0] return 1 if D3[ELM1] = ans[0] return 1 if D4[ELM1] = ans[0] return 1 if D5[ELM1] = ans[0] return 1 if D6[ELM1] = ans[0] return 1 ELM1 = ELM1 + 1 return 0 As language of choice, it would be pure c

Read the article
Access cost of dynamically created objects with dynamically allocated members

- by user343547

I'm building an application which will have dynamic allocated objects of type A each with a dynamically allocated member (v) similar to the below class class A { int a; int b; int* v; }; where: The memory for v will be allocated in the constructor. v will be allocated once when an object of type A is created and will never need to be resized. The size of v will vary across all instances of A. The application will potentially have a huge number of such objects and mostly need to stream a large number of these objects through the CPU but only need to perform very simple computations on the members variables. Could having v dynamically allocated could mean that an instance of A and its member v are not located together in memory? What tools and techniques can be used to test if this fragmentation is a performance bottleneck? If such fragmentation is a performance issue, are there any techniques that could allow A and v to allocated in a continuous region of memory? Or are there any techniques to aid memory access such as pre-fetching scheme? for example get an object of type A operate on the other member variables whilst pre-fetching v. If the size of v or an acceptable maximum size could be known at compile time would replacing v with a fixed sized array like int v[max_length] lead to better performance? The target platforms are standard desktop machines with x86/AMD64 processors, Windows or Linux OSes and compiled using either GCC or MSVC compilers.

Read the article
optimal memory layout for read-only/write memory segments.

- by aaa

hello. Suppose I have two memory segments (equal size each, approximately 1kb in size) , one is read-only (after initialization), and other is read/write. what is the best layout in memory for such segments in terms of memory performance? one allocation, contiguous segments or two allocations (in general not contiguous). my primary architecture is linux Intel 64-bit. my feeling is former (cache friendlier) case is better. is there circumstances, where second layout is preferred? Thanks

Read the article
Grand Central Strategy for Opening Multiple Files

- by user276632

I have a working implementation using Grand Central dispatch queues that (1) opens a file and computes an OpenSSL DSA hash on "queue1", (2) writing out the hash to a new "side car" file for later verification on "queue2". I would like to open multiple files at the same time, but based on some logic that doesn't "choke" the OS by having 100s of files open and exceeding the hard drive's sustainable output. Photo browsing applications such as iPhoto or Aperture seem to open multiple files and display them, so I'm assuming this can be done. I'm assuming the biggest limitation will be disk I/O, as the application can (in theory) read and write multiple files simultaneously. Any suggestions? TIA

Read the article
Position of least significant bit that is set

- by peterchen

I am looking for an efficient way to determine the position of the least significant bit that is set in an integer, e.g. for 0x0FF0 it would be 4. A trivial implementation is this: unsigned GetLowestBitPos(unsigned value) { assert(value != 0); // handled separately unsigned pos = 0; while (!(value & 1)) { value >>= 1; ++pos; } return pos; } Any ideas how to squeeze some cycles out of it? (Note: this question is for people that enjoy such things, not for people to tell me xyzoptimization is evil.) [edit] Thanks everyone for the ideas! I've learnt a few other things, too. Cool!

Read the article
Function-Local Static Const variable Initialization semantics.

- by Hassan Syed

The questions are in bold, for those that cannot be bothered reading a question in depth. This is a followup to this question. It is to do with the initialization semantics of static variables in functions. Static variables should be initialized once, and their internal state might be altered later - as I (currently) do in the linked question. However, the code in question does not require the feature to change the state of the variable later. Let me clarrify my position, since I don't require the string object's internal state to change. The code is for a trait class for meta programming, and as such would would benifit from a const char * const ptr -- thus Ideally a local cost static const variable is needed. My educated guess is that in this case the string in question will be optimally placed in memory by the link-loader, and that the code is more secure and maps to the intended semantics. This leads to the semantics of such a variable "The C++ Programming language Third Edition -- Stroustrup" does not have anything (that I could find) to say about this matter. All that is said is that the variable is initialized once when the flow of control of the thread first reaches the code. This leads me to ponder if the following code would be sensible, and if not what are the intended semantics ?. #include <iostream> const char * const GetString(const char * x_in) { static const char * const x = x_in; return x; } int main() { const char * const temp = GetString("yahoo"); std::cout << temp << std::endl; const char * const temp2 = GetString("yahoo2"); std::cout << temp2 << std::endl; } The following compiles on GCC and prints "yahoo" twice. Which is what I want -- However it might not be standards compliant (which is why I post this question). It might be more elegant to have two functions, "SetString" and "String" where the latter forwards to the first. If it is standards compliant does someone know of a templates implementation in boost (or elsewhere) ?

Read the article
Multiple ParticleSystems in cocos2d

- by Mattias Akerman

I wonder about what road I should go with ParticleSystem. In this particular case I want to create 1-20 small explosions at the same time but with different positions. Right now I'm creating a new ParticleSystem for each explosion and then release it, but of course this is very punishing to the performance. My question is: Is there a way to create one ParticleSystem with multiple emitting sources. If not should I create an array of ParticleSystem in init and then use a free one when an explosion is needed? Or is there another approach I haven't thought of?

Read the article
Strange: Planner takes decision with lower cost, but (very) query long runtime

- by S38

Facts: PGSQL 8.4.2, Linux I make use of table inheritance Each Table contains 3 million rows Indexes on joining columns are set Table statistics (analyze, vacuum analyze) are up-to-date Only used table is "node" with varios partitioned sub-tables Recursive query (pg = 8.4) Now here is the explained query: WITH RECURSIVE rows AS ( SELECT * FROM ( SELECT r.id, r.set, r.parent, r.masterid FROM d_storage.node_dataset r WHERE masterid = 3533933 ) q UNION ALL SELECT * FROM ( SELECT c.id, c.set, c.parent, r.masterid FROM rows r JOIN a_storage.node c ON c.parent = r.id ) q ) SELECT r.masterid, r.id AS nodeid FROM rows r QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on rows r (cost=2742105.92..2862119.94 rows=6000701 width=16) (actual time=0.033..172111.204 rows=4 loops=1) CTE rows -> Recursive Union (cost=0.00..2742105.92 rows=6000701 width=28) (actual time=0.029..172111.183 rows=4 loops=1) -> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.025..0.027 rows=1 loops=1) Index Cond: (masterid = 3533933) -> Hash Join (cost=0.33..262208.33 rows=600070 width=28) (actual time=40628.371..57370.361 rows=1 loops=3) Hash Cond: (c.parent = r.id) -> Append (cost=0.00..211202.04 rows=12001404 width=20) (actual time=0.011..46365.669 rows=12000004 loops=3) -> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.002..0.002 rows=0 loops=3) -> Seq Scan on node_dataset c (cost=0.00..55001.01 rows=3000001 width=20) (actual time=0.007..3426.593 rows=3000001 loops=3) -> Seq Scan on node_stammdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=0.008..9049.189 rows=3000001 loops=3) -> Seq Scan on node_stammdaten_adresse c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=3.455..8381.725 rows=3000001 loops=3) -> Seq Scan on node_testdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=1.810..5259.178 rows=3000001 loops=3) -> Hash (cost=0.20..0.20 rows=10 width=16) (actual time=0.010..0.010 rows=1 loops=3) -> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.002..0.004 rows=1 loops=3) Total runtime: 172111.371 ms (16 rows) (END) So far so bad, the planner decides to choose hash joins (good) but no indexes (bad). Now after doing the following: SET enable_hashjoins TO false; The explained query looks like that: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on rows r (cost=15198247.00..15318261.02 rows=6000701 width=16) (actual time=0.038..49.221 rows=4 loops=1) CTE rows -> Recursive Union (cost=0.00..15198247.00 rows=6000701 width=28) (actual time=0.032..49.201 rows=4 loops=1) -> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.028..0.031 rows=1 loops=1) Index Cond: (masterid = 3533933) -> Nested Loop (cost=0.00..1507822.44 rows=600070 width=28) (actual time=10.384..16.382 rows=1 loops=3) Join Filter: (r.id = c.parent) -> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.001..0.003 rows=1 loops=3) -> Append (cost=0.00..113264.67 rows=3001404 width=20) (actual time=8.546..12.268 rows=1 loops=4) -> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.001..0.001 rows=0 loops=4) -> Bitmap Heap Scan on node_dataset c (cost=58213.87..113214.88 rows=3000001 width=20) (actual time=1.906..1.906 rows=0 loops=4) Recheck Cond: (c.parent = r.id) -> Bitmap Index Scan on node_dataset_parent (cost=0.00..57463.87 rows=3000001 width=0) (actual time=1.903..1.903 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_stammdaten_parent on node_stammdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=3.272..3.273 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_stammdaten_adresse_parent on node_stammdaten_adresse c (cost=0.00..8.60 rows=1 width=20) (actual time=4.333..4.333 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_testdaten_parent on node_testdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=2.745..2.746 rows=0 loops=4) Index Cond: (c.parent = r.id) Total runtime: 49.349 ms (21 rows) (END) - incredibly faster, because indexes were used. Notice: Cost of the second query ist somewhat higher than for the first query. So the main question is: Why does the planner make the first decision, instead of the second? Also interesing: Via SET enable_seqscan TO false; i temp. disabled seq scans. Than the planner used indexes and hash joins, and the query still was slow. So the problem seems to be the hash join. Maybe someone can help in this confusing situation? thx, R.

Read the article
Custom View - Avoid redrawing when non-interactive

- by MasterGaurav

I have a complex custom view - photo collage. What is observed is whenever any UI interaction happens, the view is redrawn. How can I avoid complete redrawing (for example, use a cached UI) of the view specially when I click the "back" button to go back to previous activity because that also causes redrawing of the view. While exploring the API and web, I found a method - getDrawingCache() - but don't know how to use it effectively. How do I use it effectively? I've had other issues with Custom Views that I outline here.

Read the article
Any difference between lazy loading Javascript files vs. placing just before </body>

- by mhr

Looked around, couldn't find this specific question discussed. Pretty sure the difference is negligible, just curious as to your thoughts. Scenario: All Javascript that doesn't need to be loaded before page render has been placed just before the closing </body> tag. Are there any benefits or detriments to lazy loading these instead through some Javascript code in the head that executes when the DOM load/ready event is fired? Let's say that this only concerns downloading one entire .js file full of functions and not lazy loading several individual files as needed upon usage. Hope that's clear, thanks.

Read the article
Improving the speed of php

- by cast01

I'm currently working on a website in PHP, and I'm wondering what the best practices/methods are to reduce the time requests take. I've build the site in a modular way, so a page would consist of a number of modules, and each of these would need to request information. For example, I have a cart module, that (if a cart is set) will fetch the cart with the id (stored in a session variable) from the database and return its contents. I have another module that lists categories and this needs to fetch the categories from the database. My system is built with models, and each model might also make a request, for example a category model will make a request to get products in that category.

Read the article
Whats faster in Javascript a bunch of small setInterval loops, or one big one?

- by RobertWHurst

Just wondering if its worth it to make a monolithic loop function or just add loops were they're needed. The big loop option would just be a loop of callbacks that are added dynamically with an add function. adding a function would look like this setLoop(function(){ alert('hahaha! I\'m a really annoying loop that bugs you every tenth of a second'); }); setLoop would add the function to the monolithic loop. so is the is worth anything in performance or should I just stick to lots of little loops using setInterval?

Read the article

< Previous Page | 50 51 52 53 54 55 56 57 58 59 60 61 | Next Page >