Search Results

Search found 3438 results on 138 pages for 'nonlinear optimization'.

Page 70/138 | < Previous Page | 66 67 68 69 70 71 72 73 74 75 76 77  | Next Page >

  • Optimizing near-duplicate value search

    - by GApple
    I'm trying to find near duplicate values in a set of fields in order to allow an administrator to clean them up. There are two criteria that I am matching on One string is wholly contained within the other, and is at least 1/4 of its length The strings have an edit distance less than 5% of the total length of the two strings The Pseudo-PHP code: foreach($values as $value){ foreach($values as $match){ if( ( $value['length'] < $match['length'] && $value['length'] * 4 > $match['length'] && stripos($match['value'], $value['value']) !== false ) || ( $match['length'] < $value['length'] && $match['length'] * 4 > $value['length'] && stripos($value['value'], $match['value']) !== false ) || ( abs($value['length'] - $match['length']) * 20 < ($value['length'] + $match['length']) && 0 < ($match['changes'] = levenshtein($value['value'], $match['value'])) && $match['changes'] * 20 <= ($value['length'] + $match['length']) ) ){ $matches[] = &$match; } } } I've tried to reduce calls to the comparatively expensive stripos and levenshtein functions where possible, which has reduced the execution time quite a bit. However, as an O(n^2) operation this just doesn't scale to the larger sets of values and it seems that a significant amount of the processing time is spent simply iterating through the arrays. Some properties of a few sets of values being operated on Total | Strings | # of matches per string | | Strings | With Matches | Average | Median | Max | Time (s) | --------+--------------+---------+--------+------+----------+ 844 | 413 | 1.8 | 1 | 58 | 140 | 593 | 156 | 1.2 | 1 | 5 | 62 | 272 | 168 | 3.2 | 2 | 26 | 10 | 157 | 47 | 1.5 | 1 | 4 | 3.2 | 106 | 48 | 1.8 | 1 | 8 | 1.3 | 62 | 47 | 2.9 | 2 | 16 | 0.4 | Are there any other things I can do to reduce the time to check criteria, and more importantly are there any ways for me to reduce the number of criteria checks required (for example, by pre-processing the input values), since there is such low selectivity?

    Read the article

  • How to optimize this script

    - by marks34
    I have written the following script. It opens a file, reads each line from it splitting by new line character and deleting first character in line. If line exists it's being added to array. Next each element of array is splitted by whitespace, sorted alphabetically and joined again. Every line is printed because script is fired from console and writes everything to file using standard output. I'd like to optimize this code to be more pythonic. Any ideas ? import sys def main(): filename = sys.argv[1] file = open(filename) arr = [] for line in file: line = line[1:].replace("\n", "") if line: arr.append(line) for line in arr: lines = line.split(" ") lines.sort(key=str.lower) line = ''.join(lines) print line if __name__ == '__main__': main()

    Read the article

  • Graph search problem with route restrictions

    - by Darcara
    I want to calculate the most profitable route and I think this is a type of traveling salesman problem. I have a set of nodes that I can visit and a function to calculate cost for traveling between nodes and points for reaching the nodes. The goal is to reach a fixed known score while minimizing the cost. This cost and rewards are not fixed and depend on the nodes visited before. The starting node is fixed. There are some restrictions on how nodes can be visited. Some simplified examples include: Node B can only be visited after A After node C has been visited, D or E can be visited. Visiting at least one is required, visiting both is permissible. Z can only be visited after at least 5 other nodes have been visited Once 50 nodes have been visited, the nodes A-M will no longer reward points Certain nodes can (and probably must) be visited multiple times Currently I can think of only two ways to solve this: a) Genetic Algorithms, with the fitness function calculating the cost/benefit of the generated route b) Dijkstra search through the graph, since the starting node is fixed, although the large number of nodes will probably make that not feasible memory wise. Are there any other ways to determine the best route through the graph? It doesn't need to be perfect, an approximated path is perfectly fine, as long as it's error acceptable. Would TSP-solvers be an option here?

    Read the article

  • PHP shell_exec() - Run directly, or perform a cron (bash/php) and include MySQL layer?

    - by Jimbo
    Sorry if the title is vague - I wasn't quite sure how to word it! What I'm Doing I'm running a Linux command to output data into a variable, parse the data, and output it as an array. Array values will be displayed on a page using PHP, and this PHP page output is requested via AJAX every 10 seconds so, in effect, the data will be retrieved and displayed/updated every 10 seconds. There could be as many as 10,000 characters being parsed on every request, although this is usually much lower. Alternative Idea I want to know if there is a better* alternative method of retrieving this data every 10 seconds, as multiple users (<10) will be having this command executed automatically for them. A cronjob running on the server could execute either bash or php (which is faster?) to grab the data and store it in a MySQL database. Then, any AJAX calls to the PHP output would return values in the MySQL database rather than making a direct call to execute server code every 10 seconds. Why? I know there are security concerns with running execs directly from PHP, and (I hope this isn't micro-optimisation) I'm worried about CPU usage on the server. The server is running a sempron processor. Yes, they do still exist. Having this only execute when the user is on the page (idea #1) means that the server isn't running code that doesn't need to be run. However, is this slow and insecure? Just in case the type of linux command may be of assistance in determining it's efficiency: shell_exec("transmission-remote $host:$port --auth $username:$password -l"); I'm hoping that there are differences in efficiency and level of security with the two methods I have outlined above, and that this isn't just micro-micro-optimisation. If there are alternative methods that are better*, I'd love to learn about these! :)

    Read the article

  • SQL -- How is DISTINCT so fast without an index?

    - by Jonathan
    Hi, I have a database with a table called 'links' with 600 million rows in it in SQLite. There are 2 columns in the database - a "src" column and a "dest" column. At present there are no indices. There are a fair number of common values between src and dest, but also a fair number of duplicated rows. The first thing I'm trying to do is remove all the duplicate rows, and then perform some additional processing on the results, however I've been encountering some weird issues. Firstly, SELECT * FROM links WHERE src=434923 AND dest=5010182. Now this returns one result fairly quickly and then takes quite a long time to run as I assume it's performing a tablescan on the rest of the 600m rows. However, if I do SELECT DISTINCT * FROM links, then it immediately starts returning rows really quickly. The question is: how is this possible?? Surely for each row, the row must be compared against all of the other rows in the table, but this would require a tablescan of the remaining rows in the table which SHOULD takes ages! Any ideas why SELECT DISTINCT is so much quicker than a standard SELECT?

    Read the article

  • Help in optimizing a for loop in matlab

    - by HH
    I have a 1 by N double array consisting of 1 and 0. I would like to map all the 1 to symbol '-3' and '3' and all the 0 to symbol '-1' and '1' equally. Below is my code. As my array is approx 1 by 8 million, it is taking a very long time. How to speed things up? [row,ll] = size(Data); sym_zero = -1; sym_one = -3; for loop = 1 : row if Data(loop,1) == 0 Data2(loop,1) = sym_zero; if sym_zero == -1 sym_zero = 1; else sym_zero = -1; end else Data2(loop,1) = sym_one; if sym_one == -3 sym_zero = 3; else sym_zero = -3; end end end

    Read the article

  • C#, Using Custom Generic Collection faster with objects than List

    - by Kaminari
    Hello, I'm using for now List< to iterate through some object collection and find matching element, The problem is that object has only 2 significant values Name and Link (strings) but has some other values wich I dont want to compare. I'm thinkig about using something like HashSet (wich is exactly what I'm searching for - fast) from .NET 3.5 but target framework has to be 2.0. There is something called Power Collections here: http://powercollections.codeplex.com/ But maybe there is other way? If not, can you suggest me a suitable custom collection?

    Read the article

  • optimizing oracle query

    - by deming
    I'm having a hard time wrapping my head around this query. it is taking almost 200+ seconds to execute. I've pasted the execution plan as well. SELECT user_id , ROLE_ID , effective_from_date , effective_to_date , participant_code , ACTIVE FROM CMP_USER_ROLE E WHERE ACTIVE = 0 AND (SYSDATE BETWEEN effective_from_date AND effective_to_date OR TO_CHAR(effective_to_date,'YYYY-Q') = '2010-2') AND participant_code = 'NY005' AND NOT EXISTS ( SELECT 1 FROM CMP_USER_ROLE r WHERE r.USER_ID= E.USER_ID AND r.role_id = E.role_id AND r.ACTIVE = 4 AND E.effective_to_date <= (SELECT MAX(last_update_date) FROM CMP_USER_ROLE S WHERE S.role_id = r.role_id AND S.role_id = r.role_id AND S.ACTIVE = 4 )) Explain plan ----------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 37 | 154 (2)| 00:00:02 | |* 1 | FILTER | | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID | USER_ROLE | 1 | 37 | 30 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | N_USER_ROLE_IDX6 | 27 | | 3 (0)| 00:00:01 | |* 4 | FILTER | | | | | | | 5 | HASH GROUP BY | | 1 | 47 | 124 (2)| 00:00:02 | |* 6 | TABLE ACCESS BY INDEX ROWID | USER_ROLE | 159 | 3339 | 119 (1)| 00:00:02 | | 7 | NESTED LOOPS | | 11 | 517 | 123 (1)| 00:00:02 | |* 8 | TABLE ACCESS BY INDEX ROWID| USER_ROLE | 1 | 26 | 4 (0)| 00:00:01 | |* 9 | INDEX RANGE SCAN | N_USER_ROLE_IDX5 | 1 | | 3 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | N_USER_ROLE_IDX2 | 957 | | 74 (2)| 00:00:01 | -----------------------------------------------------------------------------------------------------

    Read the article

  • Does a c/c++ compiler optimize constant divisions by power-of-two value into shifts?

    - by porgarmingduod
    Question says it all. Does anyone know if the following... size_t div(size_t value) { const size_t x = 64; return value / x; } ...is optimized into? size_t div(size_t value) { return value >> 6; } Do compilers do this? (My interest lies in GCC). Are there situations where it does and others where it doesn't? I would really like to know, because every time I write a division that could be optimized like this I spend some mental energy wondering about whether precious nothings of a second is wasted doing a division where a shift would suffice.

    Read the article

  • Is there a way to optimise finding text items on a page (not regex)

    - by Jeepstone
    After seeing several threads rubbishing the regexp method of finding a term to match within an HTML document, I've used the Simple HTML DOM PHP parser (http://simplehtmldom.sourceforge.net/) to get the bits of text I'm after, but I want to know if my code is optimal. It feels like I'm looping too many times. Is there a way to optimise the following loop? //Get the HTML and look at the text nodes $html = str_get_html($buffer); //First we match the <body> tag as we don't want to change the <head> items foreach($html->find('body') as $body) { //Then we get the text nodes, rather than any HTML foreach($body->find('text') as $text) { //Then we match each term foreach ($terms as $term) { //Match to the terms within the text nodes $text->outertext = str_replace($term, '<span class="highlight">'.$term.'</span>', $text->outertext); } } } For example, would it make a difference to determine check if I have any matches before I start the loop maybe?

    Read the article

  • WPF Logical Tree - bottom up vs. top down

    - by Dor Rotman
    Hello, I've read the MSDN article about the layouts pass, that states: When a node is added or removed from the logical tree, property invalidations are raised on the node's parent and all its children. As a result, a top-down construction pattern should always be followed to avoid the cost of unnecessary invalidations on nodes that have already been validated. Now lets assume I do this. Won't the users see the control tree populate itself and the layout change several times during the control creation process? I want the whole control tree to just appear completely full. Thanks!

    Read the article

  • Best way to reverse a string in C# 2.0

    - by Guy
    I've just had to write a string reverse function in C# 2.0 (i.e. LINQ not available) and came up with this: public string Reverse(string text) { char[] cArray = text.ToCharArray(); string reverse = String.Empty; for (int i = cArray.Length - 1; i > -1; i--) { reverse += cArray[i]; } return reverse; } Personally I'm not crazy about the function and am convinced that there's a better way to do it. Is there?

    Read the article

  • [C++] Is it possible to roll a significantly faster version of sqrt

    - by John
    In an app I'm profiling, I found that in some scenarios this functions are able to take over 10% of total execution time. I've seen discussion over the years of faster sqrt implementations using sneaky floating-point trickery, but I don't know if such things are outdated on modern CPUs. MSVC++ 2008 compiler is being used, for reference... though I'd assume sqrt is not going to add much overhead though.

    Read the article

  • serving cached files based upon cookie?

    - by matthewsteiner
    So I realized something today. In my application, you really can't get anywhere (except the front page) unless you're logged in. And you can't be logged in without a cookie. So my front page could be cached, except the problem is if you are logged in (have a cookie set) then it should just redirect into the application. Is there a way for nginx to look for a cookie and if it finds it then deliver a cached file? Just an idea...

    Read the article

  • Storing information in the DOM?

    - by John
    Im making a small private message application in the form of a phone. Ten messages are shown at the time. And the list of messages are scrolled up/down by hiding them. Just how bad is it to use the DOM to store information in this way. My main goal for doing this is to reduce calls to the database. And instead of making a new call all the time, it only checks if any new messages has arrived and ads the new message(s). Whats the alternative, cookies anyone? Thank you for the time

    Read the article

  • Getting confused why i dont get expected amount ?

    - by Stackfan
    I have 1 result and which i will receive in Bank account, Based on that account i have to Put a balance to user account. How can you find the Handling cost from total tried 491.50 / 0.95 = 517.36 which is wrong ? It should be 500.00 (to my expectation) User balance requires 500.00 When 500.00 selected he gets 5% discount There is a handling cost for this ex: 1) Discount: 500.00 - 5% = 475.00 2) Handling cost: (475.00 x 0.034) + 0.35 = 16.50 3) Total: 475.00 + 16.50 = 491.50 So problem is from 491.50, i have to find atleast handling cost to get promised Balance. Any solution ? Cant figure it out myself...

    Read the article

  • Nested loop traversing arrays

    - by alecco
    There are 2 very big series of elements, the second 100 times bigger than the first. For each element of the first series, there are 0 or more elements on the second series. This can be traversed and processed with 2 nested loops. But the unpredictability of the amount of matching elements for each member of the first array makes things very, very slow. The actual processing of the 2nd series of elements involves logical and (&) and a population count. I couldn't find good optimizations using C but I am considering doing inline asm, doing rep* mov* or similar for each element of the first series and then doing the batch processing of the matching bytes of the second series, perhaps in buffers of 1MB or something. But the code would be get quite messy. Does anybody know of a better way? C preferred but x86 ASM OK too. Many thanks! Sample/demo code with simplified problem, first series are "people" and second series are "events", for clarity's sake. (the original problem is actually 100m and 10,000m entries!) #include <stdio.h> #include <stdint.h> #define PEOPLE 1000000 // 1m struct Person { uint8_t age; // Filtering condition uint8_t cnt; // Number of events for this person in E } P[PEOPLE]; // Each has 0 or more bytes with bit flags #define EVENTS 100000000 // 100m uint8_t P1[EVENTS]; // Property 1 flags uint8_t P2[EVENTS]; // Property 2 flags void init_arrays() { for (int i = 0; i < PEOPLE; i++) { // just some stuff P[i].age = i & 0x07; P[i].cnt = i % 220; // assert( sum < EVENTS ); } for (int i = 0; i < EVENTS; i++) { P1[i] = i % 7; // just some stuff P2[i] = i % 9; // just some other stuff } } int main(int argc, char *argv[]) { uint64_t sum = 0, fcur = 0; int age_filter = 7; // just some init_arrays(); // Init P, P1, P2 for (int64_t p = 0; p < PEOPLE ; p++) if (P[p].age < age_filter) for (int64_t e = 0; e < P[p].cnt ; e++, fcur++) sum += __builtin_popcount( P1[fcur] & P2[fcur] ); else fcur += P[p].cnt; // skip this person's events printf("(dummy %ld %ld)\n", sum, fcur ); return 0; } gcc -O5 -march=native -std=c99 test.c -o test

    Read the article

  • Java API Method Run Times

    - by Mike
    Is there a good resource to get run times for standard API functions? It's somewhat confusing when trying to optimize your program. I know Java isn't made to be particularly speedy but I can't seem to find much info on this at all. Example Problem: If I am looking for a certain token in a file is it faster to scan each line using string.contains(...) or to bring in say 100 or so lines putting them to a local string them performing contains on that chunk.

    Read the article

  • What is microbenchmarking?

    - by polygenelubricants
    I've heard this term used, but I'm not entirely sure what it means, so: What DOES it mean and what DOESN'T it mean? What are some examples of what IS and ISN'T microbenchmarking? What are the dangers of microbenchmarking and how do you avoid it? (or is it a good thing?)

    Read the article

  • Optimal (Time paradigm) solution to check variable within boundary

    - by kumar_m_kiran
    Hi All, Sorry if the question is very naive. I will have to check the below condition in my code 0 < x < y i.e code similar to if(x > 0 && x < y) The basic problem at system level is - currently, for every call (Telecom domain terminology), my existing code is hit (many times). So performance is very very critical, Now, I need to add a check for boundary checking (at many location - but different boundary comparison at each location). At very normal level of coding, the above comparison would look very naive without any issue. However, when added over my statistics module (which is dipped many times), performance will go down. So I would like to know the best possible way to handle the above scenario (kind of optimal way for limits checking technique). Like for example, if bit comparison works better than normal comparison or can both the comparison be evaluation in shorter time span? Other Info x is unsigned integer (which must be checked to be greater than 0 and less than y). y is unsigned integer. y is a non-const and varies for every comparison. Here time is the constraint compared to space. Language - C++. Now, later if I need to change the attribute of y to a float/double, would there be another way to optimize the check (i.e will the suggested optimal technique for integer become non-optimal solution when y is changed to float/double). Thanks in advance for any input. PS : OS used is SUSE 10 64 bit x64_64, AIX 5.3 64 bit, HP UX 11.1 A 64.

    Read the article

  • Calling a method on an object a bunch of times versus constructing an object a bunch of times

    - by Ami
    I have a List called myData and I want to apply a particular method (someFunction) to every element in the List. Is calling a method through an object's constructor slower than calling the same method many times for one particular object instantiation? In other words, is this: for(int i = 0; i < myData.Count; i++) myClass someObject = new myClass(myData[i]); slower than this: myClass someObject = new myClass(); for(int i = 0; i < myData.Count; i++) someObject.someFunction(myData[i]); ? If so, how much slower?

    Read the article

< Previous Page | 66 67 68 69 70 71 72 73 74 75 76 77  | Next Page >