Search Results

Search found 21107 results on 845 pages for 'size optimization'.

Page 119/845 | < Previous Page | 115 116 117 118 119 120 121 122 123 124 125 126  | Next Page >

  • Word frequency tally script is too slow

    - by Dave Jarvis
    Background Created a script to count the frequency of words in a plain text file. The script performs the following steps: Count the frequency of words from a corpus. Retain each word in the corpus found in a dictionary. Create a comma-separated file of the frequencies. The script is at: http://pastebin.com/VAZdeKXs Problem The following lines continually cycle through the dictionary to match words: for i in $(awk '{if( $2 ) print $2}' frequency.txt); do grep -m 1 ^$i\$ dictionary.txt >> corpus-lexicon.txt; done It works, but it is slow because it is scanning the words it found to remove any that are not in the dictionary. The code performs this task by scanning the dictionary for every single word. (The -m 1 parameter stops the scan when the match is found.) Question How would you optimize the script so that the dictionary is not scanned from start to finish for every single word? The majority of the words will not be in the dictionary. Thank you!

    Read the article

  • Specific template for the first element.

    - by Kalinin
    I have a template: <xsl:template match="paragraph"> ... </xsl:template> I call it: <xsl:apply-templates select="paragraph"/> For the first element I need to do: <xsl:template match="paragraph[1]"> ... <xsl:apply-templates select="."/><!-- I understand that this does not work --> ... </xsl:template> How to call <xsl:apply-templates select="paragraph"/> (for the first element paragraph) from the template <xsl:template match="paragraph[1]">? So far that I have something like a loop. I solve this problem so (but I do not like it): <xsl:for-each select="paragraph"> <xsl:choose> <xsl:when test="position() = 1"> ... <xsl:apply-templates select="."/> ... </xsl:when> <xsl:otherwise> <xsl:apply-templates select="."/> </xsl:otherwise> </xsl:choose> </xsl:for-each>

    Read the article

  • MySQL, delete and index hint

    - by Manuel Darveau
    I have to delete about 10K rows from a table that has more than 100 million rows based on some criteria. When I execute the query, it takes about 5 minutes. I ran an explain plan (the delete query converted to select * since MySQL does not support explain delete) and found that MySQL uses the wrong index. My question is: is there any way to tell MySQL which index to use during delete? If not, what ca I do? Select to temp table then delete from temp table? Thank you!

    Read the article

  • C++ performance, optimizing compiler, empty function in .cpp

    - by Dodo
    I've a very basic class, name it Basic, used in nearly all other files in a bigger project. In some cases, there needs to be debug output, but in release mode, this should not be enabled and be a NOOP. Currently there is a define in the header, which switches a makro on or off, depending on the setting. So this is definetely a NOOP, when switched off. I'm wondering, if I have the following code, if a compiler (MSVS / gcc) is able to optimize out the function call, so that it is again a NOOP. (By doing that, the switch could be in the .cpp and switching will be much faster, compile/link time wise). --Header-- void printDebug(const Basic* p); class Basic { Basic() { simpleSetupCode; // this should be a NOOP in release, // but constructor could be inlined printDebug(this); } }; --Source-- // PRINT_DEBUG defined somewhere else or here #if PRINT_DEBUG void printDebug(const Basic* p) { // Lengthy debug print } #else void printDebug(const Basic* p) {} #endif

    Read the article

  • Optimizing sparse dot-product in C#

    - by Haggai
    Hello. I'm trying to calculate the dot-product of two very sparse associative arrays. The arrays contain an ID and a value, so the calculation should be done only on those IDs that are common to both arrays, e.g. <(1, 0.5), (3, 0.7), (12, 1.3) * <(2, 0.4), (3, 2.3), (12, 4.7) = 0.7*2.3 + 1.3*4.7 . My implementation (call it dict) currently uses Dictionaries, but it is too slow to my taste. double dot_product(IDictionary<int, double> arr1, IDictionary<int, double> arr2) { double res = 0; double val2; foreach (KeyValuePair<int, double> p in arr1) if (arr2.TryGetValue(p.Key, out val2)) res += p.Value * val2; return res; } The full arrays have about 500,000 entries each, while the sparse ones are only tens to hundreds entries each. I did some experiments with toy versions of dot products. First I tried to multiply just two double arrays to see the ultimate speed I can get (let's call this "flat"). Then I tried to change the implementation of the associative array multiplication using an int[] ID array and a double[] values array, walking together on both ID arrays and multiplying when they are equal (let's call this "double"). I then tried to run all three versions with debug or release, with F5 or Ctrl-F5. The results are as follows: debug F5: dict: 5.29s double: 4.18s (79% of dict) flat: 0.99s (19% of dict, 24% of double) debug ^F5: dict: 5.23s double: 4.19s (80% of dict) flat: 0.98s (19% of dict, 23% of double) release F5: dict: 5.29s double: 3.08s (58% of dict) flat: 0.81s (15% of dict, 26% of double) release ^F5: dict: 4.62s double: 1.22s (26% of dict) flat: 0.29s ( 6% of dict, 24% of double) I don't understand these results. Why isn't the dictionary version optimized in release F5 as do the double and flat versions? Why is it only slightly optimized in the release ^F5 version while the other two are heavily optimized? Also, since converting my code into the "double" scheme would mean lots of work - do you have any suggestions how to optimize the dictionary one? Thanks! Haggai

    Read the article

  • negative values in integer programming model

    - by Lucia
    I'm new at using the glpk tool, and after writing a model for certain integer problem and running the solver (glpsol) i get negative values in some constraint that shouldn't be negative at all: No.Row name Activity Lower bound Upper bound 8 act[1] 0 -0 9 act[2] -3 -0 10 act[2] -2 -0 That constraint is defined like this: act{j in J}: sum{i in I} d[i,j] <= y[j]*m; where the sets and variables used are like this: param m, integer, 0; param n, integer, 0; set I := 1..m; set J := 1..n; var y{j in J}, binary; As the upper bound is negative, i think the problem may be in the y[j]*m parte, of the right side of the inequality.. perhaps something with the multiplication of binarys? or that the j in that side of the constrait is undefined? i dont know... i would be greatly grateful if someone can help me with this! :) and excuse for my bad english thanks in advance!

    Read the article

  • Why isn't the copy constructor elided here?

    - by Jesse Beder
    (I'm using gcc with -O2.) This seems like a straightforward opportunity to elide the copy constructor, since there are no side-effects to accessing the value of a field in a bar's copy of a foo; but the copy constructor is called, since I get the output meep meep!. #include <iostream> struct foo { foo(): a(5) { } foo(const foo& f): a(f.a) { std::cout << "meep meep!\n"; } int a; }; struct bar { foo F() const { return f; } foo f; }; int main() { bar b; int a = b.F().a; return 0; }

    Read the article

  • Tree iterator, can you optimize this any further?

    - by Ron
    As a follow up to my original question about a small piece of this code I decided to ask a follow up to see if you can do better then what we came up with so far. The code below iterates over a binary tree (left/right = child/next ). I do believe there is room for one less conditional in here (the down boolean). The fastest answer wins! The cnt statement can be multiple statements so lets make sure this appears only once The child() and next() member functions are about 30x as slow as the hasChild() and hasNext() operations. Keep it iterative <-- dropped this requirement as the recursive solution presented was faster. This is C++ code visit order of the nodes must stay as they are in the example below. ( hit parents first then the children then the 'next' nodes). BaseNodePtr is a boost::shared_ptr as thus assignments are slow, avoid any temporary BaseNodePtr variables. Currently this code takes 5897ms to visit 62200000 nodes in a test tree, calling this function 200,000 times. void processTree (BaseNodePtr current, unsigned int & cnt ) { bool down = true; while ( true ) { if ( down ) { while (true) { cnt++; // this can/will be multiple statesments if (!current->hasChild()) break; current = current->child(); } } if ( current->hasNext() ) { down = true; current = current->next(); } else { down = false; current = current->parent(); if (!current) return; // done. } } }

    Read the article

  • Sorting a list of numbers with modified cost

    - by David
    First, this was one of the four problems we had to solve in a project last year and I couldn’t find a suitable algorithm so we handle in a brute force solution. Problem: The numbers are in a list that is not sorted and supports only one type of operation. The operation is defined as follows: Given a position i and a position j the operation moves the number at position i to position j without altering the relative order of the other numbers. If i j, the positions of the numbers between positions j and i - 1 increment by 1, otherwise if i < j the positions of the numbers between positions i+1 and j decreases by 1. This operation requires i steps to find a number to move and j steps to locate the position to which you want to move it. Then the number of steps required to move a number of position i to position j is i+j. We need to design an algorithm that given a list of numbers, determine the optimal (in terms of cost) sequence of moves to rearrange the sequence. Attempts: Part of our investigation was around NP-Completeness, we make it a decision problem and try to find a suitable transformation to any of the problems listed in Garey and Johnson’s book: Computers and Intractability with no results. There is also no direct reference (from our point of view) to this kind of variation in Donald E. Knuth’s book: The art of Computer Programing Vol. 3 Sorting and Searching. We also analyzed algorithms to sort linked lists but none of them gives a good idea to find de optimal sequence of movements. Note that the idea is not to find an algorithm that orders the sequence, but one to tell me the optimal sequence of movements in terms of cost that organizes the sequence, you can make a copy and sort it to analyze the final position of the elements if you want, in fact we may assume that the list contains the numbers from 1 to n, so we know where we want to put each number, we are just concerned with minimizing the total cost of the steps. We tested several greedy approaches but all of them failed, divide and conquer sorting algorithms can’t be used because they swap with no cost portions of the list and our dynamic programing approaches had to consider many cases. The brute force recursive algorithm takes all the possible combinations of movements from i to j and then again all the possible moments of the rest of the element’s, at the end it returns the sequence with less total cost that sorted the list, as you can imagine the cost of this algorithm is brutal and makes it impracticable for more than 8 elements. Our observations: n movements is not necessarily cheaper than n+1 movements (unlike swaps in arrays that are O(1)). There are basically two ways of moving one element from position i to j: one is to move it directly and the other is to move other elements around i in a way that it reaches the position j. At most you make n-1 movements (the untouched element reaches its position alone). If it is the optimal sequence of movements then you didn’t move the same element twice.

    Read the article

  • Get count matches in query on large table very slow

    - by Roy Roes
    I have a mysql table "items" with 2 integer fields: seid and tiid The table has about 35000000 records, so it's very large. seid tiid ----------- 1 1 2 2 2 3 2 4 3 4 4 1 4 2 The table has a primary key on both fields, an index on seid and an index on tiid. Someone types in 1 or more tiid values and now I would like to get the seid with most results. For example when someone types 1,2,3, I would like to get seid 2 and 4 as result. They both have 2 matches on the tiid values. My query so far: SELECT COUNT(*) as c, seid FROM items WHERE tiid IN (1,2,3) GROUP BY seid HAVING c = (SELECT COUNT(*) as c, seid FROM items WHERE tiid IN (1,2,3) GROUP BY seid ORDER BY c DESC LIMIT 1) But this query is extremly slow, because of the large table. Does anyone know how to construct a better query for this purpose?

    Read the article

  • Counting context switches per thread

    - by Sarmun
    Is there a way to see how many context switches each thread generates? (both in and out if possible) Either in X/s, or to let it run and give aggregated data after some time. (either on linux or on windows) I have found only tools that give aggregated context-switching number for whole os or per process. My program makes many context switches (50k/s), probably a lot not necessary, but I am not sure where to start optimizing, where do most of those happen.

    Read the article

  • how to avoid temporaries when copying weakly typed object

    - by Truncheon
    Hi. I'm writing a series classes that inherit from a base class using virtual. They are INT, FLOAT and STRING objects that I want to use in a scripting language. I'm trying to implement weak typing, but I don't want STRING objects to return copies of themselves when used in the following way (instead I would prefer to have a reference returned which can be used in copying): a = "hello "; b = "world"; c = a + b; I have written the following code as a mock example: #include <iostream> #include <string> #include <cstdio> #include <cstdlib> std::string dummy("<int object cannot return string reference>"); struct BaseImpl { virtual bool is_string() = 0; virtual int get_int() = 0; virtual std::string get_string_copy() = 0; virtual std::string const& get_string_ref() = 0; }; struct INT : BaseImpl { int value; INT(int i = 0) : value(i) { std::cout << "constructor called\n"; } INT(BaseImpl& that) : value(that.get_int()) { std::cout << "copy constructor called\n"; } bool is_string() { return false; } int get_int() { return value; } std::string get_string_copy() { char buf[33]; sprintf(buf, "%i", value); return buf; } std::string const& get_string_ref() { return dummy; } }; struct STRING : BaseImpl { std::string value; STRING(std::string s = "") : value(s) { std::cout << "constructor called\n"; } STRING(BaseImpl& that) { if (that.is_string()) value = that.get_string_ref(); else value = that.get_string_copy(); std::cout << "copy constructor called\n"; } bool is_string() { return true; } int get_int() { return atoi(value.c_str()); } std::string get_string_copy() { return value; } std::string const& get_string_ref() { return value; } }; struct Base { BaseImpl* impl; Base(BaseImpl* p = 0) : impl(p) {} ~Base() { delete impl; } }; int main() { Base b1(new INT(1)); Base b2(new STRING("Hello world")); Base b3(new INT(*b1.impl)); Base b4(new STRING(*b2.impl)); std::cout << "\n"; std::cout << b1.impl->get_int() << "\n"; std::cout << b2.impl->get_int() << "\n"; std::cout << b3.impl->get_int() << "\n"; std::cout << b4.impl->get_int() << "\n"; std::cout << "\n"; std::cout << b1.impl->get_string_ref() << "\n"; std::cout << b2.impl->get_string_ref() << "\n"; std::cout << b3.impl->get_string_ref() << "\n"; std::cout << b4.impl->get_string_ref() << "\n"; std::cout << "\n"; std::cout << b1.impl->get_string_copy() << "\n"; std::cout << b2.impl->get_string_copy() << "\n"; std::cout << b3.impl->get_string_copy() << "\n"; std::cout << b4.impl->get_string_copy() << "\n"; return 0; } It was necessary to add an if check in the STRING class to determine whether its safe to request a reference instead of a copy: Script code: a = "test"; b = a; c = 1; d = "" + c; /* not safe to request reference by standard */ C++ code: STRING(BaseImpl& that) { if (that.is_string()) value = that.get_string_ref(); else value = that.get_string_copy(); std::cout << "copy constructor called\n"; } If was hoping there's a way of moving that if check into compile time, rather than run time.

    Read the article

  • C++ DWORD* to BYTE*

    - by NomeSkavinski
    My issue, i am trying to convert and array of dynamic memory of type DWORD to a BYTE. Fair enough i can for loop through this and convert the DWORD into a BYTE per entry. But is their a faster way to do this? to take a pointer to DWORD data and convert the whole piece of data into a pointer to BYTE data? such as using a memcpy operation? I feel this is not possible, im not requesting an answer just an experienced opinion on my approach, as i have tried testing both approaches but seem to fail getting to a solution on my second solution. Thanks for any input, again no answers just a point in the right direction. Nor is this a homework question, i felt that had to be mentioned.

    Read the article

  • Has anyone ever successfully make index merge work for MySQL?

    - by user198729
    Setup: mysql> create table t(a integer unsigned,b integer unsigned); mysql> insert into t(a,b) values (1,2),(1,3),(2,4); mysql> create index i_t_a on t(a); mysql> create index i_t_b on t(b); mysql> explain select * from t where a=1 or b=4; +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | t | ALL | i_t_a,i_t_b | NULL | NULL | NULL | 3 | Using where | +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ Is there something I'm missing? Update mysql> explain select * from t where a=1 or b=4; +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | t | ALL | i_t_a,i_t_b | NULL | NULL | NULL | 1863 | Using where | +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ Version: mysql> select version(); +----------------------+ | version() | +----------------------+ | 5.1.36-community-log | +----------------------+ Has anyone ever successfully make index merge work for MySQL? I'll be glad to see successful stories here:)

    Read the article

  • Overhead of serving pages - JSPs vs. PHP vs. ASPXs vs. C

    - by John Shedletsky
    I am interested in writing my own internet ad server. I want to serve billions of impressions with as little hardware possible. Which server-side technologies are best suited for this task? I am asking about the relative overhead of serving my ad pages as either pages rendered by PHP, or Java, or .net, or coding Http responses directly in C and writing some multi-socket IO monster to serve requests (I assume this one wins, but if my assumption is wrong, that would actually be most interesting). Obviously all the most efficient optimizations are done at the algorithm level, but I figure there has got to be some speed differences at the end of the day that makes one method of serving ads better than another. How much overhead does something like apache or IIS introduce? There's got to be a ton of extra junk in there I don't need. At some point I guess this is more a question of which platform/language combo is best suited - please excuse the in-adroitly posed question, hopefully you understand what I am trying to get at.

    Read the article

  • Will an optimizing compiler remove calls to a method whose result will be multiplied by zero?

    - by Tim R.
    Suppose you have a computationally expensive method, Compute(p), which returns some float, and another method, Falloff(p), which returns another float from zero to one. If you compute Falloff(p) * Compute(p), will Compute(p) still run when Falloff(p) returns zero? Or would you need to write a special case to prevent Compute(p) from running unnecessarily? Theoretically, an optimizing compiler could determine that omitting Compute when Falloff returns zero would have no effect on the program. However, this is kind of hard to test, since if you have Compute output some debug data to determine whether it is running, the compiler would know not to omit it because of that debug info, resulting in sort of a Schrodinger's cat situation. I know the safe solution to this problem is just to add the special case, but I'm just curious.

    Read the article

  • Calculating Growth-Rates by applying log-differences

    - by mropa
    I am trying to transform my data.frame by calculating the log-differences of each column and controlling for the rows id. So basically I like to calculate the growth rates for each id's variable. So here is a random df with an id column, a time period colum p and three variable columns: df <- data.frame (id = c("a","a","a","c","c","d","d","d","d","d"), p = c(1,2,3,1,2,1,2,3,4,5), var1 = rnorm(10, 5), var2 = rnorm(10, 5), var3 = rnorm(10, 5) ) df id p var1 var2 var3 1 a 1 5.375797 4.110324 5.773473 2 a 2 4.574700 6.541862 6.116153 3 a 3 3.029428 4.931924 5.631847 4 c 1 5.375855 4.181034 5.756510 5 c 2 5.067131 6.053009 6.746442 6 d 1 3.846438 4.515268 6.920389 7 d 2 4.910792 5.525340 4.625942 8 d 3 6.410238 5.138040 7.404533 9 d 4 4.637469 3.522542 3.661668 10 d 5 5.519138 4.599829 5.566892 Now I have written a function which does exactly what I want BUT I had to take a detour which is possibly unnecessary and can be removed. However, somehow I am not able to locate the shortcut. Here is the function and the output for the posted data frame: fct.logDiff <- function (df) { df.log <- dlply (df, "code", function(x) data.frame (p = x$p, log(x[, -c(1,2)]))) list.nalog <- llply (df.log, function(x) data.frame (p = x$p, rbind(NA, sapply(x[,-1], diff)))) ldply (list.nalog, data.frame) } fct.logDiff(df) id p var1 var2 var3 1 a 1 NA NA NA 2 a 2 -0.16136569 0.46472004 0.05765945 3 a 3 -0.41216720 -0.28249264 -0.08249587 4 c 1 NA NA NA 5 c 2 -0.05914281 0.36999681 0.15868378 6 d 1 NA NA NA 7 d 2 0.24428771 0.20188025 -0.40279188 8 d 3 0.26646102 -0.07267311 0.47041227 9 d 4 -0.32372771 -0.37748866 -0.70417351 10 d 5 0.17405309 0.26683625 0.41891802 The trouble is due to the added NA-rows. I don't want to collapse the frame and reduce it, which would be automatically done by the diff() function. So I had 10 rows in my original frame and am keeping the same amount of rows after the transformation. In order to keep the same length I had to add some NAs. I have taken a detour by transforming the data.frame into a list, add the NAs, and afterwards transform the list back into a data.frame. That looks tedious. Any ideas to avoid the data.frame-list-data.frame class transformation and optimize the function?

    Read the article

  • SQL Database dilemma : Optimize for Querying or Writing?

    - by Harry
    I'm working on a personal project (Search engine) and have a bit of a dilemma. At the moment it is optimized for writing data to the search index and significantly slow for search queries. The DTA (Database Engine Tuning Adviser) recommends adding a couple of Indexed views inorder to speed up search queries. But this is to the detriment of writing new data to the DB. It seems I can't have one without the other! This is obviously not a new problem. What is a good strategy for this issue?

    Read the article

  • Ternary Operators in JavaScript Without an "Else"

    - by Oscar Godson
    I've been using them forever, and I love them. To me they see cleaner and i can scan faster, but ever since I've been using them i've always had to put null in the else conditions that don't have anything. Is there anyway around it? E.g. condition ? x=true : null ; basically, is there a way to do: condition ? x=true; Now it shows up as a syntax error...

    Read the article

  • SimpleDB as Denormalized DB

    - by Max
    In an environment where you have a relational database which handles all business transactions is it a good idea to utilise SimpleDB for all data queries to have faster and more lightweight search? So the master data storage would be a relational DB which is "replicated"/"transformed" into SimpleDB to provide very fast read only queries since no JOINS and complicated subselects are needed.

    Read the article

  • Optimize SELECT DISTINCT CONCAT query in MySQL

    - by L. Cosio
    Hello! I'm running this query: SELECT DISTINCT CONCAT(ALFA_CLAVE, FECHA_NACI) FROM listado GROUP BY ALFA_CLAVE HAVING count(CONCAT(ALFA_CLAVE, FECHA_NACI)) > 1 Is there any way to optimize it? Queries are taking 2-3 hours on a table with 850,000 rows. Adding an index to ALFA_CLAVE and FECHA_NACI would work? Thanks in advanced

    Read the article

  • Merging and splitting overlapping rectangles to produce non-overlapping ones

    - by uj
    I am looking for an algorithm as follows: Given a set of possibly overlapping rectangles (All of which are "not rotated", can be uniformly represented as (left,top,right,bottom) tuplets, etc...), it returns a minimal set of (non-rotated) non-overlapping rectangles, that occupy the same area. It seems simple enough at first glance, but prooves to be tricky (at least to be done efficiently). Are there some known methods for this/ideas/pointers? Methods for not necessarily minimal, but heuristicly small, sets, are interesting as well, so are methods that produce any valid output set at all.

    Read the article

  • Does the compiler optimize the function parameters passed by value?

    - by Naveen
    Lets say I have a function where the parameter is passed by value instead of const-reference. Further, lets assume that only the value is used inside the function i.e. the function doesn't try to modify it. In that case will the compiler will be able to figure out that it can pass the value by const-reference (for performance reasons) and generate the code accordingly? Is there any compiler which does that?

    Read the article

  • Reduce Processing Time of accessing databse

    - by medma
    hello all, I m making an app which requires remote databse connection. I want the values in picker from database but when I click on button to invoke picker it takes some time to fetch the values and displaying. Is there any way to do it fast? and also is there any way to reduce the time of transition between 2 views? Thanx

    Read the article

< Previous Page | 115 116 117 118 119 120 121 122 123 124 125 126  | Next Page >