Search Results

Search found 3828 results on 154 pages for 'mathematical optimization'.

Page 66/154 | < Previous Page | 62 63 64 65 66 67 68 69 70 71 72 73  | Next Page >

  • How to optimize a postgreSQL server for a "write once, read many"-type infrastructure ?

    - by mhu
    Greetings, I am working on a piece of software that logs entries (and related tagging) in a PostgreSQL database for storage and retrieval. We never update any data once it has been inserted; we might remove it when the entry gets too old, but this is done at most once a day. Stored entries can be retrieved by users. The insertion of new entries can happen rather fast and regularly, thus the database will commonly hold several millions elements. The tables used are pretty simple : one table for ids, raw content and insertion date; and one table storing tags and their values associated to an id. User search mostly concern tags values, so SELECTs usually consist of JOIN queries on ids on the two tables. To sum it up : 2 tables Lots of INSERT no UPDATE some DELETE, once a day at most some user-generated SELECT with JOIN huge data set What would an optimal server configuration (software and hardware, I assume for example that RAID10 could help) be for my PostgreSQL server, given these requirements ? By optimal, I mean one that allows SELECT queries taking a reasonably little amount of time. I can provide more information about the current setup (like tables, indexes ...) if needed.

    Read the article

  • Eliminate full table scan due to BETWEEN (and GROUP BY)

    - by Dave Jarvis
    Description According to the explain command, there is a range that is causing a query to perform a full table scan (160k rows). How do I keep the range condition and reduce the scanning? I expect the culprit to be: Y.YEAR BETWEEN 1900 AND 2009 AND Code Here is the code that has the range condition (the STATION_DISTRICT is likely superfluous). SELECT COUNT(1) as MEASUREMENTS, AVG(D.AMOUNT) as AMOUNT, Y.YEAR as YEAR, MAKEDATE(Y.YEAR,1) as AMOUNT_DATE FROM CITY C, STATION S, STATION_DISTRICT SD, YEAR_REF Y FORCE INDEX(YEAR_IDX), MONTH_REF M, DAILY D WHERE -- For a specific city ... -- C.ID = 10663 AND -- Find all the stations within a specific unit radius ... -- 6371.009 * SQRT( POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) + (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) * POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= 50 AND -- Get the station district identification for the matching station. -- S.STATION_DISTRICT_ID = SD.ID AND -- Gather all known years for that station ... -- Y.STATION_DISTRICT_ID = SD.ID AND -- The data before 1900 is shaky; insufficient after 2009. -- Y.YEAR BETWEEN 1900 AND 2009 AND -- Filtered by all known months ... -- M.YEAR_REF_ID = Y.ID AND -- Whittled down by category ... -- M.CATEGORY_ID = '003' AND -- Into the valid daily climate data. -- M.ID = D.MONTH_REF_ID AND D.DAILY_FLAG_ID <> 'M' GROUP BY Y.YEAR Update The SQL is performing a full table scan, which results in MySQL performing a "copy to tmp table", as shown here: +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ | 1 | SIMPLE | C | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | Y | range | YEAR_IDX | YEAR_IDX | 4 | NULL | 160422 | Using where | | 1 | SIMPLE | SD | eq_ref | PRIMARY | PRIMARY | 4 | climate.Y.STATION_DISTRICT_ID | 1 | Using index | | 1 | SIMPLE | S | eq_ref | PRIMARY | PRIMARY | 4 | climate.SD.ID | 1 | Using where | | 1 | SIMPLE | M | ref | PRIMARY,YEAR_REF_IDX,CATEGORY_IDX | YEAR_REF_IDX | 8 | climate.Y.ID | 54 | Using where | | 1 | SIMPLE | D | ref | INDEX | INDEX | 8 | climate.M.ID | 11 | Using where | +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ Related http://dev.mysql.com/doc/refman/5.0/en/how-to-avoid-table-scan.html http://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html http://stackoverflow.com/questions/557425/optimize-sql-that-uses-between-clause Thank you!

    Read the article

  • Limit CPU usage of a process

    - by jb
    I have a service running which periodically checks a folder for a file and then processes it. (Reads it, extracts the data, stores it in sql) So I ran it on a test box and it took a little longer thaan expected. The file had 1.6 million rows, and it was still running after 6 hours (then I went home). The problem is the box it is running on is now absolutely crippled - remote desktop was timing out so I cant even get on it to stop the process, or attach a debugger to see how far through etc. It's solidly using 90%+ CPU, and all other running services or apps are suffering. The code is (from memory, may not compile): List<ItemDTO> items = new List<ItemDTO>(); using (StreamReader sr = fileInfo.OpenText()) { while (!sr.EndOfFile) { string line = sr.ReadLine() try { string s = line.Substring(0,8); double y = Double.Parse(line.Substring(8,7)); //If the item isnt already in the collection, add it. if (items.Find(delegate(ItemDTO i) { return (i.Item == s); }) == null) items.Add(new ItemDTO(s,y)); } catch { /*Crash*/ } } return items; } - So I am working on improving the code (any tips appreciated). But it still could be a slow affair, which is fine, I've no problems with it taking a long time as long as its not killing my server. So what I want from you fine people is: 1) Is my code hideously un-optimized? 2) Can I limit the amount of CPU my code block may use? Cheers all

    Read the article

  • Which is quicker? Memcache or file query? (using maxmind geoip.dat file)

    - by tomcritchlow
    Hi, I'm using Python on Appengine and am looking up the geolocation of an IP address like this: import pygeoip gi = pygeoip.GeoIP('GeoIP.dat') Location = gi.country_code_by_addr(self.request.remote_addr) (pygeoip can be found here: http://code.google.com/p/pygeoip/) I want to geolocate each page of my app for a user so currently I lookup the IP address once then store it in memcache. My question - which is quicker? Looking up the IP address each time from the .dat file or fetching it from memcache? Are there any other pros/cons I need to be aware of? For general queries like this, is there a good guide to teach me how to optimise my code and run speed tests myself? I'm new to python and coding in general so apologies if this is a basic concept. Thanks! Tom

    Read the article

  • Where is the bottleneck in this code?

    - by Mikhail
    I have the following tight loop that makes up the serial bottle neck of my code. Ideally I would parallelize the function that calls this but that is not possible. //n is about 60 for (int k = 0;k < n;k++) { double fone = z[k*n+i+1]; double fzer = z[k*n+i]; z[k*n+i+1]= s*fzer+c*fone; z[k*n+i] = c*fzer-s*fone; } Are there any optimizations that can be made such as vectorization or some evil inline that can help this code? I am looking into finding eigen solutions of tridiagonal matrices. http://www.cimat.mx/~posada/OptDoglegGraph/DocLogisticDogleg/projects/adjustedrecipes/tqli.cpp.html

    Read the article

  • Does anybody have any suggestions on which of these two approaches is better for large delete?

    - by RPS
    Approach #1: DECLARE @count int SET @count = 2000 DECLARE @rowcount int SET @rowcount = @count WHILE @rowcount = @count BEGIN DELETE TOP (@count) FROM ProductOrderInfo WHERE ProductId = @product_id AND bCopied = 1 AND FileNameCRC = @localNameCrc SELECT @rowcount = @@ROWCOUNT WAITFOR DELAY '000:00:00.400' Approach #2: DECLARE @count int SET @count = 2000 DECLARE @rowcount int SET @rowcount = @count WHILE @rowcount = @count BEGIN DELETE FROM ProductOrderInfo WHERE ProductId = @product_id AND FileNameCRC IN ( SELECT TOP(@count) FileNameCRC FROM ProductOrderInfo WITH (NOLOCK) WHERE bCopied = 1 AND FileNameCRC = @localNameCrc ) SELECT @rowcount = @@ROWCOUNT WAITFOR DELAY '000:00:00.400' END

    Read the article

  • Java: how to read BufferedReader faster

    - by Cata
    Hello, I want to optimize this code: InputStream is = rp.getEntity().getContent(); BufferedReader reader = new BufferedReader(new InputStreamReader(is)); String text = ""; String aux = ""; while ((aux = reader.readLine()) != null) { text += aux; } The thing is that i don't know how to read the content of the bufferedreader and copy it in a String faster than what I have above. I need to spend as little time as possible. Thank you

    Read the article

  • In SQL Server what is most efficient way to compare records to other records for duplicates with in

    - by Glenn
    We have an SQL Server that gets daily imports of data files from clients. This data is interrelated and we are always scrubbing it and having to look for suspect duplicate records between these files. Finding and tagging suspect records can get pretty complicated. We use logic that requires some field values to be the same, allows some field values to differ, and allows a range to be specified for how different certain field values can be. The only way we've found to do it is by using a cursor based process, and it places a heavy burden on the database. So I wanted to ask if there's a more efficient way to do this. I've heard it said that there's almost always a more efficient way to replace cursors with clever JOINS. But I have to admit I'm having a lot of trouble with this one. For a concrete example suppose we have 1 table, an "orders" table, with the following 6 fields. order_id, customer_id product_id, quantity, sale_date, price We want to look through the records to find suspect duplicates on the following example criteria. These get increasingly harder. 1. Records that have the same product_id, sale_date, and quantity but different customer_id's should be marked as suspect duplicates for review. 2. Records that have the same customer_id, product_id, quantity and have sale_dates within five days of each other should be marked as suspect duplicates for review 3. Records that have the same customer_id, product_id, but different quantities within 20 units, and sales dates within five days of each other should be considered suspect. Is it possible to satisfy each one of these criteria with a single SQL Query that uses JOINS? Is this the most efficient way to do this?

    Read the article

  • PHP Increasing writing to page speed.

    - by Frederico
    I'm currently writing out xml and have done the following: header ("content-type: text/xml"); header ("content-length: ".strlen($xml)); $xml being the xml to be written out. I'm near about 1.8 megs of text (which I found via firebug), it seems as the writing is taking more time than the script to run.. is there a way to increase this write speed? Thank you in advance.

    Read the article

  • STL vectors with uninitialized storage?

    - by Jim Hunziker
    I'm writing an inner loop that needs to place structs in contiguous storage. I don't know how many of these structs there will be ahead of time. My problem is that STL's vector initializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct's members to their values. Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements? (I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.) Also, see my comments below for a clarification about when the initialization occurs. SOME CODE: void GetsCalledALot(int* data1, int* data2, int count) { int mvSize = memberVector.size() memberVector.resize(mvSize + count); // causes 0-initialization for (int i = 0; i < count; ++i) { memberVector[mvSize + i].d1 = data1[i]; memberVector[mvSize + i].d2 = data2[i]; } }

    Read the article

  • Will unused deconstructors be optimized out?

    - by Brendan Long
    Assuming MyClass uses the default deconstructor (or no deconstructor), and this code: MyClass buffer[] = new MyClass[i]; // Construct N objects using placement new for(size_t i = 0; i < N; i++){ ~buffer[i]; } delete[] buffer; Is there any optimizer that would be able to remove this loop? Also, is there any way for my code to detect if MyClass is using an empty/default constructor?

    Read the article

  • Is a program compiled with -g gcc flag slower than the same program compiled without -g?

    - by e271p314
    I'm compiling a program with -O3 for performance and -g for debug symbols (in case of crash I can use the core dump). One thing bothers me a lot, does the -g option results in a performance penalty? When I look on the output of the compilation with and without -g, I see that the output without -g is 80% smaller than the output of the compilation with -g. If the extra space goes for the debug symbols, I don't care about it (I guess) since this part is not used during runtime. But if for each instruction in the compilation output without -g I need to do 4 more instructions in the compilation output with -g than I certainly prefer to stop using -g option even at the cost of not being able to process core dumps. How to know the size of the debug symbols section inside the program and in general does compilation with -g creates a program which runs slower than the same code compiled without -g?

    Read the article

  • Ways to optimize Android App code based on function call stack?

    - by K-RAN
    I've been told that Android OS stores all function calls in a stack. This can lead to many problems and cause the 'hiccups' during runtime, even if a program is functionalized properly, correct? So the question is, how can we prevent this from happening? The obvious solution is to functionalize less, along with other sensible acts such as refraining from excessively/needlessly creating objects, performing static calls to functions that don't access fields, etc... Is there another way though? Or can this only be done through careful code writing on the programmers' part? Does the JVM/JIT automatically optimize the bytecode during compile time to account for this?? Thanks a lot for your responses!!

    Read the article

  • PHP error handling : my code is not optimized

    - by Tristan
    Hello, I must warn you, this code will heart your eyes, so please don't judge me, i'm trying to improve the way I handle errors all my tests are like this : if ($something < 27) { $error_IP= '<div class="error_message">something bad</div> '; }else{ $erreur_IP=''; } and here's the ugliest thing : if( !isset($_POST) || ($erreur_captcha !='') || ($erreur_email !='') || ($erreur_hebergeurVide != '') || ($erreur_paysVide != '') || ($erreur_slotVide != '') || ($erreur_rconVide != '') || ($erreur_tick != '') + a lot more :d ) What do you suggest to me to optimize my errors handling ? Thank you

    Read the article

  • How to simplify my code... 2D array in Objective C...?

    - by Tattat
    self.myArray = [NSArray arrayWithObjects: [NSArray arrayWithObjects: [self d], [self generateMySecretObject],nil], [NSArray arrayWithObjects: [self generateMySecretObject], [self generateMySecretObject],nil],nil]; for (int k=0; k<[self.myArray count]; k++) { for(int s = 0; s<[[self.myArray objectAtIndex:k] count]; s++){ [[[self.myArray objectAtIndex:k] objectAtIndex:s] setAttribute:[self generateSecertAttribute]]; } } As you can see this is a simple 2*2 array, but it takes me lots of code to assign the NSArray in very first place, because I found that the NSArray can't assign the size at very beginning. Also, I want to set attribute one by one. I can't think of if my array change to 10*10. How long it could be. So, I hope you guys can give me some suggestions on shorten the code, and more readable. thz

    Read the article

  • MySQL Prepared Statements vs Stored Procedures Performance

    - by amardilo
    Hi there, I have an old MySQL 4.1 database with a table that has a few millions rows and an old Java application that connects to this database and returns several thousand rows from this this table on a frequent basis via a simple SQL query (i.e. SELECT * FROM people WHERE first_name = 'Bob'. I think the Java application uses client side prepared statements but was looking at switching this to the server, and in the example mentioned the value for first_name will vary depending on what the user enters). I would like to speed up performance on the select query and was wondering if I should switch to Prepared Statements or Stored Procedures. Is there a general rule of thumb of what is quicker/less resource intensive (or if a combination of both is better)

    Read the article

  • optimize python code

    - by user283405
    i have code that uses BeautifulSoup library for parsing. But it is very slow. The code is written in such a way that threads cannot be used. Can anyone help me about this? I am using beautifulsoup library for parsing and than save in DB. if i comment the save statement, than still it takes time so there is no problem with database. def parse(self,text): soup = BeautifulSoup(text) arr = soup.findAll('tbody') for i in range(0,len(arr)-1): data=Data() soup2 = BeautifulSoup(str(arr[i])) arr2 = soup2.findAll('td') c=0 for j in arr2: if str(j).find("<a href=") > 0: data.sourceURL = self.getAttributeValue(str(j),'<a href="') else: if c == 2: data.Hits=j.renderContents() #and few others... #... c = c+1 data.save() Any suggestions? Note: I already ask this question here but that was closed due to incomplete information.

    Read the article

  • Fastest way to do a weighted tag search in SQL Server

    - by Hasan Khan
    My table is as follows ObjectID bigint Tag nvarchar(50) Weight float Type tinyint I want to get search for all objects that has tags 'big' or 'large' I want the objectid in order of sum of weights (so objects having both the tags will be on top) select objectid, row_number() over (order by sum(weight) desc) as rowid from tags where tag in ('big', 'large') and type=0 group by objectid the reason for row_number() is that i want paging over results. The query in its current form is very slow, takes a minute to execute over 16 million tags. What should I do to make it faster? I have a non clustered index (objectid, tag, type) Any suggestions?

    Read the article

< Previous Page | 62 63 64 65 66 67 68 69 70 71 72 73  | Next Page >