Search Results

Search found 3481 results on 140 pages for 'convex optimization'.

Page 69/140 | < Previous Page | 65 66 67 68 69 70 71 72 73 74 75 76  | Next Page >

  • Get count matches in query on large table very slow

    - by Roy Roes
    I have a mysql table "items" with 2 integer fields: seid and tiid The table has about 35000000 records, so it's very large. seid tiid ----------- 1 1 2 2 2 3 2 4 3 4 4 1 4 2 The table has a primary key on both fields, an index on seid and an index on tiid. Someone types in 1 or more tiid values and now I would like to get the seid with most results. For example when someone types 1,2,3, I would like to get seid 2 and 4 as result. They both have 2 matches on the tiid values. My query so far: SELECT COUNT(*) as c, seid FROM items WHERE tiid IN (1,2,3) GROUP BY seid HAVING c = (SELECT COUNT(*) as c, seid FROM items WHERE tiid IN (1,2,3) GROUP BY seid ORDER BY c DESC LIMIT 1) But this query is extremly slow, because of the large table. Does anyone know how to construct a better query for this purpose?

    Read the article

  • Read large amount of data from file in Java

    - by Crozin
    Hello I've got text file that contains 1 000 002 numbers in following formation: 123 456 1 2 3 4 5 6 .... 999999 100000 Now I need to read that data and allocate it to int variables (the very first two numbers) and all the rest (1 000 000 numbers) to an array int[]. It's not a hard task, but - it's horrible slow. My first attempt was java.util.Scanner: Scanner stdin = new Scanner(new File("./path")); int n = stdin.nextInt(); int t = stdin.nextInt(); int array[] = new array[n]; for (int i = 0; i < n; i++) { array[i] = stdin.nextInt(); } It works as excepted but it takes about 7500 ms to execute. I need to fetch that data in up to several hundred of milliseconds. Then I tried java.io.BufferedReader: Using BufferedReader.readLine() and String.split() I got the same results in about 1700 ms, but it's still too many. How can I read that amount of data in less that 1 second? The final result should be equal to: int n = 123; int t = 456; int array[] = { 1, 2, 3, 4, ..., 999999, 100000 };

    Read the article

  • Efficient SQL to count an occurrence in the latest X rows

    - by pulegium
    For example I have: create table a (i int); Assume there are 10k rows. I want to count 0's in the last 20 rows. Something like: select count(*) from (select i from a limit 20) where i = 0; Is that possible to make it more efficient? Like a single SQL statement or something? PS. DB is SQLite3 if that matters at all...

    Read the article

  • What is the absolute fastest way to implement a concurrent queue with ONLY one consumer and one producer?

    - by JohnPristine
    java.util.concurrent.ConcurrentLinkedQueue comes to mind, but is it really optimum for this two-thread scenario? I am looking for the minimum latency possible on both sides (producer and consumer). If the queue is empty you can immediately return null AND if the queue is full you can immediately discard the entry you are offering. Does ConcurrentLinkedQueue use super fast and light locks (AtomicBoolean) ? Has anyone benchmarked ConcurrentLinkedQueue or knows about the ultimate fastest way of doing that? Additional Details: I imagine the queue should be a fair one, meaning the consumer should not make the consumer wait any longer than it needs (by front-running it) and vice-versa.

    Read the article

  • Overhead of serving pages - JSPs vs. PHP vs. ASPXs vs. C

    - by John Shedletsky
    I am interested in writing my own internet ad server. I want to serve billions of impressions with as little hardware possible. Which server-side technologies are best suited for this task? I am asking about the relative overhead of serving my ad pages as either pages rendered by PHP, or Java, or .net, or coding Http responses directly in C and writing some multi-socket IO monster to serve requests (I assume this one wins, but if my assumption is wrong, that would actually be most interesting). Obviously all the most efficient optimizations are done at the algorithm level, but I figure there has got to be some speed differences at the end of the day that makes one method of serving ads better than another. How much overhead does something like apache or IIS introduce? There's got to be a ton of extra junk in there I don't need. At some point I guess this is more a question of which platform/language combo is best suited - please excuse the in-adroitly posed question, hopefully you understand what I am trying to get at.

    Read the article

  • Performance considerations of a large hard-coded array in the .cs file

    - by terence
    I'm writing some code where performance is important. In one part of it, I have to compare a large set of pre-computed data against dynamic values. Currently, I'm storing that pre-computed data in a giant array in the .cs file: Data[] data = { /* my data set */ }; The data set is about 90kb, or roughly 13k elements. I was wondering if there's any downside to doing this, as opposed to loading it in from an external file? I'm not entirely sure how C# works internally, so I just wanted to be aware of any performance issues I might encounter with this method.

    Read the article

  • Help on MySQL table indexing when GROUP BY is used in a query

    - by Silver Light
    Thank you for your attention. There are two INNODB tables: Table authors id INT nickname VARCHAR(50) status ENUM('active', 'blocked') about TEXT Table books author_id INT title VARCHAR(150) I'm running a query against these tables, to get each author and a count of books he has: SELECT a. * , COUNT( b.id ) AS book_count FROM authors AS a, books AS b WHERE a.status != 'blocked' AND b.author_id = a.id GROUP BY a.id ORDER BY a.nickname This query is very slow (takes about 6 seconds to execute). I have an index on books.author_id and it works perfectly, but I do not know how to create an index on authors table, so that this query could use it. Here is how current EXPLAIN looks: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE a ALL PRIMARY,id_status_nickname NULL NULL NULL 3305 Using where; Using temporary; Using filesort 1 SIMPLE b ref key_author_id key_author_id 5 a.id 2 Using where; Using index I've looked at MySQL manual on optimizing queries with group by, but could not figure out how I can apply it on my query. I'll appreciate any help and hints on this - what must be the index structure, so that MySQL could use it?

    Read the article

  • Preventing objects from being linked if they are not needed?

    - by Massif
    I have an ARM project that I'm building with make. I'm creating the list of object files to link based on the names of all of the .c and .cpp files in my source directory. However, I would like to exclude objects from being linked if they are never used. Will the linker exclude these objects from the .elf file automatically even if I include them in the list of objects to link? If not, is there a way to generate a list of only the objects that need to be linked?

    Read the article

  • Fastest way to compare Objects of type DateTime

    - by radbyx
    I made this. Is this the fastest way to find lastest DateTime of my collection of DateTimes? I'm wondering if there is a method for what i'm doing inside the foreach, but even if there is, I can't see how it can be faster than what i all ready got. List<StateLog> stateLogs = db.StateLog.Where(p => p.ProductID == product.ProductID).ToList(); DateTime lastTimeStamp = DateTime.MinValue; foreach (var stateLog in stateLogs) { int result = DateTime.Compare(lastTimeStamp, stateLog.TimeStamp); if (result < 0) lastTimeStamp = stateLog.TimeStamp; // sæt fordi timestamp er senere }

    Read the article

  • MySql product\tag query optimisation - please help!

    - by Nige
    Hi There I have an sql query i am struggling to optimise. It basically is used to pull back products for a shopping cart. The products each have tags attached using a many to many table product_tag and also i pull back a store name from a separate store table. Im using group_concat to get a list of tags for the display (this is why i have the strange groupby orderby clauses at the bottom) and i need to order by dateadded, showing the latest scheduled product first. Here is the query.... SELECT products.*, stores.name, GROUP_CONCAT(tags.taglabel ORDER BY tags.id ASC SEPARATOR " ") taglist FROM (products) JOIN product_tag ON products.id=product_tag.productid JOIN tags ON tags.id=product_tag.tagid JOIN stores ON products.cid=stores.siteid WHERE dateadded < '2010-05-28 07:55:41' GROUP BY products.id ASC ORDER BY products.dateadded DESC LIMIT 2 Unfortunately even with a small set of data (3 tags and about 12 products) the query is taking 00.0034 seconds to run. Eventually i want to have about 2000 products and 50 tagsin this system (im guessing this will be very slooooow). Here is the ExplainSql... id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra 1|SIMPLE|tags|ALL|PRIMARY|NULL|NULL|NULL|4|Using temporary; Using filesort 1|SIMPLE|product_tag|ref|tagid,productid|tagid|4|cs_final.tags.id|2| 1|SIMPLE|products|eq_ref|PRIMARY,cid|PRIMARY|4|cs_final.product_tag.productid|1|Using where 1|SIMPLE|stores|ALL|siteid|NULL|NULL|NULL|7|Using where; Using join buffer Can anyone help?

    Read the article

  • Need help optimizing this Django aggregate query

    - by Chris Lawlor
    I have the following model class Plugin(models.Model): name = models.CharField(max_length=50) # more fields which represents a plugin that can be downloaded from my site. To track downloads, I have class Download(models.Model): plugin = models.ForiegnKey(Plugin) timestamp = models.DateTimeField(auto_now=True) So to build a view showing plugins sorted by downloads, I have the following query: # pbd is plugins by download - commented here to prevent scrolling pbd = Plugin.objects.annotate(dl_total=Count('download')).order_by('-dl_total') Which works, but is very slow. With only 1,000 plugins, the avg. response is 3.6 - 3.9 seconds (devserver with local PostgreSQL db), where a similar view with a much simpler query (sorting by plugin release date) takes 160 ms or so. I'm looking for suggestions on how to optimize this query. I'd really prefer that the query return Plugin objects (as opposed to using values) since I'm sharing the same template for the other views (Plugins by rating, Plugins by release date, etc.), so the template is expecting Plugin objects - plus I'm not sure how I would get things like the absolute_url without a reference to the plugin object. Or, is my whole approach doomed to failure? Is there a better way to track downloads? I ultimately want to provide users some nice download statistics for the plugins they've uploaded - like downloads per day/week/month. Will I have to calculate and cache Downloads at some point? EDIT: In my test dataset, there are somewhere between 10-20 Download instances per Plugin - in production I expect this number would be much higher for many of the plugins.

    Read the article

  • Word frequency tally script is too slow

    - by Dave Jarvis
    Background Created a script to count the frequency of words in a plain text file. The script performs the following steps: Count the frequency of words from a corpus. Retain each word in the corpus found in a dictionary. Create a comma-separated file of the frequencies. The script is at: http://pastebin.com/VAZdeKXs Problem The following lines continually cycle through the dictionary to match words: for i in $(awk '{if( $2 ) print $2}' frequency.txt); do grep -m 1 ^$i\$ dictionary.txt >> corpus-lexicon.txt; done It works, but it is slow because it is scanning the words it found to remove any that are not in the dictionary. The code performs this task by scanning the dictionary for every single word. (The -m 1 parameter stops the scan when the match is found.) Question How would you optimize the script so that the dictionary is not scanned from start to finish for every single word? The majority of the words will not be in the dictionary. Thank you!

    Read the article

  • Write file need to optimised for heavy traffic part 2

    - by Clayton Leung
    For anyone interest to see where I come from you can refer to part 1, but it is not necessary. write file need to optimised for heavy traffic Below is a snippet of code I have written to capture some financial tick data from the broker API. The code will run without error. I need to optimize the code, because in peak hours the zf_TickEvent method will be call more than 10000 times a second. I use a memorystream to hold the data until it reaches a certain size, then I output it into a text file. The broker API is only single threaded. void zf_TickEvent(object sender, ZenFire.TickEventArgs e) { outputString = string.Format("{0},{1},{2},{3},{4}\r\n", e.TimeStamp.ToString(timeFmt), e.Product.ToString(), Enum.GetName(typeof(ZenFire.TickType), e.Type), e.Price, e.Volume); fillBuffer(outputString); } public class memoryStreamClass { public static MemoryStream ms = new MemoryStream(); } void fillBuffer(string outputString) { byte[] outputByte = Encoding.ASCII.GetBytes(outputString); memoryStreamClass.ms.Write(outputByte, 0, outputByte.Length); if (memoryStreamClass.ms.Length > 8192) { emptyBuffer(memoryStreamClass.ms); memoryStreamClass.ms.SetLength(0); memoryStreamClass.ms.Position = 0; } } void emptyBuffer(MemoryStream ms) { FileStream outStream = new FileStream("c:\\test.txt", FileMode.Append); ms.WriteTo(outStream); outStream.Flush(); outStream.Close(); } Question: Any suggestion to make this even faster? I will try to vary the buffer length but in terms of code structure, is this (almost) the fastest? When memorystream is filled up and I am emptying it to the file, what would happen to the new data coming in? Do I need to implement a second buffer to hold that data while I am emptying my first buffer? Or is c# smart enough to figure it out? Thanks for any advice

    Read the article

  • C++ DWORD* to BYTE*

    - by NomeSkavinski
    My issue, i am trying to convert and array of dynamic memory of type DWORD to a BYTE. Fair enough i can for loop through this and convert the DWORD into a BYTE per entry. But is their a faster way to do this? to take a pointer to DWORD data and convert the whole piece of data into a pointer to BYTE data? such as using a memcpy operation? I feel this is not possible, im not requesting an answer just an experienced opinion on my approach, as i have tried testing both approaches but seem to fail getting to a solution on my second solution. Thanks for any input, again no answers just a point in the right direction. Nor is this a homework question, i felt that had to be mentioned.

    Read the article

  • Optimizing GDI+ drawing?

    - by user146780
    I'm using C++ and GDI+ I'm going to be making a vector drawing application and want to use GDI+ for the drawing. I'v created a simple test to get familiar with it: case WM_PAINT: GetCursorPos(&mouse); GetClientRect(hWnd,&rct); hdc = BeginPaint(hWnd, &ps); MemDC = CreateCompatibleDC(hdc); bmp = CreateCompatibleBitmap(hdc, 600, 600); SelectObject(MemDC,bmp); g = new Graphics(MemDC); for(int i = 0; i < 1; ++i) { SolidBrush sb(Color(255,255,255)); g->FillRectangle(&sb,rct.top,rct.left,rct.right,rct.bottom); } for(int i = 0; i < 250; ++i) { pts[0].X = 0; pts[0].Y = 0; pts[1].X = 10 + mouse.x * i; pts[1].Y = 0 + mouse.y * i; pts[2].X = 10 * i + mouse.x; pts[2].Y = 10 + mouse.y * i; pts[3].X = 0 + mouse.x; pts[3].Y = (rand() % 600) + mouse.y; Point p1, p2; p1.X = 0; p1.Y = 0; p2.X = 300; p2.Y = 300; g->FillPolygon(&b,pts,4); } BitBlt(hdc,0,0,900,900,MemDC,0,0,SRCCOPY); EndPaint(hWnd, &ps); DeleteObject(bmp); g->ReleaseHDC(MemDC); DeleteDC(MemDC); delete g; break; I'm wondering if I'm doing it right, or if I have areas killing the cpu. Because right now it takes ~ 1sec to render this and I want to be able to have it redraw itself very quickly. Thanks In a real situation would it be better just to figure out the portion of the screen to redraw and only redraw the elements withing bounds of this?

    Read the article

  • Optimizing near-duplicate value search

    - by GApple
    I'm trying to find near duplicate values in a set of fields in order to allow an administrator to clean them up. There are two criteria that I am matching on One string is wholly contained within the other, and is at least 1/4 of its length The strings have an edit distance less than 5% of the total length of the two strings The Pseudo-PHP code: foreach($values as $value){ foreach($values as $match){ if( ( $value['length'] < $match['length'] && $value['length'] * 4 > $match['length'] && stripos($match['value'], $value['value']) !== false ) || ( $match['length'] < $value['length'] && $match['length'] * 4 > $value['length'] && stripos($value['value'], $match['value']) !== false ) || ( abs($value['length'] - $match['length']) * 20 < ($value['length'] + $match['length']) && 0 < ($match['changes'] = levenshtein($value['value'], $match['value'])) && $match['changes'] * 20 <= ($value['length'] + $match['length']) ) ){ $matches[] = &$match; } } } I've tried to reduce calls to the comparatively expensive stripos and levenshtein functions where possible, which has reduced the execution time quite a bit. However, as an O(n^2) operation this just doesn't scale to the larger sets of values and it seems that a significant amount of the processing time is spent simply iterating through the arrays. Some properties of a few sets of values being operated on Total | Strings | # of matches per string | | Strings | With Matches | Average | Median | Max | Time (s) | --------+--------------+---------+--------+------+----------+ 844 | 413 | 1.8 | 1 | 58 | 140 | 593 | 156 | 1.2 | 1 | 5 | 62 | 272 | 168 | 3.2 | 2 | 26 | 10 | 157 | 47 | 1.5 | 1 | 4 | 3.2 | 106 | 48 | 1.8 | 1 | 8 | 1.3 | 62 | 47 | 2.9 | 2 | 16 | 0.4 | Are there any other things I can do to reduce the time to check criteria, and more importantly are there any ways for me to reduce the number of criteria checks required (for example, by pre-processing the input values), since there is such low selectivity?

    Read the article

  • Optimizing Code

    - by Claudiu
    You are given a heap of code in your favorite language which combines to form a rather complicated application. It runs rather slowly, and your boss has asked you to optimize it. What are the steps you follow to most efficiently optimize the code? What strategies have you found to be unsuccessful when optimizing code? Re-writes: At what point do you decide to stop optimizing and say "This is as fast as it'll get without a complete re-write." In what cases would you advocate a simple complete re-write anyway? How would you go about designing it?

    Read the article

  • Results from two queries at once in sqlite?

    - by SF.
    I'm currently trying to optimize the sluggish process of retrieving a page of log entries from the SQLite database. I noticed I almost always retrieve next entries along with count of available entries: SELECT time, level, type, text FROM Logs WHERE level IN (%s) ORDER BY time DESC, id DESC LIMIT LOG_REQ_LINES OFFSET %d* LOG_REQ_LINES ; together with total count of records that can match current query: SELECT count(*) FROM Logs WHERE level IN (%s); (for a display "page n of m") I wonder, if I could concatenate the two queries, and ask them both in one sqlite3_exec() simply concatenating the query string. How should my callback function look then? Can I distinguish between the different types of data by argc? What other optimizations would you suggest?

    Read the article

  • Does a c/c++ compiler optimize constant divisions by power-of-two value into shifts?

    - by porgarmingduod
    Question says it all. Does anyone know if the following... size_t div(size_t value) { const size_t x = 64; return value / x; } ...is optimized into? size_t div(size_t value) { return value >> 6; } Do compilers do this? (My interest lies in GCC). Are there situations where it does and others where it doesn't? I would really like to know, because every time I write a division that could be optimized like this I spend some mental energy wondering about whether precious nothings of a second is wasted doing a division where a shift would suffice.

    Read the article

  • Iterative Reduction to Null Matrix

    - by user1459032
    Here's the problem: I'm given a matrix like Input: 1 1 1 1 1 1 1 1 1 At each step, I need to find a "second" matrix of 1's and 0's with no two 1's on the same row or column. Then, I'll subtract the second matrix from the original matrix. I will repeat the process until I get a matrix with all 0's. Furthermore, I need to take the least possible number of steps. I need to print all the "second" matrices in O(n) time. In the above example I can get to the null matrix in 3 steps by subtracting these three matrices in order: Expected output: 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 I have coded an attempt, in which I am finding the first maximum value and creating the second matrices based on the index of that value. But for the above input I am getting 4 output matrices, which is wrong: My output: 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 My solution works for most of the test cases but fails for the one given above. Can someone give me some pointers on how to proceed, or find an algorithm that guarantees optimality? Test case that works: Input: 0 2 1 0 0 0 3 0 0 Output 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0

    Read the article

  • Graph search problem with route restrictions

    - by Darcara
    I want to calculate the most profitable route and I think this is a type of traveling salesman problem. I have a set of nodes that I can visit and a function to calculate cost for traveling between nodes and points for reaching the nodes. The goal is to reach a fixed known score while minimizing the cost. This cost and rewards are not fixed and depend on the nodes visited before. The starting node is fixed. There are some restrictions on how nodes can be visited. Some simplified examples include: Node B can only be visited after A After node C has been visited, D or E can be visited. Visiting at least one is required, visiting both is permissible. Z can only be visited after at least 5 other nodes have been visited Once 50 nodes have been visited, the nodes A-M will no longer reward points Certain nodes can (and probably must) be visited multiple times Currently I can think of only two ways to solve this: a) Genetic Algorithms, with the fitness function calculating the cost/benefit of the generated route b) Dijkstra search through the graph, since the starting node is fixed, although the large number of nodes will probably make that not feasible memory wise. Are there any other ways to determine the best route through the graph? It doesn't need to be perfect, an approximated path is perfectly fine, as long as it's error acceptable. Would TSP-solvers be an option here?

    Read the article

< Previous Page | 65 66 67 68 69 70 71 72 73 74 75 76  | Next Page >