Search Results

Search found 5070 results on 203 pages for 'algorithm'.

Page 18/203 | < Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25  | Next Page >

  • mysql/algorithm: Weighting an average to accentuate differences from the mean

    - by Sai Emrys
    This is for a new feature on http://cssfingerprint.com (see /about for general info). The feature looks up the sites you've visited in a database of site demographics, and tries to guess what your demographic stats are based on that. All my demgraphics are in 0..1 probability format, not ratios or absolute numbers or the like. Essentially, you have a large number of data points that each tend you towards their own demographics. However, just taking the average is poor, because it means that by adding in a lot of generic data, the number goes down. For example, suppose you've visited sites S0..S50. All except S0 are 48% female; S0 is 100% male. If I'm guessing your gender, I want to have a value close to 100%, not just the 49% that a straight average would give. Also, consider that most demographics (i.e. everything other than gender) does not have the average at 50%. For example, the average probability of having kids 0-17 is ~37%. The more a given site's demographics are different from this average (e.g. maybe it's a site for parents, or for child-free people), the more it should count in my guess of your status. What's the best way to calculate this? For extra credit: what's the best way to calculate this, that is also cheap & easy to do in mysql?

    Read the article

  • Creating combinations that have no more one intersecting element

    - by khuss
    I am looking to create a special type of combination in which no two sets have more than one intersecting element. Let me explain with an example: Let us say we have 9 letter set that contains A, B, C, D, E, F, G, H, and I If you create the standard non-repeating combinations of three letters you will have 9C3 sets. These will contain sets like ABC, ABD, BCD, etc. I am looking to create sets that have at the most only 1 common letter. So in this example, we will get following sets: ABC, ADG, AEI, AFH, BEH, BFG, BDI, CFI, CDH, CEG, DEF, and GHI - note that if you take any two sets there are no more than 1 repeating letter. What would be a good way to generate such sets? It should be scalable solution so that I can do it for a set of 1000 letters, with sub set size of 4. Any help is highly appreciated. Thanks

    Read the article

  • Binary Search Tree for specific intent

    - by Luís Guilherme
    We all know there are plenty of self-balancing binary search trees (BST), being the most famous the Red-Black and the AVL. It might be useful to take a look at AA-trees and scapegoat trees too. I want to do deletions insertions and searches, like any other BST. However, it will be common to delete all values in a given range, or deleting whole subtrees. So: I want to insert, search, remove values in O(log n) (balanced tree). I would like to delete a subtree, keeping the whole tree balanced, in O(log n) (worst-case or amortized) It might be useful to delete several values in a row, before balancing the tree I will most often insert 2 values at once, however this is not a rule (just a tip in case there is a tree data structure that takes this into account) Is there a variant of AVL or RB that helps me on this? Scapegoat-trees look more like this, but would also need some changes, anyone who has got experience on them can share some thougts? More precisely, which balancing procedure and/or removal procedure would help me keep this actions time-efficient?

    Read the article

  • algorithm for checking addresses for matches?

    - by user151841
    I'm working on a survey program where people will be given promotional considerations the first time they fill out a survey. In a lot of scenarios, the only way we can stop people from cheating the system and getting a promotion they don't deserve is to check street address strings against each other. I was looking at using levenshtein distance to give me a number to measure similarity, and consider those below a certain threshold a duplicate. However, if someone were looking to game the system, they could easily write "S 5th St" instead of "South Fifth Street", and levenshtein would consider those strings to be very different. So then I was thinking to convert all strings to a 'standard address form' i.e. 'South' becomes 's', 'Fifth' becomes '5th', etc. Then I was thinking this is hopeless, and too much effort to get it working robustly. Is it? I'm working with PHP/MySql, so I have the limitations inherent in that system.

    Read the article

  • open source gossip-based membership protocol?

    - by Aaron
    I am looking for a library which I can plug into a distributed application which implements any gossip-based membership protocol. Such a library would allow me to send/receive membership lists, merge received membership lists, etc... Even better would be if the library implemented a protocol with performance O(logn) performance guarantees. Does anyone know of any open source library like this? It doesn't need to meet all of the aforementioned requirements; even something partially implemented would be helpful.

    Read the article

  • Using static vs. member find method on a STL set?

    - by B Johnson
    I am using a set because, i want to use the quick look up property of a sorted container such as a set. I am wondering if I have to use the find member method to get the benefit of a sorted container, or can I also use the static find method in the STL algorithms? My hunch is that using the static version will use a linear search instead of a binary search like I want.

    Read the article

  • Better algorithm for estimating download time

    - by Scott Smith
    We've all seen the download time running estimate that initially says something like "7 days", but keeps dropping wildly (e.g. "23 hours", "45 minutes", "1 min. 50 sec", etc) with each successive estimation as the chunks are downloaded. To avoid these initial (alarming) estimates, there are techniques one could try like suppressing display of the first n estimates, or waiting for the delta between estimates to drop below some threshold before you start displaying them, but these don't seem like a general, robust solution. There are corner cases involving too few samples, or samples that actually are wildly varying... I think I recall a general solution for this kind of thing in mathematics (statistics?) that reduced or eliminated these wild values. Does anyone know?

    Read the article

  • algorithm advice for finding maximum items within a time period

    - by darren
    Hi everyone. I have a database schema that is similar to the following: | User | Event | Date |--------|---------------|------ | 111 | Walked dog | 2009-10-1 | 222 | Walked dog | 2009-10-2 | 333 | Fed Fish | 2009-10-5 | 222 | Did Laundry | 2009-10-6 | 111 | Fed Fish | 2009-10-7 | 111 | Walked dog | 2009-10-18 | 222 | Walked dog | 2009-10-19 | 111 | Fed Fish | 2009-10-21 I would like to produce a query that returns the maximum number of times a user performs some action within a time period. For example, given a time period of 5 days, what is the maximum number of times user 111 walked the dog? The most obvious solution would be to start at some zero point and move forward each day, summing up 5 day periods along the way, then taking the maximum total out of all the 5 day windows. the approach seems incredibly costly however. I would appreciate any suggestions you may have.

    Read the article

  • Sorting data by relevance, from multiple tables

    - by Oden
    Hey, How is it possible to sort data from multiple tables by relevance? My table structure is following: I have 3 tables in my database, one table contains the name of solar systems, the second for e.g. of planets. There is one more table, witch is a connection between solar systems and planets. If I want to get data of a planet, witch is in the Milky Way, i post this data to the server, and it gives me a multi-dimensional array witch contains: The Milky Way, with every planet in it Every planet, witch name contains the string Milky Way (maybe thats a bat example because i don't think that theres but one planet with this name, but the main concept is on file) But, i want to set the most relevant restaurants to the top of the array. (for the relevance i would check the description of the restaurants or something like that) So, how would you do that kind of data sorting?

    Read the article

  • Algorithm for bitwise fiddling

    - by EquinoX
    If I have a 32-bit binary number and I want to replace the lower 16-bit of the binary number with a 16-bit number that I have and keep the upper 16-bit of that number to produce a new binary number.. how can I do this using simple bitwise operator? For example the 32-bit binary number is: 1010 0000 1011 1111 0100 1000 1010 1001 and the lower 16-bit I have is: 0000 0000 0000 0001 so the result is: 1010 0000 1011 1111 0000 0000 0000 0001 how can I do this?

    Read the article

  • Algorithm to split an array into N groups based on item index (should be something simple)

    - by serg
    I feel that it should be something very simple and obvious but just stuck on this for the last half an hour and can't move on. All I need is to split an array of elements into N groups based on element index. For example we have an array of 30 elements [e1,e2,...e30], that has to be divided into N=3 groups like this: group1: [e1, ..., e10] group2: [e11, ..., e20] group3: [e21, ..., e30] I came up with nasty mess like this for N=3 (pseudo language, I left multiplication on 0 and 1 just for clarification): for(i=0;i<array_size;i++) { if(i>=0*(array_size/3) && i<1*(array_size/3) { print "group1"; } else if(i>=1*(array_size/3) && i<2*(array_size/3) { print "group2"; } else if(i>=2*(array_size/3) && i<3*(array_size/3) print "group3"; } } But what would be the proper general solution? Thanks.

    Read the article

  • Algorithm for finding similar users through a join table

    - by Gdeglin
    I have an application where users can select a variety of interests from around 300 possible interests. Each selected interest is stored in a join table containing the columns user_id and interest_id. Typical users select around 50 interests out of the 300. I would like to build a system where users can find the top 20 users that have the most interests in common with them. Right now I am able to accomplish this using the following query: SELECT i2.user_id, count(i2.interest_id) AS count FROM interests_users as i1, interests_users as i2 WHERE i1.interest_id = i2.interest_id AND i1.user_id = 35 GROUP BY i2.user_id ORDER BY count DESC LIMIT 20; However, this query takes approximately 500 milliseconds to execute with 10,000 users and 500,000 rows in the join table. All indexes and database configuration settings have been tuned to the best of my ability. I have also tried avoiding the use of joins altogether using the following query: select user_id,count(interest_id) count from interests_users where interest_id in (13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,508) group by user_id order by count desc limit 20; But this one is even slower (~800 milliseconds). How could I best lower the time that I can gather this kind of data to below 100 milliseconds? I have considered putting this data into a graph database like Neo4j, but I am not sure if that is the easiest solution or if it would even be faster than what I am currently doing.

    Read the article

  • looking for a set union find algorithm

    - by Mig
    I have thousands of lines of 1 to 100 numbers, every line define a group of numbers and a relationship among them. I need to get the sets of related numbers. Little Example: If I have this 7 lines of data T1 T2 T3 T4 T5 T6 T1 T5 T4 T3 T4 I need a not so slow algorith to know that the sets here are: T1 T2 T6 (because T1 is related with T2 in the first line and T1 related with T6 in the line 5) T3 T4 T5 (because T5 is with T4 in line 6 and T3 is with T4 in line 7) but when you have very big sets is painfully slow to do a search of a T(x) in every big set, and do unions of sets... etc. Do you have a hint to do this in a not so brute force manner? I'm trying to do this in python. Thanks

    Read the article

  • A graph-based tuple merge?

    - by user1644030
    I have paired values in tuples that are related matches (and technically still in CSV files). Neither of the paired values are necessarily unique. tupleAB = (A####, B###), (A###, B###), (A###, B###)... tupleBC = (B####, C###), (B###, C###), (B###, C###)... tupleAC = (A####, C###), (A###, C###), (A###, C###)... My ideal output would be a dictionary with a unique ID and a list of "reinforced" matches. The way I try to think about it is in a graph-based context. For example, if: tupleAB[x] = (A0001, B0012) tupleBC[y] = (B0012, C0230) tupleAC[z] = (A0001, C0230) This would produce: output = {uniquekey0001, [A0001, B0012, C0230]} Ideally, this would also be able to scale up to more than three tuples (for example, adding a "D" match that would result in an additional three tuples - AD, BD, and CD - and lists of four items long; and so forth). In regards to scaling up to more tuples, I am open to having "graphs" that aren't necessarily fully connected, i.e., every node connected to every other node. My hunch is that I could easily filter based on the list lengths. I am open to any suggestions. I think, with a few cups of coffee, I could work out a brute force solution, but I thought I'd ask the community if anyone was aware of a more elegant solution. Thanks for any feedback.

    Read the article

  • Algorithm Question Maximize Average of Functions

    - by paradoxperfect
    Hello, I have a set of N functions each denoted by Fi(h). Each function returns some value when given an h. I'm trying to figure out a way to maximize the average of all of the functions given some total H value. For example, say each function represents a grade on an assignment. If I spend h hours on assignment i, I will get g = Fi(h) as my grade. I'm given H hours to finish all of the assignments. I want to maximize my average grade for all assignments. Can anyone point me in the right direction to figure this out?

    Read the article

  • decoding algorithm wanted

    - by Horace Ho
    I receive encoded PDF files regularly. The encoding works like this: the PDFs can be displayed correctly in Acrobat Reader select all and copy the test via Acrobat Reader and paste in a text editor will show that the content are encoded so, examples are: 13579 -> 3579; hello -> jgnnq it's basically an offset (maybe swap) of ASCII characters. The question is how can I find the offset automatically when I have access to only a few samples. I cannot be sure whether the encoding offset is changed. All I know is some text will usually (if not always) show up, e.g. "Name:", "Summary:", "Total:", inside the PDF. Thank you!

    Read the article

  • Operant conditioning algorithm?

    - by Ken
    What's the best way to implement real time operant conditioning (supervised reward/punishment-based learning) for an agent? Should I use a neural network (and what type)? Or something else? I want the agent to be able to be trained to follow commands like a dog. The commands would be in the form of gestures on a touchscreen. I want the agent to be able to be trained to follow a path (in continuous 2D space), make behavioral changes on command (modeled by FSM state transitions), and perform sequences of actions. The agent would be in a simulated physical environment.

    Read the article

  • what is the easiest way to do this function in c# ?

    - by From.ME.to.YOU
    Hello let say that we have an array [5,5] 01,02,03,04,05 06,07,08,09,10 11,12,13,14,15 16,17,18,19,20 21,22,23,24,25 the user should send 2 values to the function (start,searchFOR) for example (13,25) the function should search for that value in this way 07,08,09 12, ,14 17,18,19 if the value is n't found in this level it will goes a level higher 01,02,03,04,05 06, , , ,10 11, , , ,15 16, , , ,20 21,22,23,24,25 if the array is bigger than this and the value didn't found it will go to a level higher Thanks for your help

    Read the article

  • algorithm to find Best 8 minute window in a 1 hour run

    - by Arun
    I have a requirement like, an activity runs for about more than an hour. I need to get the best 8 minute window where some parameters are maximum. say a value x, which is dynamic for every second. if my activity runs for one hr,i get 3600 values for x. I need to find the best continuous 8 minute time interval where x value was the highest among all the others. if i capture say from 0th minute to 8th minute, there may be another time frame like 0.4 to 8.4 where it was maximum. the granularity is one second. every second we need to consider. basically the peak 8 minute window where x was maximum. please help me with the design

    Read the article

  • Efficient algorithm for Next button on a MySQL result set

    - by David Grayson
    I have a website that lets people view rows in a table (each row is a picture). There are more than 100,000 rows. You can view different subsets of the rows, and you can view them with different sort orders. While you are viewing one of the rows, you can click the "Next" or "Previous" buttons to go the next/previous row in the list. How would you implement the "Next" and "Previous" features of the website? More specifically, if you have an arbitrary query that returns a list of up to 100,000+ rows, and you know some information about the current row someone is viewing, how do you determine the NEXT row efficiently? Here is the pseudo-code of the solution I came up with when the website was young, and it worked well when there were only 1000 rows, but now that there are 100,000 rows I think it is eating up too much memory. int nextRowId(string query, int currentRowId) { array allRowIds = mysql_query(query); // Takes up a lot of memory! int currentIndex = (index of currentRowId in allRowIds); // Takes time! return allRowIds[currentIndex+1]; } While you are thinking about this problem, remember that the website can store more information about the current row than just its ID (for example, the position of the current row in the result set), and this information can be used as a hint to help determine the ID of the next row. Edit: Sorry for not mentioning this earlier, but this isn't just a static website: rows can often be added to the list, and rows can be re-ordered in the list. (Much rarer, rows can be removed from the list.) I think that I should worry about that kind of thing, but maybe you can convince me otherwise.

    Read the article

  • Algorithm to split an article without breaking the reading flow or HTML code

    - by Victor Stanciu
    Hello, I have a very large database of articles, of varying lengths. The articles have HTML elements in them. I have to insert some ads (simple <script> elements) in the body of each article when it is displayed (I know, I hate ads that interrupt my reading too). Now, the problem is that each ad must be inserted at about the same position in each article. The simplest solution is to simply split the article on a fixed number of characters (without breaking words), and insert the ad code. This, however, runs the risk of inserting the ad in the middle of a HTML tag. I could go the regex way, but I was thinking about the following solution, using JS: Establish a character count threshold. For example, "the add should be inserted at about 200 words" Set accepted deviations in each direction, say -20, +20 characters. Loop through each text node inside the article, and while doing so, keep count of the total number of characters so far Once the count exceeds the threshold, make the following decision: 4.1. If count exceeds the threshold by a value lower that the positive accepted deviation (for example, 17 characters), insert the ad code just after the current text node. 4.2. If the count is greater than the sum of the threshold and the deviation, roll back to the previous text node, and make the same decision, only this time use the previous count and check if it's lower than the difference between the threshold and the deviation, and if not, insert the ad between the current node and the previous one. 4.3. If the 4.1 and 4.2 fail (which means that the previous node reached a too low character count and the current node a too high one), insert the ad after whatever character count is needed inside the current element. I know it's convoluted, but it's the first thing out of my mind and it has the advantage that, by trying to insert the ad between text nodes, perhaps it will not break the flow of the article as bad as it would if I would just stick it in (like the final 4.3 case) Here is some pseudo-code I put together, I don't trust my english-explaining skills: threshold = 200 deviation = 20 current_count = 0 for each node in article_nodes { previous_count = current_count current_count = current_count + node.length if current_count < threshold { continue // next interation } if current_count > threshold + deviation { if previous_count < threshdold - deviation { // insert ad in current node } else { // insert ad between the current and previous nodes } } else { // insert ad after the current node } break; } Am I over-complicating stuff, or am I missing a simpler, more elegant solution?

    Read the article

  • Radial Grid Search Algorithm

    - by grey
    I'm sure there's a clean way to do this, but I'm probably not using the right keywords for find it. So let's say I have a grid. Starting from a position on the grid, return all of the grid coordinates that fall within a given distance. So I call something like: getCoordinates( currentPosition, distance ) And for each coordinate, starting from the initial position, add all cardinal directions, and then add the spaces around those and so forth until the distance is reached. I imagine that on a grid this would look like a diamond.

    Read the article

  • Algorithm for Determining Variations of Differing Lengths

    - by joseph.ferris
    I have four objects - for the sake of arguments, let say that they are the following letters: A B C D I need to calculate the number of variations that can be made for these under the following two conditions: No repetition Objects are position agnostic Taking the above, this means that with a four object sequence, I can have only one sequence that matches the criteria (since order is not considered for being unique): ABCD There are four variations for a three object combination from the four object pool: ABC, ABD, ACD, and BCD There are six variations for a two object combination from the four object pool: AB, AC, AD, BC, BD, and CD And the most simple one, if taken on at a time: A, B, C, and D I swear that this was something covered in school, many, many years ago - and probably forgotten since I didn't think I would use it. :-) I am anticipating that factorials will come into play, but just trying to force an equation is not working. Any advice would be appreciated.

    Read the article

< Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25  | Next Page >