Search Results

Search found 26127 results on 1046 pages for 'google penguin algorithm'.

Page 332/1046 | < Previous Page | 328 329 330 331 332 333 334 335 336 337 338 339  | Next Page >

  • finding long repeated substrings in a massive string

    - by Will
    I naively imagined that I could build a suffix trie where I keep a visit-count for each node, and then the deepest nodes with counts greater than one are the result set I'm looking for. I have a really really long string (hundreds of megabytes). I have about 1 GB of RAM. This is why building a suffix trie with counting data is too inefficient space-wise to work for me. To quote Wikipedia's Suffix tree: storing a string's suffix tree typically requires significantly more space than storing the string itself. The large amount of information in each edge and node makes the suffix tree very expensive, consuming about ten to twenty times the memory size of the source text in good implementations. The suffix array reduces this requirement to a factor of four, and researchers have continued to find smaller indexing structures. And that was wikipedia's comments on the tree, not trie. How can I find long repeated sequences in such a large amount of data, and in a reasonable amount of time (e.g. less than an hour on a modern desktop machine)? (Some wikipedia links to avoid people posting them as the 'answer': Algorithms on strings and especially Longest repeated substring problem ;-) )

    Read the article

  • How can you Merge sort a .Net framework LinkedList (of T)

    - by Andronicus
    There's a few questions discussing Merge sorting a LinkedList, but how can I do it with the C# LinkedList? Since the LinkedListNode Next and Previous properties are read-only most of the in-place algorithms I've come across are not possible. I've resorted to removing all the nodes, sorting them with a .OrderBy(node = node.Value) and then re-inserting them into the Linked list, but it's fairly crude.

    Read the article

  • Efficient way to get highly correlated pairs from large data set in Python or R

    - by Akavall
    I have a large data set (Let's say 10,000 variables with about 1000 elements each), we can think of it as 2D list, something like: [[variable_1], [variable_2], ............ [variable_n] ] I want to extract highly correlated variable pairs from that data. I want "highly correlated" to be a parameter that I can choose. I don't need all pairs to be extracted, and I don't necessarily want the most correlated pairs. As long as there is an efficient method that gets me highly correlated pairs I am happy. Also, it would be nice if a variable does not show up in more than one pair. Although this might not be crucial. Of course, there is a brute force way to finding such pairs, but it is too slow for me. I've googled around for a bit and found some theoretical work on this issue, but I wasn't able for find a package that could do what I am looking for. I mostly work in python, so a package in python would be most helpful, but if there exists a package in R that does what I am looking for it will be great. Does anyone know of a package that does the above in Python or R? Or any other ideas? Thank You in Advance

    Read the article

  • Enumerating large (20-digit) [probable] prime numbers

    - by Paul Baker
    Given A, on the order of 10^20, I'd like to quickly obtain a list of the first few prime numbers greater than A. OK, my needs aren't quite that exact - it's alright if occasionally a composite number ends up on the list. What's the fastest way to enumerate the (probable) primes greater than A? Is there a quicker way than stepping through all of the integers greater than A (other than obvious multiples of say, 2 and 3) and performing a primality test for each of them? If not, and the only method is to test each integer, what primality test should I be using?

    Read the article

  • which is time consuming construct in following program?

    - by user388338
    while submitting a solution for practise problem 6(odd) i got TLE error but while using using print and scanf in place cin and cout my sol was submitted successfully with 0.77s time..i want to know how can i make it more efficient link to problem is codechef problem 6 #include<iostream> #include<cstdio> using namespace std; int main() {int n,N; scanf("%d",&n); for(int l=0;l<n;l++) { scanf("%d",&N); int i=0,x; if(N<=0) continue; for(;N>=(x=(2<<i));i++); printf("%d",x/2); cout<<"\n"; } }

    Read the article

  • Show me some cool python list comprehensions

    - by christangrant
    One of the major strengths of python and a few other (functional) programming languages are the list comprehension. They allow programmers to write complex expressions in 1 line. They may be confusing at first but if one gets used to the syntax, it is much better than nested complicated for loops. With that said, please share with me some of the coolest uses of list comprehensions. (By cool, I just mean useful) It could be for some programming contest, or a production system. For example: To do the transpose of a matrix mat >>> mat = [ ... [1, 2, 3], ... [4, 5, 6], ... [7, 8, 9], ... ] >>> [[row[i] for row in mat] for i in [0, 1, 2]] [[1, 4, 7], [2, 5, 8], [3, 6, 9]] Please include a description of the expression and where it was used (if possible).

    Read the article

  • Interview question : What is the fastest way to generate prime number recursively ?

    - by hilal
    Generation of prime number is simple but what is the fastest way to find it and generate( prime numbers) it recursively ? Here is my solution. However, it is not the best way. I think it is O(N*sqrt(N)). Please correct me, if I am wrong. public static boolean isPrime(int n) { if (n < 2) { return false; } else if (n % 2 == 0 & n != 2) { return false; } else { return isPrime(n, (int) Math.sqrt(n)); } } private static boolean isPrime(int n, int i) { if (i < 2) { return true; } else if (n % i == 0) { return false; } else { return isPrime(n, --i); } } public static void generatePrimes(int n){ if(n < 2) { return ; } else if(isPrime(n)) { System.out.println(n); } generatePrimes(--n); } public static void main(String[] args) { generatePrimes(200); }

    Read the article

  • What is the most efficient method to find x contiguous values of y in an array?

    - by Alec
    Running my app through callgrind revealed that this line dwarfed everything else by a factor of about 10,000. I'm probably going to redesign around it, but it got me wondering; Is there a better way to do it? Here's what I'm doing at the moment: int i = 1; while ( ( (*(buffer++) == 0xffffffff && ++i) || (i = 1) ) && i < desiredLength + 1 && buffer < bufferEnd ); It's looking for the offset of the first chunk of desiredLength 0xffffffff values in a 32 bit unsigned int array. It's significantly faster than any implementations I could come up with involving an inner loop. But it's still too damn slow.

    Read the article

  • What are the core mathematical concepts a good developer should know?

    - by Jose B.
    Since Graduating from a very small school in 2006 with a badly shaped & outdated program (I'm a foreigner & didn't know any better school at the time) I've come to realize that I missed a lot of basic concepts from a mathematical & software perspective that are mostly the foundations of other higher concepts. I.e. I tried to listen/watch the open courseware from MIT on Introduction to Algorithms but quickly realized I was missing several mathematical concepts to better understand the course. So what are the core mathematical concepts a good software engineer should know? And what are the possible books/sites you will recommend me?

    Read the article

  • Stuck on solving the Minimal Spanning Tree problem.

    - by kunjaan
    I have reduced my problem to finding the minimal spanning tree in the graph. But I want to have one more constraint which is that the total degree for each vertex shouldnt exceed a certain constant factor. How do I model my problem? Is MST the wrong path? Do you know any algorithms that will help me? One more problem: My graph has duplicate edge weights so is there a way to count the number of unique MSTs? Are there algorithms that do this? Thank You. Edit: By degree, I mean the total number of edges connecting the vertex. By duplicate edge weight I mean that two edges have the same weight.

    Read the article

  • 3-clique counting in a graph

    - by Legend
    I am operating on a (not so) large graph having about 380K edges. I wrote a program to count the number of 3-cliques in the graph. A quick example: List of edges: A - B B - C C - A C - D List of cliques: A - B - C A 3-clique is nothing but a triangle in a graph. Currently, I am doing this using PHP+MySQL. As expected, it is not fast enough. Is there a way to do this in pure MySQL? (perhaps a way to insert all 3-cliques into a table?)

    Read the article

  • What is the best book for learning about Algorithms?

    - by sheats
    I know what algorithms are, but I have never consciously used or created one for any of the programming that I have done. So I'd like to get a book about the subject - I'd prefer if it was in python but that's not a strict requirement. What book about algorithms helped you most to understand, use, and create algorithms? One book per answer so they can be voted on...

    Read the article

  • Are there any well-known algorithms or computer models that computer scientists use to predict FIFA

    - by Khnle
    Occasionally I read news articles that mention about some computer models that computer scientists use to predict winners of some sporting events or the odds for betting which I think there must be a mathematical model behind it. I never bothered to think twice even though I am a "pseudo computer scientist" myself. With the 2010 FIFA World Cup just underway, and since I am also a "pseudo football/soccer player" myself, I just started to wonder about these calculations algorithms. For example, I know one factor is determining the strength of opponents, so that a win against a strong opponent can count more than a win against a weak opponent. But it now kind of gets in a circular loop, or at least how does one determine the strength of a team in the first place, before that team can be considered strong or weak? If it's based on a historical data then there's no way that could be accurate, because those players of the past are no longer on the fields so their impact is none (except maybe if they become coaches like Maradona) Anyway, long question short, if you're happen to be working in this field or have some knowledge, please shed some lights.

    Read the article

  • how to fast compute distance between high dimension vectors

    - by chyojn
    assume there are three group of high dimension vectors: {a_1, a_2, ..., a_N}, {b_1, b_2, ... , b_N}, {c_1, c_2, ..., c_N}. each of my vector can be represented as: x = a_i + b_j + c_k, where 1 <=i, j, k <= N. then the vector is encoded as (i, j, k) wich is then can be decoded as x = a_i + b_j + c_k. my question is, if there are two vector: x = (i_1, j_1, k_1), y = (i_2, j_2, k_2), is there a method to compute the euclidian distance of these two vector without decode x and y.

    Read the article

  • GWT Calendrical Calculations

    - by Kyle Hayes
    We have a GWT application that needs to display various holidays. Is there a library available to do these calendrical calculations? If not, we'll have to do our own that we can ingest a set of rules to. Cheers

    Read the article

  • Collision free hash function for a specific data structure

    - by Max
    Is it possible to create collision free hash function for a data structure with specific properties. The datastructure is int[][][] It contains no duplicates The range of integers that are contained in it is defined. Let's say it's 0..1000, the maximal integer is definitely not greater than 10000. Big problem is that this hash function should also be very fast. Is there a way to create such a hash function? Maybe at run time depending on the integer range?

    Read the article

  • more efficient version of this?

    - by john connor
    i have this thingy here : function numOfPackets(bufferSize, packetSize) { if (bufferSize <= 0 || packetSize > bufferSize) return 0; if (packetSize < 0) throw Error(); var out = 0; for(;;){ out++; bufferSize = bufferSize - packetSize; if( packetSize > bufferSize ) break; } return out; } which i run at often , can u give me more efficent variant of it?

    Read the article

  • Automatic images translation to 3d model

    - by farrakhov-bulat
    I'm quite interested in automatic images translation to 3d models. Not really for commercial product, but from the point of possible academic research and implementation. What I'd like to achieve is almost transparent for user process of transformation series of images (fewer is better) to 3d model which might be shown in flash/silverlight/javafx or similar. Consider online furniture store with 3d models of all items in stock. Kinda cool to have ability to see the product in 3d before purchasing it. I managed to find a few pieces of software, like insight3d, but it couldn't be used in my case I guess. So, are there any similar projects or tips for me? If it would require to write that piece of software - I'd really love to dig into research on this field.

    Read the article

  • skipping certain number of frames on a timeline

    - by clamp
    hi, i have a mathematical problem which is a bit hard to describe, but i'll give it a try anyway. in a timeline, i have a number of frames, of which i want to skip a certain number of frames, which should be evenly distributed along the timeline. for example i have 10 frames and i want to skip 5, then the solution is easy: we skip every second frame. 10/5 = 2 if (frame%2 == 0) skip(); but what if the above division does result in a floating number? for example in 44 frames i want to skip 15 times. how can i determine the 15 frames which should be skipped? thanks!

    Read the article

  • finding the maximum in series

    - by peril brain
    I need to know a code that will automatically:- search a specific word in excel notes it row or column number (depends on data arrangement) searches numerical type values in the respective row or column with that numeric value(suppose a[7][0]or a[0][7]) it compares all other values of respective row or column(ie. a[i][0] or a[0][i]) sets that value to the highest value only if IT HAS GOT NO FORMULA FOR DERIVATION i know most of coding but at a few places i got myself stuck... i'm writing a part of my program upto which i know: using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.IO; using System.Threading; using Microsoft.Office.Interop; using Excel = Microsoft.Office.Interop.Excel; Excel.Application oExcelApp; namespace a{ class b{ static void main(){ try { oExcelApp = (Excel.Application)System.Runtime.InteropServices.Marshal.GetActiveObject("Excel.Application"); ; if(oExcelApp.ActiveWorkbook != null) {Excel.Workbook xlwkbook = (Excel.Workbook)oExcelApp.ActiveWorkbook; Excel.Worksheet ws = (Excel.Worksheet)xlwkbook.ActiveSheet; Excel.Range rn; rn = ws.Cells.Find("maximum", Type.Missing, Excel.XlFindLookIn.xlValues, Excel.XlLookAt.xlPart,Excel.XlSearchOrder.xlByRows, Excel.XlSearchDirection.xlNext, false, Type.Missing, Type.Missing); }}} now ahead of this i only know tat i have to use cell.value2 ,cell.hasformula methods..... & no more idea can any one help me with this..

    Read the article

  • Why is Dictionary.First() so slow?

    - by Rotsor
    Not a real question because I already found out the answer, but still interesting thing. I always thought that hash table is the fastest associative container if you hash properly. However, the following code is terribly slow. It executes only about 1 million iterations and takes more than 2 minutes of time on a Core 2 CPU. The code does the following: it maintains the collection todo of items it needs to process. At each iteration it takes an item from this collection (doesn't matter which item), deletes it, processes it if it wasn't processed (possibly adding more items to process), and repeats this until there are no items to process. The culprit seems to be the Dictionary.Keys.First() operation. The question is why is it slow? Stopwatch watch = new Stopwatch(); watch.Start(); HashSet<int> processed = new HashSet<int>(); Dictionary<int, int> todo = new Dictionary<int, int>(); todo.Add(1, 1); int iterations = 0; int limit = 500000; while (todo.Count > 0) { iterations++; var key = todo.Keys.First(); var value = todo[key]; todo.Remove(key); if (!processed.Contains(key)) { processed.Add(key); // process item here if (key < limit) { todo[key + 13] = value + 1; todo[key + 7] = value + 1; } // doesn't matter much how } } Console.WriteLine("Iterations: {0}; Time: {1}.", iterations, watch.Elapsed); This results in: Iterations: 923007; Time: 00:02:09.8414388. Simply changing Dictionary to SortedDictionary yields: Iterations: 499976; Time: 00:00:00.4451514. 300 times faster while having only 2 times less iterations. The same happens in java. Used HashMap instead of Dictionary and keySet().iterator().next() instead of Keys.First().

    Read the article

  • A question about matrix manipulation

    - by appi
    Given a 1*N matrix or an array, how do I find the first 4 elements which have the same value and then store the index for those elements? PS: I'm just curious. What if we want to find the first 4 elements whose value differences are within a certain range, say below 2? For example, M=[10,15,14.5,9,15.1,8.5,15.5,9.5], the elements I'm looking for will be 15,14.5,15.1,15.5 and the indices will be 2,3,5,7.

    Read the article

  • List circular group membership from active directory

    - by KAPes
    We have 40K+ groups in our active directory and we are increasingly facing problem of circular nested groups which are creating problems for some applications. Does anyone know how to list down the full route through which a circular group membership exists ? e.g. G1 --> G2 --> G3 --> G4 --> G1 How do I list it down.

    Read the article

  • Algorithms for finding the intersections of intervals

    - by tomwu
    I am wondering how I can find the number of intervals that intersect with the ones before it. for the intervals [2, 4], [1, 6], [5, 6], [0, 4], the output should be 2. from [2,4] [5,6] and [5,6] [0,4]. So now we have 1 set of intervals with size n all containing a point a, then we add another set of intervals size n as well, and all of the intervals are to the right of a. Can you do this in O(nlgn) and O(nlg^2n)?

    Read the article

< Previous Page | 328 329 330 331 332 333 334 335 336 337 338 339  | Next Page >