fuzzy lollipop - Page 2

Lucene.NET (strings fuzzy matching)

- by dark-elf2

Good day The question is: Could anyone give me an example about how to do fuzzy matching of two strings using Lucene.NET (or using Java version of Lucene, or in any other language that has port of Lucene).

Read the article

something like gimp "fuzzy select" in python/PIL

- by Adam

I have image with some object at not solid background. I want to extract this objects like in gimp using "fuzzy select". This can be an example: http://img249.imageshack.us/gal.php?g=25750902.png Question is what is the best way to do it using python/PIL...

Read the article

Query for a exact string in Zend Lucene PHP Not fuzzy match

- by Podlsk

Im using the Zend Lucene however Im have a little trouble. I wish to query the index for the exact tring so page_name IS test123, not any fuzzy match. Currently I have: $hits = $index-find('page_name:"test123"'); And advice appreciated, thanks!

Read the article

Using FontSquirrel @Font-Face Generator: font quite fuzzy in Firefox

- by Conando

Mac Firefox (3.6.3). Font looks sharp in Chrome, Safari, IE8 (not as good as other 2, but less fuzzy than Firefox). Any work arounds? How can I determine which of the source files Firefox is using? Can I force it in the CSS to choose something else?

Read the article

Fuzzy Date algorithm in Objective-C

- by Brock Woolf

I would like to write a fuzzy date method for calculating dates in Objective-C for iPhone. There is a popular explanation here: http://stackoverflow.com/questions/11/how-do-i-calculate-relative-time However it contains missing arguments. How could this be used in Objective-C?. Thanks. const int SECOND = 1; const int MINUTE = 60 * SECOND; const int HOUR = 60 * MINUTE; const int DAY = 24 * HOUR; const int MONTH = 30 * DAY; if (delta < 1 * MINUTE) { return ts.Seconds == 1 ? "one second ago" : ts.Seconds + " seconds ago"; } if (delta < 2 * MINUTE) { return "a minute ago"; } if (delta < 45 * MINUTE) { return ts.Minutes + " minutes ago"; } if (delta < 90 * MINUTE) { return "an hour ago"; } if (delta < 24 * HOUR) { return ts.Hours + " hours ago"; } if (delta < 48 * HOUR) { return "yesterday"; } if (delta < 30 * DAY) { return ts.Days + " days ago"; } if (delta < 12 * MONTH) { int months = Convert.ToInt32(Math.Floor((double)ts.Days / 30)); return months <= 1 ? "one month ago" : months + " months ago"; } else { int years = Convert.ToInt32(Math.Floor((double)ts.Days / 365)); return years <= 1 ? "one year ago" : years + " years ago"; }

Read the article

SDL_image/C++ OpenGL Program: IMG_Load() produces fuzzy images

- by Kami

I'm trying to load an image file and use it as a texture for a cube. I'm using SDL_image to do that. I used this image because I've found it in various file formats (tga, tif, jpg, png, bmp) The code : SDL_Surface * texture; //load an image to an SDL surface (i.e. a buffer) texture = IMG_Load("/Users/Foo/Code/xcode/test/lena.bmp"); if(texture == NULL){ printf("bad image\n"); exit(1); } //create an OpenGL texture object glGenTextures(1, &textureObjOpenGLlogo); //select the texture object you need glBindTexture(GL_TEXTURE_2D, textureObjOpenGLlogo); //define the parameters of that texture object //how the texture should wrap in s direction glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT); //how the texture should wrap in t direction glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT); //how the texture lookup should be interpolated when the face is smaller than the texture glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); //how the texture lookup should be interpolated when the face is bigger than the texture glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); //send the texture image to the graphic card glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, texture->w, texture->h, 0, GL_RGB, GL_UNSIGNED_BYTE, texture-> pixels); //clean the SDL surface SDL_FreeSurface(texture); The code compiles without errors or warnings ! I've tired all the files formats but this always produces that ugly result : I'm using : SDL_image 1.2.9 & SDL 1.2.14 with XCode 3.2 under 10.6.2 Does anyone knows how to fix this ?

Read the article

Fuzzy Regex, Text Processing, Lexical Analysis?

- by justinzane

I'm not quite sure what terminology to search for, so my title is funky... Here is the workflow I've got: Semi-structured documents are scanned to file. The files are OCR'd to text. The text is parsed into Python objects The objects are serialized (to SQL, JSON, whatever) for use. The documents are structures like this: HEADER blah blah, Page ### blah Garbage text... 1. Question Text... continued until now. A. Choice text... adsadsf. B. Another Choice... 2. Another Question... I need to extract the questions and choices. The problem is that, because the text is OCR output, there are occasional strange substitutions like '2' - 'Z' which makes ordinary regular expressions useless. I've tried the Levenshtein module and it helps, but it requires prior knowledge of what edit distance is to be expected. I don't know whether I'm looking to create a parser? a lexer? something else? This has lead me down all kinds of interesting but nonrelevant paths. Guidance would be greatly appreciated. Oh, also, the text is generally from specific technical domains, so general spelling tools are not so helpful. Regarding the structure of the documents, there is no clear visual pattern -- like line breaks or indentation -- with the exception of the fact that "questions" usually begin a line. Crap on the document can cause characters to appear before the actual beginning of the line, which means that something along the lines of r'^[0-9]+' does not reliably work. Though the "questions" always begin with an int, a period and a space; the OCR can substitute other characters or skip characters. This is not so much a problem with Tesseract or Cunieform, rather with the poor quality of the paper documents. # Note: for the project in question, it was decided that having a human prep the OCR'd text was better that spending the time coding a solution. I'd still love good pointers, however.

Read the article

fuzzy DISTINCT Values

- by user982853

I have a database of real estate listings and need to return a list of neighborhoods. Right now I am using mysql DISTINCT which returns all of the distinct values. My probelm is that there is a lot of neighborhoods that have similar names: example: Park View Sub 1 Park View Park View Sub 2 Park View Sub 3 Great Lake Sub 1 Great Lake Sub 2 Great Lake Great Lake Sub 3 I am looking for an easy php or mysql solution that would recognize that "Park View" and "Great Lake" already exists and ONLY return "Park View" and "Great Lake". My initial thought is to some how get the sort order by length so that the short values are at the top and then loop through using strstr. This sound like a large task I am wondering if there is a function either in mysql or php that would easily do this.

Read the article

algorithm q: Fuzzy matching of structured data

- by user86432

I have a fairly small corpus of structured records sitting in a database. Given a tiny fraction of the information contained in a single record, submitted via a web form (so structured in the same way as the table schema), (let us call it the test record) I need to quickly draw up a list of the records that are the most likely matches for the test record, as well as provide a confidence estimate of how closely the search terms match a record. The primary purpose of this search is to discover whether someone is attempting to input a record that is duplicate to one in the corpus. There is a reasonable chance that the test record will be a dupe, and a reasonable chance the test record will not be a dupe. The records are about 12000 bytes wide and the total count of records is about 150,000. There are 110 columns in the table schema and 95% of searches will be on the top 5% most commonly searched columns. The data is stuff like names, addresses, telephone numbers, and other industry specific numbers. In both the corpus and the test record it is entered by hand and is semistructured within an individual field. You might at first blush say "weight the columns by hand and match word tokens within them", but it's not so easy. I thought so too: if I get a telephone number I thought that would indicate a perfect match. The problem is that there isn't a single field in the form whose token frequency does not vary by orders of magnitude. A telephone number might appear 100 times in the corpus or 1 time in the corpus. The same goes for any other field. This makes weighting at the field level impractical. I need a more fine-grained approach to get decent matching. My initial plan was to create a hash of hashes, top level being the fieldname. Then I would select all of the information from the corpus for a given field, attempt to clean up the data contained in it, and tokenize the sanitized data, hashing the tokens at the second level, with the tokens as keys and frequency as value. I would use the frequency count as a weight: the higher the frequency of a token in the reference corpus, the less weight I attach to that token if it is found in the test record. My first question is for the statisticians in the room: how would I use the frequency as a weight? Is there a precise mathematical relationship between n, the number of records, f(t), the frequency with which a token t appeared in the corpus, the probability o that a record is an original and not a duplicate, and the probability p that the test record is really a record x given the test and x contain the same t in the same field? How about the relationship for multiple token matches across multiple fields? Since I sincerely doubt that there is, is there anything that gets me close but is better than a completely arbitrary hack full of magic factors? Barring that, has anyone got a way to do this? I'm especially keen on other suggestions that do not involve maintaining another table in the database, such as a token frequency lookup table :). This is my first post on StackOverflow, thanks in advance for any replies you may see fit to give.

Read the article

Resolving Assemblies, the fuzzy way

- by David Rutten

Here's the setup: A pure DotNET class library is loaded by an unmanaged desktop application. The Class Library acts as a plugin. This plugin loads little baby plugins of its own (all DotNET Class Libraries), and it does so by reading the dll into memory as a byte-stream, then Assembly asm = Assembly.Load(COFF_Image); The problem arises when those little baby plugins have references to other dlls. Since they are loaded via the memory rather than directly from the disk, the framework often cannot find these referenced assemblies and is thus incapable of loading them. I can add an AssemblyResolver handler to my project and I can see these referenced assemblies drop past. I have a reasonably good idea about where to find these referenced assemblies on the disk, but how can I make sure that the Assmebly I load is the correct one? In short, how do I reliably go from the System.ResolveEventArgs.Name field to a dll file path, presuming I have a list of all the folders where this dll could be hiding)?

Read the article

One page of responsive site is blurry/fuzzy on iphone

- by Gwendydd

Here's a weird one. I'm developing a responsive site here: http://74.209.178.54:3000/index.html There are three pages built so far: the home page, the "Why" page, and the "Pricing" page. The Home and Why pages are just fine on my iPhone 4. The "Pricing" page is really blurry. And I don't just mean the images are blurry - absolutely everything is blurry: text, borders, backgrounds... Has anyone seen this before? Do you know what's happening?

Read the article

Search Lucene with precise edit distances

- by askullhead

I would like to search a Lucene index with edit distances. For example, say, there is a document with a field FIRST_NAME; I want all documents with first names that are 1 edit distance away from, say, 'john'. I know that Lucene supports fuzzy searches (FIRST_NAME:john~) and takes a number between 0 and 1 to control the fuzziness. The problem (for me) is this number does not directly translate to an edit distance. And when the values in the documents are short strings (less than 3 characters) the fuzzy search has difficulty finding them. For example if there is a document with FIRST_NAME 'J' and I search for FIRST_NAME:I~0.0 I don't get anything back.

Read the article

How to quickly find file in the workspace/switch between buffers/etc. in Eclipse?

- by Alexey Romanov

I am looking for something like Textmate's fuzzy search on Command-T, FuzzyFinder in Vim, or Ido in Emacs. Does it exist? If no, how do you prefer to do it?

Read the article

Inverse Logistic Function / Reverse Sigmoid Function

- by Chanq

I am currently coding up a fuzzy logic library in java. I have found the equations for all the standard functions - Grade, inverseGrade, Triangle, Trapezoid, Gaussian. However, I can't find the inverse of the sigmoid/ logistic function. The way I have written the logistic function is java is : //f(x) = 1/(1+e(-x)) public double logistic(double x){ return (1/(1+(Math.exp(-x))); } But I can't work out or find the inverse anywhere. My algebraic/calculus abilities are fairly limited, hence why I haven't been able to work out the inverse of the function. Any hints or pointers would be a big help. Thanks

Read the article

What free expert system can You recommend (with higher functionality then CLIPS)?

- by Martin

Hi, I'm trying to find best free expert system, with the highest functionality. I know about CLIPS, but is there another system, for example being able to accept percent of confidence for each rule (fuzzy logic). I need it to know will I be able to do fast a short project using expert system, with highest functionality. But anyways, it's interesting is there an open source program that aims to gather different AI methods (whitch there are plenty of), and use them together. So I would be extremely thankeful for any info about more robust CLIPS, or similar programs. Thanks!

Read the article

Algorithm detect repeating/similiar strings in a corpus of data -- say email subjects, in Python

- by RizwanK

I'm downloading a long list of my email subject lines , with the intent of finding email lists that I was a member of years ago, and would want to purge them from my Gmail account (which is getting pretty slow.) I'm specifically thinking of newsletters that often come from the same address, and repeat the product/service/group's name in the subject. I'm aware that I could search/sort by the common occurrence of items from a particular email address (and I intend to), but I'd like to correlate that data with repeating subject lines.... Now, many subject lines would fail a string match, but "Google Friends : Our latest news" "Google Friends : What we're doing today" are more similar to each other than a random subject line, as is: "Virgin Airlines has a great sale today" "Take a flight with Virgin Airlines" So -- how can I start to automagically extract trends/examples of strings that may be more similar. Approaches I've considered and discarded ('because there must be some better way'): Extracting all the possible substrings and ordering them by how often they show up, and manually selecting relevant ones Stripping off the first word or two and then count the occurrence of each sub string Comparing Levenshtein distance between entries Some sort of string similarity index ... Most of these were rejected for massive inefficiency or likelyhood of a vast amount of manual intervention required. I guess I need some sort of fuzzy string matching..? In the end, I can think of kludgy ways of doing this, but I'm looking for something more generic so I've added to my set of tools rather than special casing for this data set. After this, I'd be matching the occurring of particular subject strings with 'From' addresses - I'm not sure if there's a good way of building a data structure that represents how likely/not two messages are part of the 'same email list' or by filtering all my email subjects/from addresses into pools of likely 'related' emails and not -- but that's a problem to solve after this one. Any guidance would be appreciated.

Read the article

Simplifying a four-dimensional rule table in Matlab: addressing rows and columns of each dimension

- by Cate

Hi all. I'm currently trying to automatically generate a set of fuzzy rules for a set of observations which contain four values for each observation, where each observation will correspond to a state (a good example is with Fisher's Iris Data). In Matlab I am creating a four dimensional rule table where a single cell (a,b,c,d) will contain the corresponding state. To reduce the table I am following the Hong and Lee method of row and column similarity checking but I am having difficulty understanding how to address the third and fourth dimensions' rows and columns. From the method it is my understanding that each dimension is addressed individually and if the rule is true, the table is simplified. The rules for merging are as follows: If all cells in adjacent columns or rows are the same. If two cells are the same or if either of them is empty in adjacent columns or rows and at least one cell in both is not empty. If all cells in a column or row are empty and if cells in its two adjacent columns or rows are the same, merge the three. If all cells in a column or row are empty and if cells in its two adjacent columns or rows are the same or either of them is empty, merge the three. If all cells in a column or row are empty and if all the non-empty cells in the column or row to its left have the same region, and all the non-empty cells in the column or row to its right have the same region, but one different from the previously mentioned region, merge these three columns into two parts. Now for the confusing bit. Simply checking if the entire row/column is the same as the adjacent (rule 1) seems simple enough: if (a,:,:,:) == (a+1,:,:,:) (:,b,:,:) == (:,b+1,:,:) (:,:,c,:) == (:,:,c+1,:) (:,:,:,d) == (:,:,:,d+1) is this correct? but to check if the elements in the row/column match, or either is zero (rules 2 and 4), I am a bit lost. Would it be something along these lines: for a = 1:20 for i = 1:length(b) if (a+1,i,:,:) == (a,i,:,:) ... else if (a+1,i,:,:) == 0 ... else if (a,i,:,:) == 0 etc. and for the third and fourth dimensions: for c = 1:20 for i = 1:length(a) if (i,:,c,:) == (i,:,c+1,:) ... else if (i,:,c+1,:) == 0 ... else if (i,:,c,:) == 0 etc. for d = 1:20 for i = 1:length(a) if (i,:,:,d) == (i,:,:,d+1) ... else if (i,:,:,d+1) == 0 ... else if (i,:,:,d) == 0 etc. even any help with four dimensional arrays would be useful as I'm so confused by the thought of more than three! I would advise you look at the paper to understand my meaning - they themselves have used the Iris data but only given an example with a 2D table. Thanks in advance, hopefully!

Read the article

Java code optimization leads to numerical inaccuracies and errors

- by rano

I'm trying to implement a version of the Fuzzy C-Means algorithm in Java and I'm trying to do some optimization by computing just once everything that can be computed just once. This is an iterative algorithm and regarding the updating of a matrix, the clusters x pixels membership matrix U, this is the update rule I want to optimize: where the x are the element of a matrix X (pixels x features) and v belongs to the matrix V (clusters x features). And m is a parameter that ranges from 1.1 to infinity. The distance used is the euclidean norm. If I had to implement this formula in a banal way I'd do: for(int i = 0; i < X.length; i++) { int count = 0; for(int j = 0; j < V.length; j++) { double num = D[i][j]; double sumTerms = 0; for(int k = 0; k < V.length; k++) { double thisDistance = D[i][k]; sumTerms += Math.pow(num / thisDistance, (1.0 / (m - 1.0))); } U[i][j] = (float) (1f / sumTerms); } } In this way some optimization is already done, I precomputed all the possible squared distances between X and V and stored them in a matrix D but that is not enough, since I'm cycling througn the elements of V two times resulting in two nested loops. Looking at the formula the numerator of the fraction is independent of the sum so I can compute numerator and denominator independently and the denominator can be computed just once for each pixel. So I came to a solution like this: int nClusters = V.length; double exp = (1.0 / (m - 1.0)); for(int i = 0; i < X.length; i++) { int count = 0; for(int j = 0; j < nClusters; j++) { double distance = D[i][j]; double denominator = D[i][nClusters]; double numerator = Math.pow(distance, exp); U[i][j] = (float) (1f / (numerator * denominator)); } } Where I precomputed the denominator into an additional column of the matrix D while I was computing the distances: for (int i = 0; i < X.length; i++) { for (int j = 0; j < V.length; j++) { double sum = 0; for (int k = 0; k < nDims; k++) { final double d = X[i][k] - V[j][k]; sum += d * d; } D[i][j] = sum; D[i][B.length] += Math.pow(1 / D[i][j], exp); } } By doing so I encounter numerical differences between the 'banal' computation and the second one that leads to different numerical value in U (not in the first iterates but soon enough). I guess that the problem is that exponentiate very small numbers to high values (the elements of U can range from 0.0 to 1.0 and exp , for m = 1.1, is 10) leads to ver y small values, whereas by dividing the numerator and the denominator and THEN exponentiating the result seems to be better numerically. The problem is it involves much more operations. Am I doing something wrong? Is there a possible solution to get both the code optimized and numerically stable? Any suggestion or criticism will be appreciated.

Read the article

Using MinHash to find similiarities between 2 images

- by Sung Meister

I am using MinHash algorithm to find similar images between images. I have run across this post, How can I recognize slightly modified images? which pointed me to MinHash algorithm. Being a bit mathematically challenged, I was using a C# implementation from this blog post, Set Similarity and Min Hash. But while trying to use the implementation, I have run into 2 problems. What value should I set universe value to? When passing image byte array to HashSet, it only contains distinct byte values; thus comparing values from 1 ~ 256. What is this universe in MinHash? And what can I do to improve the C# MinHash implementation? Since HashSet<byte> contains values upto 256, similarity value always come out to 1. Here is the source that uses the C# MinHash implementation from Set Similarity and Min Hash: class Program { static void Main(string[] args) { var imageSet1 = GetImageByte(@".\Images\01.JPG"); var imageSet2 = GetImageByte(@".\Images\02.TIF"); //var app = new MinHash(256); var app = new MinHash(Math.Min(imageSet1.Count, imageSet2.Count)); double imageSimilarity = app.Similarity(imageSet1, imageSet2); Console.WriteLine("similarity = {0}", imageSimilarity); } private static HashSet<byte> GetImageByte(string imagePath) { using (var fs = new FileStream(imagePath, FileMode.Open, FileAccess.Read)) using (var br = new BinaryReader(fs)) { //List<int> bytes = br.ReadBytes((int)fs.Length).Cast<int>().ToList(); var bytes = new List<byte>(br.ReadBytes((int) fs.Length).ToArray()); return new HashSet<byte>(bytes); } } }

Read the article

Is this a variation of the traveling salesman problem?

- by Ville Koskinen

I'm interested in a function of two word lists, which would return an order agnostic edit distance between them. That is, the arguments would be two lists of (let's say space delimited) words and return value would be the minimum sum of the edit (or Levenshtein) distances of the words in the lists. Distance between "cat rat bat" and "rat bat cat" would be 0. Distance between "cat rat bat" and "fat had bad" would be the same as distance between "rat bat cat" and "had fat bad", 4. In the case the number of words in the lists are not the same, the shorter list would be padded with 0-length words. My intuition (which hasn't been nurtured with computer science classes) does not find any other solution than to use brute force: |had|fat|bad| a solution ---+---+---+---+ +---+---+---+ cat| 2 | 1 | 2 | | | 1 | | ---+---+---+---+ +---+---+---+ rat| 2 | 1 | 2 | | 3 | | | ---+---+---+---+ +---+---+---+ bat| 2 | 1 | 1 | | | | 4 | ---+---+---+---+ +---+---+---+ Starting from the first row, pick a column and go to the next rows without ever revisiting a column you have already visited. Do this over and over again until you've tried all combinations. To me this sounds a bit like the traveling salesman problem. Is it, and how would you solve my particular problem?

Read the article

How to calculate this string-dissimilarity function efficiently?

- by ybungalobill

Hello, I was looking for a string metric that have the property that moving around large blocks in a string won't affect the distance so much. So "helloworld" is close to "worldhello". Obviously Levenshtein distance and Longest common subsequence don't fulfill this requirement. Using Jaccard distance on the set of n-grams gives good results but has other drawbacks (it's a pseudometric and higher n results in higher penalty for changing single character). [original research] As I thought about it, what I'm looking for is a function f(A,B) such that f(A,B)+1 equals the minimum number of blocks that one have to divide A into (A1 ... An), apply a permutation on the blocks and get B: f("hello", "hello") = 0 f("helloworld", "worldhello") = 1 // hello world -> world hello f("abba", "baba") = 2 // ab b a -> b ab a f("computer", "copmuter") = 3 // co m p uter -> co p m uter This can be extended for A and B that aren't necessarily permutations of each other: any additional character that can't be matched is considered as one additional block. f("computer", "combuter") = 3 // com uter -> com uter, unmatched: p and b. Observing that instead of counting blocks we can count the number of pairs of indices that are taken apart by a permutation, we can write f(A,B) formally as: f(A,B) = min { C(P) | P:|A|?|B|, P is bijective, ?i?dom(P) A[P(i)]=B[P(i)] } C(P) = |A| + |B| - |dom(P)| - |{ i | i,i+1?dom(P) and P(i)+1=P(i+1) }| - 1 The problem is... guess what... ... that I'm not able to calculate this in polynomial time. Can someone suggest a way to do this efficiently? Or perhaps point me to already known metric that exhibits similar properties?

Read the article

Anything wrong with this function for comparing floats?

- by Michael Borgwardt

When my Floating-Point Guide was yesterday published on slashdot, I got a lot of flak for my suggested comparison function, which was indeed inadequate. So I finally did the sensible thing and wrote a test suite to see whether I could get them all to pass. Here is my result so far. And I wonder if this is really as good as one can get with a generic (i.e. not application specific) float comparison function, or whether I still missed some edge cases. import static org.junit.Assert.assertFalse; import static org.junit.Assert.assertTrue; import org.junit.Test; public class NearlyEqualsTest { public static boolean nearlyEqual(float a, float b) { final float epsilon = 0.000001f; final float absA = Math.abs(a); final float absB = Math.abs(b); final float diff = Math.abs(a-b); if (a*b==0) { // a or b or both are zero // relative error is not meaningful here return diff < Float.MIN_VALUE / epsilon; } else { // use relative error return diff / (absA+absB) < epsilon; } } /** Regular large numbers - generally not problematic */ @Test public void big() { assertTrue(nearlyEqual(1000000f, 1000001f)); assertTrue(nearlyEqual(1000001f, 1000000f)); assertFalse(nearlyEqual(10000f, 10001f)); assertFalse(nearlyEqual(10001f, 10000f)); } /** Negative large numbers */ @Test public void bigNeg() { assertTrue(nearlyEqual(-1000000f, -1000001f)); assertTrue(nearlyEqual(-1000001f, -1000000f)); assertFalse(nearlyEqual(-10000f, -10001f)); assertFalse(nearlyEqual(-10001f, -10000f)); } /** Numbers around 1 */ @Test public void mid() { assertTrue(nearlyEqual(1.0000001f, 1.0000002f)); assertTrue(nearlyEqual(1.0000002f, 1.0000001f)); assertFalse(nearlyEqual(1.0002f, 1.0001f)); assertFalse(nearlyEqual(1.0001f, 1.0002f)); } /** Numbers around -1 */ @Test public void midNeg() { assertTrue(nearlyEqual(-1.000001f, -1.000002f)); assertTrue(nearlyEqual(-1.000002f, -1.000001f)); assertFalse(nearlyEqual(-1.0001f, -1.0002f)); assertFalse(nearlyEqual(-1.0002f, -1.0001f)); } /** Numbers between 1 and 0 */ @Test public void small() { assertTrue(nearlyEqual(0.000000001000001f, 0.000000001000002f)); assertTrue(nearlyEqual(0.000000001000002f, 0.000000001000001f)); assertFalse(nearlyEqual(0.000000000001002f, 0.000000000001001f)); assertFalse(nearlyEqual(0.000000000001001f, 0.000000000001002f)); } /** Numbers between -1 and 0 */ @Test public void smallNeg() { assertTrue(nearlyEqual(-0.000000001000001f, -0.000000001000002f)); assertTrue(nearlyEqual(-0.000000001000002f, -0.000000001000001f)); assertFalse(nearlyEqual(-0.000000000001002f, -0.000000000001001f)); assertFalse(nearlyEqual(-0.000000000001001f, -0.000000000001002f)); } /** Comparisons involving zero */ @Test public void zero() { assertTrue(nearlyEqual(0.0f, 0.0f)); assertFalse(nearlyEqual(0.00000001f, 0.0f)); assertFalse(nearlyEqual(0.0f, 0.00000001f)); } /** Comparisons of numbers on opposite sides of 0 */ @Test public void opposite() { assertFalse(nearlyEqual(1.000000001f, -1.0f)); assertFalse(nearlyEqual(-1.0f, 1.000000001f)); assertFalse(nearlyEqual(-1.000000001f, 1.0f)); assertFalse(nearlyEqual(1.0f, -1.000000001f)); assertTrue(nearlyEqual(10000f*Float.MIN_VALUE, -10000f*Float.MIN_VALUE)); } /** * The really tricky part - comparisons of numbers * very close to zero. */ @Test public void ulp() { assertTrue(nearlyEqual(Float.MIN_VALUE, -Float.MIN_VALUE)); assertTrue(nearlyEqual(-Float.MIN_VALUE, Float.MIN_VALUE)); assertTrue(nearlyEqual(Float.MIN_VALUE, 0)); assertTrue(nearlyEqual(0, Float.MIN_VALUE)); assertTrue(nearlyEqual(-Float.MIN_VALUE, 0)); assertTrue(nearlyEqual(0, -Float.MIN_VALUE)); assertFalse(nearlyEqual(0.000000001f, -Float.MIN_VALUE)); assertFalse(nearlyEqual(0.000000001f, Float.MIN_VALUE)); assertFalse(nearlyEqual(Float.MIN_VALUE, 0.000000001f)); assertFalse(nearlyEqual(-Float.MIN_VALUE, 0.000000001f)); assertFalse(nearlyEqual(1e20f*Float.MIN_VALUE, 0.0f)); assertFalse(nearlyEqual(0.0f, 1e20f*Float.MIN_VALUE)); assertFalse(nearlyEqual(1e20f*Float.MIN_VALUE, -1e20f*Float.MIN_VALUE)); } }

Read the article

Splitting a set of object into several subsets of 'similar' objects

- by doublep

Suppose I have a set of objects, S. There is an algorithm f that, given a set S builds certain data structure D on it: f(S) = D. If S is large and/or contains vastly different objects, D becomes large, to the point of being unusable (i.e. not fitting in allotted memory). To overcome this, I split S into several non-intersecting subsets: S = S1 + S2 + ... + Sn and build Di for each subset. Using n structures is less efficient than using one, but at least this way I can fit into memory constraints. Since size of f(S) grows faster than S itself, combined size of Di is much less than size of D. However, it is still desirable to reduce n, i.e. the number of subsets; or reduce the combined size of Di. For this, I need to split S in such a way that each Si contains "similar" objects, because then f will produce a smaller output structure if input objects are "similar enough" to each other. The problems is that while "similarity" of objects in S and size of f(S) do correlate, there is no way to compute the latter other than just evaluating f(S), and f is not quite fast. Algorithm I have currently is to iteratively add each next object from S into one of Si, so that this results in the least possible (at this stage) increase in combined Di size: for x in S: i = such i that size(f(Si + {x})) - size(f(Si)) is min Si = Si + {x} This gives practically useful results, but certainly pretty far from optimum (i.e. the minimal possible combined size). Also, this is slow. To speed up somewhat, I compute size(f(Si + {x})) - size(f(Si)) only for those i where x is "similar enough" to objects already in Si. Is there any standard approach to such kinds of problems? I know of branch and bounds algorithm family, but it cannot be applied here because it would be prohibitively slow. My guess is that it is simply not possible to compute optimal distribution of S into Si in reasonable time. But is there some common iteratively improving algorithm?

Read the article

"Did you mean" feature on a dictionary database

- by Hazar

I have a ~300.000 row table; which includes technical terms; queried using PHP and MySQL + FULLTEXT indexes. But when I searching a wrong typed term; for example "hyperpext"; naturally giving no results. I need to "compansate" little writing errors and getting nearest record from database. How I can accomplish such feaure? I know (actually, learned today) about Levenshtein distance, Soundex and Metaphone algorithms but currently not having a solid idea to implement this to querying against database. Best regards. (Sorry about my poor English, I'm trying to do my best)

Read the article

How to set up a user Quartz2D coordinate system with scaling that avoids fuzzy drawing?

- by jdmuys

This topic has been scratched once or twice, but I am still puzzled. And Google was not friendly either. Since Quartz allows for arbitrary coordinate systems using affine transform, I want to be able to draw things such as floorplans using real-life coordinate, e.g. feet. So basically, for the sake of an example, I want to scale the view so that when I draw a 10x10 rectangle (think a 1-inch box for example), I get a 60x60 pixels rectangle. It works, except the rectangle I get is quite fuzzy. Another question here got an answer that explains why. However, I'm not sure I understood that reason why, and moreover, I don't know how to fix it. Here is my code: I set my coordinate system in my awakeFromNib custom view method: - (void) awakeFromNib { CGAffineTransform scale = CGAffineTransformMakeScale(6.0, 6.0); self.transform = scale; } And here is my draw routine: - (void)drawRect:(CGRect)rect { CGContextRef context = UIGraphicsGetCurrentContext(); CGRect r = CGRectMake(10., 10., 11., 11.); CGFloat lineWidth = 1.0; CGContextStrokeRectWithWidth(context, r, lineWidth); } The square I get is scaled just fine, but totally fuzzy. Playing with lineWidth doesn't help: when lineWidth is set smaller, it gets lighter, but not crisper. So is there a way to set up a view to have a scaled coordinate system, so that I can use my domain coordinates? Or should I go back and implementing scaling in my drawing routines? Note that this issue doesn't occur for translation or rotation. Thanks

Search Results

Search found 266 results on 11 pages for 'fuzzy lollipop'.

Page 2/11 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 | Next Page >

- by dark-elf2

- by Adam

- by Podlsk

- by Conando

- by Brock Woolf

- by Kami

- by justinzane

- by user982853

- by user86432

- by David Rutten

- by Gwendydd

- by askullhead

- by Alexey Romanov

- by Chanq

- by Martin

- by RizwanK

- by Cate

- by rano

- by Sung Meister

- by Ville Koskinen

- by ybungalobill

- by Michael Borgwardt

- by doublep

- by Hazar

- by jdmuys

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 | Next Page >