Search Results

Search found 7490 results on 300 pages for 'algorithm analysis'.

Page 98/300 | < Previous Page | 94 95 96 97 98 99 100 101 102 103 104 105  | Next Page >

  • Ideas for designing an automated content tagging system needed

    - by Benjamin Smith
    I am currently designing a website that amongst other is required to display and organise small amounts of text content (mainly quotes, article stubs, etc.). I currently have a database with 250,000+ items and need to come up with a method of tagging each item with relevant tags which will eventually allow for easy searching/browsing of the content for users. A very simplistic idea I have (and one that I believe is employed by some sites that I have been looking to for inspiration (http://www.brainyquote.com/quotes/topics.html)), is to simply search the database for certain words or phrases and use these words as tags for the content. This can easily be extended so that if for example a user wanted to show all items with a theme of love then I would just return a list of items with words and phrases relating to this theme. This would not be hard to implement but does not provide very good results. For example if I were to search for the month 'May' in the database with the aim of then classifying the items returned as realting to the topic of Spring then I would get back all occurrences of the word May, regardless of the semantic meaning. Another shortcoming of this method is that I believe it would be quite hard to automate the process to any large scale. What I really require is a library that can take an item, break it down and analyse the semantic meaning and also return a list of tags that would correctly classify the item. I know this is a lot to ask and I have a feeling I will end up reverting to the aforementioned method but I just thought I should ask if anyone knew of any pre-existing solution. I think that as the items in the database are short then it is probably quite a hard task to analyse any meaning from them however I may be mistaken. Another path to possibly go down would be to use something like amazon turk to outsource the task which may produce good results but would be expensive. Eventually I would like users to be able to (and want to!) tag content and to vote for the most relevant tags, possibly using a gameification mechanic as motivation however this is some way down the line. A temporary fix may be the best thing if this were the route I decided to go down as I could use the rough results I got as the starting point for a more in depth solution. If you've read this far, thanks for sticking with me, I know I'm spitballing but any input would be really helpful. Thanks.

    Read the article

  • Programming Contest Question: Counting Polyominos

    - by Martijn Courteaux
    Hi, An example question for a programming contest was to write a program that finds out how much polyominos are possible with a given number of stones. So for two stones (n = 2) there is only one polyominos: XX You might think this is a second solution: X X But it isn't. The polyominos are not unique if you can rotate them. So, for 4 stones (n = 4), there are 7 solutions: X X XX X X X X X X XX X XX XX XX X X X XX X X XX The application has to be able to find the solution for 1 <= n <=10 PS: Using the list of polyominos on Wikipedia isn't allowed ;) EDIT: Of course the question is: How to do this in Java, C/C++, C#

    Read the article

  • How to output multicolumn html without "widows"?

    - by user314850
    I need to output to HTML a list of categorized links in exactly three columns of text. They must be displayed similar to columns in a newspaper or magazine. So, for example, if there are 20 lines total the first and second columns would contain 7 lines and the last column would contain 6. The list must be dynamic; it will be regularly changed. The tricky part is that the links are categorized with a title and this title cannot be a "widow". If you have a page layout background you'll know that this means the titles cannot be displayed at the bottom of the column -- they must have at least one link underneath them, otherwise they should bump to the next column (I know, technically it should be two lines if I were actually doing page layout, but in this case one is acceptable). I'm having a difficult time figuring out how to get this done. Here's an example of what I mean: Shopping Link 3 Link1 Link 1 Link 4 Link2 Link 2 Link 3 Link 3 Cars Link 1 Music Games Link 2 Link 1 Link 1 Link 2 News As you can see, the "News" title is at the bottom of the middle column, and so is a "widow". This is unacceptable. I could bump it to the next column, but that would create an unnecessarily large amount of white space at the bottom of the second column. What needs to happen instead is that the entire list needs to be re-balanced. I'm wondering if anyone has any tips for how to accomplish this, or perhaps source code or a plug in. Python is preferable, but any language is fine. I'm just trying to get the general concept down.

    Read the article

  • Most Frequent 3 page sequence in a weblog

    - by Sundararajan S
    Given a web log which consists of fields 'User ' 'Page url'. We have to find out the most frequent 3-page sequence that users takes. There is a time stamp. and it is not guaranteed that the single user access will be logged sequentially it could be like user1 Page1 user2 Pagex user1 Page2 User10 Pagex user1 Page 3 her User1s page sequence is page1- page2- page 3

    Read the article

  • How can I test if a point lies within a 3d shape with its surface defined by a point cloud?

    - by Ben
    Hi I have a collection of points which describe the surface of a shape that should be roughly spherical, and I need a method with which to determine if any other given point lies within this shape. I've previously been approximating the shape as an exact sphere, but this has proven too inaccurate and I need a more accurate method. Simplicity and speed is favourable over complete accuracy, a good approximation will suffice. I've come across techniques for converting a point cloud to a 3d mesh, but most things I have found have been very complicated, and I am looking for something as simple as possible. Any ideas? Many thanks, Ben.

    Read the article

  • Base X string encoding

    - by Paul Stone
    I'm looking for a routine that will encode a string (stream of bytes) into an arbitrary base/alphabet (like base64 encoding but I get to choose the alphabet). I've seen a few routines that do base X encoding for a number, but not for a string.

    Read the article

  • Convert arbitrary size of byte[] to BigInteger[] and then safely convert back to exactly the same by

    - by PatlaDJ
    I believe conversion exactly to BigInteger[] would be optimal in my case. Anyone had done or found this written in Java and willing to share? So imagine I have arbitrary size byte[] = {0xff,0x3e,0x12,0x45,0x1d,0x11,0x2a,0x80,0x81,0x45,0x1d,0x11,0x2a,0x80,0x81} How do I convert it to array of BigInteger's and then be able to recover it back the original byte array safely? ty in advance.

    Read the article

  • Naive Bayesian classification (spam filtering) - Doubt in one calculation? Which one is right? Plz c

    - by Microkernel
    Hi guys, I am implementing Naive Bayesian classifier for spam filtering. I have doubt on some calculation. Please clarify me what to do. Here is my question. In this method, you have to calculate P(S|W) - Probability that Message is spam given word W occurs in it. P(W|S) - Probability that word W occurs in a spam message. P(W|H) - Probability that word W occurs in a Ham message. So to calculate P(W|S), should I do (1) (Number of times W occuring in spam)/(total number of times W occurs in all the messages) OR (2) (Number of times word W occurs in Spam)/(Total number of words in the spam message) So, to calculate P(W|S), should I do (1) or (2)? (I thought it to be (2), but I am not sure, so plz clarify me) I am refering http://en.wikipedia.org/wiki/Bayesian_spam_filtering for the info by the way. I got to complete the implementation by this weekend :( Thanks and regards, MicroKernel :) @sth: Hmm... Shouldn't repeated occurrence of word 'W' increase a message's spam score? In the your approach it wouldn't, right?. Lets take a scenario and discuss... Lets say, we have 100 training messages, out of which 50 are spam and 50 are Ham. and say word_count of each message = 100. And lets say, in spam messages word W occurs 5 times in each message and word W occurs 1 time in Ham message. So total number of times W occuring in all the spam message = 5*50 = 250 times. And total number of times W occuring in all Ham messages = 1*50 = 50 times. Total occurance of W in all of the training messages = (250+50) = 300 times. So, in this scenario, how do u calculate P(W|S) and P(W|H) ? Naturally we should expect, P(W|S) P(W|H)??? right. Please share your thought...

    Read the article

  • Revisions: algorithm and data structure

    - by SODA
    Hi, I need ideas for structuring and processing data with revisions. For example, I have a database of objects (e.g. cars). Each object has a number of properties, which can be arbitrary, so there's no a set schema to describe these objects. These objects are probably saved as key-value pairs. Now I need to change property of an object. I don't want to completely rewrite it - I want to be able to go back and see history of changes to these properties, that's why I want to add new property and keep the old one (so I guess a timestamp would do the job of telling which property is the latest). At the same time I want to be able to get info about any object in a snap, with only latest versions of each of the properties. Any ideas what would be the best approach? At least please point me in the right direction. Thanks!

    Read the article

  • A graph problem

    - by copperhead
    I am struggling to solve the following problem http://uva.onlinejudge.org/external/1/193.html However Im not able to get a fast solution. And as seen by the times of others, there should be a solution of maximum n^2 complexity http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=3&page=show_problem&problemid=129&page=problem_stats Can I get some help?

    Read the article

  • Interview question: How do I detect a loop in this linked list?

    - by jjujuma
    Say you have a linked list structure in Java. It's made up of Nodes: class Node { Node next; // some user data } and each Node points to the next node, except for the last Node, which has null for next. Say there is a possibility that the list can contain a loop - i.e. the final Node, instead of having a null, has a reference to one of the nodes in the list which came before it. What's the best way of writing boolean hasLoop(Node first) which would return true if the given Node is the first of a list with a loop, and false otherwise? How could you write so that it takes a constant amount of space and a reasonable amount of time? Here's a picture of what a list with a loop looks like: Node->Node->Node->Node->Node->Node--\ \ | ----------------

    Read the article

  • how to develop a program to minimize errors in human transcription of hand written surveys

    - by Alex. S.
    I need to develop custom software to do surveys. Questions may be of multiple choice, or free text in a very few cases. I was asked to design a subsystem to check if there is any error in the manual data entry for the multiple choices part. We're trying to speed up the user data entry process and to minimize human input differences between digital forms and the original questionnaires. The surveys are filled with handwritten marks and text by human interviewers, so it's possible to find hard to read marks, or also the user could accidentally select a different value in some question, and we would like to avoid that. The software must include some automatic control to detect possible typing differences. Each answer of the multiple choice questions has the same probability of being selected. This question has two parts: The GUI. The most simple thing I have in mind is to implement the most usable design of the questions display: use of large and readable fonts and space generously the choices. Is there something else? For faster input, I would like to use drop down lists (favoring keyboard over mouse). Given the questions are grouped in sections, I would like to show the answers selected for the questions of that section, but this could slow down the process. Any other ideas? The error checking subsystem. What else can I do to minimize or to check human typos in the multiple choice questions? Is this a solvable problem? is there some statistical methodology to check values that were entered by the users are the same from the hand filled forms? For example, let's suppose the survey has 5 questions, and each has 4 options. Let's say I have n survey forms filled in paper by interviewers, and they're ready to be entered in the software, then how to minimize the accidental differences that can have the manual transcription of the n surveys, without having to double check everything in the 5 questions of the n surveys? My first suggestion is that at the end of the processing of all the hand filled forms, the software could choose some forms randomly to make a double check of the responses in a few instances, but on what criteria can I make this selection? This validation would be enough to cover everything in a significant way? The actual survey is nation level and it has 56 pages with over 200 questions in total, so it will be a lot of hand written pages by many people, and the intention is to reduce the likelihood of errors and to optimize speed in the data entry process. The surveys must filled in paper first, given the complications of taking laptops or handhelds with the interviewers.

    Read the article

  • question about qsort in c++

    - by davit-datuashvili
    i have following code in c++ #include <iostream> using namespace std; void qsort5(int a[],int n){ int i; int j; if (n<=1) return; for (i=1;i<n;i++) j=0; if (a[i]<a[0]) swap(++j,i,a); swap(0,j,a); qsort5(a,j); qsort(a+j+1,n-j-1); } int main() { return 0; } void swap(int i,int j,int a[]) { int t=a[i]; a[i]=a[j]; a[j]=t; } i have problem 1>c:\users\dato\documents\visual studio 2008\projects\qsort5\qsort5\qsort5.cpp(13) : error C2780: 'void std::swap(std::basic_string<_Elem,_Traits,_Alloc> &,std::basic_string<_Elem,_Traits,_Alloc> &)' : expects 2 arguments - 3 provided 1> c:\program files\microsoft visual studio 9.0\vc\include\xstring(2203) : see declaration of 'std::swap' 1>c:\users\dato\documents\visual studio 2008\projects\qsort5\qsort5\qsort5.cpp(13) : error C2780: 'void std::swap(std::pair<_Ty1,_Ty2> &,std::pair<_Ty1,_Ty2> &)' : expects 2 arguments - 3 provided 1> c:\program files\microsoft visual studio 9.0\vc\include\utility(76) : see declaration of 'std::swap' 1>c:\users\dato\documents\visual studio 2008\projects\qsort5\qsort5\qsort5.cpp(13) : error C2780: 'void std::swap(_Ty &,_Ty &)' : expects 2 arguments - 3 provided 1> c:\program files\microsoft visual studio 9.0\vc\include\utility(16) : see declaration of 'std::swap' 1>c:\users\dato\documents\visual studio 2008\projects\qsort5\qsort5\qsort5.cpp(14) : error C2780: 'void std::swap(std::basic_string<_Elem,_Traits,_Alloc> &,std::basic_string<_Elem,_Traits,_Alloc> &)' : expects 2 arguments - 3 provided 1> c:\program files\microsoft visual studio 9.0\vc\include\xstring(2203) : see declaration of 'std::swap' 1>c:\users\dato\documents\visual studio 2008\projects\qsort5\qsort5\qsort5.cpp(14) : error C2780: 'void std::swap(std::pair<_Ty1,_Ty2> &,std::pair<_Ty1,_Ty2> &)' : expects 2 arguments - 3 provided 1> c:\program files\microsoft visual studio 9.0\vc\include\utility(76) : see declaration of 'std::swap' 1>c:\users\dato\documents\visual studio 2008\projects\qsort5\qsort5\qsort5.cpp(14) : error C2780: 'void std::swap(_Ty &,_Ty &)' : expects 2 arguments - 3 provided 1> c:\program files\microsoft visual studio 9.0\vc\include\utility(16) : see declaration of 'std::swap' 1>c:\users\dato\documents\visual studio 2008\projects\qsort5\qsort5\qsort5.cpp(16) : error C2661: 'qsort' : no overloaded function takes 2 arguments 1>Build log was saved at "file://c:\Users\dato\Documents\Visual Studio 2008\Projects\qsort5\qsort5\Debug\BuildLog.htm" please help

    Read the article

  • Why does Java's hashCode() in String use 31 as a multiplier?

    - by jacobko
    In Java, the hash code for a String object is computed as s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] using int arithmetic, where s[i] is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. Why is 31 used as a multiplier? I understand that the multiplier should be a relatively large prime number. So why not 29, or 37, or even 97?

    Read the article

  • Position elements without overlap

    - by eWolf
    I have a number of rectangular elements that I want to position in a 2D space. I calculate an ideal position for each element. Now my problem is that many elements overlap as very often the ideal positions are concentrated in one region. I want to avoid overlap as much as possible (doesn't have to be perfect, though). How can I do this? I've heard physics simulations are suitable for this - is that correct? And can anyone provide an example/tutorial? By the way: I'm using XNA, if you know any .NET library that does exactly this job - tell me!

    Read the article

  • Genetic programming in c++, library suggestions?

    - by shuttle87
    I'm looking to add some genetic algorithms to an Operations research project I have been involved in. Currently we have a program that aids in optimizing some scheduling and we want to add in some heuristics in the form of genetic algorithms. Are there any good libraries for generic genetic programming/algorithms in c++? Or would you recommend I just code my own? I should add that while I am not new to c++ I am fairly new to doing this sort of mathematical optimization work in c++ as the group I worked with previously had tended to use a proprietary optimization package. We have a fitness function that is fairly computationally intensive to evaluate and we have a cluster to run this on so parallelized code is highly desirable. So is c++ a good language for this? If not please recommend some other ones as I am willing to learn another language if it makes life easier. thanks!

    Read the article

  • What is it about Fibonacci numbers?

    - by Ian Bishop
    Fibonacci numbers have become a popular introduction to recursion for Computer Science students and there's a strong argument that they persist within nature. For these reasons, many of us are familiar with them. They also exist within Computer Science elsewhere too; in surprisingly efficient data structures and algorithms based upon the sequence. There are two main examples that come to mind: Fibonacci heaps which have better amortized running time than binomial heaps. Fibonacci search which shares O(log N) running time with binary search on an ordered array. Is there some special property of these numbers that gives them an advantage over other numerical sequences? Is it a density quality? What other possible applications could they have? It seems strange to me as there are many natural number sequences that occur in other recursive problems, but I've never seen a Catalan heap.

    Read the article

  • Converting to a column oriented array in Java

    - by halfwarp
    Although I have Java in the title, this could be for any OO language. I'd like to know a few new ideas to improve the performance of something I'm trying to do. I have a method that is constantly receiving an Object[] array. I need to split the Objects in this array through multiple arrays (List or something), so that I have an independent list for each column of all arrays the method receives. Example: List<List<Object>> column-oriented = new ArrayList<ArrayList<Object>>(); public void newObject(Object[] obj) { for(int i = 0; i < obj.length; i++) { column-oriented.get(i).add(obj[i]); } } Note: For simplicity I've omitted the initialization of objects and stuff. The code I've shown above is slow of course. I've already tried a few other things, but would like to hear some new ideas. How would you do this knowing it's very performance sensitive?

    Read the article

  • question about api functions

    - by davit-datuashvili
    i have question we have API functions in java can user create it's own function and add to his java IDE? for example i am using netbeans can i create my own function add to netbean IDE?let say create binary function or something else thanks

    Read the article

  • Inverse relationship of two variables

    - by Jam
    this one is maybe pretty stupid.. Or I am just exhausted or something, but I just cant seem to solve it.. Problem : two variables X and Y, value of Y is dependent on value of X. X can have values ranging from some value to some value (lets say from 0 to 250) and y can have different values (lets say from 0.1 to 1.0 or something..) - but it is inverse relatonship (what I mean is: if value of X is e.g. 250, then value of Y would be 0.1 and when X decreases up to 0, value of Y raises up to 1.0.. So how should I do it? lets say I have function: -- double computeValue (double X) { /computation/ return Y; } Also, is there some easy way to somehow make the scaling of the function not so linear? - For example when X raises, Y decreases slower at first but then more rapidly in the end.. (rly dont know how to say it but I hope you guys got it) Thanks in advance for this stupid question :/

    Read the article

< Previous Page | 94 95 96 97 98 99 100 101 102 103 104 105  | Next Page >