Search Results

Search found 30771 results on 1231 pages for 'machine language'.

Page 134/1231 | < Previous Page | 130 131 132 133 134 135 136 137 138 139 140 141  | Next Page >

  • Ngram IDF smoothing

    - by adi92
    I am trying to use IDF scores to find interesting phrases in my pretty huge corpus of documents. I basically need something like Amazon's Statistically Improbable Phrases, i.e. phrases that distinguish a document from all the others The problem that I am running into is that some (3,4)-grams in my data which have super-high idf actually consist of component unigrams and bigrams which have really low idf.. For example, "you've never tried" has a very high idf, while each of the component unigrams have very low idf.. I need to come up with a function that can take in document frequencies of an n-gram and all its component (n-k)-grams and return a more meaningful measure of how much this phrase will distinguish the parent document from the rest. If I were dealing with probabilities, I would try interpolation or backoff models.. I am not sure what assumptions/intuitions those models leverage to perform well, and so how well they would do for IDF scores. Anybody has any better ideas?

    Read the article

  • Java text classification problem

    - by yox
    Hello, I have a set of Books objects, classs Book is defined as following : Class Book{ String title; ArrayList<tags> taglist; } Where title is the title of the book, example : Javascript for dummies. and taglist is a list of tags for our example : Javascript, jquery, "web dev", .. As I said a have a set of books talking about different things : IT, BIOLOGY, HISTORY, ... Each book has a title and a set of tags describing it.. I have to classify automaticaly those books into separated sets by topic, example : IT BOOKS : Java for dummies Javascript for dummies Learn flash in 30 days C++ programming HISTORY BOOKS : World wars America in 1960 Martin luther king's life BIOLOGY BOOKS : .... Do you guys know a classification algorithm/method to apply for that kind of problems ? A solution is to use an external API to define the category of the text, but the problem here is that books are in different languages : french, spanish, english ..

    Read the article

  • WITH_OBJECT_HEADERS enabled GC from Dalvik?

    - by Wonil
    Hello, As I know Dalvik VM does not support generational GC as default. But, I found "WITH_OBJECT_HEADERS" compilation flag which could be related with generational GC from HeapInternal.h file. typedef struct DvmHeapChunk { #if WITH_OBJECT_HEADERS u4 header; const Object *parent; const Object *parentOld; const Object *markFinger; const Object *markFingerOld; u2 birthGeneration; u2 markCount; u2 scanCount; u2 oldMarkGeneration; u2 markGeneration; u2 oldScanGeneration; u2 scanGeneration; #endif Does anyone try to build Dalvik with this option enabled? Do you know anything about generational GC support from Dalvik? Regards, Wonil.

    Read the article

  • Shortest distance between a point and a line segment

    - by Eli Courtwright
    I need a basic function to find the shortest distance between a point and a line segment. Feel free to write the solution in any language you want; I can translate it into what I'm using (Javascript). EDIT: My line segment is defined by two endpoints. So my line segment AB is defined by the two points A (x1,y1) and B (x2,y2). I'm trying to find the distance between this line segment and a point C (x3,y3). My geometry skills are rusty, so the examples I've seen are confusing, I'm sorry to admit.

    Read the article

  • Are Scala "continuations" just a funky syntax for defining and using Callback Functions?

    - by Alex R
    And I mean that in the same sense that a C/Java for is just a funky syntax for a while loop. I still remember when first learning about the for loop in C, the mental effort that had to go into understanding the execution sequence of the three control expressions relative to the loop statement. Seems to me the same sort of effort has to be applied to understand Continuations (in Scala and I guess probably other languages). And then there's the obvious follow-up question... if so, then what's the point? It seems like a lot of pain (language complexity, programmer errors, unreadable programs, etc) for no gain.

    Read the article

  • which Server I have to buy?

    - by sri
    Hello, Recently i have registered startup (home office). My intension is to have virtual(company) server for LAMP website development(virtual). Initially thinking, 2-5 people will be logging virtually to the server. a. but not sure where to shop or what to shop. b. Should I go for my own Server or i have to share some server. c. If i have to go My Server, should I go with rack or tower model. d. which brand and model i have to buy. d. If I have to go with shared(cloud), not sure which is the best place It will be great help, if some one provide in site. Thanks in advance. Sri

    Read the article

  • Neural Networks test cases

    - by Betamoo
    Does increasing the number of test cases in case of Precision Neural Networks may led to problems (like over-fitting for example)..? Does it always good to increase test cases number? Will that always lead to conversion ? If no, what are these cases.. an example would be better.. Thanks,

    Read the article

  • Is it possible to have regexp that matches all valid regular expressions?

    - by Juha Syrjälä
    Is it possible to detect if a given string is valid regular expression, using just regular expressions? Say I have some strings, that may or may not be a valid regular expressions. I'd like to have a regular expression matches those string that correspond to valid regular expression. Is that possible? Or do I have use some higher level grammar (i.e. context free language) to detect this? Does it affect if I am using some extended version of regexps like Perl regexps? If that is possible, what the regexp matching regexp is?

    Read the article

  • Is there any valid reason radians are used as the inputs to trig function in many modern languages?

    - by johnmortal
    Is there any pressing reason trig functions should use radian inputs in modern programming languages? As far as I know radians are typically ugly to deal with except in three cases: (1) You want to compute an arc length and you know the angle of the arc and (2) You need to do symbolic calculus with trig functions (3) certain infinite series expansion look prettier if the input is in radians. None of these scenarios seem like a worthy justification for every programming language I am familiar with using radian inputs for Sin, Cos, Tangent, etc... The third one sounds good because it might mean one gets faster computations using radians (very slightly faster- the cost of one additional floating point multiplication ) , but I am dubious even of that because most commonly the developer had to take an extra step to put the angle in radians in the first place. The other two are ridiculous justifications for all the added obscurity.

    Read the article

  • "Anagram solver" based on statistics rather than a dictionary/table?

    - by James M.
    My problem is conceptually similar to solving anagrams, except I can't just use a dictionary lookup. I am trying to find plausible words rather than real words. I have created an N-gram model (for now, N=2) based on the letters in a bunch of text. Now, given a random sequence of letters, I would like to permute them into the most likely sequence according to the transition probabilities. I thought I would need the Viterbi algorithm when I started this, but as I look deeper, the Viterbi algorithm optimizes a sequence of hidden random variables based on the observed output. I am trying to optimize the output sequence. Is there a well-known algorithm for this that I can read about? Or am I on the right track with Viterbi and I'm just not seeing how to apply it?

    Read the article

  • Database strucure for versioning and multiple languages

    - by phobia
    Hi, I'm wondering how to best solve the issue of content existing in multiple versions and multiple languages. An image of my current structure can be seen here: http://i46.tinypic.com/72fx3k.png Each content can only have one active version in each language, and that's how I'm curious on how to best solve. Right now I have a column of the contentversions table, which means for each change of active version I have to run a update and set active=false on all version and then a update to set active=true for the piece of content in question. Any thoughts? :)

    Read the article

  • Operant conditioning algorithm?

    - by Ken
    What's the best way to implement real time operant conditioning (supervised reward/punishment-based learning) for an agent? Should I use a neural network (and what type)? Or something else? I want the agent to be able to be trained to follow commands like a dog. The commands would be in the form of gestures on a touchscreen. I want the agent to be able to be trained to follow a path (in continuous 2D space), make behavioral changes on command (modeled by FSM state transitions), and perform sequences of actions. The agent would be in a simulated physical environment.

    Read the article

  • problem with hierarchical clustering in Python

    - by user248237
    I am doing a hierarchical clustering a 2 dimensional matrix by correlation distance metric (i.e. 1 - Pearson correlation). My code is the following (the data is in a variable called "data"): from hcluster import * Y = pdist(data, 'correlation') cluster_type = 'average' Z = linkage(Y, cluster_type) dendrogram(Z) The error I get is: ValueError: Linkage 'Z' contains negative distances. What causes this error? The matrix "data" that I use is simply: [[ 156.651968 2345.168618] [ 158.089968 2032.840106] [ 207.996413 2786.779081] [ 151.885804 2286.70533 ] [ 154.33665 1967.74431 ] [ 150.060182 1931.991169] [ 133.800787 1978.539644] [ 112.743217 1478.903191] [ 125.388905 1422.3247 ]] I don't see how pdist could ever produce negative numbers when taking 1 - pearson correlation. Any ideas on this? thank you.

    Read the article

  • Logical Matrix .... Solution

    - by Sanju
    Suppose you have an NxN matrix and you have to check whether it's a upper diagonal matrix or not. Is there any generic method to solve this problem? I am elaborating my question: Some thing is like this: Suppose you have NXN matrix having the value of N=4 then matrix will look like this: |5 6 9 2| |1 8 4 9| |5 8 1 6| |7 6 3 2| Its a 4X4 square matrix and again if it's upper triangle matrix it will look something like this: |5 6 9 2| |1 8 4 0| |5 8 0 0| |7 0 0 0| I need to generate a generic program in any language to check wether a square matrix is upper trailgle or not.

    Read the article

  • Problems with real-valued input deep belief networks (of RBMs)

    - by Junier
    I am trying to recreate the results reported in Reducing the dimensionality of data with neural networks of autoencoding the olivetti face dataset with an adapted version of the MNIST digits matlab code, but am having some difficulty. It seems that no matter how much tweaking I do on the number of epochs, rates, or momentum the stacked RBMs are entering the fine-tuning stage with a large amount of error and consequently fail to improve much at the fine-tuning stage. I am also experiencing a similar problem on another real-valued dataset. For the first layer I am using a RBM with a smaller learning rate (as described in the paper) and with negdata = poshidstates*vishid' + repmat(visbiases,numcases,1); I'm fairly confident I am following the instructions found in the supporting material but I cannot achieve the correct errors. Is there something I am missing? See the code I'm using for real-valued visible unit RBMs below, and for the whole deep training. The rest of the code can be found here. rbmvislinear.m: epsilonw = 0.001; % Learning rate for weights epsilonvb = 0.001; % Learning rate for biases of visible units epsilonhb = 0.001; % Learning rate for biases of hidden units weightcost = 0.0002; initialmomentum = 0.5; finalmomentum = 0.9; [numcases numdims numbatches]=size(batchdata); if restart ==1, restart=0; epoch=1; % Initializing symmetric weights and biases. vishid = 0.1*randn(numdims, numhid); hidbiases = zeros(1,numhid); visbiases = zeros(1,numdims); poshidprobs = zeros(numcases,numhid); neghidprobs = zeros(numcases,numhid); posprods = zeros(numdims,numhid); negprods = zeros(numdims,numhid); vishidinc = zeros(numdims,numhid); hidbiasinc = zeros(1,numhid); visbiasinc = zeros(1,numdims); sigmainc = zeros(1,numhid); batchposhidprobs=zeros(numcases,numhid,numbatches); end for epoch = epoch:maxepoch, fprintf(1,'epoch %d\r',epoch); errsum=0; for batch = 1:numbatches, if (mod(batch,100)==0) fprintf(1,' %d ',batch); end %%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% data = batchdata(:,:,batch); poshidprobs = 1./(1 + exp(-data*vishid - repmat(hidbiases,numcases,1))); batchposhidprobs(:,:,batch)=poshidprobs; posprods = data' * poshidprobs; poshidact = sum(poshidprobs); posvisact = sum(data); %%%%%%%%% END OF POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% poshidstates = poshidprobs > rand(numcases,numhid); %%%%%%%%% START NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% negdata = poshidstates*vishid' + repmat(visbiases,numcases,1);% + randn(numcases,numdims) if not using mean neghidprobs = 1./(1 + exp(-negdata*vishid - repmat(hidbiases,numcases,1))); negprods = negdata'*neghidprobs; neghidact = sum(neghidprobs); negvisact = sum(negdata); %%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% err= sum(sum( (data-negdata).^2 )); errsum = err + errsum; if epoch>5, momentum=finalmomentum; else momentum=initialmomentum; end; %%%%%%%%% UPDATE WEIGHTS AND BIASES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% vishidinc = momentum*vishidinc + ... epsilonw*( (posprods-negprods)/numcases - weightcost*vishid); visbiasinc = momentum*visbiasinc + (epsilonvb/numcases)*(posvisact-negvisact); hidbiasinc = momentum*hidbiasinc + (epsilonhb/numcases)*(poshidact-neghidact); vishid = vishid + vishidinc; visbiases = visbiases + visbiasinc; hidbiases = hidbiases + hidbiasinc; %%%%%%%%%%%%%%%% END OF UPDATES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% end fprintf(1, '\nepoch %4i error %f \n', epoch, errsum); end dofacedeepauto.m: clear all close all maxepoch=200; %In the Science paper we use maxepoch=50, but it works just fine. numhid=2000; numpen=1000; numpen2=500; numopen=30; fprintf(1,'Pretraining a deep autoencoder. \n'); fprintf(1,'The Science paper used 50 epochs. This uses %3i \n', maxepoch); load fdata %makeFaceData; [numcases numdims numbatches]=size(batchdata); fprintf(1,'Pretraining Layer 1 with RBM: %d-%d \n',numdims,numhid); restart=1; rbmvislinear; hidrecbiases=hidbiases; save mnistvh vishid hidrecbiases visbiases; maxepoch=50; fprintf(1,'\nPretraining Layer 2 with RBM: %d-%d \n',numhid,numpen); batchdata=batchposhidprobs; numhid=numpen; restart=1; rbm; hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases; save mnisthp hidpen penrecbiases hidgenbiases; fprintf(1,'\nPretraining Layer 3 with RBM: %d-%d \n',numpen,numpen2); batchdata=batchposhidprobs; numhid=numpen2; restart=1; rbm; hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases; save mnisthp2 hidpen2 penrecbiases2 hidgenbiases2; fprintf(1,'\nPretraining Layer 4 with RBM: %d-%d \n',numpen2,numopen); batchdata=batchposhidprobs; numhid=numopen; restart=1; rbmhidlinear; hidtop=vishid; toprecbiases=hidbiases; topgenbiases=visbiases; save mnistpo hidtop toprecbiases topgenbiases; backpropface; Thanks for your time

    Read the article

  • How is the ">" operator implemented (on 32 bit integers)?

    - by Ron Klein
    Let's say that the environment is x86. How do compilers compile the "" operator on 32 bit integers. Logically, I mean. Without any knowledge of Assembly. Let's say that the high level language code is: int32 x, y; x = 123; y = 456; bool z; z = x > y; What does the compiler do for evaluating the expression x > y? Does it perform something like (assuming that x and y are positive integers): w = sign_of(x - y); if (w == 0) // expression is 'false' else if (w == 1) // expression is 'true' else // expression is 'false' Is there any reference for such information?

    Read the article

  • How do polymorphic inline caches work with mutable types?

    - by kingkilr
    A polymorphic inline cache works by caching the actual method by the type of the object, in order to avoid the expensive lookup procedures (usually a hashtable lookup). How does one handle the type comparison if the type objects are mutable (i.e. the method might be monkey patched into something different at run time). The one idea I've come up with would be a "class counter" that gets incremented each time a method is adjusted, however this seems like it would be exceptionally expensive in a heavily monkey patched environ since it would kill all the PICs for that class, even if the methods for them weren't altered. I'm sure there must be a good solution to this, as this issue is directly applicable to Javascript and AFAIK all 3 of the big JS VMs have PICs (wow acronym ahoy).

    Read the article

  • Detecting regular expression in content during parse

    - by sonofdelphi
    I am writing a parser for C. I was just running it with some other language files (for fun, to see the extent C-likeness). It breaks down if the code being parsed contains regular expressions... Case 1: For example, while parsing the JavaScript code snippet, var phone="(304)434-5454" phone=phone.replace(/[\(\)-]/g, "") //Returns "3044345454" (removes "(", ")", and "-") The '(', '[' etc get matched as starters of new scopes, which may never be closed. Case 2: And, for the Perl code snippet, # Replace backslashes with two forward slashes # Any character can be used to delimit the regex $FILE_PATH =~ s@\\@//@g; The // gets matched as a comment... How can I detect a regular expression within the content text of a "C-like" program-file?

    Read the article

  • git can I speed up committing?

    - by AndreasT
    I have a big repository in a shared folder. I use git from within a VM on that folder. Everything works nice, but the repository is big and git's searching through all directories and files when committing is slow. I cannot move this repository out of the shared folder. I tried to git add specific files and directories, but when I do git commit -m "something" it still goes off onto it's oddyssey through the directory tree. Can I do commits that ignore the rest of the tree?

    Read the article

  • Should I use C++0x Features Now?

    - by svu2g
    With the official release of VS 2010, is it safe for me to start using the partially-implemented C++0x feature set in my new code? The features that are of interest to me right now are both implemented by VC++ 2010 and recent versions of GCC. These are the only two that I have to support. In terms of the "safety" mentioned in the first sentence: can I start using these features (e.g., lambda functions) and still be guaranteed that my code will compile in 10 years on a compiler that properly conforms to C++0x when it is officially released? I guess I'm asking if there is any chance that VC++ 2010 or GCC will end up like VC++ 6; it was released before the language was officially standardized and consequently allowed grossly ill-formed code to compile. After all, Microsoft does say that "10 is the new 6". ;)

    Read the article

  • Shortest distance between a point and a line segment

    - by Eli Courtwright
    I need a basic function to find the shortest distance between a point and a line segment. Feel free to write the solution in any language you want; I can translate it into what I'm using (Javascript). EDIT: My line segment is defined by two endpoints. So my line segment AB is defined by the two points A (x1,y1) and B (x2,y2). I'm trying to find the distance between this line segment and a point C (x3,y3). My geometry skills are rusty, so the examples I've seen are confusing, I'm sorry to admit.

    Read the article

< Previous Page | 130 131 132 133 134 135 136 137 138 139 140 141  | Next Page >