Search Results

Search found 61 results on 3 pages for 'bayesian'.

Page 1/3 | 1 2 3 | Next Page >

Designing bayesian networks

- by devoured elysium

I have a basic question about Bayesian networks. Let's assume we have an engine, that with 1/3 probability can stop working. I'll call this variable ENGINE. If it stops working, then your car doesn't work. If the engine is working, then your car will work 99% of the time. I'll call this one CAR. Now, if your car is old(OLD), instead of not working 1/3 of the time, your engine will stop working 1/2 of the time. I'm being asked to first design the network and then assign all the conditional probabilities associated with the table. I'd say the diagram of this network would be something like OLD -> ENGINE -> CAR Now, for the conditional probabilities tables I did the following: OLD |ENGINE ------------ True | 0.50 False | 0.33 and ENGINE|CAR ------------ True | 0.99 False | 0.00 Now, I am having trouble about how to define the probabilities of OLD. In my point of view, old is not something that has a CAUSE relationship with ENGINE, I'd say it is more a characteristic of it. Maybe there is a different way to express this in the diagram? If the diagram is indeed correct, how would I go to make the tables? Thanks

Read the article
naive bayesian spam filter question

- by Microkernel

Hi guys, I am planning to implement spam filter using Naive Bayesian classification model. Online I see a lot of info on Naive Bayesian classification, but the problem is its a lot of mathematical stuff, than clearly stating how its done. And the problem is I am more of a programmer than a mathematician (yes I had learnt Probability and Bayesian theorem back in school, but out of touch for a long long time, and I don't have luxury of learning it now (Have nearly 3 weeks to come-up with a working prototype)). So if someone can explain or point me to location where its explained for programmers than a mathematician, it would be a great help. PS: By the way I have to implement it in C, if you want to know. :( Regards, Microkernel

Read the article
Bayesian filtering for forum posts

- by Andrew Davey

Has anyone used a Bayesian filter to let forum members classify posts and so over time only display interesting posts? A Bayesian filter seems to work well for detecting email spam. Is this a viable approach to filter forum posts for users?

Read the article
Naive Bayesian spam filtering effectiveness

- by Waleed Eissa

How effective is naive Bayesian filtering for filtering spam? I heard that spammers easily bypass them by stuffing extra non-spam-related words. What programming techniques can you use with Bayesian filters to prevent that?

Read the article
Confusion Matrix of Bayesian Network

- by iva123

Hi, I'm trying to understand bayesian network. I have a data file which has 10 attributes, I want to acquire the confusion table of this data table ,I thought I need to calculate tp,fp, fn, tn of all fields. Is it true ? if it's then what i need to do for bayesian network. Really need some guidance, I'm lost.

Read the article
Any Naive Bayesian Classifier in python?

- by asldkncvas

Dear Everyone I have tried the Orange Framework for Naive Bayesian classification. The methods are extremely unintuitive, and the documentation is extremely unorganized. Does anyone here have another framework to recommend? I use mostly NaiveBayesian for now. I was thinking of using nltk's NaiveClassification but then they don't think they can handle continuous variables. What are my options?

Read the article
ClassNotFoundException error in implementing Bayesian algorithm in Apache Mahout on Hadoop

- by Shweta

Hi, I have a problem in executing the Bayesian algorithm in Mahout. I built it with Maven and the job file is in target directory. When run from terminal using hadoop, I'm getting the ClassNotFoundException error. What should be done? $HADOOP_HOME/bin/hadoop jar mahout-core-0.3-SNAPSHOT.job org.apache.mahout.classifier.bayes.mapreduce.bayes.bayesdriver -i test -o output Exception in thread "main" java.lang.ClassNotFoundException: org.apache.mahout.classifier.bayes.mapreduce.bayes.bayesdriver at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.util.RunJar.main(RunJar.java:149)

Read the article
Naive Bayesian for Topic detection using "Bag of Words" approach

- by AlgoMan

I am trying to implement a naive bayseian approach to find the topic of a given document or stream of words. Is there are Naive Bayesian approach that i might be able to look up for this ? Also, i am trying to improve my dictionary as i go along. Initially, i have a bunch of words that map to a topics (hard-coded). Depending on the occurrence of the words other than the ones that are already mapped. And depending on the occurrences of these words i want to add them to the mappings, hence improving and learning about new words that map to topic. And also changing the probabilities of words. How should i go about doing this ? Is my approach the right one ? Which programming language would be best suited for the implementation ?

Read the article
Ticket Bayesian(or something else) Categorization

- by vinnitu

Hi. I search solution for ticket managment system. Do you know any commercial offers? For now I have only own dev prjects with using dspam library. Maybe I am wrong use it but it show bad results. My idea was divide all prerated ticket in 2 group: spam (it is my category) and rest to (ham - all not the same with this category). After that i trained my dspam. After I redivide all tickets in new groups (for next category) and teach dspam again (with new user - by category name)... And it works bad... My thoughs about is - bad data base tickes (i mean not correct tagging before) - bad my algorythm (it is more posible) Please give me a direction to go forward. Thanks. I am integesting any idea and suggestion. Thanks again.

Read the article
Bayesian content filter for vbulletin [on hold]

- by mc0e

I've been tasked with coming up with a tool to automatically flag some posts for moderator attention on a large vbulletin forum. It's not spam per se, but the task has a lot in common with the sort of handling that might be done by a spam protection plugin (a mod in vbulletin speak). There's only so much I can say, but the task does not involve bad users, so much as particular kinds of posts which the moderators need to be aware of. Filtering out user registrations and links is therefore not useful, and we are talking about posts by real human users. What I'm looking for is an existing bayesian classification plugin, or something that I can study to get an understanding of how to do the vbulletin side of the interface in order to build such a thing. Ie I'd need ways for moderators to list flagged posts, and to correct the classification of posts which have been mis-classified. Ideally I want a 3 way split with an "unsure" category in order to reduce what has to be reviewed to find any mis-classifications. Any pointers? I've searched around a bit, and so far what I've found has been more or less entirely targetted at intervening in sign-ups (mostly using stopforumspam), captchas, and use of external services like akismet which are spam specific. I'm also considering an external solution, which might be ableto be interfaced i

Read the article
Naive Bayesian classification (spam filtering) - Doubt in one calculation? Which one is right? Plz c

- by Microkernel

Hi guys, I am implementing Naive Bayesian classifier for spam filtering. I have doubt on some calculation. Please clarify me what to do. Here is my question. In this method, you have to calculate P(S|W) - Probability that Message is spam given word W occurs in it. P(W|S) - Probability that word W occurs in a spam message. P(W|H) - Probability that word W occurs in a Ham message. So to calculate P(W|S), should I do (1) (Number of times W occuring in spam)/(total number of times W occurs in all the messages) OR (2) (Number of times word W occurs in Spam)/(Total number of words in the spam message) So, to calculate P(W|S), should I do (1) or (2)? (I thought it to be (2), but I am not sure, so plz clarify me) I am refering http://en.wikipedia.org/wiki/Bayesian_spam_filtering for the info by the way. I got to complete the implementation by this weekend :( Thanks and regards, MicroKernel :) @sth: Hmm... Shouldn't repeated occurrence of word 'W' increase a message's spam score? In the your approach it wouldn't, right?. Lets take a scenario and discuss... Lets say, we have 100 training messages, out of which 50 are spam and 50 are Ham. and say word_count of each message = 100. And lets say, in spam messages word W occurs 5 times in each message and word W occurs 1 time in Ham message. So total number of times W occuring in all the spam message = 5*50 = 250 times. And total number of times W occuring in all Ham messages = 1*50 = 50 times. Total occurance of W in all of the training messages = (250+50) = 300 times. So, in this scenario, how do u calculate P(W|S) and P(W|H) ? Naturally we should expect, P(W|S) P(W|H)??? right. Please share your thought...

Read the article
R Package for Bayesian Belief Networks

- by user187395

Is anyone aware of an R package for Bayesian Belief Networks (BBN) that can handle latent (hidden) nodes? I am referring to process learning the structure and the parameters of the model. Based on the documentation, the "bnlearn" package does not provide for such capability. In addition, is anyone aware of R packages that allow for making inferences and sensitivity analysis for BBN? Thank you in advance.

Read the article
SpamAssassin bayesian score discrepancies

- by CaptSaltyJack

This makes my brain hurt. For some reason, SpamAssassin is giving high scores to certain emails, but when I test them on the command line, they get a low score. This one particular email has this in the header: X-Spam-Flag: YES X-Spam-Score: 8.521 X-Spam-Level: ******** X-Spam-Status: Yes, score=8.521 tagged_above=-9999 required=5 tests=[BAYES_99=3.5, BAYES_999=0.2, HTML_MESSAGE=0.001, NO_RECEIVED=-0.001, NO_RELAYS=-0.001, RAZOR2_CF_RANGE_51_100=0.5, RAZOR2_CF_RANGE_E8_51_100=1.886, RAZOR2_CHECK=0.922, URIBL_RHS_DOB=1.514] autolearn=no Yet when I dump the raw email into a file msg and run sudo su amavis -c 'spamassassin -t msg', I get this output: Content analysis details: (3.8 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 1.5 URIBL_RHS_DOB Contains an URI of a new domain (Day Old Bread) [URIs: cliobeads.com] -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP 0.0 HTML_MESSAGE BODY: HTML included in message -0.0 BAYES_20 BODY: Bayes spam probability is 5 to 20% [score: 0.1855] 1.9 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level above 50% [cf: 100] 0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% [cf: 100] 0.9 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/) I'm really confused as to why when the email comes in, it gets a completely different score attached to it than when I run spamassassin -t. Is there some other way I should be testing emails? Also, my users have the ability to drag false positives into a folder called "False Positives," and every day a cron job fires off that runs this on every message in every user's folder: sa-learn --dbpath=/var/lib/amavis/.spamassassin --ham /tmp/*-*.eml >/dev/null I ran sudo locate bayes_toks and there's definitely only one bayes DB on the system, in /var/lib/amavis/.spamassassin. I'm clueless, any help would be great and may help restore my sanity!

Read the article
AI Game Programming : Bayesian Networks, how to make efficient?

- by Mahbubur R Aaman

We know that AI is one of the most important part of Game Programming. Bayesian networks is one of the core part of AI at Game Programming. Bayesian networks are graphs that compactly represent the relationship between random variables for a given problem. These graphs aid in performing reasoning or decision making in the face of uncertainty. Here me, utilizing the monte carlo method and genetic algorithms. But tooks much time and sometimes crashes due to memory. Is there any way to implement efficiently?

Read the article
Order database results by bayesian rating

- by One Trick Pony

I'm not sure this is even possible, but I need a confirmation before doing it the "ugly" way :) So, the "results" are posts inside a database which are stored like this: the posts table, which contains all the important stuff, like the ID, the title, the content the post meta table, which contains additional post data, like the rating (this_rating) and the number of votes (this_num_votes). This data is stored in pairs, the table has 3 columns: post ID / key / value. It's basically the WordPress table structure. What I want is to pull out the highest rated posts, sorted based on this formula: br = ( (avg_num_votes * avg_rating) + (this_num_votes * this_rating) ) / (avg_num_votes + this_num_votes) which I stole form here. avg_num_votes and avg_rating are known variables (they get updated on each vote), so they don't need to be calculated. Can this be done with a mysql query? Or do I need to get all the posts and do the sorting with PHP?

Read the article
Weighted Average and Ratings

- by Danten

Maths isn't my strong point and I'm at a loss here. Basically, all I need is a simple formula that will give a weighted rating on a scale of 1 to 5. If there are very few votes, they carry less influence and the rating pressess more towards the average (in this case I want it to be 3, not the average of all other ratings). I've tried a few different bayesian implementations but these haven't worked out. I believe the graphical representation I am looking for could be shown as: ___ / ___/ Cheers

Read the article
Using c#,c/c++ or java to improve BBN with GA

- by madicemickael

I have a little problem in my little project , I wish that someone here could help me! I am planning to use a bayesian network as a decision factor in my game AI and I want to improve the decision making every step of the way , anyone knows how to do that ? Any tutorials / existing implementations will be very good,I hope some of you could help me. I heard that a programmer in this community did a good implementation of this put together for poker game AI.I am planning to use it like him ,but in another poker(Texas) or maybe Rentz. Looking for C/c++ or c# or java code. Thanks , Mike

Read the article
Calculate posterior distribution of unknown mis-classification with PRTools in MATLAB

- by Samuel Lampa

I'm using the PRTools MATLAB library to train some classifiers, generating test data and testing the classifiers. I have the following details: N: Total # of test examples k: # of mis-classification for each classifier and class I want to do: Calculate and plot Bayesian posterior distributions of the unknown probabilities of mis-classification (denoted q), that is, as probability density functions over q itself (so, P(q) will be plotted over q, from 0 to 1). I have that (math formulae, not matlab code!): P(q|k,N) = Posterior * Prior / Normalization constant = P(k|q,N) * P(q|N) / P(k|N) The prior is set to 1, so I only need to calculate the posterior and normalization constant. I know that the posterior can be expressed as (where B(N,k) is the binomial coefficient): P(k|q,N) = B(N,k) * q^k * (1-q)^(N-k) ... so the Normalization constant is simply an integral of the posterior above, from 0 to 1: P(k|N) = B(N,k) * integralFromZeroToOne( q^k * (1-q)^(N-k) ) (The Binomial coefficient ( B(N,k) ) can be omitted thoughappears in both the posterior and normalization constant, so it can be omitted.) Now, I've heard that the integral for the normalization constant should be able to be calculated as a series ... something like: k!(N-k)! / (N+1)! Is that correct? (I have some lecture notes from with this series, but can't figure out if it is for the normalization constant integral, or for the posterior distribution of mis-classification (q)) Also, hints are welcome as how to practically calculate this? (factorials are easily creating truncation errors right?) ... AND, how to practically calculate the final plot (the posterior distribution over q, from 0 to 1).

Read the article
What would be a good language to implement a naive bayes classifier from scratch?

- by kita

I would like to implement a naive bayes classifier for spam filtering from scratch as a learning exercise. What would be the best langauge of the following to try this out in? Java Ruby C++ C something else Please give reasons (it would help greatly!)

Read the article
Mahout Naive Bayes Classifier for Items

- by Nimesh Parikh

Team, I am working on a project where i need to classify Items into certain category. I have a single file as input; which contains target variable and space separated features. My training data will look like Category Name [Tab] DataString Plumbing [Tab] Pipe Tap Plastic Pipe PVC Pipe Cold Water Line Hot Water Line Tee outlet up Elbow turned up Elbow turned down Gate valve Globe valve Paint [Tab] Ivory Black Burnt Umber Caput Mortuum Violet Earth Red Yellow Ochre Titanium White Cadmium Yellow Light Cadmium Yellow Deep Cloths [Tab] Shirt T-Shirt Pent Jeans Tee Cargo Well, I have really big set of Category. I have couple of question here am i using correct data for Training? If no then what should i use? Once I train and Test my model, what is next step? How can i use output? Please help me with this Thanks, Nimesh

Read the article
A simple explanation of Naive Bayes Classification

- by Jaggerjack

I am finding it hard to understand the process of Naive Bayes, and I was wondering if someone could explained it with a simple step by step process in English. I understand it takes comparisons by times occurred as a probability, but I have no idea how the training data is related to the actual dataset. Please give me an explanation of what role the training set plays. I am giving a very simple example for fruits here, like banana for example training set--- round-red round-orange oblong-yellow round-red dataset---- round-red round-orange round-red round-orange oblong-yellow round-red round-orange oblong-yellow oblong-yellow round-red

Read the article
How do I solve this conditional probabilities problem with MATLAB?

- by user198729

If P( cj | xi ) are already known, where i=1,2,...n; j=1,2,...k; How do I calculate/estimate: P( cj | xl , xm , xn ), where j=1,2,...k; l,m,n {1,2,...n} ?

Read the article
Naive Bayes matlab, row classification

- by Jungle Boogie

How do you classify a row of seperate cells in matlab? Atm I can classify single coloums like so: training = [1;0;-1;-2;4;0;1]; % this is the sample data. target_class = ['posi';'zero';'negi';'negi';'posi';'zero';'posi']; % target_class are the different target classes for the training data; here 'positive' and 'negetive' are the two classes for the given training data % Training and Testing the classifier (between positive and negative) test = 10*randn(25, 1); % this is for testing. I am generating random numbers. class = classify(test,training, target_class, 'diaglinear') % This command classifies the test data depening on the given training data using a Naive Bayes classifier Unlike the above im looking at wanting to classify: A B C Row A | 1 | 1 | 1 = a house Row B | 1 | 2 | 1 = a garden Can anyone help? Here is a code example from matlabs site: nb = NaiveBayes.fit(training, class) nb = NaiveBayes.fit(..., 'param1',val1, 'param2',val2, ...) I dont understand what param1 is or what val1 etc should be?

Read the article
Calculating spam probability

- by Hobhouse

I am building a website in python/django and want to predict wether a user submission is valid or wether it is spam. Users have an accept rate on their submissions, like this website has. Users can moderate other users' submissions; and these moderations are later metamoderated by an admin. Given this: user A with an submission accept rate of 60% submits something. user B moderates A's post as a valid submission. However, his moderations are often wrong, and his moderations' accept rate is a mere 30%. user C moderates A's post as spam. User C is usually right. His moderations' accept rate is 80%. How can I predict the chance of A's post being spam?

Read the article
What is the difference between causal models and directed graphical models?

- by Neil G

What is the difference between causal models and directed graphical models? or: What is the difference between causal relationships and directed probabilistic relationships? or, even better: What would you put in the interface of a DirectedProbabilisticModel class, and what in a CausalModel class? Would one inherit from the other? Collaborative solution: interface DirectedModel { map<Node, double> InferredProbabilities(map<Node, double> observed_probabilities, set<Node> nodes_of_interest) } interface CausalModel: DirectedModel { bool NodesDependent(set<Node> nodes, map<Node, double> context) map<Node, double> InferredProbabilities(map<Node, double> observed_probabilities, map<Node, double> externally_forced_probabilities, set<Node> nodes_of_interest) }

Read the article

1 2 3 | Next Page >