Latent Dirichlet Allocation, pitfalls, tips and programs

Posted by Gregg Lind on Stack Overflow See other posts from Stack Overflow or by Gregg Lind
Published on 2008-10-10T13:23:07Z Indexed on 2010/04/15 22:03 UTC
Read the original article Hit count: 544

Filed under:

lda

|

natural-language

|

statistics

|

nlp

|

algorithm

I'm experimenting with Latent Dirichlet Allocation for topic disambiguation and assignment, and I'm looking for advice.

Which program is the "best", where best is some combination of easiest to use, best prior estimation, fast
How do I incorporate my intuitions about topicality. Let's say I think I know that some items in the corpus are really in the same category, like all articles by the same author. Can I add that into the analysis?
Any unexpected pitfalls or tips I should know before embarking?

I'd prefer is there are R or Python front ends for whatever program, but I expect (and accept) that I'll be dealing with C.

© Stack Overflow or respective owner

Related posts about lda

Latent Dirichlet Allocation, pitfalls, tips and programs

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm experimenting with Latent Dirichlet Allocation for topic disambiguation and assignment, and I'm looking for advice. Which program is the "best", where best is some combination of easiest to use, best prior estimation, fast How do I incorporate my intuitions about topicality. Let's say I think… >>> More
how to load the save the value and upload back when it turn on.

as seen on Stack Overflow - Search for 'Stack Overflow'
void Load(void) { unsigned char j,*flash,free; unsigned int bank,siz; asm lda 0xA100; asm sta flash_s; flash = &CT_r; bank = 0xA101; siz = 1; for(j=0;j<=siz;j++) { asm lda bank; asm sta free; free=*flash++; bank=bank+1; } asm lda 0xA103; … >>> More
Statistics toolbox in Matlab

as seen on Stack Overflow - Search for 'Stack Overflow'
Is the default linear discriminant analysis (LDA) in Matlab the Fisher discriminant analysis? >>> More
postfix with mailman

as seen on Server Fault - Search for 'Server Fault'
What should happen is that [email protected] should be delivered to that users inbox on localhost, user@localhost. Thunderbird works fine at reading user@localhost. I'm just using a small portion of postfix-dovecot with Ubuntu mailman. How can I get postfix to recognize the FQDN and deliver… >>> More
How ca I return a value from a function

as seen on Stack Overflow - Search for 'Stack Overflow'
I used a function to calculate information about certain instructions I intialized in a map,like this void get_objectcode(char*&token1,const int &y) { map<string,int> operations; operations["ADD"] = 18; operations["AND"] = 40; operations["COMP"] = 28; operations["DIV"]… >>> More

Related posts about natural-language

Natural language processing - Ideas for beginner's projects

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi guys, I am a beginner in NLP and NLTK. I am very interested in NLP and hence joined a weekend course on AI in some local institution, which requires me to do a project for completion of the course, and I decided to do it in NLP. The problem is,the instructor is not good at all for this course… >>> More
Natural Language parsing of an appointment?

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm looking for a Java library to help parse user entered text that represents an 'appointment' for a calendar application. For instance: Lunch with Mike at 11:30 on Tuesday or 5pm Happy hour on Friday I've found some promising leads like https://jchronic.dev.java.net/ and http://www.datejs… >>> More
details on the following Natural Language Processing terms ?

as seen on Stack Overflow - Search for 'Stack Overflow'
Named Entity Extraction (extract ppl, cities, organizations) Content Tagging (extract topic tags by scanning doc) Structured Data Extraction Topic Categorization (taxonomy classification by scanning doc....bayesian ) Text extraction (HTML page cleaning) are there libraries that i can use to do any… >>> More
String chunking algorithm with natural language context

as seen on Stack Overflow - Search for 'Stack Overflow'
I have a arbitrarily large string of text from the user that needs to be split into 10k chunks (potentially adjustable value) and sent off to another system for processing. Chunks cannot be longer than 10k (or other arbitrary value) Text should be broken with natural language context in mind split… >>> More
oppertunities in the area of natural language processing

as seen on Stack Overflow - Search for 'Stack Overflow'
i worked on indian language telugu using python... and now i am interested to work in any company which works on natural language processing.if any oppertunities please tell me >>> More