About the inference result of Blei's lda-c-dist

I have a question about the inference result of lda-c-dist package. How many words should be displayed when viewing results of inference? For example, if I set number of words to a very large number N(assume number of all terms are N), it seems to exist some groups of words. In each group, the index of words are ranging from 1 to N.

What I got is like, Assume number of terms is 10, and I assign the number of words displayed to 10.

Topic 0xx:
001
008
009
002
003
007
000
004
005
006

It seems that, may be I should set words displayed 3, not 10.

So, as to one topic, when viewing topics by calling topics.py, how many words should be specified?

Besides, I'm going to use this output to calculate the similarity of two topics. So ...

Answers


Actually, there can be as many items as the vocabularies are. What is displayed here, is just a probability descending order for a limited number indicated.


Need Your Help

Store static transaction details as JSON in MySQL

mysql ruby-on-rails json mongodb

My application generates payments to contractors based on a number of different objects on our platform. For example, we have jobs and line items per job. Each of those have their own details.

JavaMail to send email through vps, require NO authentication, but get AuthenticationFailedException - why?

java email authentication smtp javamail

I originally setup my site to use my local ISP to send email through my site. I would like to change this, and start sending email through my VPS. According to the online documentation (from my vps

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.