vectorization of a text file
        Posted  
        
            by 
                Fox
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Fox
        
        
        
        Published on 2012-03-21T17:25:28Z
        Indexed on 
            2012/03/21
            17:29 UTC
        
        
        Read the original article
        Hit count: 433
        
java
|vectorization
I am trying to implement vectorization of a text file...I have created a dictionary (Unique words in all the documents) ... Which is the best way to implement this in java?
For example - My dictionary has the following words - {w1, w2, w3, w4} And I have 2 documents each having subset of the words in the vocabulary. I need to write to a text file the matrix in the form --
1,3,4,0
0,0,2,1
Here each row represents a document and the values represent the occurrence of each word in the document.
Can you suggest me the most efficient way to implement this in Java?
© Stack Overflow or respective owner