Latent semantic analysis — LSA via Sklearn

Quick write up on using the CountVectorizer and TruncatedSVD from the Sklearn library, to compute Document-Term and Term-Topic matrices. After setting up our model, we try it out on simple, never before seen documents in order to label them.

Helper Methods

  • using these to simplify viewing a document-topic matrix

Document-Topic

Document-Topic Matrix

Term-Topic

Term-Topic Matrix

Hold-Out Documents

Document-Topic Matrix for Hold Outs
Photo by iMattSmart on Unsplash

Focused on generating original, compelling, short stories through the use of Artificial Intelligence.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store