PYCON UK

Topic Modelling with Gensim

Parul sethi | Friday 10:30 | Room B

Topic Modelling is a great way to analyze completely unstructured textual data and with the python NLP framework Gensim, it's very easy to do this. This tutorial will guide you through the process of analyzing your textual data through topic modelling right from pre-processing your data - applying topic modelling algorithms using gensim - evaluating them manually and automatically - analyzing them using topic modelling visualizations. We will also see it’s applications in few NLP tasks: Discovering Topic correlation (with dendrograms), Document Clustering (demo with Tensorboard), Document analysis (using word coloring).

The python packages used during the tutorial will be spaCy (for pre-processing), gensim (for topic modelling), Visdom pyLDAvis and Plotly (for visualization). The interface for the tutorial will be a Jupyter notebook.