Topic Modelling Fri Dec 18th 2pm, U35 3rd floor meeting room.

The methods circle will meet on Friday Dec 18th to discuss Topic Modelling. Venue is Unioninkatu 35, 3rd floor meeting room.

Topic modelling can be loosely characterized as advanced form of Content Analysis. It is suitable for large sets of texts (these days readily available in digital format). Often it is combined with some form of linguistic preprocessing. The analysis consists in extracting set of commonly co-occurring words (“topics”), and identifying the occurrences of these topics in segments of the original texts.

Topic Models are just one of several flavours of automatic (or semi-automatic) text mining. Due to popularity of TM, and the wealth of computer programs available, we have chosen TM as a starting point. However, we expect the discussion to be more about the landscape of text mining rather than details of the method itself.

The meeting hopefully gives rise to further study circle sessions in which we may find answers to some of these questions:

– What kinds of research problems does TM address
– What kind of data can it handle
– What are different flavours of TM (hierarchical,labeled, semi-supervised,…)
– What are it’s relationships to alternatives like traditional content analysis, traditional qualitative research, neural networks, co-word-mapping etc,
– How can it be combined with other methods? Especially traditional qualitative analysis – either using TM as a preliminary rough classifier to ease the analysis, or using TM to test or expand the results of a qualitative study
– How do the linguistics come in?
– What is the process of creating a Topic Model like? What skills do you need, what software is available.

We will hopefully have a few short presentations of research cases using TM, a small demonstration of doing it in R, and some very preliminary thoughts about combining it with qual software (RQDA). And whatever discussion you wish to start.

Please comment on this entry if you plan to participate in the meeting