Manuscript Topic Coding

One method of content analysis that can be used with the files available in the Digital Archive is topic coding. Although topic coding is a popular method used in qualitative sociological research, this process can be streamlined using the digital PDF files available in the Digital Archive. There are many forms of automatic topic coding as well as advanced methodologies of topic modelling. However, one user-friendly method involves the use of the NVivo software to automatically code thousands of documents from the archive for various topics.

This method has already been employed using the Digital Archive PDF files. Manuscripts and even Review PDF files can be “auto-coded” using NVivo. The process starts with researchers using NVivo to manually code for topics in a sample of manuscript PDF files from the Digital Archive. Sentences in these files are coded as belonging to a set of topics which can be established before the process starts or can be created through saturation coding. In NVivo, these sentences are stored digitally in groupings called “nodes”.

After the topic nodes have been created, with at least ten sentence references to each code, the “auto-code” function from NVivo can be used. The auto-code function from NVivo has been improving through the years and it involves using stored sentences in the topic nodes to code other files for these topic nodes as well. This allows the researchers to train the software with manual coding, and then prompting the software to use that manual coding to auto-code many other files. This saves time and effort since thousands of PDF files can be automatically coded this way. It also allows for a standardized way of coding thousands of files, saving the researchers the reliability problems often found when there is more than one individual coding several documents. This is also useful when sensitive documents that should not be read by human eyes are being analyzed, since it is a software that will automatically analyze them and output the results.

The result of the auto-coding done by NVivo is an excel table that tells the researcher the number of references that it found to a specific topic in each of the documents analyzed. This output can be used as a continuous variable to analyze in quantitative analysis as well.