Text Classification for Organizational Researchers: A Tutorial

baby-84626_1920[We’re pleased to welcome author Dr. Stefan Mol of the University of Amsterdam. Dr. Mol recently published an article in Organizational Research Methods entitled “Text Classification for Organizational Researchers: A Tutorial,” which is currently free to read for a limited time. Below, Dr. Mol reflects on the inspiration for conducting this research:]

07ORM13_Covers.inddWhat motivated you to pursue this research?
Machine Learning assisted text analysis is still uncommon in organizational research, although its use holds promise. Most manual text analysis procedures conducted by researchers in this field are about the assignment of text to categories such as in thematic and template analyses. However, manual classification of text becomes laborious and time consuming (and sometimes subject to reliability issues) when one needs to do this for a sizeable amount (hundreds of thousands or millions) of pieces of text. An alternative is to use automatic text classification systems that can be constructed by researchers, which allow them to speed up the process of labeling or coding large sets of textual data. The design and building of text classifiers could be of use for various areas of organizational research. Our aim was to illustrate how this could be done and provide a tutorial. We used the example of building a text classifier to automatically sort job type information contained in job vacancies. The importance of validating the results of text classification was demonstrated through data triangulation, using expert input. We believe that the use of this procedure among organizational researchers can improve reliability and efficiency in analysis that involves classification.
What has been the most challenging aspect of conducting your research? Were there any surprising findings?
Building classifiers involves several rounds of training, testing, and validation before they can be deployed in practice and the most challenging aspect is training the classifier and choosing the parameters in such a way that the results are valid from the standpoint of application. The classifier we built for the job analysis task was able to recover job task sentences with high precision as assessed by an expert in the field, although the classifier was initially trained with minimum expert input. Our results thus suggest that job vacancies are a reliable alternative source of job information that can augment existing approaches to job analysis. More generally, we believe this also suggests that wider use of text classification holds promise for organizational research in a broader sense.
What did not make it into your published manuscript that you would like to share with us?
One class of techniques that are now increasingly applied in the area of text classification are word embeddings. Word embeddings map each word to vectors of real numbers. The similarities among word vectors can be used to quantify and categorize the meaning of words in specific contexts. We initially planned to include a short discussion about this but we decided not to because these techniques warrant more in depth discussion which go beyond the scope of our current article. However, organizational researchers interested in recovering context specific meaning of words may benefit from the specific approach taken with word embeddings and we recommend them to get to know these techniques as well.

Stay up-to-date with the latest research from Organizational Research Methods and sign up for email alerts today through the homepage!

 

 

 

This entry was posted in Organizational Research, Organizational Studies, Technology and tagged , , , , , by Cynthia Nalevanko, Senior Editor, SAGE Publishing. Bookmark the permalink.

About Cynthia Nalevanko, Senior Editor, SAGE Publishing

Founded in 1965, SAGE is the world’s leading independent academic and professional publisher. Known for our commitment to quality and innovation, SAGE has helped inform and educate a global community of scholars, practitioners, researchers, and students across a broad range of subject areas. With over 1500 employees globally from principal offices in Los Angeles, London, New Delhi, Singapore, Washington DC, and Melburne, our publishing program includes more than 1000 journals and over 900 books, reference works and databases a year in business, humanities, social sciences, science, technology and medicine. Believing passionately that engaged scholarship lies at the heart of any healthy society and that education is intrinsically valuable, SAGE aims to be the world’s leading independent academic and professional publisher. This means playing a creative role in society by disseminating teaching and research on a global scale, the cornerstones of which are good, long-term relationships, a focus on our markets, and an ability to combine quality and innovation. Leading authors, editors and societies should feel that SAGE is their natural home: we believe in meeting the range of their needs, and in publishing the best of their work. We are a growing company, and our financial success comes from thinking creatively about our markets and actively responding to the needs of our customers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s