About the Co-founders
Nick Adams Ph.D
Nick Adams is TagWorks' inventor and co-creator. He came up with the initial design of the software when he faced the challenge of closely annotating over 8,000 news articles describing events of the Occupy movement.
Nick is a proponent of hybrid text analysis approaches, and he has taught, consulted, and mentored hundreds of researchers in a range of automated and human text analysis techniques. He founded the Computational Text Analysis Working Group at UC Berkeley's D-Lab, and the interdisciplinary Text Across Domains (Text XD) initiative at the Berkeley Institute for Data Science.
Nick earned his Ph.D. in sociology from Berkeley in 2015 and is now co-Founder of Thusly, Inc. and the Founder and Director of Goodly Labs, a CA non-profit dedicated to empowering the public with the data, tools, and insights of social science.
Nick is honored to sit on the SSRC's Digital Culture committee; to have helped form the Credibility Coalition; and to serve as a BITSS Catalyst (Berkeley Initiative for Transparency in the Social Sciences), promoting sustainable open science.
Norman Gilmore is the Chief Technology Officer for Thusly, Inc. and co-creator of TagWorks. He has a long term interest in software that supports citizen science, visualization for exploratory data analysis, and complex problem solving, and is pleased that TagWorks brings all of those themes together.
Norman likes to lead happy teams, and believes Scrum is a good way to do that. He uses Node and React to build advanced interfaces, Django for web application development, and deploys to AWS with docker containers.
Norman has worked at big companies like Disney Interactive and Boeing, and at small companies that you probably haven't heard of.
The TagWorks story
Sociologist Nick Adams had a problem.
It was not a new problem. In fact it was a problem that many scientists and researchers had encountered in the past –– one that many still struggle with right now.
The problem was: how could he pull rich, intricate data from a very large corpus of documents without taking decades to do it?
Adams, a UC Berkeley political sociologist and data scientist, wanted to do a multilevel textual analysis of over 8,000 documents reporting on the 2011 Occupy movement. And his detailed conceptual scheme (itemizing all the potentially interesting information in those documents) included nearly 300 variables.
His review of similar research seemed to suggest that his goal was impossible. Past studies of contentious politics tended to be of two types. One type collected a few variables about a large number of movement events. The second type looked deeply into one or two events or movements producing thick descriptions of everything that happened and every detail of how the movement interacted with the city and the police. In other words, there were shallow looks at a large number of documents (a large-n study) or deep examinations of a small number of documents (a small-n study).
Adams was not interested in doing a deep dive into one or two episodes or a shallow pass over all of them. “What I wanted to do with my project was to have the richness of description of a small N study and do it for the thousands of events led by 185 different Occupy encampments.
“I wanted to do something that's never been done before.”
It soon became clear why such a deep dive into a huge pile of documents hadn’t been done. The task was incredibly daunting. This kind of textual analysis is not something that computers can do because computers can’t understand the nuance, ambiguities, and contradictions of natural language the way humans do. On the other hand, the cognitive load of identifying (any of) 300 different variables in a long passage of text like a news article is significant. It’s both extremely challenging and extremely boring. “My very smart undergrads felt it was an impossible task,” Adams recalls. “Each article would take hours to complete and once you proved that you could do it, you’re kind of like ‘that was terrible and I never want to do it again.’”
Other researchers had run into this issue before and had either simplified their approach or reduced their document set. A few had retained the size and depth of their studies and had persevered, but those studies took many years. Adams was determined to soak everything he could out of the historical opportunity the Occupy movement presented, but he couldn’t afford to spend a decade of his life training and managing multiple waves of undergraduates to hand-code these documents.
Adams began to look into ways to break the job down into smaller assignments and then automate the work as much as possible. Existing tagging software required too much of annotators and still gathered too little data, so he decided to design a new solution. After much trial and error Adams found that an assembly line approach using the proper analytical units and tagging interfaces could efficiently break out large and complex content analysis jobs into brief, simple tasks that relatively untrained people could complete in series or in parallel.
Working with a team of software engineers in the San Francisco Bay Area, Adams created a prototype that made possible multi-level information extraction from a large corpus of documents.
Adams quickly realized that this solution would work for many others who want to ask and answer big research questions that no one has had the capacity to address before.
Adams then teamed up with veteran software engineer and Thusly Co-Founder Norman Gilmore and a team of programmers to create TagWorks, a web-based software that not only decomposes a monumental effort into a series of simple tasks, it also trains and manages analysts and crowd workers directly through the software. TagWorks also automatically aggregates the work of crowd workers to validate results, formalizing and exploiting intersubjective epistemology so that principal investigators can feel confident about their results without having to check each tag. This allows researchers (like you) to scale-up their workforces by a factor of 100 (or more) while significantly scaling back their management role.
If you’ve dreamt of a rich, multi-level examination of a very large collection of documents, but found yourself lacking appropriate tools and resources. Don’t give up. Don’t reduce the scope of your study or the richness and clarity of your variables.
With TagWorks you can have it all.