The only data labeling suite built specifically for analyzing language and training NLP AI.

TagWorks was created by a data scientist and sociologist to efficiently analyze large sets of language files in rich detail. This web-based system will help you finish your giant data labeling project ten times faster without hiring extra project managers or data scientists. With TagWorks, you’ll overcome the tradeoffs of first generation content labelers, so you can extract all the information from every single file.

ui graphic-site.png

What can TagWorks do?


Complexity at scale

With TagWorks you can tackle complex, large-scale projects with ease. Efficiently annotate, tag and classify tens of thousands or millions of documents with hundreds of labels.


Up to 10x faster

Complete what would normally be a decade long project in a year. Tagworks removes the need to train wave-after-wave  of research assistants, saving you time and money.


The power of the crowd

Easily enlist thousands of crowd workers to extract the information you need. Every annotator will be tested and pre-qualified before they work on your project.


Eliminates task management

TagWorks automates worker, document, and task management, saving you hundreds of hours otherwise spent training team members and directing traffic.


Validated results

Unique interfaces make it simple for you to review and validate your results. Easily share reliability statistics and data provenance with your peers and reviewers.


Web based

TagWorks is completely web based, so no software installation or maintenance is required. Crowd workers and collaborators can join your project with the click of a mouse.


How does TagWorks work?

  1. Gather your thousands of documents (i.e. reports, articles, transcripts, customer success narratives, marketing trend data, internal knowledge repositories, web content, social media posts, and more) and upload them onto your TagWorks server.

  2. Our team helps you break out your experts’ complex analysis methods into a longer assembly line of brief annotation tasks.

  3. You and your team perform a few dozen tasks to test and refine your assembly line. Your best work will establish a “gold standard” set of high-quality tags that will be used to qualify online “crowd” workers.

  4. Open your data labeling factory to thousands of online crowd workers or volunteers.

  5. TagWorks automatically finds agreement among annotators, and produces the validation statistics you need to publish your results.

  6. TagWorks output labels can be used to train AI that can apply your team’s intricate expertise to incoming documents in mere seconds.


Who uses TagWorks?


Columbia University’s History Lab is using TagWorks to annotate an archive of over 1 million diplomatic reports.


The University of Texas School of Information is using TagWorks to build an AI capable of cataloguing all the software scientists use to conduct their research.


The Public Editor project is using TagWorks to transparently assess the credibility of news articles and news organizations.


Where did TagWorks come from?


Sociologist Nick Adams wanted to better understand the complex social interactions during the Occupy movement by scrutinizing thousands of news articles. His goal was to annotate over 8000 articles with labels identifying over 300 separate variables. With all these data, he would be able to refine multi-level time series models of interactions among police, protesters, and city governments. But it was not easy.


The labeling process was complex, tedious, and required the close management of many research assistants. But automated approaches to language could not come close to identifying all the information relevant to theories of protest policing. Many ambitious researchers have faced similar challenges. They always had to simplify their approach or reduce their document set.


But Adams persisted, and reached a breakthrough. He figured out how to divide the whole annotation process into simpler tasks that online workers could do without face-to-face training. The new assembly line approach could be managed by software and validated by algorithms, so researchers could stick to what they do best.

Adams built a prototype and recruited veteran software engineer Norman Gilmore. Together they’re leading the TagWorks team to make sure you can answer questions even bigger than your data.


 When Is TagWorks The Right Solution?


When it is

If you have a large (or even gigantic) set of documents and the expertise you’d like to apply to them is rather intricate or nuanced, including dozens or more labels, TagWorks is the solution you have been looking for.

When it isn’t

If you have less than 500 documents to analyze or you only need to apply a handful of tags that are simple to find in your documents, there are a number of other tools that can meet your needs.  


If you’d like to learn more about the suitability of TagWorks for your project, get in touch and we can set up a free consultation.


Schedule a Chat

Interested in using TagWorks? Complete the form below and we’ll provide you with a free consultation.

Name *
We can usually meet with you via videoconference from 9am-5pm Pacific time. (Feel free to suggest other times if needed.)
What are your documents like? How many are there? Is there a fixed number or more every day/week? How intricate is the expertise you'd like applied to them? How many tags would you expect to be applied to the content in an average document? How long is the average document?
Join our mailing list