Media Organization Leverages Automated Content Tagging

Download Case Study (PDF File)
Media & Entertainment

Full Story


In today’s hyper-connected world, the focus on digital media presents publishers, information providers, and media organizations with unique challenges - from media production to developing new channels and markets, all while meeting constantly changing consumer expectations and information needs.

In the US alone, people spend approximately 44% of every day interacting with media and seven out of 10 homes have some form of content streaming. While competition is fierce, the industry is exhibiting growth; global media and entertainment revenues are forecast to reach $2.4 trillion in the next few years.

Building a successful brand in this dynamic ecosystem comes with challenges. Trying to compete in an overly fragmented marketplace is forcing media and publishing organizations to create more content and partnering with multiple distribution platforms – leading to even greater fragmentation. Once loyal audiences are frustrated as they attempt to find a coherent media experience across multiple platforms and devices.

One organization integrated Semaphore's Semantic AI platform, into their digital ecosystem to create an automated, transparent, and consistent content classification process that results in precise and consistent tagging and improves customer and user search and retrieval experiences.


The organizations existing method of content tagging used a rules-based approach. As each article was created, rules to perform tagging were handwritten and focused on finding literal strings of text rather than concepts. Their desire to replace the manual and no longer supported system with something that would improve classification and be easier to maintain led them to Semaphore.

Content tagging consisted of hundreds of thousands of manually crafted rules that were developed over a 15-year period. Each rule identified the relevant words, titles, and subjects found in each article i.e., 80,524 organizations, 19,873 public traded companies, 60,990 titles, 4,781 descriptors and 7,788 geography descriptors. Due to the sheer volume, rules were often duplicated and redundant.

They began the process with a proof of concept to demonstrate that Semaphore could produce precise, complete, and accurate rules – in the same way as the existing system. The Semaphore challenge: moving the organization away from the manual rulewriting mindset to a concept/model driven approach that supports collaboration with subject matter experts, is easily maintainable, and sustainable over the long haul.

They began by using Knowledge Model Management (KMM) to develop a robust model that reflects the subjects, topics, and other relevant concepts. The Lexical Enrichment Side Panel (LEX) assists in model development by suggesting synonyms and signpost terms. Knowledge Review Tool (KRT) incorporates subject matter experts into the model development process by reviewing the model and suggesting relevant model concepts using the vocabulary of the business. Model developers can review suggested terms and incorporate them as required.

KMM, Document Analyzer and Classification Analysis Tool are deployed to test and validate rule development. Semaphore’s sophisticated rules language and NLP engine examine content and look for evidence of concepts—not strings. This results in rules that are derived from the model, written once, and reused across content repositories resulting in consistent and transparent classification.

Semantic Integration Services and user experience widgets are used to assist authors and editors in selecting tags not automatically applied in classification.


Today, the organization uses a cutting-edge model management and classification technology that fully integrates with the organization’s existing systems, practices, and processes - no rip and replace.

Rules are automatically generated from the model, which is built and validated by the business and subject matter experts, and results in precise, complete, and consistent metadata tags and explainable and transparent classification outcomes. The modeldriven approach reduces duplication and redundancy saving time, reducing costs, and redirecting resources to higher-value work.

Semaphore’s seamless connection with downstream systems drive analytics, support SEO, and boost queries for improved customer and user experience.

Learn more
about the products


Keep exploring
stories like this one

Read Next Story