Data Categorization or Data Classification?

In the last few years there has been a dramatic shift from data classification being “nice to have” to becoming a “need to have”. Behind this momentum, private companies and organizations are implementing data classification using “traditional” taxonomies and schemas that worked for governments and militaries, but don’t necessarily translate well into the workflow or culture of commercial enterprises.

When TITUS started over a decade ago, many of our first customers were large government and military organizations who were familiar with the concept of classification. We all  remember the “secret” and “top secret” rubber stamp with red ink used to classify paper documents and files before the dawn of digital productivity tools. As a result, when government and military customers began to deploy classification, their users were already well educated on the meanings and appropriate use of their classification taxonomies. As classification has moved into commercial enterprises, the template for classification has remained unchanged. As a result, many enterprises have struggled to find a way to align classification labels and policies to meet their own unique needs.



As private industry adopts classification, TITUS has been helping our customers adapt to taxonomies and policies for faster user adoption and more flexible security policy options.

In March 2016, Forrester Research released a report entitled Rethinking Data Discovery and Classification Strategies. This report pushed organizations to start thinking beyond a traditional classification taxonomy focused exclusively on sensitivity (Public, Confidential, Highly Confidential, Secret), into actually using data categories to help determine sensitivity. While some organizations might be able to adopt a standard classification taxonomy, most – particularly those that are highly regulated – struggle to trust that their users will select the right classification. Will they be able to discern when something is sensitive enough to be upgraded from “Internal” to “Restricted”? While we can present users with classification label definitions and even use automated algorithms to provide classification suggestions, there remains a feeling that assigning sensitivity is so new to users that they might not get it right.

This is where the concept of data categorization enters the discussion – rather than asking employees about the sensitivity of the data, ask employees to identify the category of the data. For example, most employees don’t know the difference between “highly confidential” and “confidential”, but they can tell you if a document contains “employee information” or “intellectual property”, or is “approved for public use”. Once the category is assigned by the user, the automated algorithms have new information that can be used (along with the information content, the user profile, and other contextual factors) to automatically assign the appropriate classification.

Categorization can be simple yet powerful. Several TITUS customers have adopted categorization to help them comply with onerous regulations such as ITAR with the simplest of questions: “Does this information contain technical data, Yes or No?” If “No” then move on. If “Yes” then a couple more questions are presented to guide the users to the right selections.

Categorization is another way in which TITUS helps to make sure your classification and data identification initiatives are as simple and successful as possible.

Leave a Reply