Getting Started with Data Products Series: Part Two
In the first part of this series, we covered the basics of data products. Now, let’s get into the four factors to keep in mind when building data products...
Four factors to keep in mind when evaluating data products:
For those interested in using data tools that are data product-centric, the process can seem daunting. There are plenty tools that promote data products out there, so how do you determine which option is right for you? The right data tool to create actionable data products produces analysis-ready data from raw data. This is done through a foundation of using trust in the data, glossaries and data definitions, metadata (both active and passive), findability, accessibility, and actionability. One way to keep these factors in mind when considering the data platform that creates the best data products is to remember the acronym: SMART.
1. Security
Security is a critical aspect of any data product. Data products typically handle large amounts of sensitive data, including personal information, financial data, and business intelligence. Without cyber protection (such as NIST) and privacy security (such as GDPR and CCPA), data products can be vulnerable to cybercrime that targets products as they are managed, moved, and stored. Therefore, it's important to ensure that data products are designed with security in mind from the very beginning.
Data security is especially critical and often a key concern for companies who are determining if they should switch from existing tools that have already been vetted. As you’re evaluating a data product, here are some key questions to ask to ensure the offering is secure enough for your data.
· What data protections are included with this product?
Any data products you’re considering should include authentication and access control mechanisms to ensure that only authorized users can access sensitive data. This can include two-factor authentication, role-based access control, and password policies. In addition, sensitive data should be encrypted both in transit and at rest through technologies such as SSL/TLS and AES to prevent unauthorized access.
· Does this product have any additional features in place to protect especially sensitive data?
Some data products can employ data masking techniques such as redaction, tokenization, and data substitution to obscure sensitive data and further protect it from unauthorized access.
· How long is my data kept if I use this product?
To minimize the risk of a major security incident should a data breach occur, data should only be kept in a data product for as long as it’s necessary. Ask about the data product’s specific data retention policies to ensure that your sensitive data is not kept for an extended period of time.
· What kind of regular security testing is done?
Any security protocols should be regularly tested and updated as needed to ensure the product’s ongoing integrity as new threats arise. Confirm that the data product you’re interested in is subjected to testing for security vulnerabilities, penetration testing, vulnerability scanning, and log monitoring.
· What data protection regulations does this product comply with?
The need for a data product to comply with certain protocols varies by company, but most should already comply with major regulations like GDPR, CCPA, and HIPAA. If other data protection regulations must be complied with, be sure to verify that this is possible through a given data product.
Because security is a top priority for nearly every company, a data product should feature strong security measures to protect sensitive data, minimize the risk of security incidents, and build trust with users.
2. Metadata
Data products include both active and passive metadata detailing who created the data, when it was created, field types and names, and other relevant information. This metadata aids in understanding the data's context, purpose, and applicability, making it easier for businesses to use it effectively.
As you’re evaluating a data product, here are some key questions to ask to give you a more complete understanding of its metadata capabilities.
· What kind of metadata is available through this product?
Confirm that the data product you’re considering features both active and passive metadata, as both are useful. Passive metadata is most commonly used and includes the field, business description, glossary details, or domain data. Active metadata provides greater context and a more well-rounded view of the data by providing information on the freshness of the dataset, how frequently it’s updated, and other factors that are necessary to help business users select the best dataset for their needs.
· How will users need to be educated on proper search processes?
This will vary by data product, as some data products may require users to have some level of data expertise to successfully search for data—which may in turn require training if users are general business users. Other data products feature much more intuitive search capabilities that enable users to find data by simply typing in business terms that they already use.
· Are both data cataloguing and metadata management integrated?
Historically, data cataloguing and metadata management have been disconnected, which has forced users to go to separate locations to find the data and then access the descriptive information to understand and verify what that data contains. When considering a data product, ensure that both of these capabilities are integrated into a single product to increase ease of use for higher adoption rates.
· What additional insights into the data are provided by the metadata?
In our personal lives, many of us have grown accustomed to “Amazon-like” experiences for finding products. Some data products can replicate this experience by integrating a crowdsourced rating system that displays user ratings and customer counts to reveal how popular a given dataset is, as well as how accurate and useful it is.
By providing additional context and information about the data being analyzed, metadata can help users better understand the data and gain deeper insights into it.
Next: Factors three & four to consider when building data products...