In the past, organizations stored their data in disparate areas and analyzed it in silos. As technology increased, organizations consolidated data into larger data sets for evaluation. Now technology is knocking down boundaries that have limited the amount of data an organization can capture and analyze, giving birth to the term big data
With the advent of big data, questions arise about how to protect, manage, and evaluate it. Internal auditors can help their organization evaluate its big data practices by ensuring data security protections are in place, reviewing the contractual requirements for data sold or acquired, and assessing the appropriateness of analysis tools.
Raw vs. Smart Data
Big data can be divided into raw data and smart data. Raw data is the entire warehouse of data in its detailed format. Smart data is the output of the data analyses and the conclusions that are used by organizations to make decisions. As with standard data classifications, each of these two sets of data has a different value for the organization.
Over time, organizations have implemented procedures to protect their data, including access controls, encryption, tracking mechanisms, and monitoring. Some organizations have presumed that the same controls would suffice regardless of the amount of data. However, given the strategic nature of smart data, this specific data subset should be protected by tighter security controls. Organizations should treat smart data in a confidential fashion similar to their most guarded critical assets.
For example, a retailer gathers transactional customer data and consolidates it with its VIP program and other social media coverage to create a set of smart data that allows the company to predict future customer spending habits. These results can facilitate planning purchases for store goods over the next year, potentially providing a competitive advantage over other retailers. However, if this organization's smart data is stolen or leaked, then that competitive advantage is lost.
External vs. Internal Data
With the large amount of data available, many organizations purchase data from other sources or sell their data to others. For example, a credit card company's raw transactional data might have less value to that company than it has to a retailer, who can take this data and consolidate it with its own data to derive various customer perceptions or trends.
Buying and selling data poses unique risks. For the selling organization, the data needs to be scrubbed sufficiently to ensure it is generic and not identifiable to a specific person, and the organization must have sufficient legal releases or notifications in place to alert people to the potential uses of their data. For example, a health-care provider might sell its medical data to a university, which could use it along with other health-care data for medical research. Before it does this, it must ensure personally identifiable information has been scrubbed from the data set.
For buying organizations, the risks revolve around their confidence in the legitimacy of the data and of its source. Companies functioning as exchanges purchase large bulks of data and sell them to others, or they consolidate one organization's data with data from other organizations and return it to the original organization for use. As data progresses away from its source, it could be consolidated with other data and then sold to another organization, thereby increasing or decreasing its value. The value of the data depends on its legitimacy, which may be impacted by being passed through multiple sources and exchanges. Organizations must either verify the data's legitimacy or accept the risk than it might not be legitimate.
Additionally, organizations need to consider whether their competitors also are purchasing this data and using it similarly. The need is real, and those that do not keep abreast of current trends may be at a disadvantage.
Another potential risk of data is its quality. External data an organization has purchased may become less useful when it is combined with other data. Similarly, an organization may be concerned about whether its analysis of transactional data may predict future trends. As the old motto goes: garbage in, garbage out.
Overall, organizations need to ensure that contracts and usage policies cover the quality of the data, including the time period in which it can be used, the specific source, and the type of data. Internal auditors can assist their organization by ensuring data received complies with contract terms and that the organization retains this data for only the period stated within the contract.
What to Use and How to Use It
Big data poses a wide range of concerns, including data classification, quality, regulatory, security, and resourcing needs. Some questions internal auditors could ask include:
- Is the current data classification system sufficient to handle the various classes of big data (i.e., raw vs. smart data)?
- What is the confidence level in the data obtained from external organizations?
- If the organization sells data to others, how confident is it that the data has been sufficiently scrubbed to avoid identifying an individual? What protections exist to limit liability in the event that shared data is used incorrectly?
- What is the confidence level in the existing measures to protect organizational data, especially if it is stored in a cloud environment? Is appropriate due diligence occurring for the cloud?
- Is the organization making the appropriate investment in technology and systems to store and maintain data as well as investing in its data analytics capability.
- Is the organization using the results of the big data analyses to further its competitive advantage?
The thought processes surrounding analysis of large data sets may be vastly different than the way the organization thought previously. For example, with big data, marketing personnel may be able to fully evaluate various marketing data, while operations managers can analyze various operations data. But big data also can enable the organization to go farther by analyzing both types of data from a seamless enterprise perspective.
Organizations need to ensure their data analytics team has personnel with the skill sets and educational backgrounds to mine and evaluate such large repositories. As the use of big data rises, these data scientists will be in high demand and may garner higher salaries.
Along with hiring the right people, the organization needs to invest in the technology tools necessary to perform the analyses timely and provide the most accurate results. For example, a drug company that needs to analyze large data sets for potential drugs to treat a disease must have the appropriate technology; otherwise, the analyses may take longer or produce incorrect results. Both outcomes would be detrimental to the organization's overall competitiveness. Internal auditors can assist the organization by ensuring the organization has performed sufficient research to select the best possible data analysis tools. This should include a review of the features, strengths, and weaknesses of possible tools.
Given today's data breaches, many organizations have increased their overall security posture. Regardless of the amount of data, organizations should be taking the appropriate precautions to protect their data assets. Accumulating a larger amount of data in one place could be more tempting for hackers. Rather than hacking into credit card and customer data stored on separate systems, criminals could find it all with less effort by breaking into one system. Therefore, organizations should ensure the centralized data stores are protected.
Protection decisions could be based on the location of the data, especially if it is stored externally. Organizations may want to use cloud services to store their data rather than invest in maintaining the storage technology themselves. Doing so may enable them to increase or decrease storage capacity and access speeds on demand as their data needs change. However, storing organizational data at a third party raises considerations about the cloud service's security controls.
Some organizations take a hybrid approach. For example, they use cloud storage for their raw data needs. Then after evaluating the data, they store the resulting smart data internally because it is of more value to the organization than the overall transactional detail. This requires the organization to protect data based on data classification or by taking a risk-based approach tied to the data's value.
Helping Organizations Adjust
There is nothing magical to big data — it's really just a much-larger data store compared to what organizations stored previously. However, organizations should adjust to these larger data sets by using the enhanced analysis tools on the market. New ideas regarding use and evaluation of larger data sets may change an organization's data practices and controls. As big data is still new to many organizations, auditors can advise management and IT leaders on best practices and controls.