Want to see our platform in action?    

← Back to Resources

What is Data Quality and Why Is Data Quality Important?

07 Apr, 2021  |  By Marketing Team   |  Published in Articles,

What is Data Quality and Why Is It Important?

Today, data is arguably a business’ most valuable asset. Business leaders want to ensure they can use their collected data for better decision making, and ultimately, to skyrocket their success amongst competition. However, data can only be beneficial if it is high quality. Firms must be able to understand and answer the questions, what is data quality and why is it important? Data quality means data is fit-for-purpose and meets various data quality scorecard dimensions such as correctness, completeness, consistency, and timeliness. Without high quality data, firms will struggle to improve operational efficiency, meet regulatory compliance, and make informed business decisions.

According to Gartner research, the average financial impact of poor data quality on organizations is $9.7 million per year. IBM also discovered that in the United States alone, businesses lose $3.1 trillion annually due to poor data quality (Source). Low quality data can result in severe consequences to operational efficiency, regulatory compliance, and business decision making, ultimately harming a firm’s business value and eroding customer trust. Firms are realizing the critical need for implementing an effective data quality model to manage and prevent low-quality data from entering their pipeline at all. Additionally, through active data governance and strategic data management, firms can continuously catch and resolve any inaccuracies. A firms’ top priority should be to ensure end-to-end (E2E) data quality throughout a firm’s landscape.

What is Data Quality?

What is Data Quality

So, what is data quality? As mentioned, data quality can be defined as data that is correct, consistent, complete, and timely. Depending on the industry, these can mean different things. For financial services, “data quality” refers to critical data elements that affect an institution’s ability to operate efficiently, meet regulatory compliance, and conduct business effectively. Managing Directors may want to know how their business flow is doing; however, they may realize that there are some challenges with their data quality. To them, it doesn’t necessarily matter that ten out of the two million transactions have wrong dates; but instead, what percent have correct dates. At the executive, mid-level, business, and IT levels, each want to know different aspects of their data quality.

Data quality consists of various ingredients and should meet the following data quality scorecard dimensions.

Correct Data
Data values agree with an identified source of correct data. Data is correct if it conforms to business rules that govern it at all points in the lifecycle, from the time it is created, enriched, and transformed, to the time it is archived or deleted. Inaccuracies quickly lead to poor decisions, and can be the downfall of an organization. If a certain trade transaction has the wrong customer field, it could significantly impact analytics and business decisions related to the client’s account.

Consistent Data
Consistent data is in a standardized format that is easily recognizable throughout the firm. There is consistency in the representation of similar attribute values. Additionally, data is synchronized throughout its various systems, applications, and workflows. If different departments in a financial institution are not naming trade fields the same way, serious confusion and misguided decisions can occur.

Complete Data
Data that is complete does not contain any gaps or missing information. Without all pieces of information, it can render information as useless. Attributes are expected to have assigned values. If certain fields are missing (such as transaction date), pulling inaccurate records for that specific day will be inevitable.

Timely Data
Timely data refers to the availability and accessibility of information. Using outdated data can lead to inaccurate and irrelevant results. If firms are using outdated or old data, their business decisions may not reflect the current business climate.

Additional data quality dimensions also include current, clean, compliant, unique, and valid data. Quality data is fit-for-purpose and satisfies certain parameters such as service level agreements (SLAs), contracts, policies, procedures, software system requirements, regulators, and industry and technology standards.

What is a Data Quality Scorecard?

Data quality scorecards provide a mechanism to objectively measure data quality across the four dimensions of correctness, completeness, consistency and timeliness. For data quality scorecards to truly add value to data consumers, they need to be contextual.

For example, each of the below contexts should have a separate out-of-the-box scorecard:

Business event context: A single data item in a single dataset. Examples: A debit entry recorded by a ledger, an insurance claim update, an amended bond trade, etc.

Dataset context: A logical collection of business events typically stored in a single table or view. Examples: A set of open insurance claims, a set of bond trades, a set of credit & debit entries in a ledger, etc.

Business transaction context: A set of business events that represent a business transaction (can span multiple datasets). Examples: An equity order that is split into multiple orders, each of which is independently executed, a wire transfer instruction that includes a debit entry from payer account and a credit entry to recipient account, a payment instruction that goes through a series of AML checks and then fulfilled by a payment network.

Business flow context: A combination of datasets and data quality rules that represent a business workflow. Examples: A settlement workflow for Fixed Income securities that comprises multiple datasets, and data quality rules specific to the settlement workflow.

What Causes Low-Quality Data?

Now that we have answered the question of what data quality is, it becomes important to also understand what causes low-quality data. There are a variety of reasons that data becomes low-quality and thus, detrimental for operational efficiency, regulatory compliance, and business decision-making.

A few of the most common reasons for low-quality data are highlighted below.

Data Decay
Data decays as people leave their current roles in the organization and take their company knowledge with them. As mergers occur and companies go out of business, data becomes irrelevant, lost, or inaccurate. It has been stated that databases degrade by 30% per year (Source).

Manual Entry Errors
For firms that still use manual processes, human error is inevitable, especially when users manually enter and update data.

Poor Data Integration, Migration, and Interoperability
If business rules are not implemented properly, integrating or migrating data can easily result in misplaced or low-quality data. If systems are not interoperable, manual data entry may be required, resulting in additional errors.

Complex Business Processes
Financial institutions struggle with data quality due to complex business processes and the heterogeneous nature of IT applications.

Data and Enterprise Silos
Data is often siloed within various systems, applications, and workflows. Each of these data sources can consist of scattered, outdated, and duplicate information. Firms become unable to continuously synchronize data across their landscape, resulting in serious data fragmentation and poor-quality data.

Why is Data Quality Important?

You may be asking, well why is data quality important? Poor data quality can have a domino effect, causing serious issues when it comes to productivity, risk, and planning for the future. It is crucial that firms are able to ensure high quality data, so they can reap significant benefits in the following areas.

Business Decision Making
With ongoing data quality checks, enterprises have cleaner, safer, and higher-quality data, resulting in more accurate analytics, clearer insights, and predictive advantages. Ultimately, leaders can be more confident in their business decision making.

Scalability
Firms can scale much more quickly once they have a strategic and effective data quality model in place, guaranteeing that data maintains its quality.

Operational Efficiency and Productivity
With high-quality data, firms do not have to manually comb through and verify large volumes of data, which is resource-intensive, laborious, and can often lead to false or duplicate investigations. With high-quality data, fewer mistakes and less time is needed to fix inconsistencies.

Regulatory Compliance
Firms can submit regulatory reports confidently with data that is correct, consistent, complete, and timely. Implementing a consistent set of data formats, references, and standards across the enterprise, combined with transparent business rules reduces risk and prevents fines.

Customer Trust
Customers and partners can relax knowing their data is being handled with the proper controls in place (i.e., privacy, accuracy, etc.).

Competitiveness
Companies that fail to keep up with their evolving data and current technologies run the risk of falling behind current competition.

New Technologies
Artificial intelligence, machine learning, and other forms of automation have incredible potential. However, these technologies cannot be beneficial if they are being fed poor-quality data.

What are the Challenges of a Current Data Quality Model?

Batch-Oriented and Bilateral Tools
Many tools measure data quality in batches at specific points for individual data sources (like end-of-day), instead of providing metrics across the landscape. Firms currently perform multiple bilateral data quality checks, resulting in high false positives and duplicate issues.

Passive Data Governance
Firms often implement passive data governance and data management tools that may fix the data error, but not the rule or process that was causing it. Additionally, these tools may only perform data quality checks as a last step in the pipeline, instead of perpetually throughout the data’s lifecycle. By failing to cohesively monitor quality throughout the data’s lifecycle, more discrepancies and errors are easily missed.

Incomplete Data Quality Scorecards
Many current models do not measure all dimensions when measuring data quality dimensions, producing an incomplete data quality scorecard.

How Do Firms Achieve Continuous High Quality Data?

For firms to achieve continuous high-quality data, they must implement tools that perpetually identify, parse, and correct any flaws before they enter the business pipeline. Using a three-pronged approach, firms should focus on data governance, data quality, and data management, applying business rules that corrects faulty data quickly and efficiently.

What is an Effective Data Quality Tool?

The tools your firm chooses to help you achieve continuous data quality matters. The PeerNova® Cuneiform® Platform is an effective data quality tool that provides zero-code, end-to-end (E2E) data quality automation.

Data Quality Scorecards
The Cuneiform Platform offers individual data quality scorecards at the initial event, dataset, workflow, and business transaction levels. Firms can zoom in and out of various events or data points, producing data quality scorecards even at a micro scale. Data owners or data stewards can easily discover the reason behind the score, and what specific rules passed or failed. Individuals can tailor views that are appropriate for their needs.

Zero-Code Capabilities

With a zero-code solution, business users can easily use the platform’s self-serve interface to dynamically create, configure, and execute applications and rules without coding knowledge. The self-serve platform allows leaders to gain various insights and business knowledge for use in liquidity management, risk management, and more.

The Cuneiform Platform

Additional Features

The Cuneiform Platform also provides the following features:

Real-Time Root-Cause Analysis
PeerNova’s solution performs root-cause analysis to manage, prioritize, and resolve exceptions as they occur, saving time and money. The solution also reduces resolution times for data quality, SLAs, and timeliness metrics, ultimately leading to better operational efficiency, regulatory compliance, and faster time to market.

E2E Visibility Across Workflows
The Cuneiform Platform unifies the fragmented enterprise landscape, providing E2E visibility and traceability. The solution breaks down silos and manages lifecycle events across workflows through cross-functional collaboration, the lineage of data and metadata, and by building a “single source of truth”. Firms can create E2E, integrated, and active lineages across disparate systems, resulting in faster root-cause analysis and resolution of exceptions.

Confidence in Decision-Making
The platform enables business users to rapidly define and deploy data integrity policies, data quality rules, and process correctness controls, providing an array of high-quality insights they can trust. Business rules provide frameworks for continuous data quality checks and audits. Cuneiform infers and provides an array of insights so business users can appropriately identify, evaluate, and take automated or guided actions based on those findings.

Data Standardization and Harmonization
The Cuneiform Platform automates the process of acquiring data, connecting datasets, running data quality checks, generating relevant reports, and fixing errors in real-time. Firms can ensure the consistency of data formats and business logic. Additionally, the platform enforces the canonicalization of data, and also ensures that compressed or netted data/raw data is consistent across the organization.

Positive Operational Impact
The solution consistently runs data quality checks to identify bottlenecks, preventing the need for manual investigations and duplicate efforts.

Active Data Governance
The PeerNova Cuneiform Platform uses active data governance to ensure continuous E2E data quality by perpetually applying Data Quality and Timeliness rules across live data. It also automatically builds dictionaries, glossaries, and catalogs, thereby managing risk and monitoring change management. Through automation and ongoing data quality checks, the solution proactively fixes existing data issues, but also continuously manages and monitors data throughout its entire lifecycle.

For a deeper understanding of how to achieve continuous high-quality data, check out PeerNova’s data quality white paper. Please reach out to one of our team members for more information or for a demo of the Cuneiform Platform.

Sources:

Why Bad Data Could Cost Entrepreneurs Millions. April, 2019. Link.

Why Data Decays So Fast. Innovation Enterprise Channels. Link.




Want to see our platform in action?


By leveraging the Cuneiform Platform, you can obtain and use more accurate, data-driven insights through effective data quality monitoring. Learn more about how we can help you with your important tasks.