At present, the company is conducting data governance and is preparing to outsource a data asset management platform to undertake the company's data assets. In order to find a suitable product, you first need to understand the company's own pain points, and then examine the functions of each competing product according to the pain points. This article is based on this idea for analysis.
1. Pain points and needs
In actual data management, the following problems are often encountered:
Inconsistent data language: different business systems have inconsistent definitions of the same indicators or fields, and lack unified data naming conventions and standards
Data cannot be found or country email list understood: There are multiple sources of data, analysts and technologies do not know where the desired data is, data processing logic, etc., and cannot clarify information assets
Untrustworthy data: lack of data quality control and evaluation methods, unable to ensure data accuracy, consistency, validity, etc.
Unconnected data: "chimney-style" development, data is not shared or circulated, and cross-domain data analysis and data innovation cannot be realized
For the above problems, the following requirements are extracted:
Data standard: establish a unified data specification naming system to ensure consistent data caliber
Metadata: Build data asset maps, including metadata management, lineage and impact analysis, asset catalogs, etc.
Data quality: Establish data quality rules and quality monitoring mechanisms to help users discover data quality problems in a timely manner
Data security: establish a data security system including access control, desensitization encryption, etc.
Master data management: establish a master data model and master data management process, and realize cross-departmental and cross-system data fusion applications
2. Competitive analysis
Analyze the four data asset suppliers A, B, C, and D. First, give an overview of each product from the data governance system, then analyze and compare the core functional modules, and finally draw a conclusion.
2.1.1 Product Information Architecture of Data Asset Management Platform
Supports the access and management of relational databases such as Oracle, Mysql, and SqlServer, mongodb databases, and Hive, HBase, and HDFS distributed databases in big data environments, supports Excel supplementary data, and realizes the unification of structured and unstructured data. collection.
Customizable metadata, the system automatically collects metadata (incremental update), and can retrieve and maintain metadata (field level), when the data model changes, the metadata can be dynamically sensed and a sense log is generated
Bloodline analysis: supports automatic an