Master Data Management (MDM)

  • Serves as the authoritative source for applications and analytics
    • Have to have a way to track all the different process
    • Ideal – Application integration
  • Single Version of all Entities
  • Must be owned by each Business Unit
  • Must have versioning and auditing Capabilities –Data Lineage.
  • Must be able to handle hierarchies
  • Must have a governed process (cleaning process)

BI2.1

Master Data Maintenance

  • All applications are rewritten or subscribed to the Master Data Hub. Applications are changed to encompass new schema.
  • Master lists are supported by all applications and changes are pushed to each master list, but may not be directly modified by the client
  • Continuous, where master data is changed at the source then modified at the MDM Hub
  • Data Structures similar to Dimensional Model. Some use it.
  • Distributing information from a transaction out to other related applications.

Match.com buys eHarmony –How do they merge their user data together?

Match.com  (BUILT ON SQL SERVER) Eharmony  (BUILT ON ORACLE 11g)
Separate Last name, first name, middle name Single name field
Eye color
Religious background
Address at Apartment from 1/12/11 Address from house at 11/2/10

 

Master Data Management vs Data Warehousing
Based on the discussions so far, it seems like Master Data Management and Data Warehousing have a lot in common. For example, the effort of data transformation and cleansing is very similar to an ETL process in data warehousing, and in fact they can use the same ETL tools. In the real world, it is not uncommon to see MDM and data warehousing fall into the same project. On the other hand, it is important to call out the main differences between the two:

1) Different Goals

The main purpose of a data warehouse is to analyze data in a multidimensional fashion, while the main purpose of MDM is to create and maintain a single source of truth for a particular dimension within the organization. In addition, MDM requires solving the root cause of the inconsistent metadata, because master data needs to be propagated back to the source system in some way. In data warehousing, solving the root cause is not always needed, as it may be enough just to have a consistent view at the data warehousing level rather than having to ensure consistency at the data source level.

2) Different Types of Data

Master Data Management is only applied to entities and not transactional data, while a data warehouse includes data that are both transactional and non-transactional in nature. The easiest way to think about this is that MDM only affects data that exists in dimensional tables and not in fact tables, while in a data warehousing environment includes both dimensional tables and fact tables.

3) Different Reporting Needs

In data warehousing, it is important to deliver to end users the proper types of reports using the proper type of reporting tool to facilitate analysis. In MDM, the reporting needs are very different — it is far more important to be able to provide reports on data governance, data quality, and compliance, rather than reports based on analytical needs.

4) Where Data Is Used

In a data warehouse, usually the only usage of this “single source of truth” is for applications that access the data warehouse directly, or applications that access systems that source their data straight from the data warehouse. Most of the time, the original data sources are not affected. In master data management, on the other hand, we often need to have a strategy to get a copy of the master data back to the source system. This poses challenges that do not exist in a data warehousing environment. For example, how do we sync the data back with the original source? Once a day? Once an hour? How do we handle cases where the data was modified as it went through the cleansing process? And how much modification do we need make do to the source system so it can use the master data? These questions represent some of the challenges MDM faces. Unfortunately, there is no easy answer to those questions, as the solution depends on a variety of factors specific to the organization, such as how many source systems there are, how easy / costly it is to modify the source system, and even how internal politics play out.

 

Service Oriented Architecture (SOA)

  • Also referred to as an Enterprise Service Bus (ESB) or Master Data Hub (MDH)
  • Serves as middleware that connects different applications together
  • SOA is event-driven
  • Fires an event to do something  (Completion event triggers an update in a related table)
  • Translates values between different tables  (i.e. units of measurement)
  • Visio swimlanes are one way to set this up

BI2.2

Looking at this with Nonattendees in mind  (Similar to Finish-Start relationships):

Nonattendee Not in class
Faculty Marked online
Deans
Student AR Mark DNP Add Comments Send DNP Report
Fin Aid Fin Aid Hours Adjusted
ARO Drops Class

In Self Service, IT are getting out of standard report writing.  Business users are closer to the data than anyone else and feel more comfortable working with the data as directly as possible.