Data Engineer : What is DataMart

Prem Vishnoi(cloudvala)
2 min readFeb 20, 2024

A Data Mart is a subset of a data warehouse, oriented to a specific business line or team

It contains a segment of the organization’s data specifically designed to focus on a particular area (such as sales, finance, or marketing) or a specific data analysis requirement. Data marts are created to provide business users with access to the data they need in a format that is relevant to them, making it easier to generate insights and reports specific to their domain.

The key characteristics of a data mart include:

  • Subject-Oriented: Focuses on a single subject area or line of business, so the content is specifically designed to meet the needs of a particular group of users.
  • Simplified Access: Because it is more focused, a data mart offers a simpler view of an organization’s data, which is easier for end-users to access and understand.
  • Performance: Data marts can improve query performance by reducing the volume of data to be processed. Since they contain a smaller, more relevant subset of data, queries can run faster than against a larger, more comprehensive data warehouse.
  • Ease of Use: Tailored to specific user needs, data marts often use a simpler schema that is easier for business users to navigate. This can empower non-technical users to perform their data analysis.
  • Lower Cost and Faster Implementation: Compared to a full-scale data warehouse project, data marts require less time and resources to develop and deploy. This can make them an attractive option for organizations looking to quickly provide business units with analytical capabilities.

DataMart creation, there will be new account data of customer shared from Open Banking. Bank may access customer account data in other banks once we got consent from Customer. Once consent removed, data would be removed from source table. There is a need to have a DataMart which may accelerate analysis of customer asset in other banks.

task description

ETL Scripts

  • For analysis better understanding data without accessing sensitive data, create an ETL logic to create and maintain segregated key for Real customer id and Real account number.
  • Create a table at customer level has information below,
  • Customer first date of consent for each bank, and exit date of consent for each bank.
  • Customer’s monthly average balance holding for each bank in HKD. Average shall be calculated start from the date customer give consent to corresponding month-end, excluding Sunday.
  • Identify the key competitor bank for each customer which has highest average balance. Create a table to record key competitor bank changes events.
  • Create any table you think is valuable for analytical purpose / supporting above requirements.
  • Readiness table to tell data user when the data is ready to be used

Data flow chart and ER diagram

  1. Create a simple data flow chart on how data was processed from source tables.
  2. Create a ER diagram of tables to be exposed to data users.

Deliverables

1 — ETL Scripts

2 — Data flow chart

3 — ER diagram

--

--

Prem Vishnoi(cloudvala)

Lead Data Engineer experienced in designing, implementing, and managing large-scale data infrastructure. Skilled in ETL, data modeling, and cloud computing.