Cognos to Power BI Migration with Snowflake Integration

Cognos to Power BI Migration with Snowflake Integration-feature
Cognos to Power BI Migration with Snowflake Integration
Zeeshan Mukhtar Global Head
Zeeshan Mukhtar
Global Head

April 4, 2025

Abstract
This Article outlines the technical migration process of Cognos reports and data infrastructure to modern cloud-based services, specifically transitioning reporting functionalities to Power BI and integrating Snowflake for data warehousing. The document details the architecture, challenges, resolution strategies, and implementation phases involved in modernizing legacy reporting systems. Key goals include improving scalability, performance, and business insights through centralized analytics and cloud-native tools.
Introduction

The data processing architecture plays a critical role in handling external data sources and ensuring efficient reporting for business intelligence. Over time, the data flow and ETL processes have evolved to support better performance, scalability, and advanced analytics capabilities. This document provides an in-depth technical analysis of the improvements made in the data file processing pipeline and the transition from Cognos-based reporting to Power BI and Snowflake.

Objective

Prior to the dashboard project, the data file processing pipeline was structured around an FTP-based ingestion system with ETL processing feeding into SQL-based reporting systems. Cognos was utilized for visualization and business processes. However, as the complexity and volume of data increased, performance issues, scalability and maintainability challenges happened.

Overview

The migration was aimed at replacing Cognos reports and dashboards with scalable and interactive Power BI dashboards. Alongside this transition, Snowflake was chosen as the centralized cloud-based data platform to handle reporting data.

Architectural Evolution

Before the Dashboard Project

  • High licensing costs with legacy Cognos setup.
  • Limited interactivity in dashboards and reporting.
  • Manual data refresh processes.
  • Difficulty integrating with modern data platforms.
  • Data was ingested from external sources (VTrak, iTrade, Oracle AP, CASS) via FTP.
  • ETL processed the data into SQL databases.
  • Processed data was utilized for Cognos reporting and further pushed to the eCPS portal.
  • Limited scalability and manual intervention were required for reporting updates.
Before Migration Arch
  • External Data Sources: Data originates from various external systems, representing different file types or applications (e.g., Excel, CSV, potentially application-specific formats). These sources feed data into a “File Server.”
  • File Server: This acts as a central repository for the incoming data files.
  • ETL Server: A dedicated server hosts the “ETL Process” (Extract, Transform, Load). This process is responsible for:
  1. Extraction: Reading data from the File Server.
  2. Transformation: Data cleaning, validating, and transforming the data into a uniform format suitable for the target database.
  3. Loading: Writing the transformed data into the “Web Portal Databases.”
  • Web Portal Databases: This section represents databases that store the processed data, representing a specific view derived from the database tables. Data flows from the ETL Process to these databases.
  • SQL Server: Microsoft SQL Server instance is utilized for further data processing and querying. Data from the Web Portal Databases is accessed and potentially further transformed here.
  • Web Portal: This represents an application or portal that consumes data processed by Cognos Metric Studio, likely displaying key performance indicators and reports to end-users.

New Architecture Overview

  • Data Sources: Multiple internal databases, third-party APIs, and file-based systems.
  • Data Staging: Data is pushed via SFTP to a staging server.
  • ETL Processing: Data is ingested and transformed in Snowflake.
  • Reporting: Power BI is used for creating dashboards and sharing insights.

Power BI Transition

Environment Setup

  • Power BI Service Setup: Provisioned workspaces for Dev, UAT, and Production.
  • Data Gateways: Installed and configured on-premises data gateway for hybrid connectivity.
  • Access Management: Integrated with Azure AD for SSO and role-based access.

Configuration Steps

  • Provision Power BI Workspaces in Power BI Service.
  • Install On-Premises Data Gateway:
  1. Install on a dedicated VM.
  2. Register with Power BI service.
  3. Connect to local databases or shared storage.
  • Dataset Creation in Power BI Desktop:
  1. Connect to Snowflake using native connector.
  2. Import required tables or use DirectQuery.
  3. Apply measures using DAX.
  • Publishing and Scheduling:
  1. Publish report to Power BI workspace.
  2. Set refresh schedule via Gateway.
  3. Set access permissions using AD groups.
After Migration PowerBI
  • External Data Sources and File Server: These components remain largely the same as in the previous architecture, with external data feeding into a central File Server.
  • ETL Server and ETL Process: The ETL process still exists on a dedicated server and performs the same core functions of extraction, transformation, and loading into the “Web Portal Databases.”
  • Web Portal Databases: Similar to the previous architecture, these databases store the initial processed data.
  • SQL Server (Intermediate Staging): A significant change is the introduction of a dedicated SQL Server instance for staging and further transformation. Data from the Portal Databases is loaded into this SQL Server. The diagram shows several tables within this SQL Server: Supplier_Master, Product_Dim, Customer_Dim, Calendar_Dim, and a “Fact Table.” This indicates the implementation of a more structured data warehouse .
  • ETL Jobs (SSIS): Indicates the use of SQL Server Integration Services (SSIS) for the data tranmission and transformation within this SQL Server environment.
  • Data Warehouse Server: A separate server was dedicated to host the “Data Warehouse.” Data from the intermediate SQL Server is loaded into this Data Warehouse.
  • Cognos Reports: The reporting layer now utilizes “Cognos Reports,” suggesting a broader use of IBM Cognos for generating various types of reports beyond just metrics. Data is sourced from the Data Warehouse Server.
  • Engine Reports: Another reporting mechanism, “Engine Reports,” is also present, potentially using a different technology or serving a specific purpose. It also draws data from the Data Warehouse Server.
  • aCTS Portal: This portal continues to consume processed data, likely now sourced from the Cognos Reports or Engine Reports.
  • Power BI Development: This indicates the initial introduction of Microsoft Power BI for development purposes, potentially for creating new dashboards and visualizations. Data is likely being sourced from the Data Warehouse Server for this development effort.
Snowflake Integration Details

Integration Strategy

  • Consolidate all structured/unstructured data to Snowflake.
  • Create stage tables for raw ingestion.
  • Transform using Snowflake SQL or Snowpark for advanced logic.

Snowflake Setup Steps

  • Create Snowflake Account under the enterprise agreement.
  • Define Warehouses for separate workloads: INGEST_WH, TRANSFORM_WH, BI_WH.
  • Create Databases and Schemas:

CREATE DATABASE reporting_data;

CREATE SCHEMA reporting_data.staging;

CREATE SCHEMA reporting_data.analytics;

  • Ingest CSV or JSON from SFTP location:

CREATE STAGE my_stage

 URL=’sftp://ftp.example.com/incoming/’

STORAGE_INTEGRATION = my_sftp_integration;

COPY INTO reporting_data.staging.sales_data

FROM @my_stage/sales.csv

FILE_FORMAT = (TYPE = ‘CSV’, SKIP_HEADER = 1);

  • Transform Data for Reporting:

CREATE OR REPLACE TABLE reporting_data.analytics.final_sales AS

SELECT region, product, SUM(amount) AS total_sales

FROM reporting_data.staging.sales_data

GROUP BY region, product;

After Migration SnowFlake
  • External Data Sources and File Server: These remain consistent.
  • ETL Server and ETL Process: The core ETL process for loading into the Portal Databases is still present.
  • Web Portal Databases: These continue to store the initial processed data.
  • SQL Server (Staging and Transformation): The intermediate SQL Server for staging and transformation remains, utilizing SSIS jobs.
  • Data Warehouse Server: The dedicated Data Warehouse Server persists.
  • ETL Jobs (SSIS): SSIS is still used for data movement into the Data Warehouse.
  • Data Warehouse (Star Schema): The Data Warehouse with “(Star Schema),” confirming the adoption of this dimensional modeling technique. It also shows an “ETL Job” moving data into this schema.
  • Cognos Reports: Cognos continues to be used for reporting, likely drawing data from the Data Warehouse.
  • aCTS Portal: The portal continues to consume reports.
  • Power BI Development: The Power BI development environment will remain available for connection with Data Warehouse.
  • Power BI Service: indicates the deployment of Power BI dashboards and reports to a centralized platform for high volume data. Data for the Power BI Service comes from the “Data Warehouse.”
  • SmartFolder Database: This new component suggests a separate database used by the Power BI Service, potentially for storing metadata, configurations, or pre-aggregated data for optimized dashboard performance.

Key Architectural Changes and Observations:

Shift towards a Data Warehouse

The architecture evolves from a more transactional/operational database structure (Web Portal Databases) to a dedicated Data Warehouse with a star schema, optimized for analytical reporting.

Use of ETL Tools

The specific mention of SSIS highlights the adoption of a dedicated ETL tool for managing data integration processes within the staging and data warehouse layers.

Layered Architecture

The architecture becomes more layered, separating data ingestion, initial processing, staging, data warehousing, and reporting into distinct components.

Adoption of Power BI

The integration of Microsoft Power BI shows a move towards latest data visualization and improved BI capabilities.

Centralized Reporting

The Power BI provides a centralized platform for retrieving and interacting with dashboards and reports.

Security and Compliance
  • Data Encryption: Enabled at rest and in transit.
  • Role-Based Access: Implemented in both Snowflake and Power BI.
  • Audit Logs: Maintained via Power BI and Snowflake activity history.

Results of Migration

  • Improved performance and interactivity in dashboards.
  • Easier user adoption due to Power BI’s user-friendly interface.
  • Reduced manual effort due to automated refresh and pipeline.
  • Centralized data management using Snowflake.

Key Takeaways

The transition from Cognos to a modern BI and cloud data stack brought numerous technical and business benefits:

  • Modernized Architecture: The move to cloud-native technologies simplified infrastructure management and removed dependencies on legacy systems.
  • Faster Report Generation and Insights: Power BI’s in-memory analytics engine enabled near real-time dashboarding with dynamic drill-down capabilities.
  • Enhanced Security and Governance: By using these platforms, the addition of Role-based access control (RBAC), encryption at rest/in-transit, and audit logging provided enterprise-grade security.
  • Reduced Operational Overhead: The automated data refresh, incremental loads, and serverless data processing reduced the need for manual intervention and constant monitoring.
  • Integration and Extensibility: Data operations were streamlined through Seamless connectivity between Snowflake, Power BI, and other services.
  • Cost Optimization: Subscription models for Snowflake and Power BI allowed cost-efficient resource scaling and better ROI.
Final Words

The migration successfully transitioned Cognos reporting to a latest BI stack using Power BI and Snowflake. The latest solution is scalable, secure, and aligned with current data architecture practices.

 

The improvement of data file processing has improved data management, reporting, and analytics capabilities for high volume data. After migration, this transformation enhances scalability, performance, and maintainability, positioning the system for future growth and innovation.

Contributor

Sarfaraz Shaikh

Talk With Our Expert

    [recaptcha]

    Recent Blogs
    • MQ and Kafka Integration: Three Coexistence Patterns That Work
      Websites used to be something you built once and basically forgot about. That doesn’t work …
      Read More »
    • Upgrading to Optimizely CMS 13: What Your Team Actually Needs to Decide Before Writing a Line of Code
      Learn how to plan an Optimizely CMS 13 upgrade with .NET 10, Optimizely Graph, Visual …
      Read More »
    • AI Meeting Notes: Automating Summaries and Action Items from Video Content
      Learn how AI meeting notes automate summaries, action items, and insights from video meetings using …
      Read More »