Executive Summary
Penn Foster has been on a mission to help people launch, accelerate, and thrive in their careers for more than 130 years. It is an accredited, leading provider of online education, offering flexible and affordable programs for students seeking to advance their careers or pursue new opportunities. Together with hundreds of leading organizations, Penn Foster is working to bridge the gap between education and economic opportunity
After modernizing their data infrastructure by migrating from Snowflake to Databricks Data Intelligence Platform and enabling the integration of machine learning and AI technologies, the group now has the data foundation it needs to continue AI innovations for its students. As they move into competency and skills- based education the goal is to leverage Gen AI for implementing a content model that provides personalized custom learning experience tailored to the learner’s needs which evolves based on the learner’s behavioral data. This will drastically reduce the time to create content which typically takes about 4 months typically.
While Penn Foster Group is excited to developAI innovations in near future, the group also wanted to make sure they are being compliant and have a governance layer that helps them ensure they are doing everything in a compliant manner hence the need of modernizing data governance by upgrading to UnityCatalog. Unity catalog is going to be a core foundation of all their future AI projects along with helping them dramatically speed up the machine learning model deployment process while staying compliant.

Our relationship with Koantek has always been a partnership, not a dictatorship. We value our partnership with Koantek as they provided exceptional engineers for our projects and they truly strived to understand our business and all our needs. Their in-house experts on ML/AI, DevOps & QA helped move our team forward by adopting some best practices. Databricks Unity Catalog has become the core foundation for our cutting-edge AI projects. It has helped the governance layer that helps us ensure we’re doing everything in a compliant manner and it will dramatically speed up our deployment of machine learning models in future. Partnering with Koantek has been a fantastic journey for us.
Stephen Tiepel,
Senior Director of Data at Penn Foster
Challenges
● Lack of granular data governance : One of the biggest challenges was that the team had to create disparate workspaces in each environment, leading to challenges with tracking of data lineage and not knowing which data lived where.
● Managing data access: Due to lack of visibility at user, row; column level it was difficult to get visibility on who had what access and maintaining unified access control across multiple workspaces.
● Dependency on manual intervention: Each time anyone requested data, there was a manual step pushed to share the data across workspaces.
● Auto Scaling: Every time clusters tried to auto scale on the legacy environment it would lag.
How Koantek Responded - Solution
Koantek collaborated closely with Penn Foster to assess their needs on data governance and upon understanding of the project scope, resources were deployed. Koantek began implementing a comprehensive Unity Catalog migration solution in a phased approach with each phase including thorough testing and validation.
● Environment Consolidation
○ Created three new environment-specific workspaces (Dev, QA, Prod) in the DATA subscription
○ Implemented proper SDLC practices with source control integration
○ Established CI/CD pipelines for automated Notebook deployments
● Modernized Security Model
○ Implemented centralized account-level identity management
○ Created role-based access control with granular permissions
● Data Governance Framework
○ Designed three-level namespace structure (catalog.schema.table)
○ Organized data assets by lifecycle stage (Bronze, Silver, Gold)
○ Implemented comprehensive audit logging and lineage tracking
● Phased Migration Approach
○ Phase 1: Core data ingestion and Bronze layer
○ Phase 2: Silver layer and intermediate processes
○ Phase 3: Gold layer and advanced analytics
Outcome
● Singular workspace: Everyone accessing data through a singular workspace; Unity Catalog governs data control through Centralized identity management reducing manual intervention & improving data governance thus overcoming challenge of lack of granular data governance.
● Cost Savings : Through use of serverless cost savings are expected to increase.
● Improved end user experience & Auto Scaling: Another benefit of going serverless helped improve the cluster start up time ; leverage auto scale.
● Improvements for ML flow & feature store: Penn Foster Group is looking forward to how they will be leveraging these to roll out a stack of ML models in 2025 which they are excited for.