My responsibilities: Design & Research Lead
Project timeline: GA: Q3 2024, Post-GA Improvements: Q1, 2025
A new product offering that connects data science and machine learning operational needs to GitLab's existing DevSecOps workflows and infrastructure.
My responsibilities: Design & Research Lead
Project timeline: GA: Q3 2024, Post-GA Improvements: Q1, 2025
The core need: Design a platform that enables key data science workflows to be done in-product that seamlessly interacts with existing DevOps infrastructure.
GitLab's core product was purpose-built for engineering teams, giving them a single, secure place to manage their entire development workflow. However, teams operating on similar infrastructure - specifically those using machine learning models - could use GitLab's repos and CI/CD pipelines, but had no native way to store, version, or manage the models they were building and training. This pulled customers off-product and forced a reliance on external tools that broke the unified workflow GitLab was built to provide.
Our team set out to close that gap. The challenge was twofold: first, build the foundational infrastructure for a model registry and experiment tracking directly within GitLab; and second, integrate those new systems so they communicate seamlessly with the existing DevOps infrastructure.
Model Registry: A repository where users can manage all aspects of their production models, including metadata, analytics, governance, versioning, and artifacts.
Model Experiments: A 'sandbox' repository for models that are training, being tuned, or testing. Users can train new and existing models on test and production data, and fine-tune different configurations before promoting runs as production versions.
While familiarizing myself with the data science space, I conducted user interviews with 30 data science professionals to better understand how they might integrate MLOps into their existing workflows. These moderated interviews shaped the MLOps product strategy, including a key integration with MLFlow.
Early on in the project, I identified our own internal Data Science team as a key stakeholder in this effort. Through interviews with the team, I learned valuable insight into their own needs for the system, and how they were able to leverage our core product at the time.
In my initial low-fidelity explorations for this project, I mirrored models and versions to the existing core workflows on GitLab, which used Epics, Issues, and Tasks as work items. These relationships seemed to be similar - Models contained Versions of that model, but the versions were where the experimentation and 'work' happened. As such, I molded existing epic, issue, and task pages to accomodate this new need for models and versions.
Through my conversations with our internal Data Science team and outside data science professionals, my assumed parallel between Epics = Models > Issues = Versions was validated, but with one major change.
Models and Versions were often treated as 'officiated' records of a working model. This meant that they stored key information for reproducing results, unique to that specific version. However, this also meant they were also typically treated as immutable and unchanging.
My previous assumption that Versions were where tweaks, tests, and training happened was wrong.
Essentially, the Model Registry needed to act as an archive and record of production models. We still needed a staging environment where users could tweak configurations and test different things before promoting that candidate to the registry as a production-level version.
The new platform would include:
Getting to GA required multiple rounds of iteration across both the Model Registry and Model Experiments, driven by feedback from external customers and our own internal data science team. These conversations surfaced several meaningful shifts in product strategy that shaped the final release.
Once a model candidate was trained and validated in Model Experiments, users needed a clear path to promote it into the Registry as an official production version. I redesigned the promotion and creation flows to enable this end-to-end handoff, mirroring familiar patterns from elsewhere in GitLab to keep the experience intuitive.
Artifacts were also relocated from the model card to individual version cards, a change driven by the reality that each version could share artifact filenames but contain meaningfully different files. Keeping them at the version level eliminated ambiguity and gave users a cleaner source of truth.
The metadata sidebar was refined to surface context-specific information depending on whether a user was viewing a model or a version. For versions, this included a one-click export that let users replicate a working configuration directly into their testing environment, a small change that had a significant impact on workflow speed.
Based on interview feedback and the realities of project scope, Model Experiments was repositioned as a structured record of experimental runs conducted in external tools like MLFlow and SageMaker. Rather than trying to replicate those environments, we integrated MLFlow as an import/export feature, allowing users to tie their runs directly into GitLab for tracking and visibility. This became the foundation of the new Model Experiments experience.
We also added a performance tab built on GitLab's Product Analytics, a separate product I designed, that surfaced configurable graphs of each candidate's performance within an experiment. These visualizations were powered by a GraphQL integration, giving users a flexible, queryable view of their model metrics without leaving the platform.
Following a successful GA launch, I developed a prioritization plan for product improvements drawn from de-scoped GA items and ongoing user feedback, continuing to close the gap between what users needed and what the platform provided.
Additional columns were added to both the model and version listing tables, giving users more at-a-glance context without requiring them to drill into individual records. We also explored layouts mirroring GitLab's Epic and Issue listing pages, but ultimately rejected these concepts — they felt less organized and harder to parse in the context of model management.
A targeted fix that added clearer context to action buttons and page titles across the model and version pages. The change was driven by confusion surfaced during solution validation testing, where users struggled to orient themselves within the current flow. More descriptive labeling resolved the issue and helped anchor users throughout the experience.
The repository link was added to the metadata sidebar, giving users a direct path to the repo housing the model or version without having to navigate away and search for it manually.
This module created a direct bridge between MLOps and GitLab's core product by allowing users to attach work items as linked items to models and versions. Connecting these two surfaces removed a meaningful barrier between the platforms and added significant utility, tying ML workflows into the same collaborative infrastructure that engineering teams already relied on.
The final improvement in this effort was the addition of a Group-level model listing page. Prior to this, the Model Registry existed exclusively at the Project level. This change introduced an aggregate view of all models across projects within a group, a capability specifically requested by internal leadership that meaningfully expanded the Registry's organizational reach.
Rotate your device to landscape for the best experience