Iterative Development Guide
This guide will show Data Scientist how to sync their iterative development to Vectice.
In iterative development, a model is developed and tested in cycles of repeated iterations. The model's hyperparameters are tuned, developed, and tested each iteration until a fully functional algorithm is ready for deployment.
To get started, we will walk you through an iterative development process using Vectice.
Step 1: Installation & Configuration
Install and import any packages you need for your model development, including the vectice
library. If you have not installed the vectice
library, view the Install Vectice Library guide for more details.
Once you installed and imported vectice
into your script, connect to the Vectice API.
Now that you have connected to the Vectice API and linked your script to a project, we can start an iteration.
Step 2: Retrieve a phase
To start an iteration, you must retrieve a phase. To retrieve a phase, connect using your phase ID.
Step 3: Initialize Iteration
Each iteration contains the sequence of steps defined at the Phase and acts as a guardrail for data scientists to provide updates.
To initialize an iteration, your phase must have at least one step defined. Otherwise, you will receive an error.
After retrieving a phase, we want to begin an iteration. We will initialize an iteration with the create_or_get_current_iteration()
method, as shown below.
Current iteration is your last updated iteration. However, if the current iteration is not writable or no iterations exist, we will create an iteration or list writable iterations. If you have multiple "In progress" iterations from the past, to not make assumptions, we will display a list of writable iterations that you can select using the {phase}.iteration("iteration name or ID")
method.
You may have multiple writable iterations across various phases. Multiple writable iterations can exist at a phase level, but only one can be active per user.
Step 4: Retrieve a step
Once an iteration is declared, you can retrieve a step from your current phase iteration.
To find a list of step names via the API, use the following method to print a list of step names available to your iteration.
To retrieve a step in your current iteration, select a step by copying and pasting the step'sshortcut name as shown below:
If selecting a step name from the UI, use the prefix step_
before the step name.
The assets created during your machine learning work can be linked to the steps, thus serving as a vehicle to provide automatic progress updates. Those updates will render on the UI within the phase iteration you are working on.
Step 5: Development
The development process is where the data science magic takes place. You can develop, test, and validate your models, all while using simple lines of code to share your updates and metadata of your work.
The following are a few capabilities that you can utilize to add visibility into your model development.
Register a Dataset
You can register all datasets used during development to the Vectice UI. This includes raw data, cleaned data, and modeling data. For more information on how to register datasets during development, view How to register datasets.
Register a model
Your models are continuously improving in each iteration. You can register each model version to the Vectice UI. For more information on how to register models during development, view How to register models.
Code Capture
Vectice captures your local code if there is a .git
folder or repository available. Only locally changed files not already in Vectice are captured with each new git
commit.
Automatic code capture is enabled by default. To disable the automatic capture of your local code, you can set code_capture
to False
, as shown below:
View How to capture code for more information on code capture during development.
Version Control
Vectice enables the version tracking of datasets, models, and code used during development. Stored metadata for datasets and models includes version IDs, descriptions, lineage, properties, resources, attachments, model algorithms, metrics, status, and dataset types.
Asset versions and metadata are accessible in the UI within the Datasets and Models sections of the Project.
All assets are automatically versioned, making it easy to follow a project's progression and compare results from multiple iterations.
Step 6: Complete an iteration
Once you have completed the expected goals specified in a step, you can close the iteration as below. This line of code will send information to the web UI in real time.
Once an iteration is complete, you can start the next iteration to continue iterative development until all steps are complete.
Once the final step is complete, the iteration is automatically marked as complete in the Vectice UI.
You may revisit the details of the iteration for a retrospective. If satisfied, you can summarize your outcomes or start another iteration.
Full Workflow Example
The above is just an example of how to perform various actions. The best practice is to have one file (or notebook) per phase of the Project.
All registered iterations are rendered in the UI in real-time showcasing metadata such as the iteration number, status, owner, latest step, last updated, and created date.
View the Python API Docs section guides for more information on how to use Vectice API for development.