Metaflow
5 minute read
Overview
Metaflow is a framework created by Netflix for creating and running ML workflows.
This integration lets users apply decorators to Metaflow steps and flows to automatically log parameters and artifacts to W&B.
- Decorating a step will turn logging off or on for certain types within that step.
- Decorating the flow will turn logging off or on for every step in the flow.
Quickstart
Sign up and create an API key
An API key authenticates your machine to W&B. You can generate an API key from your user profile.
- Click your user profile icon in the upper right corner.
- Select User Settings, then scroll to the API Keys section.
- Click Reveal. Copy the displayed API key. To hide the API key, reload the page.
Install the wandb
library and log in
To install the wandb
library locally and log in:
Decorate your flows and steps
Decorating a step turns logging off or on for certain types within that step.
In this example, all datasets and models in start
will be logged
Decorating a flow is equivalent to decorating all the constituent steps with a default.
In this case, all steps in WandbExampleFlow
default to logging datasets and models by default, just like decorating each step with @wandb_log(datasets=True, models=True)
Decorating the flow is equivalent to decorating all steps with a default. That means if you later decorate a Step with another @wandb_log
, it overrides the flow-level decoration.
In this example:
start
andmid
log both datasets and models.end
logs neither datasets nor models.
Access your data programmatically
You can access the information we’ve captured in three ways: inside the original Python process being logged using the wandb
client library, with the web app UI, or programmatically using our Public API. Parameter
s are saved to W&B’s config
and can be found in the Overview tab. datasets
, models
, and others
are saved to W&B Artifacts and can be found in the Artifacts tab. Base python types are saved to W&B’s summary
dict and can be found in the Overview tab. See our guide to the Public API for details on using the API to get this information programmatically from outside .
Quick reference
Data | Client library | UI |
---|---|---|
Parameter(...) |
wandb.config |
Overview tab, Config |
datasets , models , others |
wandb.use_artifact("{var_name}:latest") |
Artifacts tab |
Base Python types (dict , list , str , etc.) |
wandb.summary |
Overview tab, Summary |
wandb_log
kwargs
kwarg | Options |
---|---|
datasets |
|
models |
|
others |
|
settings |
By default, if:
|
Frequently Asked Questions
What exactly do you log? Do you log all instance and local variables?
wandb_log
only logs instance variables. Local variables are NEVER logged. This is useful to avoid logging unnecessary data.
Which data types get logged?
We currently support these types:
Logging Setting | Type |
---|---|
default (always on) |
|
datasets |
|
models |
|
others |
|
How can I configure logging behavior?
Kind of Variable | behavior | Example | Data Type |
---|---|---|---|
Instance | Auto-logged | self.accuracy |
float |
Instance | Logged if datasets=True |
self.df |
pd.DataFrame |
Instance | Not logged if datasets=False |
self.df |
pd.DataFrame |
Local | Never logged | accuracy |
float |
Local | Never logged | df |
pd.DataFrame |
Is artifact lineage tracked?
Yes. If you have an artifact that is an output of step A and an input to step B, we automatically construct the lineage DAG for you.
For an example of this behavior, please see this notebook and its corresponding W&B Artifacts page
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.