Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
222 add deployment of example get started experiments model #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
222 add deployment of example get started experiments model #233
Changes from all commits
b5e29ea
43efd66
d235947
3e6f2ab
90cbba5
f04d5a2
9933b75
986c43d
33ae6e5
3421626
966087d
efe4d51
e81c6dd
a72fdbf
60b08d5
85b2a1e
7fb7439
9f3609a
7615d0d
7797fcd
48a9661
e557d0a
4067cea
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daavoo Doesn't this get the model data for the latest commit instead of the specific version we are trying to deploy? From the DVC docs, when
--rev
is not specified,The latest commit (in the default branch) is used by default when this option is not specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that the phrase in the docs is meant for the scenario where you pass a remote URL repo as the first argument.
I think the behavior here is correct, but I am going to double-check.
I think it works because we are using the local repo
.
and so the current status of the workspace will be used. The workflow usesactions/checkout
and runs on the git tag creation so the workspace will be the revision where the tag was createdThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have to generalize it in this way? (an extra step for us to take care of)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was suggested by @jesper7 and @tapadipti to use the role as a secret. I assume it has some security implications although I am not sure what is the danger of leaking a role name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daavoo How did you set it up for https://github.com/iterative/example-get-started-experiments? Should we add it to the readme here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that roles can be public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what possible characters can appear in
f"{name}-{version}-{stage}"
that would need to be replaced here? I guess it's related to iterative/dvc#9821?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This regex is only relevant for Sagemaker and it is enforced by their API .
From top of my head, the composed name includes
/
,:
fromname
and.
fromversion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rationale for the naming restrictions in gto were to make it more likely they would be compatible with systems like sagemaker. Not sure it really helps though. We still hit these restrictions and users don't know why they can't use these characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daavoo If I created a SageMaker model for a given model version before deploying it in one environment (eg,
dev
), does this code recreate the same model before deploying it in another environment (eg,prod
). Or does it skip recreating it and just return a reference to the old model?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It currently creates a new model. If we want to do what you said last, we should only add
stage
to the composed name after the model has been created.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Referring to this msg - you said that the existing model would be re-used, right? Would removing
stage
from the model name make this happen?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'll create a PR to remove stage from the model name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daavoo If I am trying to deploy a new model version to an existing stage, does this code create a new endpoint or does it update the existing endpoint? (I need to confirm this for the blog post)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the first part but a new endpoint is created from combining the name, version, and stage, so any change to those 3 things result in a new endpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I deploy v1 to prod today, and v2 to prod tomorrow, will I have 2 different endpoints with 2 different endpoint names? Shouldn't the prod endpoint always be the same? (just like studio prod version is always studio.iterative.ai) So that any clients running inference against the prod endpoint don't have to update their endpoint addresses each time there's a new deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version shouldn't be part of the endpoint name, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a matter of opinion and depends on the pattern we expect/want to showcase:
A) The endpoint is directly queried by external consumers
B) There is some app&url that uses the right endpoint internally without exposing it to external consumers.
That is the user-facing app&url and it only "points" to the correct software because there are gitops bumping the internal studio version (i.e. https://github.com/iterative/itops/pull/2238)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So then we should also create a service to delete stale endpoints, right? Else, we will end up with a bunch of endpoints over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect that to be configured / handled on the AWS side.
Anyhow, as per your suggestions, I will send a new P.R. creating/updating a single endpoint per stage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still needed/useful? Seems odd to have
export
andsave
here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have to change the evaluate stage to use the
.pth
format.I was not able to make sagemaker work with the fast.ai exported in
.pkl
format.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried it? I would say let's do it if it's not that hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume it is possible but I didn't try to not include other changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daavoo If we logged the tar file of the model, then we wouldn't need to specify the file name
model.tar.gz
in the deployment script, right? The deployment script would be able to get the file name from the Git tag itself, which means that a single deployment script could be used for deploying several models with different names. Right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, in the blog post, I show that there's an artifact entry for model.pkl (so the model registry shows model.pkl). But the file that is getting deployed is model.tar.gz. We know that they are the same model, but it still looks like a disconnect between the model registry and the deployment script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to the comment above, do we still need
model/model.pkl
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor thoughts to try to clean this up a bit:
I haven't tested this, but wondering if we can simplify this at all with something like this?
Also, I wonder if it would be better to append directly to
dvc.yaml
and use list syntax forcmd
instead of&&
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't manage to use list cmd with
stage add
(which we use in thegenerate.sh
script , but might be missing something