I have a few issues I wanted to highlight when getting started with Cog:
1. the Predictor interface
I found it initially confusing how to input an audio file into my Predictor. In retrospect, it makes sense that I just need to take a file as input, but I didn't see anything in the docs to instruct me about this. I ended up figuring it out by looking at the tutorial that does audio tagging.
A similar confusion: what does Predictor.predict return? Can it be, like...anything? (Sorry I know I could probably read this in the docs but I guess I wish it was clearer from the API or default example.)
2. slow workflow
Every time I ran cog predict I would have to wait for cog to start up, install everything, then download the model I needed from a website, and then it would fail with an error. This was a frustratingly slow workflow! I fixed it eventually when @andreasjansson told me that every file in my local folder would be bundled into the docker image. So I downloaded the model and changed the code to load from the local folder, and that sped up cog (although it's still slower to start up than I'd like, but maybe an interface for running predictors locally would fix that)
3. malformatted cog.yaml
I was using the default cog.yaml (I think it was generated by cog.init?) and it had a line that was kinda like this:
system_packages:
# - put system packages here
and I think I tried running cog predict without changing that line, and got an error that said something nondescript:
ⅹ There is a problem in your cog.yaml file.
ⅹ must be a list.
It was annoying but I eventually figured out it was referring to that system_packages key
so two things could be fixed there
- error message could look like this:
value for key 'system_packages' in cog.yaml must be a list
- the default cog.yaml should probably not be malformed
update from later
I actually ran into a similar issue with a yaml file like this:
system_packages:
- "libsndfile1-dev"
- "ffmpeg"
- "timidity"
and got the following error:
ⅹ Failed to parse config yaml: yaml: line 11: found character that cannot start any token
this one was easier to fix because it had a line number, but also doesn't seem to be a meaningful (or correct?) error message. I was just missing the indentation.
4. returning video from the predictor
It wasn't clear to me how to return a video. I actually tried rendering the video to a file and returning the string 'out.mp4' (the name of the file) but that just propagated to the frontend as a string...
5. uploading clarity
It was unclear when uploading with cog push actually finished, or what stage it was in. It was printing a lot of weird stuff:
Using default tag: latest
31b608d88350: Pushed
074e37886942: Layer already exists
71a0887c56fc: Layer already exists
d54ed2edbac4: Layer already exists
3058d5dc6d22: Layer already exists
ca2cbf283f0e: Layer already exists
606fc042d492: Layer already exists
eff64713ff03: Layer already exists
5bdaf1f57aee: Layer already exists
19347dc82c13: Layer already exists
aa091f54d0c4: Layer already exists
feacfd9ce92d: Layer already exists
501309e49952: Layer already exists
0c5746584249: Layer already exists
1a1a19626b20: Layer already exists
5b7dc8292d9b: Layer already exists
bbc674332e2e: Layer already exists
da2785b7bb16: Layer already exists
I was worried at one point that it wasn't going to work (because I hadn't updated the version number, I was thinking) and at another point I thought the thing was totally stalled out. But then the command finally finished and I determined by going to the frontend and seeing 'model updated 2 minutes ago' that it had actually worked.
6. uploading confusion
(a) Later on, I went back and wanted to push a new version. I found the instructions about cog push eventually on replicate docs, but I don't think they're in the documentation in this Github repo. Is that intentional? Maybe cog push should be mentioned here, too?
(b) What's the purpose of r8.im in cog push r8.im/jxmorris12/piano-transcription? I think I should just be able to do cog push jxmorris12/piano-transcription, but I don't know what the r8.im is for (and I'm wary of Chesterton's fence here..)
I have a few issues I wanted to highlight when getting started with Cog:
1. the Predictor interface
I found it initially confusing how to input an audio file into my Predictor. In retrospect, it makes sense that I just need to take a file as input, but I didn't see anything in the docs to instruct me about this. I ended up figuring it out by looking at the tutorial that does audio tagging.
A similar confusion: what does Predictor.predict return? Can it be, like...anything? (Sorry I know I could probably read this in the docs but I guess I wish it was clearer from the API or default example.)
2. slow workflow
Every time I ran
cog predictI would have to wait for cog to start up, install everything, then download the model I needed from a website, and then it would fail with an error. This was a frustratingly slow workflow! I fixed it eventually when @andreasjansson told me that every file in my local folder would be bundled into the docker image. So I downloaded the model and changed the code to load from the local folder, and that sped up cog (although it's still slower to start up than I'd like, but maybe an interface for running predictors locally would fix that)3. malformatted cog.yaml
I was using the default cog.yaml (I think it was generated by cog.init?) and it had a line that was kinda like this:
system_packages:
# - put system packages here
and I think I tried running cog predict without changing that line, and got an error that said something nondescript:
ⅹ There is a problem in your cog.yaml file. ⅹ must be a list.It was annoying but I eventually figured out it was referring to that system_packages key
so two things could be fixed there
value for key 'system_packages' in cog.yaml must be a listupdate from later
I actually ran into a similar issue with a yaml file like this:
and got the following error:
ⅹ Failed to parse config yaml: yaml: line 11: found character that cannot start any tokenthis one was easier to fix because it had a line number, but also doesn't seem to be a meaningful (or correct?) error message. I was just missing the indentation.
4. returning video from the predictor
It wasn't clear to me how to return a video. I actually tried rendering the video to a file and returning the string 'out.mp4' (the name of the file) but that just propagated to the frontend as a string...
5. uploading clarity
It was unclear when uploading with
cog pushactually finished, or what stage it was in. It was printing a lot of weird stuff:I was worried at one point that it wasn't going to work (because I hadn't updated the version number, I was thinking) and at another point I thought the thing was totally stalled out. But then the command finally finished and I determined by going to the frontend and seeing 'model updated 2 minutes ago' that it had actually worked.
6. uploading confusion
(a) Later on, I went back and wanted to push a new version. I found the instructions about
cog pusheventually on replicate docs, but I don't think they're in the documentation in this Github repo. Is that intentional? Maybe cog push should be mentioned here, too?(b) What's the purpose of
r8.imincog push r8.im/jxmorris12/piano-transcription? I think I should just be able to docog push jxmorris12/piano-transcription, but I don't know what the r8.im is for (and I'm wary of Chesterton's fence here..)