Adding the sylph coverage model to yacht#141
Adding the sylph coverage model to yacht#141rtraborn wants to merge 27 commits intoKoslickiLab:superyachtfrom
Conversation
…ate effective coverage, etc according to sylph (Shaw and Yu, 2024).
… that aren't necessary.
… print statements with logger. Moved all constants to utils.py.
|
After some more testing, I just pushed some additional updates to this branch.
|
|
A small update, but with my most recent commit from last night I made the promised change to the |
|
Hi All! I've made some more changes over the past week. Here's an overview of my most recent updates (Part 1 of 2):
|
…takes-all k-mer reassignment.
…ion due to low lambda.
…pdate to median_ani_threshold.
|
…S, replacing it with the system gzip
…--calculate-coverage and --no_two_pass.
|
|
Hi all! I made more updates to this branch (along with a decent amount of testing) as we discussed and the code is ready for your review.
|





Hi @dkoslicki and team!
I created a sylph coverage model from Shaw and Yu, 2024 and added it to yacht, in a branch I named superyacht just for fun.
This is a draft that I'm still testing, so that and other caveats still apply. A few notes:
cov_calc, which calculates lambda and ani according as specified by the sylph paper.cov_calcinsideget_exclusive_hashes, given that that function provides us with the signature objects needed to make the calculations.hypothesis_recovery. There are probably good ways to integrate this, and I'll give this some more thought.cov_calcmore deeply intohypothesis_recoveryfor now. I have some ideas on what might be the best approach that we could discuss if you'd like. I thought it would be best to share this new branch while I look into this more deeply.internal_superyacht_test.pyis just a script that I have been using to test the new branch, and this can be ignored; I'll remove it once we move towards publication.AdjustStatusLambdaenum in a more idiomatic python way this week. It should be a relatively quick fix.winner_maproutine from sylph, but it's something I would like to add.I'm going to do more testing this week on additional datasets. Happy to discuss here or via email/video!