Support node_labels in write_nexus via TRANSLATE#3442
Support node_labels in write_nexus via TRANSLATE#3442kaathewisegit wants to merge 1 commit intotskit-dev:mainfrom
node_labels in write_nexus via TRANSLATE#3442Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3442 +/- ##
=======================================
Coverage 91.92% 91.92%
=======================================
Files 37 37
Lines 32153 32158 +5
Branches 5143 5145 +2
=======================================
+ Hits 29556 29561 +5
Misses 2264 2264
Partials 333 333
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
jeromekelleher
left a comment
There was a problem hiding this comment.
Looks great, thanks @kaathewisegit! A few minor bits of input above.
It would be nice to verify that a third-party tool like dendropy or BioPython recogises this - would it be possible to check this?
0286a8a to
2d69b34
Compare
|
I've changed I also tried adding tests which use BioPython (generate a random ts and see if its NEXUS output is parseable), but it turns out that Bio.Nexus only accepts integer ids for TRANSLATE, so |
jeromekelleher
left a comment
There was a problem hiding this comment.
Some comments on the implementation.
It seems like dendropy should support TRANSLATE, so should be easy enough to use that to verify? Just one example is fine, we just want to be sure that some other software can read this.
2d69b34 to
ab5991b
Compare
Co-authored-by: Jerome Kelleher <jk@well.ox.ac.uk>
ab5991b to
09b8efb
Compare
|
I've switched to a lenient translation table construction. And added dendropy parsing tests for all tree sequences from get_example_tree_sequences (except multiple-root and empty ones, as newick serialization doesn't work for those). |
jeromekelleher
left a comment
There was a problem hiding this comment.
Great! Looks like it's almost ready to go in.
| labels[node] = f"new_node_which_was_{node}" | ||
|
|
||
| nexus = ts.as_nexus(include_alignments=False, node_labels=labels) | ||
| print(nexus) |
|
|
||
| labels = {} | ||
| samples = ts.samples() | ||
| k = random.randint(1, len(samples)) |
There was a problem hiding this comment.
I don't think the random does much here, I'd just do something like
u = ts.samples()[0]
labels[u] = f"new_node_which_was_{u}"
And then after just check that labels[u] is recognised as a taxon ID by dendropy?
| :param str missing_data_character: As for the :meth:`.alignments` method, | ||
| but defaults to "?". | ||
| :param bool isolated_as_missing: As for the :meth:`.alignments` method. | ||
| :param node_labels: A map of type `{index: name}`. Samples present in |
There was a problem hiding this comment.
| :param node_labels: A map of type `{index: name}`. Samples present in | |
| :param node_labels: A map of type `{node_id: name}`. Samples present in |
| but defaults to "?". | ||
| :param bool isolated_as_missing: As for the :meth:`.alignments` method. | ||
| :param node_labels: A map of type `{index: name}`. Samples present in | ||
| the map will have the given name instead of `n{index}`. Note that |
There was a problem hiding this comment.
| the map will have the given name instead of `n{index}`. Note that | |
| the map will have the given name instead of `n{node_id}`. Note that |
Index is ambigous here, better to be clear
Description
Adds a
node_labelsparameter towrite_nexuswhich allows renaming samples. The trees still use the fast Newick serializer, the remapping is done with a TRANSLATE directive.Resolves #3435
PR Checklist: