Create empty genotype arrays when there are no samples#464
Create empty genotype arrays when there are no samples#464jeromekelleher merged 4 commits intosgkit-dev:mainfrom
Conversation
|
Ah, hmm, I think I misunderstood. I think we should continue to have no It's not clear to me what happens when the ploidy value that you specify disagrees with what's in the GTs. It is simplest to just raise an error for now? In any case I think we need to spell out the semantics somewhere in the documentation and it's probably best to that now while we're making the change. |
That's inconsistent with all the other genotype fields though. If there is a GQ field defined in the header but no samples then vcf2zarr will create empty
Yes, that's probably best. |
|
There's two orthogonal things here:
I think we have to deal with both? If there's no GT in the header and no samples present, we should not output call_genotype by default I think. But, we can force call_genotype to be included, by specifying the |
|
I agree they are orthogonal.
Currently for case 3 we don't create I think specifying ploidy is also orthogonal. In case 1 it could force creating What do you think? |
|
SGTM. What do we do in case 2 at the moment? |
|
(If we're already handling case 2 sensibly I think we should leave it - experience has shown that you just have to accept malformed VCFs) |
940d62c to
9f51adc
Compare
|
Hopefully this is a bit better. For case 1 setting ploidy doesn't have any effect. I haven't changed case 2 (or checked what it does). I've implemented case 3. And for case 4 it will error if the set ploidy is less than the maximum ploidy in the data. |
Fixes #463