while updating the markup of the resources, I stumbled upon how to properly encode the @ids of SequenceAnnotation and SequenceRange. The thing I want to encode is:
I have a protein ("@id": "https://disprot.org/DP03543") which has three hasSequenceAnnotation objects, each with it's own SequenceRange:
-
SequenceAnnotation ("@id": "https://disprot.org/DP03543#disorder-content")
with SequenceRange ("@id": "https://disprot.org/DP03543#sequence-location.1_96")
saying that this whole protein (1..96) has a disorder content of 0.99
-
SequenceAnnotation ("@id": "https://disprot.org/DP03543r001")
with SequenceRange ("@id": "https://disprot.org/DP03543#sequence-location.1_96")
saying that this protein region (1..96) is disordered (ontology)
-
SequenceAnnotation ("@id": "https://disprot.org/DP03543r003")
with SequenceRange ("@id": "https://disprot.org/DP03543#sequence-location.10_50")
saying that this protein region (10..50) is modulated...
Note that the first two SequenceAnnotations share the same SequenceRange.
An alternative version would be with modified SequenceRange @ids like this:
-
SequenceAnnotation ("@id": "https://disprot.org/DP03543#disorder-content")
with SequenceRange ("@id": "https://disprot.org/DP03543#sequence-location.1_96")
-
SequenceAnnotation ("@id": "https://disprot.org/DP03543r001")
with SequenceRange ("@id": "https://disprot.org/DP03543r001#sequence-location.1_96")
-
SequenceAnnotation ("@id": "https://disprot.org/DP03543r003")
with SequenceRange ("@id": "https://disprot.org/DP03543r003#sequence-location.10_50")
where each SequenceAnnotation has it's own SequenceRange so now, the first two SequenceRanges become separated nodes in the graph. Bottom line is should we treat SequenceRange as child node of SequenceAnnotation or somehow link it to the parent node of SequenceAnnotation (in this case Protein node, with all implied changes to the profile)?
Which solution is more correct conceptually? And of course, easier to process in the IDP-KG?
while updating the markup of the resources, I stumbled upon how to properly encode the
@ids ofSequenceAnnotationandSequenceRange. The thing I want to encode is:I have a protein (
"@id": "https://disprot.org/DP03543") which has threehasSequenceAnnotationobjects, each with it's ownSequenceRange:SequenceAnnotation("@id": "https://disprot.org/DP03543#disorder-content")with
SequenceRange("@id": "https://disprot.org/DP03543#sequence-location.1_96")saying that this whole protein (1..96) has a disorder content of 0.99
SequenceAnnotation("@id": "https://disprot.org/DP03543r001")with
SequenceRange("@id": "https://disprot.org/DP03543#sequence-location.1_96")saying that this protein region (1..96) is disordered (ontology)
SequenceAnnotation("@id": "https://disprot.org/DP03543r003")with
SequenceRange("@id": "https://disprot.org/DP03543#sequence-location.10_50")saying that this protein region (10..50) is modulated...
Note that the first two
SequenceAnnotationsshare the sameSequenceRange.An alternative version would be with modified
SequenceRange@ids like this:SequenceAnnotation("@id": "https://disprot.org/DP03543#disorder-content")with
SequenceRange("@id": "https://disprot.org/DP03543#sequence-location.1_96")SequenceAnnotation("@id": "https://disprot.org/DP03543r001")with
SequenceRange("@id": "https://disprot.org/DP03543r001#sequence-location.1_96")SequenceAnnotation("@id": "https://disprot.org/DP03543r003")with
SequenceRange("@id": "https://disprot.org/DP03543r003#sequence-location.10_50")where each
SequenceAnnotationhas it's ownSequenceRangeso now, the first twoSequenceRanges become separated nodes in the graph. Bottom line is should we treatSequenceRangeas child node ofSequenceAnnotationor somehow link it to the parent node ofSequenceAnnotation(in this caseProteinnode, with all implied changes to the profile)?Which solution is more correct conceptually? And of course, easier to process in the IDP-KG?