Skip to content

Comments

feat: amend new fields to interval schema#317

Open
project-defiant wants to merge 5 commits intomasterfrom
interval-qc-update
Open

feat: amend new fields to interval schema#317
project-defiant wants to merge 5 commits intomasterfrom
interval-qc-update

Conversation

@project-defiant
Copy link

@project-defiant project-defiant commented Feb 2, 2026

Context

This PR adds the new fields

  • biosampleFromSourceId
  • qualityControls

to the Intervals dataset.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds two new fields to the Interval schema to enhance genomic interval data representation: biosampleFromSourceId to capture the original biosample identifier from data sources, and qualityControls to store quality control flags.

Changes:

  • Added biosampleFromSourceId field to store the original biosample identifier as defined by datasource providers
  • Added qualityControls field to track quality control flags for interval data
  • Updated both the entity model and GraphQL schema definitions to maintain consistency

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
app/models/entities/Intervals.scala Added two new required fields to the Interval case class: biosampleFromSourceId (String) and qualityControls (Vector[String])
app/models/gql/Objects.scala Added corresponding GraphQL DocumentField definitions for the new fields with appropriate descriptions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@project-defiant project-defiant marked this pull request as ready for review February 2, 2026 10:30
@project-defiant project-defiant linked an issue Feb 2, 2026 that may be closed by this pull request
2 tasks
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (1)

app/models/entities/EnhancerToGene.scala:27

  • The PR description states this change only adds two new fields (biosampleFromSourceId and qualityControls) to the Intervals dataset. However, the actual changes include a significant refactoring that renames "Interval" to "EnhancerToGene" throughout the codebase. This is a breaking API change that should be clearly documented in the PR description. If this refactoring is intentional, please update the PR description to reflect the full scope of changes. If it's unintentional, consider splitting the new field additions from the renaming refactoring into separate PRs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

case class ClickhouseSettings(
defaultDatabaseName: String,
intervals: DbTableSettings,
enhancerToGene: DbTableSettings,
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration field rename from "intervals" to "enhancerToGene" will require updating any existing application.conf files in production environments. This should be documented in the PR description or migration guide, along with any necessary database table renaming if the table name changed from "intervals" to "enhancer_to_gene".

Copilot uses AI. Check for mistakes.
Comment on lines +8 to 14
case class EnhancerToGeneQuery(chromosome: String,
start: Int,
end: Int,
tableName: String,
offset: Int,
size: Int
) extends Queryable
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name "IntervalsQuery.scala" does not match the renamed case class "EnhancerToGeneQuery". Following the established codebase convention where query files are named after their contained case classes (e.g., TargetsQuery.scala contains TargetsQuery, QW2V.scala contains QW2V), this file should be renamed to "EnhancerToGeneQuery.scala".

Copilot uses AI. Check for mistakes.
val tableName = getTableWithPrefixOrDefault(defaultOTSettings.clickhouse.enhancerToGene.name)
val page = pagination.getOrElse(Pagination.mkDefault).offsetLimit
val intervalsQuery = IntervalsQuery(
val intervalsQuery = EnhancerToGeneQuery(
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name "intervalsQuery" is inconsistent with the renamed case class "EnhancerToGeneQuery". For better code clarity and consistency with the refactoring, this variable should be renamed to "enhancerToGeneQuery".

Copilot uses AI. Check for mistakes.
@@ -998,11 +998,11 @@ class Backend @Inject() (implicit
logger.info(s"Total intervals found: $total")
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message "Total intervals found" uses outdated terminology. It should be updated to reflect the new naming convention, such as "Total enhancer to gene mappings found" or similar, to be consistent with the refactoring throughout the codebase.

Suggested change
logger.info(s"Total intervals found: $total")
logger.info(s"Total enhancer to gene mappings found: $total")

Copilot uses AI. Check for mistakes.
.executeQuery[Interval, Query](intervalsQuery.query)
.map(intervals => Intervals(total, intervals))
.executeQuery[EnhancerToGene, Query](intervalsQuery.query)
.map(intervals => EnhancerToGenes(total, intervals))
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda parameter "intervals" should be renamed to "enhancerToGenes" to be consistent with the refactoring and to better reflect what the data represents. This improves code readability and maintains naming consistency throughout the method.

Suggested change
.map(intervals => EnhancerToGenes(total, intervals))
.map(enhancerToGenes => EnhancerToGenes(total, enhancerToGenes))

Copilot uses AI. Check for mistakes.
"enhancerToGenes",
enhancerToGenesImp,
description = Some(
"Regulatory enhancer/promoter regions to gene (target) predictions overlapping with this variant's location. These intervals link regulatory regions to target genes based on experimental data for specific tissues or cell types."
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description still uses the old terminology "These intervals link" instead of reflecting the new naming convention. Consider updating to be more consistent with the refactoring, such as "These enhancer-to-gene mappings link" or keeping "intervals" as a more general descriptive term while using "enhancerToGenes" for the field name. The current mix may cause confusion.

Suggested change
"Regulatory enhancer/promoter regions to gene (target) predictions overlapping with this variant's location. These intervals link regulatory regions to target genes based on experimental data for specific tissues or cell types."
"Enhancer-to-gene mappings that link regulatory enhancer/promoter regions to target genes when they overlap this variant's location. These predictions are derived from experimental data for specific tissues or cell types."

Copilot uses AI. Check for mistakes.
@@ -1491,6 +1491,10 @@ object Objects extends OTLogging {
),
DocumentField("studyId", "Identifier of the study providing the experimental data"),
DocumentField("biosampleName", "Name of the biosample where the interval was identified"),
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field description for "biosampleName" still refers to "interval" in the phrase "where the interval was identified". For consistency with the refactoring, this should be updated to refer to "regulatory region" or "enhancer/promoter region" instead of "interval".

Suggested change
DocumentField("biosampleName", "Name of the biosample where the interval was identified"),
DocumentField("biosampleName",
"Name of the biosample where the regulatory region was identified"
),

Copilot uses AI. Check for mistakes.
ctx.value.position,
ctx.value.position,
ctx.arg(pageArg)
)
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces a breaking API change by renaming the GraphQL field from "intervals" to "enhancerToGenes". This will break existing GraphQL queries that use the old field name. Consider: 1) Providing a deprecation period where both field names are supported, or 2) Documenting this as a breaking change in the PR description with migration instructions for API consumers, or 3) Bumping the API version if applicable.

Suggested change
)
)
),
Field(
"intervals",
enhancerToGenesImp,
description = Some(
"Deprecated: use `enhancerToGenes` instead. Regulatory enhancer/promoter regions to gene (target) predictions overlapping with this variant's location."
),
arguments = pageArg :: Nil,
complexity = Some(complexityCalculator(pageArg)),
deprecationReason = Some("Use `enhancerToGenes` instead."),
resolve = ctx =>
ctx.ctx.getEnhancerToGenes(ctx.value.chromosome,
ctx.value.position,
ctx.value.position,
ctx.arg(pageArg)
)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enhancer_to_gene update: schemas, widget and L2G

3 participants