Skip to content

Taxonomy subclasses: embrace or deprecate ? #634

Description

@bradenmacdonald
Image

Very relevant to the CBE Taxonomy Type use case.

Background

Our taxonomy models currently support the idea of taxonomy types, where each taxonomy can be a "normal" taxonomy but could also be a "typed" taxonomy implemented using a subclass of Taxonomy that adds extra functionality.

How it works:

Taxonomy instances have a _taxonomy_class column which can be accessed on the model as taxonomy.taxonomy_class, returning the actual Taxonomy subclass used for that instance. When loading taxonomies, the Taxonomy.cast() method will "reload" the taxonomy using the correct subclass.

Why we have it

We originally intended to have more built-in taxonomy types, only fragments of which still actually exist:

  • System defined taxonomy - the most basic subclass; all it does is override system_defined=True to mark these taxonomies as read-only (users shouldn't manually add/remove tags).
    • The only usage of this is the related LanguageTaxonomy subclass of SystemDefinedTaxonomy, which is used for the "Languages" taxonomy that gets automatically created using a migration
      Image
      • The idea of this was to tag content using languages, with a consistent taxonomy where the "external IDs" of each tag is a language code like en and the tag values are display names like English. The tags get auto-created as needed, which is why this taxonomy usually appears empty until/unless it has been used to actually tag some piece of content with a language tag.
      • This remains a half-baked idea, because: (1) the use case is unclear, (2) the tag values really should be localized ("English" / "Anglais" / "inglés" is all the same en tag) but we don't yet support tag localization, and (3) the "create tags on demand" aspect is confusing. (The thought is: pre-filling all known languages/locales would make it too noisy and less useful, so only show the actually enabled languages, but that set of languages can change at any time as it's just a configuration setting.)
  • Model System Defined Taxonomy - allows creating a non-editable taxonomy so that a specific model can be used as tags. This means, you could configure it to allow using courses, users, or almost anything else as tag values.
    • User System Defined Taxonomy - allows you to use users (specifically usernames) as tag values.
      • Example use case: you can create taxonomies called "Author" and "Reviewer", then tag content with usernames, like "Author: bradenmacdonald", "Reviewer: alice". This is pretty much working on the backend, and you could configure such taxonomies using the Django admin, but the actual content tagging UI does not support this and won't show it correctly / won't let you select users when applying such tags.

Note: we also have "free text" taxonomies, where you can create new tags as you apply tags within the course/library tagging tools, rather than defining them in advance. Oddly, this is implemented as a boolean attribute on the taxonomy, not as a distinct Taxonomy subclass/type.

Status

Most of these features are fully working on the backend, and will affect the taxonomy editor, but do not affect the actual content tagging experience seen within courses and libraries, as the features were never implemented end-to-end.

For example, you can switch any taxonomy to "Single Tag Only", and you'll then get an error if you try to apply more than one tag to the same Unit/Component, but the UI doesn't indicate why, nor does it properly change to only allow you to select one tag in the first place.

The "Languages" taxonomy is listed on everyone's taxonomy list in the Taxonomy editor, but is deliberately hidden from being used in courses if it's empty, and it's empty until it is used in a course. A classic catch-22.

The "taxonomy type" and other advanced attributes (description, single vs. multi-select, free text, system-defined, etc.) are not included in the import/export files. What you import/export is just the "contents", i.e. the list of tags, not the metadata.

Do we need this?

Given the lack of use cases or demand, I'm tempted to just remove the "taxonomy subclass" feature from openedx_tagging, as well as remove the "Languages" taxonomy.

The "system-defined" taxonomy subclass could just become a "read-only" attribute directly on the core Taxonomy model.

"free text" can continue to be a boolean attribute rather than a type, with the meaning that tags can be created at the time of tagging content, with any values, and the set of tags in the taxonomy is derived from the set of unique ObjectTag values, not Tag instances.

The idea of being able to use users as tags seems pretty cool to me, but I'm not sure it justifies the complexity.

CBE

However, for the CBE use cases, we may need a "subclass" of taxonomy, specifically Competency taxonomies.

Image Image Image

The question is: should this be implemented as a Taxonomy subclass, preserving and using the existing taxonomy_class/cast() mechanism, or (as proposed) implemented as a separate CompetencyTaxonomy table and REST API that supplements some taxonomies with extra data but which the oel_tagging app is unaware of?

  • The existing mechanism could be a good way to implement "types" and labels for those types within the taxonomy UI, since it allows pluggable implementations like CompetencyTaxonomy. However, we haven't really used it for this up until now.
  • The existing mechanism requires the full path to the python taxonomy subclass (e.g. openedx_cbe.models.CompetencyTaxonomy) to be stored in a column value, and it reverts to the behavior of a "regular" taxonomy if any error occurs while loading that class.

Alternative: keep the competency app totally separate, built on top of the tagging app. Mark Competency taxonomies as "read-only" so that they can only be edited in the competency editor and not the taxonomy editor? Use frontend plugins to inject competency-specific UI like the Competency badge where needed?

What I'm thinking: as far as user-visible features are concerned, if the only purpose of flagging some taxonomies as "Competency taxonomies" rather than regular taxonomies is to label them with the blue Competency badge and display the "Apply Competencies..." context menu, but everything else works exactly the same, then I don't think that we need to define them as a distinct type, and we can get rid of all conceptions of "taxonomy type" altogether. Just make the Studio frontend load some data from the tagging app and other data from the competency app and combine those two things together into a cohesive UI, but keep the tagging backend completely unaware of the concept of "taxonomy type".

Example:

  • List of all taxonomies: GET /api/taxonomies/v1/ (no type information)
  • If CBE is enabled, list of competency-enabled taxonomies: GET /api/cbe/v1/taxonomies/ (every taxonomy in this list is also returned in the first list, above)
  • In the frontend, display the first list and cross-reference the second list to decide when to show the Competency badge and the "Apply Competencies" context menu.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions