Skip to content

Move make_identifier function to collect.py#1215

Open
Lotram wants to merge 3 commits into
GothenburgBitFactory:developfrom
Lotram:move-identifier-function
Open

Move make_identifier function to collect.py#1215
Lotram wants to merge 3 commits into
GothenburgBitFactory:developfrom
Lotram:move-identifier-function

Conversation

@Lotram
Copy link
Copy Markdown
Collaborator

@Lotram Lotram commented May 19, 2026

in aggregate_issues, we have both the Issue object and the deserialized data, so we can compute the identifier without relying on a fragile iteration on list of key_lists Also add some more precise typing

Comment thread bugwarrior/types.py Outdated
@@ -0,0 +1,14 @@
from typing import Any, NamedTuple

TaskwarriorData = dict[str, Any]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this name confusing because it is unclear that it is data for a single task. (Everything in the taskwarrior database is "taskwarrior data".)

I had the thought that it might be ideal to align naming with taskchampion but I'm not strongly opinionated. Here are some ideas and rambling thoughts:

  • Task: Don't all data types hold data? What are we trying to communicate with "data" that Task wouldn't cover? Or perhaps we should differentiate from taskw.Task?
  • TaskData: Similar to what you have, but taskchampion's data structure is slightly higher level.
  • TaskMap: The most directly comparable data structure, but named after the rust-specific HashMap.

On the other hand, does such a simple data structure really deserve a custom type? What's the point?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this name confusing because it is unclear that it is data for a single task. (Everything in the taskwarrior database is "taskwarrior data".)

Good point, that name is bad.

On the other hand, does such a simple data structure really deserve a custom type? What's the point?
Here is my chain of thoughts:

  1. We should stop passing dicts everywhere, as it's really hard to understand what these dicts are exactly
  2. We should use our own Task object, and that's what to_taskwarrior() should return, but that's a separate issue.
  3. In the meantime, let's have a type alias, so we know where this dict is used (so we can modify all the types all at once, when switching to a real model (pydantic, probably).

But I can remove this type for now.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your line of reasoning but I also think it should be removed for now until we have a specified data type that actually constrains the value it holds.

Comment thread bugwarrior/types.py Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of centralizing types here rather than in the modules in which they're instantiated? To me this pattern seems to obfuscate the dependency graph between modules and possibly encourage such dependence. (Using a type which is only instantiated in one module is effectively dependence on that module.)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, the main reason to put it there is laziness, as I was sure it wouldn't create any dependency issue, but I mostly agree with you.

Would you move this code along Issue and Service in services/__init__.py, or in a new file services/types.py ?

Copy link
Copy Markdown
Collaborator

@ryneeverett ryneeverett May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's such a small amount of code that I would just put it in _init__.py.

Part of the reason I was asking is that I've seen this pattern in other code bases and wondered if there was something special about types.py or if that was considered a best practice for some reason. I think your answer confirmed that there is nothing special about it and no reason to break it out unless we feel a module has become too big.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On bigger codebases, I still think it improves readability, like most projects choose to have an exceptions.py. But I have no problem with dropping this for now

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you move this code along Issue and Service in services/__init__.py, or in a new file services/types.py ?

Is there a reason not to put them in collect.py, the place where they originate and db.py was already importing from?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I moved it back to collect.py

Comment thread bugwarrior/collect.py Outdated
Comment on lines 119 to 120
record = TaskConstructor(issue).get_data_to_sync()
yield record
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
record = TaskConstructor(issue).get_data_to_sync()
yield record
yield TaskConstructor(issue).get_data_to_sync()

@Lotram Lotram force-pushed the move-identifier-function branch from e8e23de to 3315a89 Compare May 28, 2026 11:46
@Lotram
Copy link
Copy Markdown
Collaborator Author

Lotram commented May 28, 2026

Turns out I had a types.py to avoid cyclic dependency issues. I changed two things:

  • get_service is now defined in services/__init__.py, as it makes more sense than to define it in collect.py
  • config/__init__.py now only import (and re-export) objects used in the public API. It used to import all files, creating cyclic dependencies much more "easily".

@Lotram Lotram force-pushed the move-identifier-function branch from 3315a89 to b2bb879 Compare May 28, 2026 11:55
Lotram added 3 commits May 29, 2026 09:49
in aggregate_issues, we have both the Issue object
and the deserialized data, so we can compute the identifier
without relying on a fragile iteration on list of key_lists
Also add some more precise typing
config/__init__.py now only imports objects for the public API
get_service function is now defined in services/__init__.py
@Lotram Lotram force-pushed the move-identifier-function branch from b2bb879 to 56754e9 Compare May 29, 2026 07:53
@ryneeverett
Copy link
Copy Markdown
Collaborator

Turns out I had a types.py to avoid cyclic dependency issues. I changed two things:

* `get_service` is now defined in `services/__init__.py`, as it makes more sense than to define it in `collect.py`

* `config/__init__.py` now only import (and re-export) objects used in the public API. It used to import all files, creating cyclic dependencies much more "easily".

It isn't clear to me that these changes are desirable. Could they be dropped now that the types are in collect.py?

@Lotram
Copy link
Copy Markdown
Collaborator Author

Lotram commented May 29, 2026

I'd keep the change on get_service. IMO, there is really no good reason to have it in collect.py. For now, config, as a modules, depends on collect.py, but it should not. If not in services/__init__.py, I think it should be somewhere in config/ (maybe validation.py ?), so that config does not depend on anything.

If we make sure of that (config not depending on anything), then I can revert the change on config/__init__.py, since we're sure that importing all config files can't be an issue for other modules.

@ryneeverett
Copy link
Copy Markdown
Collaborator

Keeping get_service in services/__init__.py and reverting the config/__init__.py changes makes sense to me. (Now that I look at get_service from a dependency perspective I see what you mean.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants