Handle a question whose type changes but name doesn't.#187
Handle a question whose type changes but name doesn't.#187noliveleger wants to merge 18 commits intomainfrom
Conversation
…51-question-with-different-types
… with different types
… but different type
| if field.has_stats and field.name not in fields_to_skip: | ||
| counter = metrics[field.contextual_name] | ||
| raw_value = entry.get(field.path) | ||
| field_contextual_name = "{}_{}_{}".format( |
There was a problem hiding this comment.
(self: revise this) i guess this is the expected contextual name, used for comparison not assignment?
possible to use the create_unique_name() method to avoid hard-coding this here?
https://github.com/kobotoolbox/formpack/pull/187/files#diff-7845025cfa4cd9a06d7a4399e1ba9c0bR58
or, maybe if field.contextual_name.startswith("{}_{}_v".format( below could be changed to something like if field.use_unique_name?
There was a problem hiding this comment.
@jnm: I now remember why I did this comparison this way.
We have
if field.contextual_name.startswith("{}_{}_v".format(
field.name,
field.data_type
)):
# We have a match, we won't evaluate other fields with same name
# for this submission.
if field.contextual_name == field_contextual_name:
fields_to_skip.append(field.name)
else:
continue
Let's explain:
All fields that are added by get_fields_for_versions (because another type has been detected in the latest version) match the pattern name_data_type_version_id.
- Trivial case: Pattern doesn't match. Data belongs to field. Skip first if statement.
- Other case: We have a match with this pattern
version_idis the same. Data belongs to this field. All other fields with same name will be skipped for this submission.version_idis not the same, we want to iterate over loop immediately because the data
should be appended to the corresponding field (or column in the export) only - which is a further occurrence of the loop.
The {}_{}_v pattern is used to avoid matching another field name.
For example:
- Question 1: restaurant
- Question 2: restaurant_text
Adding the "_v" in the pattern decreases the risk of matching the wrong field.
Obviously if all these rules are true:
- Form has several versions
- Form has a question named
restauranthas multiple types among versions - One of its types is
text - Form has a question name
restaurant_text_v
Export will have data shifted again. The comparison would be a regex to decrease the risk a little bit more.
Something like _v[\d\w]{21,}. What do you think?
Having this said: Your suggestion to use create_unique_name is really good idea.
About the idea of maybe if field.contextual_name.startswith("{}_{}_v".format( below could be changed to something like if field.use_unique_name?
Actually it can't.
In case the form has more than 3 versions and each version contains a question with the same name but 3 different types.
In that case fields should be:
[<FormField type="text" contextual_name="question_text_v123456">,
<FormField type="integer" contextual_name="question_integer_v234567">,
<FormField type="date_time" contextual_name="question">]
Let's say, we have 1 submission for each version where value of the question question would be
- Submission 1:
answer - Submission 2:
1 - Submission 3:
2018-12-18 00:00:00
Using field.use_unique_name, submission 1 and 2 would be matched with field question_text_v123456 because both fields question_text_v123456 and question_integer_v234567 have use_unique_name property equals True. Using the string comparison submission 2 can't be matched with question_text_v123456 (because its version_id is not the same)
|
|
||
| def __repr__(self): | ||
| return "<%s name='%s'>" % (self.__class__.__name__, self.name) | ||
| return "<%s name='%s'>" % (self.__class__.__name__, self.contextual_name) |
There was a problem hiding this comment.
note: everywhere else, this is changed to contextual_name='%s'
There was a problem hiding this comment.
I have hesitated between using self.name for FormDataDef or changing the representation string because contextual_name is more related to FormField than its parent class.
Still wondering...
|
There's still a problem when disaggregating (using "group by") in the autoreport after I changed an Sentry has the details, but make sure to switch to "Full" view instead of "App Only": https://sentry.kbtdev.org/kobo/kpi-backend/issues/18374/ I'm guessing Bonus: what the heck causes |
|
@jnm: Nice catch. |
…rder to match data with correct field
|
@jnm: Sentry is savior but he is perhaps the bad guy as well. |
whitespace and `contextual_name` changes
|
@noliveleger what's the way forward on this one? |
@joshuaberetta last discussion about that with @jnm was, we wanted to think about it a little bit more. |


Fixes #151.
Fixes kobotoolbox/kpi#2109
Covers #183 and #185.