difftrack is a tool for keeping track of changes in data structures.
It makes it possible for multiple "listeners" to see
changes in a dict, a list or any other data structure you want to
observe and support (these structures are called "dispatchers").
difftrack has two main classes:
Dispatcher- acts like a data structure you write to but also sends all changes (diffs) to all its listeners.Listener- a listener is connected to one dispatcher and applies incoming diffs to its internal structure so each listener looks like the original data structure after applying all those diffs.
This division allows difftrack to have multiple listeners in
different stages of applying diffs, and it enables listeners
with special abilities (e.g. difftrack.utils.BoundedListDiffHandler
implementing a "top N" list: the list never exceeds a certain fixed size
but when some items are deleted, previously invisible elements appear).
In the following example we are going to create a list dispatcher (you can
write to it as to a list using __setitem__, __delitem__
and insert) and two listeners that will listen for diffs and keep
their own internal state.
>>> import difftrack
>>> dispatcher = difftrack.ListDispatcher()
>>> listener1 = difftrack.ListListener()
>>> listener2 = difftrack.ListListener()
>>> dispatcher.add_listener(listener1)
>>> dispatcher.add_listener(listener2) # create listeners and add them to dispatcher
>>> dispatcher.insert(0, 'AAA') # insert string 'AAA' to the first position in list
>>> listener1.get_snapshot() # Diffs are not applied until get_new_diffs() is called
[]
>>> listener1.get_new_diffs() # now we get all diffs that have not been processed yet
[(difftrack.ListDiff.INSERT, 0, 'AAA')]
>>> listener1.get_snapshot() # and we see that listener1's snapshot now contains what we expect
['AAA']
>>> listener2.get_snapshot() # second listener still hasn't got anything because we haven't read its diffs
[]
>>> dispatcher.insert(0, 'BBB') # insert new string to 'BBB'
>>> listener1.get_new_diffs() # we need to read new diffs to get current state
[(difftrack.ListDiff.INSERT, 0, 'BBB')]
>>> listener1.get_snapshot() # we inserted 'BBB' to first position so 'AAA' was moved to second position
['BBB', 'AAA']
>>> del dispatcher[0] # remove the first element from th list (now 'BBB')
>>> listener1.get_new_diffs()
[(difftrack.ListDiff.DELETE, 0, None)]
>>> listener1.get_snapshot() # we deleted 'BBB' so only 'AAA' remains
['AAA']
>>> dispatcher[0] = 'CCC' # overwrite the first element
>>> listener1.get_new_diffs()
[(difftrack.ListDiff.REPLACE, 0, 'CCC')]
>>> listener1.get_snapshot()
['CCC'] # the first and only element in list was overwritten
>>> listener2.get_new_diffs() # finally get all diffs for listener2
[(<ListDiff.INSERT: 0>, 0, 'AAA'),
(<ListDiff.INSERT: 0>, 0, 'BBB'),
(<ListDiff.DELETE: 2>, 0, None),
(<ListDiff.REPLACE: 1>, 0, 'CCC')]
>>> listener2.get_snapshot() # listener2 is now also up to date
['CCC']Similarly you can use difftrack with DictDispatcher and
DictListener: you write your changes to an instance of
DictDispatcher and after applying diffs to listeners you can get a
snapshot of the current dictionary state.
We can also add a callback to a listener so that we are notified when a diff comes:
import difftrack
>>> dispatcher = difftrack.ListDispatcher()
>>> def double_inserted_items(dtype, index, value):
''' This generates a new diff *while the current one is processed!* '''
if dtype is difftrack.ListDiff.INSERT:
dispatcher[index] = value * 2
>>> listener = difftrack.ListListener(on_change = double_inserted_items) # set function as a callback
>>> dispatcher.add_listener(listener)
>>> dispatcher.insert(0, 7) # insert 7 at index 0 and expect that the result will be doubled
>>> listener.get_new_diffs()
[
(difftrack.ListDiff.INSERT, 0, 7),
(difftrack.ListDiff.REPLACE, 0, 14)
]
>>> listener.get_snapshot()
[14]In this example we show the on_change callback and its ability to
work with a dispatcher. Note that we are first using the
ListDiff.INSERT operation but the callback triggers a
ListDiff.REPLACE operation. If it would lead to ListDiff.INSERT again we
would end in recursion and after 10 iterations difftrack would give up and
raise an exception.
The dispatcher may communicate to its listeners that a certain sequence of diffs belongs together, i.e. form a batch. We do this by using the dispatcher as a context manager, wrapping diff operations that belong together.
A listener may provide another callback called on_finalize_batch that
gets called every time the dispatcher finishes dispatching a batch
(the context is exited).
>>> import difftrack
>>> dispatcher = difftrack.DictDispatcher()
>>> def finalize():
print('FINALIZED')
>>> def on_change(*args):
print('CHANGE')
>>> listener = difftrack.DictListener(on_change = on_change, on_finalize_batch = finalize)
>>> dispatcher.add_listener(listener)
>>> with dispatcher: # use the dispatcher as a context manager
dispatcher[0] = 0
dispatcher[1] = 1
dispatcher[2] = 2
CHANGE
CHANGE
CHANGE
FINALIZEDWe can see that the on_change callback is called every time but
on_finalize_batch only when we exit the context.
There are several utilities that you might find useful.
Data mapper applies a function to every data field:
>>> import difftrack
>>> def mapper(data: str) -> str:
return data.lower()
>>> dispatcher = difftrack.ListDispatcher()
>>> listener = difftrack.ListListener()
>>> dispatcher.add_listener(difftrack.data_mapper(mapper)(listener))
>>> dispatcher.insert(0, 'AAA')
>>> dispatcher.insert(0, 'BBB')
>>> listener.get_new_diffs()
[
(difftrack.ListDiff.INSERT, 0, 'aaa'),
(difftrack.ListDiff.INSERT, 0, 'bbb')
]
>>> listener.get_snapshot()
['bbb', 'aaa']When you update a dict item several times or even delete it you sometimes don't want to keep all the changes. You can use compaction to drop changes that cancel or override each other out:
>>> diffs = [
(difftrack.DictDiff.SET, 'x', 123),
(difftrack.DictDiff.SET, 'y', 456),
(difftrack.DictDiff.SET, 'y', 9999),
(difftrack.DictDiff.DELETE, 'x', None),
]
>>> difftrack.compact_dict_diffs(diffs)
[
(difftrack.DictDiff.SET, 'y', 9999),
(difftrack.DictDiff.DELETE, 'x', None),
]The same kind of compaction is available for lists as well:
>>> diffs = [
(difftrack.ListDiff.INSERT, 0, 'aaa'),
(difftrack.ListDiff.INSERT, 1, 'bbb'),
(difftrack.ListDiff.DELETE, 0, None)
(difftrack.ListDiff.REPLACE, 1, 'ccc'),
]
>>> difftrack.compact_list_diffs(diffs)
[
(difftrack.ListDiff.INSERT, 1, 'ccc'),
]If we want to keep our list bounded (capped to a certain size) we can use
difftrack.BoundedListDiffHandler.
>>> import difftrack
>>> listener = difftrack.ListListener()
>>> dispatcher = difftrack.ListDispatcher()
>>> dispatcher.add_listener(difftrack.BoundedListDiffHandler(listener, 2)) # bound listener to 2 elements
>>> dispatcher.insert(0, 'a')
>>> dispatcher.insert(1, 'b')
>>> dispatcher.insert(2, 'c')
>>> dispatcher.insert(3, 'd')
>>> listener.get_new_diffs()
[
(difftrack.ListDiff.INSERT, 0, 'a'),
(difftrack.ListDiff.INSERT, 1, 'b'),
]
>>> listener.get_snapshot()
['a', 'b']
>>> del dispatcher[0]
>>> listener.get_new_diffs() # 'a' is deleted and 'c' moves to the empty index 1
[
(<ListDiff.DELETE: 2>, 0, None),
(<ListDiff.INSERT: 0>, 1, 'c')
]
>>> listener.get_snapshot()
['b', 'c']This function groups list diffs affecting consecutive indices.
>>> import difftrack
>>> diffs = [
(difftrack.ListDiff.INSERT, 1, 'A'),
(difftrack.ListDiff.INSERT, 2, 'B'),
(difftrack.ListDiff.INSERT, 3, 'C'),
(difftrack.ListDiff.REPLACE, 1, 'D'),
(difftrack.ListDiff.DELETE, 1, [])
]
>>> list(difftrack.squash_list_diffs(diffs))
[
SquashResults(operation=<difftrack.ListDiff.INSERT: 0>, start=1, stop=1, payload=['A', 'B', 'C']),
SquashResults(operation=<difftrack.ListDiff.REPLACE: 1>, start=1, stop=2, payload=['D']),
SquashResults(operation=<difftrack.ListDiff.DELETE: 2>, start=1, stop=2, payload=[])
]You can see that the three consecutive inserts are squashed into a single message. Note that the result is no longer a difftrack diff.