Data Model¶
This section documents some of DistKV’s server-internal classes.
This module contains DistKV’s basic data model.
TODO: message chains should be refactored to arrays: much lower overhead.
-
class
distkv.model.
Node
(name, tick=None, cache=None, create=True)¶ Represents one DistKV participant.
-
for ... in
enumerate
(n: int = 0, current: bool = False)¶ Return a list of valid keys for that node.
Used to find data from no-longer-used nodes so they can be deleted.
-
seen
(tick, entry=None, local=False)¶ An event with this tick was in the entry’s chain.
- Parameters
tick – The event affecting the given entry.
entry – The entry affected by this event.
local – The message was not broadcast, thus do not assume that other nodes saw this.
-
is_deleted
(tick)¶ Check whether this tick has been marked as deleted.
-
mark_deleted
(tick)¶ The data for this tick will be deleted.
- Parameters
tick – The event that caused the deletion.
Returns: the entry, if still present
-
clear_deleted
(tick)¶ The data for this tick are definitely gone (deleted).
-
purge_deleted
(r: range_set.RangeSet)¶ All entries in this rangeset are deleted.
This is a shortcut for calling
clear_deleted()
on each item.
-
supersede
(tick)¶ The event with this tick is no longer in the referred entry’s chain. This happens when an entry is updated.
- Parameters
tick – The event that once affected the given entry.
-
report_superseded
(r: range_set.RangeSet, local=False)¶ Some node said that these entries may have been superseded.
- Parameters
range – The RangeSet thus marked.
local – The message was not broadcast, thus do not assume that other nodes saw this.
-
report_missing
(r: range_set.RangeSet)¶ Some node doesn’t know about these ticks.
We may need to broadcast either their content, or the fact that these ticks have been superseded.
-
report_deleted
(r: range_set.RangeSet, server)¶ This range has been reported as deleted.
- Parameters
range (RangeSet) – the range that’s gone.
add (dict) – store additional vanished items. Nodename -> RangeSet
-
local_present
¶ Values I know about
-
local_superseded
¶ Values I knew about
-
local_deleted
¶ Values I know to have vanished
-
local_missing
¶ Values I have not seen, the inverse of
local_present()
pluslocal_superseded()
-
remote_missing
¶ Values from this node which somebody else has not seen
-
kill_this_node
(cache=None)¶ Remove this node from the system. No chain’s first link may point to this node.
-
for ... in
-
class
distkv.model.
NodeSet
(encoded=None, cache=None)¶ Represents a dict (nodename > RangeSet).
-
class
distkv.model.
NodeEvent
(node: distkv.model.Node, tick: Optional[int] = None, prev: Optional[distkv.model.NodeEvent] = None)¶ Represents any event originating at a node.
- Parameters
node – The node thus affected
tick – Counter, timestamp, whatever
prev – The previous event, if any
-
equals
(other)¶ Check whether these chains are equal. Used for ping comparisons.
The last two items may be missing from either chain.
-
find
(node)¶ Return the position of a node in this chain. Zero if the first entry matches.
Returns
None
if not present.
-
filter
(node, server=None)¶ Return an event chain without the given node.
If the node is not in the chain, the result is not a copy.
-
attach
(prev: Optional[distkv.model.NodeEvent] = None, server=None)¶ Copy this node, if necessary, and attach a filtered prev chain to it
-
class
distkv.model.
UpdateEvent
(event: distkv.model.NodeEvent, entry: distkv.model.Entry, new_value, old_value=<class 'distkv.util._impl.NotGiven'>, tock=None)¶ Represents an event which updates something.
-
class
distkv.model.
Entry
(name: str, parent: distkv.model.Entry, tock=None)¶ This class represents one key/value pair
-
SUBTYPE
¶ alias of
distkv.model.Entry
-
follow_acl
(path, *, create=True, nulls_ok=False, acl=None, acl_key=None)¶ Follow this path.
If
create
is True (default), unknown nodes are silently created. Otherwise they cause a KeyError. IfNone
, assumecreate=True
but only check the ACLs.If
nulls_ok
is False (default), None is not allowed as a path element. If 2, it is allowed anywhere; if True, only as the first element.If
acl
is notNone
, thenacl_key
is the ACL letter to check for.acl
must be anACLFinder
created from the root of the ACL in question.The ACL key ‘W’ is special: it checks ‘c’ if the node is new, else ‘w’.
Returns a (node, acl) tuple.
-
follow
(path, *, create=True, nulls_ok=False)¶ As
follow_acl()
, but isn’t interested in ACLs and only returns the node.
-
mark_deleted
(server)¶ This entry has been deleted.
- Returns
the entry’s chain.
-
purge_deleted
()¶ Call
Node.clear_deleted()
on each link in this entry’s chain.
-
await
set_data
(event: distkv.model.NodeEvent, data: Any, server=None, tock=None)¶ This entry is updated by that event.
- Parameters
event – The
NodeEvent
to base the update on.data (Any) – whatever the node should contains. Use
distkv.util.NotGiven
to delete.
- Returns
The
UpdateEvent
that has been generated and applied.
-
await
apply
(evt: distkv.model.UpdateEvent, server=None, root=None, loading=False)¶ Apply this :cls`UpdateEvent` to me.
Also, forward to watchers.
-
await
walk
(proc, acl=None, max_depth=- 1, min_depth=0, _depth=0, full=False)¶ Call coroutine
proc
on this node and all its children).If acl (must be an ACLStepper) is given, proc is called with the acl as second argument.
If proc raises StopAsyncIteration, chop this subtree.
-
serialize
(chop_path=0, nchain=2, conv=None)¶ Serialize this entry for msgpack.
- Parameters
chop_path – If <0, do not return the entry’s path. Otherwise, do, but remove the first N entries.
nchain – how many change events to include.
-
await
updated
(event: distkv.model.UpdateEvent)¶ Send an event to this node (and all its parents)’s watchers.
-
-
class
distkv.model.
Watcher
(root: distkv.model.Entry, full: bool = False, q_len: Optional[int] = None)¶ This helper class is used as an async context manager plus async iterator. It reports all updates to an entry (or its children).
If a watcher terminates, sending to its channel has blocked. The receiver needs to take appropriate re-syncing action.
ACLs¶
ACL checks are performed by ACLFinder
. This class
collects all relevant ACL entries for any given (sub)path, sorted by
depth-first specificty. This basically means that you collect all ACLs
that could possibly match a path and sort them; the +
and #
wildcards get sorted last. Then the system picks the first entry that
actually has a value.
This basically means that if you have a path a b c d e f g
and ACLs a
b # g
and a # d e f g
, the first ACL will match because b
is
more specific than #
, even though the second ACL is longer and thus
could be regarded as being more specific. However, the current rule is more
stable when used with complex ACLs and thus more secure.
-
class
distkv.types.
ACLFinder
(acl, blocked=None)¶ A NodeFinder which expects ACL strings as elements
Helper methods and classes¶
-
class
distkv.util.
MsgWriter
(*a, buflen=65536, **kw)¶ Write a stream of messages to a file (encoded with MsgPack).
Usage:
async with MsgWriter("/tmp/msgs.pack") as f: for msg in some_source_of_messages(): # or "async for" await f(msg)
- Parameters
Exactly one of
path
andstream
must be used.The stream is buffered. Call
flush()
to flush the buffer.-
await
flush
()¶ Flush the buffer.
-
distkv.util.
NotGiven
¶ This object marks the absence of information where simply not using the data element or keyword at all would be inconvenient.
For instance, in
def fn(value=NotGiven, **kw)
you’d need to test'value' in kw
, or use an exception. The problem is that this would not show up in the function’s signature.With
NotGiven
you can simply testvalue is
(oris not
)NotGiven
.
This module’s job is to run code, resp. to keep it running.
-
exception
distkv.runner.
NotSelected
¶ This node has not been selected for a very long time. Something is amiss.
-
class
distkv.runner.
RunnerMsg
(msg=None)¶ Superclass for runner-generated messages.
Not directly instantiated.
This message and its descendants take one opaque parameter:
msg
.
-
class
distkv.runner.
ChangeMsg
(msg=None)¶ A message telling your code that some entry has been updated.
Subclass this and use it as CallAdmin.watch’s
cls
parameter for easier disambiguation.The runner sets
path
andvalue
attributes.
-
class
distkv.runner.
MQTTmsg
(msg=None)¶ A message transporting some MQTT data.
value is the MsgPack-decoded content. If that doesn’t exist the message is not decodeable.
The runner also sets the
path
attribute.
-
class
distkv.runner.
ReadyMsg
(msg=None)¶ This message is queued when the last watcher has read all data.
-
class
distkv.runner.
TimerMsg
(msg=None)¶ A message telling your code that a timer triggers.
Subclass this and use it as CallAdmin.timer’s
cls
parameter for easier disambiguation.
-
class
distkv.runner.
CallAdmin
(runner, state, data)¶ This class collects some standard tasks which async DistKV-embedded code might want to do.
-
await
cancel
()¶ Cancel the running task
-
await
spawn
(proc, *a, **kw)¶ Start a background subtask.
The task is auto-cancelled when your code ends.
- Returns: an anyio.abc.CancelScope which you can use to cancel the
subtask.
-
await
setup_done
(**kw)¶ Call this when your code has successfully started up.
-
await
error
(path=None, **kw)¶ Record that an error has occurred. This function records specific error data, then raises ErrorRecorded which the code is not supposed to catch.
See distkv.errors.ErrorRoot.record_error for keyword details. The
path
argument is auto-filled to point to the current task.
-
await
watch
(path, cls=<class 'distkv.runner.ChangeMsg'>, **kw)¶ Create a watcher. This path is monitored as per distkv.client.Client.watch; messages are encapsulated in ChangeMsg objects. A ReadyMsg will be sent when all watchers have transmitted their initial state.
By default a watcher will only monitor a single entry. Set
max_depth
if you also want child entries.By default a watcher will not report existing entries. Set
fetch=False
if you want them.
-
await
send
(path, value=<class 'distkv.util._impl.NotGiven'>, raw=None)¶ Publish an MQTT message.
Set either
value
orraw
.
-
await
set
(path, value, chain=<class 'distkv.util._impl.NotGiven'>)¶ Set a DistKV value.
-
await
get
(path, value)¶ Get a DistKV value.
-
await
monitor
(path, cls=<class 'distkv.runner.MQTTmsg'>, **kw)¶ Create an MQTT monitor. Messages are encapsulated in MQTTmsg objects.
By default a monitor will only monitor a single entry. You may use MQTT wildcards.
The message is decoded and stored in the
value
attribute unless it’s either undecodeable orraw
is set, in which case it’s stored in.msg
. The topic the message was sent to is intopic
.
-
await
-
class
distkv.runner.
RunnerEntry
(*a, **k)¶ An entry representing some hopefully-running code.
The code will run some time after
target
has passed. On success, it will run againrepeat
seconds later (if >0). On error, it will rundelay
seconds later (if >0), multiplied by 2**backoff.- Parameters
code (list) – pointer to the code that’s to be started.
data (dict) – additional data for the code.
delay (float) – time before restarting the job on error. Default 100.
repeat (float) – time before restarting on success. Default: zero: no restart.
target (float) – time the job should be started at. Default: zero: don’t start.
ok_after (float) – the job is marked OK if it has run this long. Default: zero: the code will do that itself.
backoff (float) – Exponential back-off factor on errors. Default: 1.1.
The code runs with these additional keywords:
_self: the `CallEnv` object, which the task can use to actually do things. _client: the DistKV client connection. _info: a queue which the task can use to receive events. A message of ``None`` signals that the queue was overflowing and no further messages will be delivered. Your task should use that as its mainloop. _P: build a path from a string _Path: build a path from its arguments
Some possible messages are defined in
distkv.actor
.-
await
send_event
(evt)¶ Send an event to the running process.
-
await
set_value
(value)¶ Process incoming value changes
-
should_start
()¶ Tell whether this job might want to be started.
- Returns
No, it’s running (or has run and doesn’t restart).
0
: No, it should not start>0
: timestamp at which it should start, or should have started- Return type
False
-
class
distkv.runner.
RunnerNode
(root, name)¶ Represents all nodes in this runner group.
This is used for load balancing and such. TODO.
-
class
distkv.runner.
StateEntry
(parent, name=None)¶ This is the actual state associated with a RunnerEntry. It must only be managed by the node that actually runs the code.
- Parameters
started (float) – timestamp when the job was last started
stopped (float) – timestamp when the job last terminated
pinged (float) – timestamp when the state was last verified by the runner
result (Any) – the code’s return value
node (str) – the node running this code
backoff (float) – on error, the multiplier to apply to the restart timeout
computed (float) – computed start time
reason (str) – reason why (not) starting
-
result
¶ alias of
distkv.util._impl.NotGiven
-
class
distkv.runner.
StateRoot
(client, path, *, need_wait=False, cfg=None, require_client=True)¶ Base class for handling the state of entries.
This is separate from the RunnerRoot hierarchy because the latter may be changed by anybody while this subtree may only be affected by the actual runner. Otherwise we get interesting race conditions.
-
await
kill_stale_nodes
(names)¶ States with node names in the “names” set are stale. Kill them.
-
await
-
class
distkv.runner.
AnyRunnerRoot
(*a, **kw)¶ This class represents the root of a code runner. Its job is to start (and periodically restart, if required) the entry points stored under it.
AnyRunnerRoot
tries to ensure that the code in question runs on one single cluster member. In case of a network split, the code will run once in each split areas until the split is healed.-
max_age
¶ Timeout after which we really should have gotten another go
-
await
find_stale_nodes
(cur)¶ Find stale nodes (i.e. last seen < cur) and clean them.
-
-
class
distkv.runner.
SingleRunnerRoot
(*a, **kw)¶ This class represents the root of a code runner. Its job is to start (and periodically restart, if required) the entry points stored under it.
While
AnyRunnerRoot
tries to ensure that the code in question runs on any cluster member, this class runs tasks on a single node. The code is able to check whether any and/or all of the cluster’s main nodes are reachable; this way, the code can default to local operation if connectivity is lost.Local data (dict):
- Parameters
cores (tuple) – list of nodes whose reachability may determine whether the code uses local/emergency/??? mode.
Config file:
- Parameters
-
max_age
¶ Timeout after which we really should have gotten another ping
-
class
distkv.runner.
AllRunnerRoot
(*a, **kw)¶ This class represents the root of a code runner. Its job is to start (and periodically restart, if required) the entry points stored under it.
This class behaves like SingleRunner, except that it runs tasks on all nodes.
This module implements a asyncactor.Actor
which works on top of
a DistKV client.
-
class
distkv.actor.
ActorState
(msg=None)¶ base class for states
-
class
distkv.actor.
BrokenState
(msg=None)¶ I have no idea what’s happening, probably nothing good
-
class
distkv.actor.
DetachedState
(msg=None)¶ I am detached, my actor group is not visible
-
class
distkv.actor.
PartialState
(msg=None)¶ Some but not all members of my actor group are visible
-
class
distkv.actor.
CompleteState
(msg=None)¶ All members of my actor group are visible