top

The lease package exists to implement distributed lease management on top of mgo/txn, and to expose assert operations that allow us to gate other mgo/txn transactions on lease state. This necessity has affected the package; but, apart from leaking assertion operations, it functions as a distributed lease- management system with various useful properties.

These properties of course rest upon assumptions; ensuring the validity of the following statements is the job of the client.

* The lease package has exclusive access to any collection it's configured
  to use. (Please don't store anything else in there.)

* Given any (collection,namespace) pair, any client Id will be unique at any
  given point in time. (Run no more than one per namespace per server, and
  identify them according to where they're running).

* Time passes at approximately the same rate for all clients. (Note that
  the clients do *not* have to agree what time it is, or what time zone
  anyone is in: just that 1s == 1s. This is likely to be true already if
  you use lease.SystemClock{}.)

So long as the above holds true, the following statements will too:

* A successful ClaimLease guarantees lease ownership until *at least* the
  requested duration after the start of the call. (It does *not* guaranntee
  any sort of timely expiry.)

* A successful ExtendLease makes the same guarantees. (In particular, note
  that this cannot cause a lease to be shortened; but that success may
  indicate ownership is guaranteed for longer than requested.)

* ExpireLease will only succeed when the most recent writer of the lease is
  known to believe the time is after the expiry time it wrote.

Remarks on clock skew ---------------------

When expiring a lease (or determining whether it needs to be extended) we only need to care about the writer, because everybody else is determining their own skew relative only to the writer. That is, assuming 3 clients:

A) knows "the real time"; wrote a lease at 01:00:00, expiring at 01:00:30A
B) is 20 seconds ahead; read the lease between 01:00:23B and 01:00:24B
C) is 5 seconds behind; read the lease between 00:59:57C and 00:59:58C

...then B cannot infer an expiry time earlier than 01:00:54L (=01:00:34A) and C cannot infer an expiry time earlier than 01:00:28C (=01:00:33A). If A fails to expire its lease, then C will trigger first and try to expire it, and most likely succeed; and when C succeeds, B's subsequent attempt to expire the lease will certainly fail, because C has updated both the clock document and the lease document and invalidated B's assertions.

So B can and does then Refresh; and sees the lease document written by C, and now needs only to consider its offset relative to C in order to Do The Right Thing.

Schema design -------------

For each namespace, we store a single clock document; and one additional document per lease. The lease document holds the name, holder, expiry, and writer of the lease; the clock document contains the most recent time acknowledged by each client that has written to the namespace.

Every transaction that the lease package makes is gated on a write to the clock document (which *must* precede any lease operations) which acks a recent time and fails if it appears to be going backward in time (this could happen if we crashed at the wrong moment and left a transaction queued but unprepared for some time: we definitely don't want to accept those operations).

The fact that the clock document is involved in every transaction renders it a per-namespace bottleneck, but the ability to discard outdated transactions is valuable; and the centralised record of acknowledged times mitigates the impact of client failure.

That is to say: assuming client C wrote lease L at time T, and wrote lease M at time U (later than T); and then failed; then a fresh client D will be able to expire lease L earlier (by U-T) than it could infer with the information in lease L alone.

(We could ofc still calculate that by storing a written time in each lease document, but it'd be more hassle to collate the data, harder to inspect the database, and would only be able to make much weaker anti-time-travel promises than we can manage with the clock doc.)

Client usage considerations ---------------------------

* Client operates at a relatively low level of abstraction. Claiming a held
  lease will fail, even on behalf of the holder; expiring an expired lease
  will fail; but at least we can allow lease extensions to race benignly,
  because they don't involve ownership change and thus can't break promises
  (so long as our skew logic is correct).

* ErrInvalid is normal and expected; you should never pass that on to your
  own clients, because it indicates that you tried to manipulate the client
  in an impossible way. You can and should inspect Leases() and figure out
  what to do instead; that may well be "return an error", but please be sure
  to return your own error, suitable for your own level of abstraction.

* You *probably* shouldn't ever need to actually call Refresh. It's perfectly
  safe to let state drift arbitrarily far out of sync; when you try to run
  operations, you will either succeed by luck despite your aged cache... or,
  if you fail, you'll get ErrInvalid and a fresh cache to inspect to find out
  recent state.

Imports 8 package(s)

  1. gopkg.in/mgo.v2/bson
  2. gopkg.in/mgo.v2/txn
  3. github.com/juju/errors
  4. github.com/juju/juju/mongo
  5. github.com/juju/utils/clock
  6. github.com/juju/loggo
  7. github.com/juju/txn
  8. gopkg.in/mgo.v2

Test imports 6 package(s)

  1. github.com/juju/juju/state/lease
  2. github.com/juju/txn/testing
  3. github.com/juju/juju/testing
  4. github.com/juju/testing/checkers
  5. github.com/juju/testing
  6. gopkg.in/check.v1