#3802 closed task (fixed)

Schema enforcement for HTTP protocol contents

Reported by: itamarst Owned by: GitHub <noreply@…>
Priority: normal Milestone: HTTP Storage Protocol
Component: unknown Version: n/a
Keywords: Cc:
Launchpad Bug:

Description

Many data inputs to the HTTP protocol need stringent checks, e.g. write lengths need to be positive, offsets need to be non-negative, hashes and secrets need to be the correct length, and so on.

So we need some way to enforce this.

Change History (9)

comment:1 Changed at 2021-09-23T18:15:56Z by itamarst

CDDL seems to support the above constraints.

comment:2 Changed at 2021-10-05T15:42:22Z by itamarst

This Rust library seems like the best candidate for an implementation of CDDL: https://github.com/anweiss/cddl/issues/98. (The issue is me asking "are the caveats in README actually valid", I suspect they are not valid at this point.)

comment:3 Changed at 2021-10-05T15:42:42Z by itamarst

Python JSONSchema library is highly unlikely to ever support bytes, given its goals.

comment:4 Changed at 2021-10-06T17:41:28Z by itamarst

Sounds like the caveats for the Rust CDDL library are probably not a problem, and the author is working towards a 1.0, so seems worth wrapping with Python for Tahoe-LAFS.

comment:5 Changed at 2021-10-06T18:05:20Z by itamarst

Schema enforcement features we'd need for current HTTP spec:

  1. Dict value types ("this key's value must be bytes").
  2. Dict keys ("this dict must only have these keys").
  3. List lengths? Probably optional, could replace with dicts anywhere that comes up.
  4. Integers must be positive, or not-negative.
  5. List types ("this must be a list of integers", or "this must be a list of dicts with this structure").
  6. Bytes vs. Unicode strings.
  7. String length (for either kind of string).
  8. Support both CBOR and JSON? Tricky for bytes.

Looking at CDDL:

  1. Yes.
  2. Yes.
  3. Yes.
  4. Yes.
  5. Yes.
  6. Yes.
  7. There's a size control (how many bytes is it) which is fine for bytes, but a bit strange for Unicode strings in the general non-ASCII case.
  8. The way CDDL works for JSON is by just saying "no bytes supported." Which is... a problem I guess. https://datatracker.ietf.org/doc/html/rfc8610#appendix-E

So CDDL seems plausible, but might require maintaining two versions of the schema to deal with bytes-on-JSON.

Last edited at 2021-10-06T18:06:14Z by itamarst (previous) (diff)

comment:6 Changed at 2021-10-06T18:07:19Z by itamarst

The _CBOR_ advice for bytes-on-JSON is "base64", so conceivably a CDDL validator could deal with bytes that way. But this is non-normative. I guess I'll see what Rust CDDL does.

comment:7 Changed at 2021-10-06T18:28:11Z by itamarst

The Rust CDDL library does what the RFC suggests, there's no automatic "maybe these strings are bytes." But perhaps the author would be amenable... https://github.com/anweiss/cddl/issues/101

Last edited at 2021-10-06T18:37:39Z by itamarst (previous) (diff)

comment:8 Changed at 2021-11-10T18:45:11Z by itamarst

Work in progress Python wrapper of anweiss' CDDL library: https://gitlab.com/tahoe-lafs/pycddl

comment:9 Changed at 2022-04-13T18:07:35Z by GitHub <noreply@…>

  • Owner set to GitHub <noreply@…>
  • Resolution set to fixed
  • Status changed from new to closed

In 0c18dca/trunk:

Merge pull request #1190 from tahoe-lafs/3802-cddl

CDDL-based schema validation for the HTTP storage server

Fixes ticket:3802

Note: See TracTickets for help on using tickets.