2.9 KiB
The Indented Document Values Format
Overview
The Indented Document Values (IDV) format is a text-based, whitespace-sensitive serialization format.
IDV is designed to prioritize human readability and writability by minimizing visual noise- there are no sigils, quotes, or brackets, only colons, indentation, and (when necessary) backslash escapes.
As a tradeoff, IDV is not a self-describing data format- you have to know what type of data an IDV document represents at the time you parse it.
Example
TODO: need something both concise and nontrivial. LDAP user data is certainly an option
Syntax
IDV is a line-oriented format. Before any other parsing is done, the input is split into lines, and any trailing whitespace on a line (including line separators) is ignored.
TODO: possible redraft: sequence of comments, entry headers, and documents, defined by line types (blank, comment, entry header, indented)
The lines of an IDV document represent a single flat list of Comments and Entries.
A Comment is any line whose first character is a #
character. Comment lines are for human use and are ignored by the parser.
# This line is ignored
An Entry's first line is unindented and contains the name of a Category, up to the first :
character, followed by a Distinguisher. All following lines with indentation, if any, are the entry's Document:
Collection: distinguisher
Indented
document
with a blank line
- The Category and Distinguisher are both trimmed of surrounding whitespace before being interpreted, but internal whitespace is left intact.
- Backslash unescaping is performed on the Category and Distinguisher.
- The Distinguisher may contain literal colons; these are treated as regular characters and carry no special meaning.
- The first line of a Document defines the document's indentation- subsequent lines can be indented deeper, but no line may be indented less than the first line.
- It is ambiguous whether blank lines are part of a document or just aesthetic spacing for Entries; to resolve this, blank lines before and after a Document are ignored, but internal blank lines are considered part of the Document.
- Backslash unescaping is not performed on the Document. However, backslashes may be processed later, when the document is interpreted.
Data Model
TODO: tuples, can be interpreted according to patterns
Patterns
Primitive Property
TODO: one of distinguisher | document non-empty, parsing based on expected type
Object Property
TODO: distinguisher ignored, document is IDV
Union Property
TODO: distinguisher determines how the document is parsed
List
TODO: property specified multiple times
Map
TODO: distinguisher defines key, document parsed for value
Property Map
TODO: Category defines key, parsed as property for value
Merged Map
See Also
TODO:
- yaml
- dpkg control files