Record first set of IDV spec thoughts
This commit is contained in:
parent
af07866ac0
commit
763bfbc8cf
1 changed files with 83 additions and 0 deletions
83
idv.md
Normal file
83
idv.md
Normal file
|
@ -0,0 +1,83 @@
|
|||
# The Indented Document Values Format
|
||||
|
||||
## Overview
|
||||
|
||||
The Indented Document Values (IDV) format is a text-based, whitespace-sensitive serialization format.
|
||||
|
||||
IDV is designed to prioritize human readability and writability by minimizing visual noise- there are no sigils, quotes, or brackets, only colons, indentation, and (when necessary) backslash escapes.
|
||||
|
||||
As a tradeoff, IDV is not a self-describing data format- you have to know what type of data an IDV document represents at the time you parse it.
|
||||
|
||||
### Example
|
||||
|
||||
> TODO: need something both concise and nontrivial. LDAP user data is certainly an option
|
||||
|
||||
## Syntax
|
||||
|
||||
IDV is a line-oriented format. Before any other parsing is done, the input is split into lines, and any trailing whitespace on a line (including line separators) is ignored.
|
||||
|
||||
> TODO: possible redraft: sequence of comments, entry headers, and documents, defined by line types (blank, comment, entry header, indented)
|
||||
|
||||
The lines of an IDV document represent a single flat list of Comments and Entries.
|
||||
|
||||
A **Comment** is any line whose first character is a `#` character. Comment lines are for human use and are ignored by the parser.
|
||||
|
||||
```
|
||||
# This line is ignored
|
||||
```
|
||||
|
||||
An **Entry**'s first line is unindented and contains the name of a **Category**, up to the first `:` character, followed by a **Distinguisher**. All following lines with indentation, if any, are the entry's **Document**:
|
||||
|
||||
```
|
||||
Collection: distinguisher
|
||||
Indented
|
||||
document
|
||||
|
||||
with a blank line
|
||||
```
|
||||
|
||||
1. The Category and Distinguisher are both trimmed of surrounding whitespace before being interpreted, but internal whitespace is left intact.
|
||||
1. Backslash unescaping is performed on the Category and Distinguisher.
|
||||
1. The Distinguisher may contain literal colons; these are treated as regular characters and carry no special meaning.
|
||||
1. The first line of a Document defines the document's indentation- subsequent lines can be indented deeper, but no line may be indented _less_ than the first line.
|
||||
1. It is ambiguous whether blank lines are part of a document or just aesthetic spacing for Entries; to resolve this, blank lines before and after a Document are ignored, but internal blank lines are considered part of the Document.
|
||||
1. Backslash unescaping is **not** performed on the Document. However, backslashes may be processed later, when the document is interpreted.
|
||||
|
||||
## Data Model
|
||||
|
||||
> TODO: tuples, can be interpreted according to patterns
|
||||
|
||||
## Patterns
|
||||
|
||||
### Primitive Property
|
||||
|
||||
> TODO: one of distinguisher | document non-empty, parsing based on expected type
|
||||
|
||||
### Object Property
|
||||
|
||||
> TODO: distinguisher ignored, document is IDV
|
||||
|
||||
### Union Property
|
||||
|
||||
> TODO: distinguisher determines how the document is parsed
|
||||
|
||||
### List
|
||||
|
||||
> TODO: property specified multiple times
|
||||
|
||||
### Map
|
||||
|
||||
> TODO: distinguisher defines key, document parsed for value
|
||||
|
||||
### Property Map
|
||||
|
||||
> TODO: Category defines key, parsed as property for value
|
||||
|
||||
### Merged Map
|
||||
|
||||
## See Also
|
||||
|
||||
> TODO:
|
||||
>
|
||||
> - yaml
|
||||
> - dpkg control files
|
Loading…
Add table
Reference in a new issue