IDV is designed to prioritize human readability and writability by minimizing visual noise- there are no sigils, quotes, or brackets, only colons, indentation, and (when necessary) backslash escapes.
As a tradeoff, IDV is not a self-describing data format- while it can be used for defining a serialization or configuration format, systems using it need to layer their own semantics on top of it.
IDV is a line-oriented format. Before any other parsing is done, the input is split into lines, and any trailing whitespace on a line (including line separators) is ignored.
A **Blank Line** is any line that only contains whitespace. Because trailing whitespace is always trimmed, all Blank Lines are indistinguishable from each other.
Blank Lines are ignored unless they are part of a Document. (see below)
### Entries
An **Entry** is composed of one or more lines:
#### Tags
Each entry begins with a **Tag**, terminated by a colon (`:`). A Tag can contain any characters except leading or trailing whitespace, newlines, and colons:
```
Tag:
```
#### Distinguishers
Optionally, a Distinguisher can follow the Tag on the same line. A Distinguisher can contain any characters except leading or trailing whitespace, and newlines:
```
Tag: distinguisher
```
#### Escapes
Within Tags and Distinguishers, backslash escapes may be used to represent non-permitted or inconvenient characters:
```
Tag With \: And Spaces:
Tag: \ distinguisher with leading whitespace and\nA newline
```
| Escape sequence | Replacement |
| --------------- | ----------------- |
| \\_\<space>_ | A literal space |
| \\n | A newline |
| \\: | A colon (`:`) |
| \\\\ | A backslash (`\`) |
> TODO: additional escapes? ie, hex or unicode?
#### Documents
After the first line of an entry, any indented lines make up the **Document** portion of the entry:
```
Tag: distinguisher
First Line
Second Line
Third Line
```
The first line of a Document defines the Document's indentation- subsequent lines can be indented deeper, but no line may be indented _less_ than the first line. This indentation is removed from the beginning of each line when determining the Document's value.
Blank Lines can not carry indentation information. To resolve this ambiguity, Documents may not begin or end with Blank Lines- such lines are ignored. Blank Lines that occur _between_ indented lines _are_ considered part of the Document.
Backslash escapes are _not_ processed within a Document. However, backslashes may be processed later, by higher-layered semantics.
In many cases the Document will contain recursive IDV data, and the rules above are designed to play nicely with this case- but it is up to the concrete format to decide how to parse the Document. It could just as easily contain free text, XML, or a base64 blob.
#### Disambiguations:
1. The Tag and Distinguisher are both trimmed of surrounding whitespace before being interpreted, but internal whitespace is left intact.
Applying minimal interpretation, IDV data can be represented as a list of Entries.
An Entry can be represented as a 3-tuple of:
1. a string (the Tag)
2. a string (the optional Distinguisher)
3. a list of strings (the lines of the Document)
How Entries are interpreted by the appication is not specified, but see below for some suggested patterns that should line up with things people usually want to do.