This book analyzes the methods, technologies, standards, and languages to structure and describe data in their entirety. It reveals common features, hidden assumptions, and ubiquitous patterns among these methods and shows how data are actually structured and described independently from particular trends and technologies.
Examples of data structuring methods analyzed critically include:
- Encodings (e.g. Unicode)
- Identifiers and Identifier systems (e.g. ISBN)
- File systems
- Database Systems (record databases, relational databases, NoSQL...)
- Data structuring languages (JSON, XML, CSV, RDF...)
- markup languages (SGML, HTML, TEI, Markdown...)
- Schema languages (BNF, XSD, RDFS, OWL, SQL...)
- Conceptual modeling languages (ERM, ORM, UML, DSL...)
- Conceptual diagrams
It is shown how particular method of data structuring and description can best be categorized by their primary purpose. The study further exposes five basic paradigms that deeply shape how data is structured and described in practice. The third results is a pattern language of data structuring. Patterns show problems and solutions which occur over and over again in data. Each pattern is described with its benefits, consequences, pitfalls, and relations to other patterns.
The results can help to better understand data and its actual forms, both for consumption and creation of data. Possible applications include data analysis, data modeling, data archaeology, and data literacy.