DocC 📚 Archived and Analyzed

At WWDC 2021 Apple presented DocC, a way to create Swift documentation and tutorials right within Xcode. We are going to look at the documentation archive produced, the good&bad and how to generate a static website.

TLDR: Just looking for the static exporter, don’t want to read all this junk? It is over here, but remember it is still an early and quick hack: docc2html. Works like that:

$ swift run docc2html --force \
    SlothCreator.doccarchive \
    /tmp/SlothCreatorSite/

Then open file:///tmp/SlothCreatorSite/documentation/slothcreator/index.html.

As part of this blog entry we are not going to look on how DocC documentation and tutorials are written (the Apple WWDC sessions do a very good job on explaining that), but rather on what the output of DocC is. And what we can do with it.

Let’s still revisit what the user facing part does.

DocC Quick Usage Intro

So what is the new feature about? In Xcode 13 you just have to press Ctrl-Shift-Cmd-D and it is going to produce documentation for all targets in your project. And open that in the Xcode documentation viewer (you can also trigger the action using the “Product / Build Documentation” menu). This is what it looks like:

DocC is working together w/ the Swift compiler infrastructure to extract the API of a Swift target and its documentation comments. A target can also (optionally!) include a “Documentation Catalog” which can enhance the source documentation of a target w/ additional overview documents, tutorials and more. It supports quite a lot of things and you can learn more about those in the WWDC sessions or Documenting a Swift Framework or Package.

We are going to look at something else, this feature:

It can also be automated using xcodebuild, which is covered in Host and automate your DocC documentation. That “export” is creating a “DocC Archive” and that happens to be the really interesting part.

The DocC Archive

A DocC Archive appears as an opaque file in Finder, but it is just a directory structure containing an export of your documentation (you can look at it by choosing “Show Package Contents” in Finder’s context menu).

$ ls ~/Downloads/SlothCreator.doccarchive/
css			favicon.ico		img			js
data			favicon.svg		index			theme-settings.json
downloads		images			index.html		videos

The big fancy, official feature being: It contains HTML and supporting files to show your site in a web browser, with similar features to the beautiful SwiftUI documentation you might have seen. This is what it looks like:

Just open the index.html file in the DocC Archive and it’ll open the documentation in the browser, NOT 🤦‍♀️ It’s funny how it could have possibly happened that such a basic feature doesn’t work.

I suspect the reason this happened is that Apple itself is using DocC to document just Xcode projects, not Swift Packages. And that all the docs are hosted internally on a central website automagically.

Long story short: To open a DocC Archive in the browser, you need an actual HTTP server. Yours truly has hacked up a small Macro.swift script to do the job: servedocc. No .htaccess configuration necessary.

This issue resulted in some serious backslash like: Apple’s DocC is excellent, but unusable for open source projects. I disagree, DocC is not useless for open source projects.

To understand why, we have to look at the exported DocC Archive in more detail.

It’s also worth mentioning that Apple said to release DocC itself as open source later this year. They also said that they are going to provide a hosting service for documentation, we’ll see how that works out.

The Three Kings Things (in a DocC Archive)

The archive is not just one thing, it contains three things:

  1. A version of the documentation suitable for the Xcode Documentation Viewer
  2. All the generated documentation as raw, parseable data, in a hierarchy of JSON files, plus images etc.
  3. A Vue.js JavaScript web application to display the raw data. All webpacked.

We only ever use the Xcode Documentation Viewer by accident, but presumably it is useful to some? 🤷‍♀️ This part is contained in the index subdirectory (some binary plists and other stuff).

The JSON export is the real gem and the thing where the actual value of DocC lives. Instead of producing opaque HTML we get access to the raw, structured, data!

And the embedded Vue.js app is really nice too. E.g. it produces those fancy tutorials, but has the before mentioned (serious) issue of not being able to run w/o a server to please the URL demands of the JS app.

The Semantic Web

What DocC is doing here is conceptually a great idea and doesn’t actually bite w/ hosting an archive on say GitHub pages or straight S3. The documentation is exported as static JSON and a small JavaScript reads those and just does the fancy display (data as XML and rendering as XSLT would have been even cooler 👴).

Why is this cool? Search engines, IDEs, apps and other tooling can directly access the hosted raw data, index, understand and process it.

Why is this not just cool? Three reasons:

  1. The DocC Vue.js app itself doesn’t currently work as a static export. That is likely a temporary flaw which can and is going to be fixed.
  2. <noscript>
  3. Google and companions do not process and display those JSON files as readable documents. They only show HTML pages. No semantic web for us 😢

Apart from the b0rked frontend app, the latter is a major reason to still export the docs as straight, static HTML.

Reading the DocC Archive Data in Swift

We’ve been interested how the JSON is layed out and how it works. And in the process created DocCArchive, a Swift package everyone can use. E.g. in Swift scripts that analyse API differences, documentation viewer applications (SwiftPM Catalog could show documentation inline, is that interesting?) or: static HTML exporters.

DocCArchive can parse all the JSON (including tutorials) created by the SlothCreator example.

Important: When using it on your own projects, you will likely run into setups we didn’t test and need to add. Please let us know, we’ll fix missing cases ASAP (PRs are welcome too). DocCArchive is just a rather quick hack up. Cleanup PRs and refactorings are welcome as well.

We’ll get to docc2html in a moment, a look at the structure of the raw DocC Archive data first.

Directory Structure

$ tree .
.
├── css
...
├── data
│   ├── documentation
│   │   ├── slothcreator
│   │   │   ├── activity
│   │   │   │   └── perform(with:).json
...
│   │   └── slothcreator.json
│   └── tutorials
...

All the raw data is located in the data subfolder of the subdir, specifically in the documentation and tutorials subfolders.

Using DocCArchive the archive can be opened like that, it manages the general layout of the archive:

let url     = URL(fileURLWithPath: "~/Downloads/SlothCreator.doccarchive")
let archive = try DocCArchive(contentsOf: url)

To browse the documentation, DocumentFolder objects are used:

if let docs = archive.documentationFolder() {
  ... work it ...
}

Documents

The JSON itself is versioned (currently 0.1.0), so it may still change. DocCArchive represents it as a Document object.

This is what the actual XML JSON looks like:

{ "topicSections": [
    { "title": "Essentials", "identifiers": [
        "doc://SlothCreator/tutorials/SlothCreator", ... ]
    }, ...
  ],
  "schemaVersion": { "major": 0, "minor": 1, "patch": 0 },
  "sections": [],
  "primaryContentSections": [
    { "kind": "content",
      "content": [
        { "type": "heading",  "level": 2,  
          "text": "Overview", "anchor": "Overview"
        },
        { "type": "paragraph", "inlineContent": [
          { "type": "text",
            "text": "SlothCreator provides models and ..."
        ...
        { "type": "paragraph", "inlineContent": [
          { "type": "image", "identifier": "sloth.png" } ] }
      ]
    }
  ],
  "variants": [
    { "paths" : [ "/documentation/slothcreator" ],
      "traits": [ { "interfaceLanguage": "swift" } ] }
  ],
  "identifier": {
    "url": "doc://SlothCreator/documentation/SlothCreator",
    "interfaceLanguage": "swift"
  },
  "abstract": [
    { "type": "text",
      "text": "Catalog sloths you find in nature and create new adorable virtual sloths."
    }
  ],
  "kind": "symbol",
  "metadata": {
    "roleHeading" : "Framework",
    "externalID"  : "SlothCreator",
    "title"       : "SlothCreator",
    "symbolKind"  : "module",
    "role"        : "collection",
    "modules"     : [ { "name": "SlothCreator" } ]
  },
  "hierarchy": { "paths": [[]] },
  "documentVersion": 0,
  "references": {
    "doc://SlothCreator/documentation/SlothCreator/Activity": {
      "role"    : "symbol",
      "title"   : "Activity",
      "fragments": [
        { "kind": "keyword",    "text": "protocol" },
        { "kind": "text",       "text": " "        },
        { "kind": "identifier", "text": "Activity" }
      ],
      "abstract": [ ... ],
		 ...
      "url"     : "/documentation/slothcreator/activity"
    },
	  ...
    "sloth.png": {
      "alt"        : "A sloth hanging off a tree.",
      "type"       : "image",
      "identifier" : "sloth.png",
      "variants": [
        { "url"    : "/images/sloth@2x.png",
          "size"   : { "width": 952, "height": 756 },
          "traits" : [ "2x", "light" ]
        },...
      ]
    },
	  ...
  }
}

An interesting thing is that the JSON, except for images, is self contained. For example to show the abstract of the Activity type, there is no need to load the associated Activity.json document. All the necessary metadata is embedded in the document already.

All the JSON info can be found in neat Swift types as part of the Document object:

var schemaVersion          : SchemaVersion
var identifier             : Identifier
var documentVersion        : Int
var kind                   : Kind
var metadata               : MetaData
var hierarchy              : Hierarchy
var variants               : [ Variant ]? // not in tutorial
var abstract               : [ InlineContent ]? // not in tutorial
var sections               : [ Section ]
var topicSections          : [ Section ]?
var seeAlsoSections        : [ Section ]?
var primaryContentSections : [ Section ]?
var references             : [ String : Reference ]

Summary: Use DocCArchive to write your own scripts and apps processing DocC archives. Let us know if you find open ends.

The Static HTML Exporter

Finally, lets talk about docc2html, a tool to export DocC Archives to an actual, fully static HTML site. With relative linking, so it doesn’t matter where they live. Goodbye redirects.

First off: This is a **very quick hack**/PoC full of quirks, and is pretty incomplete. It does have working parts and we invite everyone to improve it and provide PRs. Or ignore it and come up with an own exporter based on this (and potentially DocCArchive).

Again, the tool is, as of today, just tested against the SlothCreator example. Expect issues with other DocC Archives 💥

It does not export tutorials yet, only the documents in the documentation folder.

Update 2021-07-10: docc2html is now in a pretty reasonable state and the implementation isn’t that hacky anymore. Custom templates can now be provided (and overridden) in the filesystem. It can even generate tutorials to a certain degree. At least for API documentation it should be viable to use. The CSS is the thing which will need some massage.

To play with it, clone the GitHub repo:

$ git clone https://github.com/DoccZz/docc2html.git
$ cd docc2html

And run like that:

$ swift run docc2html \
    --force \
    ~/Downloads/SlothCreator.doccarchive \
    /tmp/SlothCreatorSite/

This will create the static site in /tmp/SlothCreatorSite. The root documentation can be directly opened in the browser, e.g.

open file:///tmp/SlothCreatorSite/documentation/slothcreator/index.html

It is not much yet, but a pretty good starting point.

Using it on GitHub

We didn’t try that yet ∾ But hope to be able to move the SwiftBlocksUI Documentation to that. Going to take some time until it’s possible.

The GitHub action would need to:

  • patch the Package.swift version to 5.5
  • run the xcodebuild -doc thing to produce the DocC Archives
  • use docc2html on each of the archives
  • publish the result to GH Pages

2021-07-10: I don’t think GitHub supports Xcode 13b yet (i.e. no way to get DocC into a GH action).

Parsing Apple Online Docs

Apple itself is using DocC online and one can access the documentation JSON the same way as described.

For example the JSON for the Getting Started with Scrumdinger can be found over here: https://developer.apple.com/tutorials/data/tutorials/app-dev-training/getting-started-with-scrumdinger.json .

We didn’t try yet, but it should be possible to access the online documentation and tutorials using DocCArchive the same way as we access the archive exports.

Closing Notes

Funny, but neither DocCArchive nor docc2html are documented with DocC yet. PRs welcome!

To summarize some DocC complains we at the ARI have:

  • Only seems to document Swift targets, not Swift packages.
  • Requires the 5.5 tools version in the Package.swift to make Xcode 13 build documentation, which seems to make zarro sense.
  • The Vue.js app doesn’t work on a location independent, static, dataset. Even with a static site exporter, that might still make sense.
  • While they look nice, the UX of the tutorials is actually pretty awful.
  • OpenSource projects often document the sources using README.md files in subfolders. Those should be taken into account somehow.

When you open the Vue.js app in a browser that has JavaScript disabled (or doesn’t support JavaScript, like say Lynx), you’ll get the popular “This page requires JavaScript”. By mixing in the docc2html this could actually be fixed.

Summary: It has some flaws, but we particularily like that DocC outputs structured data one can process in own tooling.

Contact

Feedback is warmly welcome: @helje5, wrong@alwaysrightinstitute.com.

Written on July 2, 2021