Restructuring internal documentation in Notion

Restructuring internal documentation in Notion

In our latest developer productivity survey, our documentation was the area with the second most comments. This is a description of the concrete steps I took to see how much progress someone could make in improving the organization’s documentation, while holding myself to a high standard in implementing changes that actually worked rather than sounding visually impressive.

Diagnosis

There were a number of problems we encountered:

  • We migrated from Confluence to Notion in January 2025, leaving behind some old pages that were “obviously wrong.”

    These files created a stink around our other documents as people felt like things weren’t being properly maintained.

  • We had an inconsistent approach to what we documented in Git-managed files versus managing in Notion. This led to double counting.

  • Duplication meant it felt safer to have one N+1e version, instead of debugging why N versions already existed.

  • A number of new people have joined in the past year, who were not sure whether she were authorized to update the documentation or if someone else managed a particular file

  • We started using Notion AI as the primary mechanism for uncovering content, which meant that hierarchical organization was less important and having inaccurate snippets was harmful, even if they were tucked away in a quiet corner.

This was combined with a handful of interesting limitations in Notion itself:

  • You can’t tell if a non-wiki page has been verified via the API or not. You can tell if a wiki page is verified via API, but no one uses wiki pages
  • You cannot retrieve all pages in a Notion Teamspace via API. Instead, you have to manually list the top-level pages in that Teamspace and find the child pages of those pages
  • There is no “archive” functionality in Notion that allows you to exclude a document from search results
  • There is no programmatic insight into views or usage of a page via API except
    for how recently it was edited

Policy

The policies we adopted to address the above diagnosis were:

  1. Optimize for NotionAI results, not manual detection: A significant portion of our Notion usage now happens via direct links to a specific page, or through Notion AI, rather than through manual discovery. That means things like “FAQ” pages that duplicate content and become outdated are actively harmful, while previously they were very valuable.
  2. Duplication and outdated content is worse than nothing: don’t write your own manual for a process. Instead, link to it or update the source document
  3. Prefer natural documentation in version control: we prefer to link to a README in Github rather than duplicate these instructions in Notion as the README is more likely to stay current
  4. Everyone cleans up our documentation: we’d rather be people who try to clean up a document, even if we make a small mistake, than someone who leaves documentation in poor condition
  5. Automatic beats manually every time: we are a busy team doing a lot of things, it will always be difficult to consistently find the time to manually curate content in depth, targeted curation is great, but global is unreasonable

Execution

Next, the details of the implementation of that policy were:

  1. Create Scheduled to Archive And Archive team rooms.
    The Archive teamspace is a private teamspace, so documents added there don’t pollute the search index. Conversely, Scheduled to Archive is public, where anyone can add documents to the main document.

    We have a weekly script from which everything is migrated Scheduled to Archive Unpleasant Archive.

    This was the most effective mechanism we could find to implement archiving within Notion’s limitations.

  2. Prune expired pages. A script has been created that recursively builds a hierarchy from a main page, enriching each page with the last_edited_date for each child, and then prunes all the pages it is on and all children were last edited over N days ago.

    Using this script on the three to four most relevant top-level pages, we archived approximately 1,500 pages of expired documentation.

  3. Compact legacy hierarchies. Created a second script that identifies current pages deep in outdated hierarchies, for example the only updated page among 15 inaccurate documents. After you find a “buried current page”, promote it to the grandparent page and move the parent page (and its legacy child pages) to Scheduled to Archive.

    This ended up as a script that found all the candidates, and then I worked through approving/rejecting each suggestion. The biggest problem is the lack of a ‘verification’ state within the API, so there is no way to bless certain pages and their descendants.

  4. Outdated link finder. Created a third script that recursively works through a hierarchy and finds 404s. It is essential that this script not have access to the Archive so those scripts show up as 404’s, otherwise you would have to scan through them Archived to find things there. Either approach would work, it’s just a matter of preference.

    We did this after the mass migrations to ensure that we didn’t leave behind a “ghost forest” of links to archived documents that people can’t see, which would still make the documentation feel bad even though much of the bad content had been removed.

  5. Manual review of important pages. After completing all the steps above, I then reviewed all new hire documentation to make sure it was linked to the top-level onboarding guide, established clear requirements, identified the Slack channel where people could get help if people ran into trouble, and made sure the instructions didn’t duplicate our Git-managed READMEs but linked to them as appropriate.

    I applied this approach more lightly for our top-level engineering and technology pages, although they were generally in a good place.

All told, I think this was about eight hours of my time, but took zero hours of anyone else’s, and hopefully significantly improved the quality of our documentation. There is still much more to be done in specific areas, but I am optimistic that having far fewer duplicates and more evidence that we are actively maintaining the documentation will make that easier as well.

#Restructuring #internal #documentation #Notion

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *