A bespoke PHP SSG, post entity creation from Markdown files

I have built a custom SSG for this site. While doing so, I have explored different (code) designs for it.

The plan with this series is to cover specific parts of the system, show some alternative options, and explain the rationale behind certain decisions.

While the needs are based on my requirements, there could be more general takeaways. Even if building a general-purpose static site generator (SSG) and a bespoke one are different endeavors.

Some background

At one point, I distinguished between two types of posts on this site: articles and pulses. The distinction between the two has become blurred over time, but that's a different story.

For the articles, I use the Markdown file's name as the title, for example, Alpine.js directives and WordPress sanitization.md.

In addition to the content, I also include some meta information: date, tags, etc.

---
date: 2020-09-01
tags: wordpress, alpinejs
---
 
WordPress has functions with sensible defaults for when you want to filter untrusted HTML ...

The pulses do not contain any meta information. The file's name also includes the date, for example, 202203281713 Introducing the Pulse.md.

While I'm sure the reasons behind these decisions are intriguing for everyone, with great effort, I'll refrain from discussing them since they are not necessary to understand the rest of the article.

One quick note: not all "pages" are generated from Markdown files; some data is read from a JSON file.

A possible solution

Due to the two post types and their differences, some complexity arises because they must be handled differently.

Factory or Factories for the Post Types

To keep things separate, we can create distinct factories for the post types: PulseFactory, ArticleFactory.

Having one factory with multiple creation methods is an alternative option, but likely the factories will have different dependencies. For example, the PulseFactory does not need any kind of front-matter parsing for the meta.

This separation is "nice", but it's somewhat inconvenient to call the appropriate factories explicitly. It would be more convenient to pass a Markdown file to a factory and receive something in return.

That something could be a Post. So we can create a PostFactory. Having separate Article, Pulse entities might also make sense.

class PulseFactory
{
public function create(MdFile $inputFile): Post
{
}
}
 
class ArticleFactory
{
public function create(MdFile $inputFile): Post
{
}
}
 
class PostFactory
{
public function create(MdFile $inputFile): Post
{
}
}

To "connect" all these factories, we can introduce a PostTypeFactory:

interface PostTypeFactory
{
public function create(MdFile $inputFile): Post
}

The picker class

How to determine which factory to call is important, but the crucial question is which class does the determining.

Although it's perfectly acceptable to have that logic inside the PostFactory:

class PostFactory implements PostTypeFactory
{
public function create(MdFile $inputFile): Post
{
$concreteFactory = match($inputFile)
{
// or if statements, private methods ...
}
return $concreteFactory->create($inputFile);
}
}

We can introduce a "picker" class, PostTypeFactoryPicker, responsible for determining the correct factory based on the Markdown file. We can also call it a Resolver, or Determiner.

Bringing It All Together

In the end, we ended up with something like this:

class PostTypeFactoryPicker
{
public function pick(MdFile $inputFile): PostTypeFactory
{
}
}
 
readonly class PostFactory implements PostTypeFactory
{
public function __construct(
private PostTypeFactoryPicker $postTypeFactoryPicker
) {
}
public function create(MdFile $inputFile): Post
{
$concreteFactory = $postTypeFactoryPicker->pick($inputFile);
return $concreteFactory->create($inputFile);
}
}

Somewhere in the system, this will be executed. Of course, done with the use of a dependency injection container:

$post = (new PostFactory(
new PostTypeFactoryPicker()
))->create($file);

As a conclusion

There's nothing groundbreaking here, and this resembles some of the well-known patterns.

The fact that the creational requirements are not intermingled is a positive trait. Even that facade-like factory (PostFactory) that doesn't do much has its merits.

The picker class, it's questionable. But having the determination logic in one place is undoubtedly good.

But there are other ways to do it. But about that, in another article.