WebFinger and why it makes things simpler

If you are playing with the idea of creating a distributed social network or service, there are standards that you have to be aware of.

One of them is the WebFinger protocol.

Let's look at an example of "a decentralised, minimalist microblogging service for hackers" that does not use the WebFinger protocol.

If you want to follow someone with twtxt, you do:

twtxt follow bob http://bobsplace.xyz/twtxt

But Alice, because twtxt is configurable, can choose to expose their posts somewhere other than twtxt.

So when you want to follow Alice, you'll type:

twtxt follow alice http://alice.space/updates.txt

The question: how do you know each individual's twtxt URL if there's no standard, only a sensible default?

You have to find out. But how? The answer is: depends. You can try the default, make an educated guess, you can ask Alice, or Alice can share the link on their website.

This is where WebFinger comes into a play.

WebFinger (...) can be used to discover information about people or other entities on the Internet using standard HTTP methods.

For a person, the type of information that might be discoverable via WebFinger includes a personal profile address, identity service, telephone number, or preferred avatar.

Information returned via WebFinger might be for direct human consumption (e.g., looking up someone's phone number), or it might be used by systems to help carry out some operation (...)

To translate this into practice, if twtxt had implemented WebFinger, the following part would look like this:

twtxt follow bob@bobsplace.xyz
twtxt follow alice@alice.space

There would be no need to know the exact location of the twtxt files.

By the way, even though they look like email addresses, they are not. In the same way, those things that look like email addresses on Mastodon (account usernames) are not. Yes, indeed! Mastodon uses WebFinger.

With WebFinger, you can have a standardized protocol to discover the URLs "for you". Since URLs are retrieved instead of directly used, they can be changed without worry since they can be "rediscovered" anytime.

But as twtxt proves, it's not a requirement. While somebody misses it, the project had its fair share of success without it.

Nevertheless, IMHO, if you are playing with the idea of creating a distributed social network or service, use WebFinger.

Recover an (accidental) git force push with the GitHub API

Imagine somebody force pushed to a branch. That person is unreachable now, and some critical code is gone. You never had that repo cloned or branch checked out. Can you do something?

It turns out, yes. At least if you are using GitHub (GH).

With the GH API, there are things you can do that are not otherwise available in the web UI.

For example, you can get a list of all things that happened to a repo using the list repository events endpoint.

curl \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer <YOUR-TOKEN>" \
  https://api.github.com/repos/OWNER/REPO/events

This includes all kinds of activities, like somebody leaving a comment on a PR, somebody starring the repo, or somebody pushing a commit.

[
{
"id": "22237752260",
"type": "WatchEvent",
"actor": {},
"repo": {},
"payload": {
"action": "started"
},
"public": true,
"created_at": "2022-06-08T23:29:25Z"
},
{
"id": "22249084964",
"type": "PushEvent",
"actor": {},
"repo": {},
"payload": {
"push_id": 10115855396,
"size": 1,
"distinct_size": 1,
"ref": "refs/heads/master",
"head": "7a8f3ac80e2ad2f6842cb86f576d4bfe2c03e300",
"before": "883efe034920928c47fe18598c01249d1a9fdabd",
"commits": [
{
"sha": "7a8f3ac80e2ad2f6842cb86f576d4bfe2c03e300",
"author": {},
"message": "commit",
"distinct": true,
"url": "https://api.github.com/repos/octocat/Hello-World/commits/7a8f3ac80e2ad2f6842cb86f576d4bfe2c03e300"
}
]
},
"public": true,
"created_at": "2022-06-09T12:47:28Z"
}
]

The PushEvent includes the before, which is the "The SHA of the most recent commit on ref before the push". To put it otherwise, the "state that we want to recover".

Now we can take advantage of the create a reference endpoint. This would allow us to create a new branch with the state right before the force push. The SHA identifies the state. This is what we pass to the endpoint, together with the branch name we want.

curl \
  -X POST \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer <YOUR-TOKEN>" \
  https://api.github.com/repos/OWNER/REPO/git/refs \
  -d '{"ref":"refs/heads/featureA","sha":"aa218f56b14c9653891f9e74264a383fa43fefbd"}'

This is it. "We saved the world."


I found the solution on Stack Overflow. This is just a more narrative explanation with updated links to the API.

Lesser known WordPress function to validate data structures

We validate data all the time. In programming, as in all areas of life, what we get is not always what we expect.

Let's take one specific case where we use validation: when constructing objects.

Painter::fromArray([
// ...
]);

PHP already offers functions to assert things, but many use dedicated libraries like Webmozart Assert or Respect\Validation. Me too! They are more convenient to use, and they are more powerful.

But if you are working with WordPress, there is a "WordPress-specific" solution that sits somewhere between the "native PHP" and libraries regarding convenience and power.

It's the rest_validate_object_value_from_schema function. It's something worth considering before requiring an extra dependency.

The function's name gives the impression that it is "REST API specific", but that's not the case.

Imagine that, for some reason, the Painter, besides the name, needs the favorite color in HEX format and possibly the name of the favorite artists.

Painter::fromArray([
'firstName' => 'John',
'lastName' => 'Doe',
'favoriteColor' => '#000',
'favoriteArtists' => ['Picasso', 'Salvador Dalí']
]);

How would you write the validation in "plain PHP"?

Here's how it might look using the rest_validate_object_value_from_schema:

class Painter
{
public static function fromArray(array $data): self
{
$validationResult = rest_validate_object_value_from_schema(
$data,
[
'properties' => [
'firstName' => [
'type' => 'string',
'required' => true,
],
'lastName' => [
'type' => 'string',
'required' => true,
],
'favoriteColor' => [
'type' => 'string',
'format' => 'hex-color',
'required' => true,
],
'favoriteArtists' => [
'type' => 'array',
'items' => [
'type' => 'string'
]
]
],
],
''
);

if (is_wp_error($validationResult)) {
throw new InvalidArgumentException(
// ...
);
}

return new self(
// ...
);
}
}

It's pretty readable and understandable at first sight.

If you want to keep the validation code inside the factory method or move it to a dedicated method or a class, that's up to you. Lately, I prefer having dedicated validation/assertion classes.

You can refer to the REST API Schema page to check the validation rules you can use.

Static site (micro)blogging with Telegram and DigitalOcean Functions

I was coquetting with the idea of using Telegram for (micro)blogging in the past. I almost took it seriously and released msgWP to the public; it allowed the creation of WordPress posts by sending messages to a Telegram bot.

Today, I prefer Jamstack websites over WordPress for personal projects. But I still think (micro)blogging with Telegram is a good idea.

Here's how it could work with a static site.

Taking a step back

Creating a post in the static site generator (SSG) world typically means creating a Markdown file that contains the post's content with maybe meta information, like date, tags, and title.

Publishing means running the build process and deploying the output to the hosting. If you are using a platform like Netlify or similar, publishing and deploying is something you don't even have to think of. All you have to do is commit a file and push it to the repo.

From all this, it follows that we have to answer this question: how will we create a (Markdown) file, commit it, and push it when we send a Telegram message (to a bot).

Create a file with API

If you are using GitHub, they offer an API endpoint to create or update file contents. If you use GitLab, they have something very similar. Probably, all Git repository hosting services offer this.

If by any chance you don't want to use the API, or it is not offered, there are alternative ways to this workflow. Maybe in a subsequent article, we can go over them.

From Telegram to Git repo API

I choose to use DigitalOcean (DO) Functions because it's something I have wanted to try for a while.

Besides the other serverless options, a different route would be using a low code integration platform like Pipedream.

There are two ways to create a DO Function. Whatever route you take, the important part is to have a main function.

The main function is the entry point with the $args parameter.

The $args is special. It contains, besides other things, the data sent to it. There's no need or way to use a superglobal variable like $_REQUEST or php://input to access the data.

function main(array $args): array
{
// 1. access incoming data
// 2. prepare file
// 3. create the file with the API
}

To keep out sensitive information from the code and make it more reusable, at least the following environment variables should be set: GH_PERSONAL_ACCESS_TOKEN, COMMIT_AUTHOR_NAME, COMMIT_AUTHOR_EMAIL, REPO_OWNER, REPO_NAME.

A Telegram bot receives many types of updates. The focus will be on a simple text message to keep the article short.

Because the $args already contains the data relayed by Telegram, we can use array destructuring and get the relevant information:

[
'update_id' => $updateId,
'message' => [
'text' => $text,
'date' => $timestamp,
],
] = $args;

Depending on the SSG you prefer, the content of the (post) file will vary. But most likely, you will have some metadata (for ex.: the date) as front matter and the content.

$formattedPostDate = date('c', (int)$timestamp); // ISO 8601
$markdownFileContent = <<<CONTENT
---
date: $formattedPostDate
---
$text
CONTENT;

Without any library, we can just use curl to make the request to the API:

$fileToCreate = "{$updateId}.md";
$ghApiUrl = sprintf(
'https://api.github.com/repos/%s/%s/contents/%s',
getenv('REPO_OWNER'),
getenv('REPO_NAME'),
$fileToCreate
);

$curlHandle = curl_init();
curl_setopt_array(
$curlHandle,
[
CURLOPT_URL => $ghApiUrl,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_CUSTOMREQUEST => 'PUT',
CURLOPT_USERAGENT => 'DigitalOcean Functions',
CURLOPT_HTTPHEADER => [
'Authorization: Bearer ' . getenv('GH_PERSONAL_ACCESS_TOKEN'),
'Content-Type: application/json',
],
CURLOPT_POSTFIELDS => json_encode([
'message' => "Create {$fileToCreate}",
'committer' => [
'name' => getenv('COMMIT_AUTHOR_NAME'),
'email' => getenv('COMMIT_AUTHOR_EMAIL'),
],
'content' => base64_encode($markdownFileContent),
]),
]
);
curl_exec($curlHandle);
curl_close($curlHandle);

curl is verbose, but we are just satisfying here the required information for the GitHub API request.

One thing that might not be obvious is that we have to set the user agent, and we must base64 encode the file content.

There's one more thing; we have to create a response to successfully terminate the DO Function:

return [
'body' => curl_info($curlHandle, CURLINFO_RESPONSE_CODE),
];

At this point, if you create a Telegram Bot and set the webhook to the DO Function URL things should work.

possible video demo here


This could serve as a basis but is not even close to have a flawless (micro)blogging experience. For example, in msgWP, I had the message (post) editing and image publishing implemented.

Testing nested HTML Components with the test implementation

In the previous article, I settled on a solution to test data passed to the template renderer for an HTML component.

There's one caveat with the test implementation. The data of nested components gets json_encoded multiple times.

Let me demonstrate.

(As a reminder) we had these two interfaces:

interface Component
{
public function render(): string;
}

interface TemplateRenderer
{
public function render(string $templateName, array $data = []): string;
}

And the test implementation for the TemplateRenderer is this:

class JsonEncodeTemplateRenderer implements TemplateRenderer
{
public function render(string $templateName, array $data = []): string
{
return json_encode($data);
}
}

(The TemplateRenderer is passed as a dependency in the constructor of the Component.)

If we have a component that has a child component, and that one has a child component, like so:

var_dump($component([
'foo' => 'foo',
'bar' => $component([
'baz' => 'baz',
'qux' => $component([
'quux' => 'quux',
]),
]),
]));

we end up with a JSON encoded string like this:

string(79) "{"foo":"foo","bar":"{\"baz\":\"baz\",\"qux\":\"{\\\"quux\\\":\\\"quux\\\"}\"}"}

It's not ideal because we used assertEquals to compare arrays. If we apply the json_decode as we did, we will get an array that still contains JSON encoded values:

array(2) {
  ["foo"]=>
  string(3) "foo"
  ["bar"]=>
  string(41) "{"baz":"baz","qux":"{\"quux\":\"quux\"}"}"
}

The solution

I was never that thrilled with decoding the data in the tests anyway, so I created a "custom" assertion that does all the "heavy-lifting", and recursively decodes the string.

The "custom" assertion is as simple as it can be:

protected function assertTemplateRendererDataEquals(array $expected, string $actual, string $message = ''): void
{
$this->assertEquals(
$expected,
$this->decodeEncodedTemplateRendererData($actual),
$message
);
}

It just wraps the "native" assertEquals and defers the work to a private method:

 private function decodeEncodedTemplateRendererData(mixed $data): mixed
{
if (!is_string($data) && !is_array($data)) {
return $data;
}

if (is_string($data)) {
try {
$data = json_decode($data, true, flags: JSON_THROW_ON_ERROR);
} catch (\Exception) {
return $data;
}
}

if (is_array($data)) {
foreach ($data as $key => $value) {
$data[$key] = $this->decodeEncodedTemplateRendererData($value);
}
}

return $data;
}

And with all this in place, in the tests, I can simply do this and forget about the complexity:

$this->assertTemplateRendererDataEquals(
[
'foo' => 'foo',
'bar' => [
'baz' => ...
]
],
$component->render()
);

The code to create those components "on-the-fly":

$component = fn(array $data = []): string => (new class(
new JsonEncodeTemplateRenderer(),
$data
) implements Component {
public function __construct(
readonly private TemplateRenderer $templateRenderer,
readonly private array $data,
) {
}

public function render(): string
{
return $this->templateRenderer->render(
'index.php',
$this->data,
);
}
})->render();