But first, why did I create a new blog path structure? If I'm just rebuilding it with a PHP framework, what necessitated the change? Two things:
- The old version didn't have a well thought out URL structure
- The new version of the site uses FastRoute for its routing engine
Let's go through both of those points.
The old version didn't have a well thought out URL structure
In the previous version of this website, which was built with Hugo, a static site generator, there wasn't a well thought out structure URL path for the blog.
I feel that this is due to two key reasons: the flexibility of Hugo's URL structure, and the lack of thought on my part for how I'd structure the blog in the first place.
Given these two problems, URI paths to blog items could be anything, e.g.,:
- servicemanager/accessing-servicemanager-services-controller-plugins
- tutorial/zftool-for-basic-project-management
- what-I-learned-using-expressive
- zend-framework/json-xml-rss-csv-its-all-a-contextswitch-away
Perhaps I'm being pedantic, but this isn't an ideal structure, as there is precious little, if anything, in the URL to show that it's a blog article and not just another page in the website.
So to correct this, in the new version of the site all blog articles are prefixed with "/blog/item/". To me, that distinctly sets blog items apart from normal pages.
The new version uses FastRoute
As the new version of the site uses FastRoute for its routing engine, (to the best of my knowledge) URLs can no longer support a nested path structure.
This is because routes are defined by regular expressions (regexes), such as '/test', '/articles/{id:\d+}[/{title}]', and '/user/{id:\d+}'.
After quite some experimentation, I wasn't able to find a regex broad enough, that's also allowed by FastRoute, that would match the existing routes.
What's more, I suspect that, if I could that it would be a problem for a handful of, if not more, URLs. Because of this, any URL with a nested path would need to be flattened, e.g., "zend-framework/json-xml-rss-csv-its-all-a-contextswitch-away" to "zend-framework-json-xml-rss-csv-its-all-a-contextswitch-away".
Why did I use Apache 2 and mod_rewrite?
Honestly, it came down to practicality. My typical approach to development is to use containers, where the container images are based off of official Docker Hub images. So, for PHP, I'll use the official PHP image. And, to save some time and effort, I'll use the tag that uses Apache 2 with the prefork module to hand off dynamic requests to PHP.
Given that, mod_rewrite was the logical choice for rewriting requests. If I was using a different web server, such as Caddy, nginx, or lighttpd, then I would have created the redirect setup with that webserver in mind.
If you're not familiar with mod_rewrite it's:
A rule-based rewriting engine to rewrite requested URLs on the fly.
If you're not familiar with URL rewriting, especially in the context of mod_rewrite, here's an example of how you'd rewrite one request to another:
RewriteRule "^index\.html$" "welcome.html"
In the example above, any request to index.html in your application would be rewritten as or redirected to welcome.html.
So, if someone requested https://your.domain.invalid/index.html, they'd be redirected to https://your.domain.invalid/welcome.html.
You can also make your rules quite complex, if and when required, as well, such as:
- Adding conditions that must pass before a rewrite will be allowed with RewriteCond
- Setting the base path from which URLs are matched with RewriteBase
- Making the rewrite case sensitive or insensitive
Now, while it can be pretty trivial to add mod_rewrite directives for rewriting a handful of routes — I have 355 routes that need to be redirected!
So, I needed to avoid writing — and maintaining — that many rules over the lifetime of the website. I needed something far more compact, more efficient, and more accommodating. Something that could handle any number of routes with a minimum of code.
That's the other reason that I chose mod_rewrite: its RewriteMap feature. If you're not familiar with it:
The RewriteMap directive defines an external function which can be called in the context of RewriteRule or RewriteCond directives to perform rewriting that is too complicated, or too specialized to be performed just by regular expressions. The source of this lookup can be any of the types listed in the sections below, and enumerated in the RewriteMap reference documentation.
How was the rewrite map created?
Now, let's look at how the rewrite configuration was created using Apache 2's mod_rewrite extension.
First up, I made a map of all the existing blog articles and their flattened version, storing it in a simple space-separated text file. Here's a quick example:
zend-framework/json-xml-rss-csv-its-all-a-contextswitch-away zend-framework-json-xml-rss-csv-its-all-a-contextswitch-away
The original version is on the left, and the new version is on the right. In many cases, the two paths are the same. Other times, there are just minor differences.
Then, I created the configuration below and added it to the Apache configuration in the running container.
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/$
RewriteRule ^(.*)/$ $1 [R=301,L]
RewriteMap rewritemap "txt:/etc/apache2/legacy-blog-map.txt"
RewriteCond %{REQUEST_URI} ^/(.*)$
RewriteCond ${rewritemap:%1|NOT_PRESENT} !NOT_PRESENT
RewriteRule .? "/blog/item/${rewritemap:%1}" [R=301,L]
Here’s what’s going on in the configuration above:
- The first line enables mod_rewrite
- The second adds a rewrite condition which fails if the request i for the default (home) route. Without this, requests to the default route end up in an indefinite redirect loop.
- The third line redirects removes trailing slashes if present, issuing a 301 redirect to the same route after stripping the trailing slash
- The fourth enables the rewrite map, naming it
rewritemap, setting its source to /etc/apache2/legacy-blog-map.txt; which is where the rewrite map file has been copied into the Apache/PHP container. - The fifth line is a rewrite condition that checks if the current request URI matches the regex
/(.*), which checks if the current request is something on the site. To be fair, that’s a pretty broad regular expression, but a necessary one. - The sixth line adds a condition that checks if the URI path is in the rewrite map. If not, it is marked as not present, ending the rewrite process. If a match is found, then it continues on to the final line.
- The final line redirects the old blog item URI to the new one
That's how I rewrite URLs with Apache 2’s RewriteMap directive
While rewriting URLs is common practice, and has been for many years, it can be confusing to know how to do it, given the wealth of choices and approaches. As I found this approach really helpful for redirecting a significant number of URLs with a minimum of code and effort, I felt it was worth sharing.
I hope that it helps you out, if you're looking for a compact and efficient way to update the structure of your website or blog.
Are you tired of hearing how "simple" it is to deploy apps with Docker Compose, because your experience is more one of frustration? Have you read countless blog posts and forum threads that promised to teach you how to deploy apps with Docker Compose, only for one or more essential steps to be missing, outdated, or broken?