Entries tagged 'golang'
Time to modernize PHP’s syntax highlighting?
This blog post about “A syntax highlighter that doesn't suck” was timely because recently I had been kicking at the code for the syntax highlighter that I use on this blog. It’s a very old JavaScript package called SHJS based on GNU Source-highlight.
I created a Git repository where I imported all of the released versions of SHJS and then tried to update the included language files to the ones from the latest GNU Source-highlight release (which was four years ago), but ran into some trouble. There are some new features to the syntax files that the old Perl code in the SHJS package can’t handle. And as you might imagine, the pile of code involved is really, really old.
That new PHP package seems like a great idea and all, but I really like the idea of leveraging work that other people have done to create syntax highlighting for other languages rather than inventing another one.
On Mastodon, Ben Ramsey brought up a start he had made at trying to port Pygments, a Python syntax highlighter, to PHP.
I ran across Chroma, which is a Go package that is built on top of the Pygments language definitions. They’ve converted the Pygments language definitions into an XML format. Those don’t completely handle 100% of the languages, but it covers most of them.
At the end of the day, both GNU Source-highlight and Pygments and variants are built on what are likely to remain imprecise parsers because they are mostly regex-based and just not the same lexing and parsing code actually being used to handle these languages.
PHP has long had it’s own built-in syntax highlighting functions (highlight_string()
and highlight_file()
) but it looks like the generation code hasn’t been updated in a meaningful way in about 25 years. It just has five colors that can be configured that it uses for <span style="color: #...;">
tags. There are many tokens that it simply outputs using the same color where it could make more distinctions. If it were to instead (or also) use CSS classes to mark every token with the exact type, you could do much finer-grained syntax highlighting.
Looks like an area ready for some experimentation.
dipping my toes in go
one of the very first things i noticed when i migrated our website to a new server is that someone was running a vulnerability scanner against us, which was annoying. i cranked up the bot-fighting tools on cloudflare, but i also got fail2ban running pretty quickly so it would add the IP addresses for obviously bad requests to an IP list on cloudflare that would lock those addresses out of the site for a while. not a foolproof measure, of course, but maybe it just makes us a slightly harder target so they move on to someone else.
but fail2ban is a very old system with a pretty gross configuration system. i was poking around for a more modern take on the problem, and i found a simple application written in go called silencer that i decided to try and work with. i forked it so i could integrate it with cloudflare, and it was very straightforward. i also had to update one of the dependencies so it actually handled log file rotation. when i get time to hack on it some more, i’ll add handling for ipv6 as well as ipv4 addresses.
go is an interesting language. obviously i don’t have my head wrapped around the customs and community, so it seems a little rough to me, but it’s also not so different that i couldn’t feel my way around pretty quickly to solve my problem at hand.