Summary Technical Skill Development & Community Acknowledges
This GitHub repository hosts a 2-week PHP performance optimization challenge (Feb 24–Mar 15, 2026) with transparent rules, fair compensation via sponsor prizes, and open community participation. The content engages positively with UDHR principles around non-discrimination, equality before rules, free expression, assembly, fair work, and education—but frames these as technical competition values rather than explicit human rights advocacy. Overall, the content is neutral-to-mildly-positive regarding human rights compatibility, with no hostile or undermining elements.
A month ago, I went on a performance quest trying to optimize a PHP script that took 5 days to run. Together with the help of many talented developers, I eventually got it to run in under 30 seconds. This optimization process with so much fun, and so many people pitched in with their ideas; so I eventually decided I wanted to do something more.
That's why I built a performance challenge for the PHP community
The goal of this challenge is to parse 100 million rows of data with PHP, as efficiently as possible. The challenge will run for about two weeks, and at the end there are some prizes for the best entries (amongst the prize is the very sought-after PhpStorm Elephpant, of which we only have a handful left).
This is why I jumped from PHP to Go, then why I jumped from Go to Rust.
Go is the most battery-included language I've ever used. Instant compile times means I can run tests bound to ctrl/cmd+s every time I save the file. It's more performant (way less memory, similar CPU time) than C# or Java (and certainly all the scripting languages) and contains a massive stdlib for anything you could want to do. It's what scripting languages should have been. Anyone can read it just like Python.
Rust takes the last 20% I couldn't get in a GC language and removes it. Sure, it's syntax doesn't make sense to an outsider and you end up with 3rd party packages for a lot of things, but can't beat it's performance and safety. Removes a whole lot of tests as those situations just aren't possible.
If Rust scares you use Go. If Go scares you use Rust.
> duckdb -s "COPY (SELECT url[20:] as url, date, count(*) as c FROM read_csv('data.csv', columns = { 'url': 'VARCHAR', 'date': 'DATE' }) GROUP BY url, date) TO 'output.json' (ARRAY)"
Takes about 8 seconds on my M1 Macbook. JSON not in the right format, but that wouldn't dominate the execution time.
I took a quick look, the dependency on php 8.5 is mildly irritating, even Ubuntu 26.04 isn't lined up to ship with that version, it's on 8.4.11.
You mention in the README that the goal is to run things in a standard environment, but then you're using a near bleeding edge PHP version that people are unlikely to be using?
I thought I'd just quickly spin up a container and take a look out of interest, but now it looks like I'll have to go dig into building my own PHP packages, or compiling my own version from scratch to even begin to look at things?
I was curious what it would take if I approached it the way I do with most CSV transformation tasks that I'm only intending to do once: use Unix command line tools such as cut, sed, sort, and uniq to do the bulk of the work, and then do something in whatever scripting language seems appropriate to put the final output in whatever format is needed.
The first part, using this command [1], produces output lines that look like this:
219,/blog/php-81-before-and-after,2021-06-21
and is sorted by URL path and then date.
With 1 million lines that took 9 or 10 seconds (M2 Max Mac Studio). But with 100 million it took 1220 seconds, virtually all of which was sorting.
Turning that into JSON via a shell script [2] was about 15 seconds. (That script is 44% longer than it would have been had JSON allowed a comma after the last element of an array).
So basically 22 minutes. The sorting is the killer with this type of approach, because the input is 7 GB. The output is only 13 MB and the are under 300 pages and the largest page count is under 1000 so building the output up in memory as the unsorted input is scanned and then sorting it would clearly by way way faster.
> A month ago, I went on a performance quest trying to optimize a PHP script that took 5 days to run. Together with the help of many talented developers, I eventually got it to run in under 30 seconds
That's a huge improvement! How much was low hanging fruit unrelated to the PHP interpreter itself, out of curiosity? (E.g. parallelism, faster SQL queries etc)
I am not that smart to use Rust so take it with a grain of salt. However, its syntax just makes me go crazy. Go/Golang on the other hand is a breath of fresh air. I think unless you really need that additional 20% improvement that Rust provides, Go should be the default for most projects between the 2.
> The output should be encoded as a pretty JSON string.
So apparently that is what they consider "pretty JSON". I really don't want to see what they would consider "ugly JSON".
(I think the term they may have been looking for is "pretty-printed JSON" which implies something about the formatting rather than being a completely subjective term)
PHP has always escaped forward slashes to help prevent malicious JSON from injecting tags into JavaScript I believe. Because it was common for PHP users to json_encode some data and then to write it out into the HTML in a script tag. A malicious actor could include a closing script tag, and then could inject their own HTML tags and scripts etc.
The weirdness is partly in JSON . In the JSON spec, the slash (named "solidus" there) is the only character that can be written plainly or prefixed with a backslash (AKA "reverse solidus").
Can't speak for go... but for the handful of languages I've thrown at Claude Code, I'd say it's doing the best job with Rust. Maybe the Rust examples in the wild are just better compared to say C#, but I've had a much smoother time of it with Rust than anything else. TS has been decent though.
Hehe. Optimization ... it's a good way to learn. Earlier in my career I did a lot of PHP. Usually close to bare.
Other than the obvious point that writing an enormous JSON file is a dubious goal in the first place (really), while PHP can be very fast this is probably faster to implement in shell with sed/grep, or ... almost certainly better ... by loading to sqlite then dumping out from there. Your optimization path then likely becomes index specification and processing, and after the initial load potentially query or instance parallelization.
The page confirms sqlite is available.
If the judges whinge and shell_exec() is unavailable as a path, as a more acceptable path that's whinge-tolerant, use PHP's sqlite feature then dump to JSON.
If I wanted to achieve this for some reason in reality, I'd have the file on a memory-backed blockstore before processing, which would yield further gains.
Frankly, this is not much of a programming problem, it's more a system problem, but it's not being specced as such. This shows, in my view, immaturity of conception of the real problem domain (likely IO bound). Right tool for the job.
Technical documentation explains input/output formatting with concrete examples.
FAQ explains server specifications and available PHP extensions.
Inferences
Educational scaffolding and transparent technical requirements enable participants to learn optimization and coding skills, directly supporting Article 26 right to education.