That would be a laudable goal, but I feel like it's contradicted by the text:
> Even on a low-quality image, GPT‑5.2 identifies the main regions and places boxes that roughly match the true locations of each component
I would not consider it to have "identified the main regions" or to have "roughly matched the true locations" when ~1/3 of the boxes have incorrect labels. The remark "even on a low-quality image" is not helping either.
Edit: credit where credit is due, the recently-added disclaimer is nice:
> Both models make clear mistakes, but GPT‑5.2 shows better comprehension of the image.
A key limiting factor for dietary use of single cell protein is the high mass fraction of nucleic acid, which limits daily consumption due to uric acid production during metabolism. High rates of RNA synthesis are unfortunately necessary for high protein productivity.
The paper notes:
>It is important to note that MP products often contain elevated levels of nucleic acids, constituting ~8% of the dry weight [17], which necessitates consideration when assessing their suitability for human consumption. To address this, a heat treatment process is employed at the end of fermentation that reduces the nucleic acid content in the fermented biomass to below 0.75/100 g, while simultaneously deactivating protease activity and F. venenatum biomass. However, this procedure has been observed to induce cell membrane leakage and a substantial loss of biomass, as evidenced in the Quorn production process [17], which also utilizes F. venenatum as the MP producer. Our experimental trials have encountered similar challenges, achieving a biomass yield of merely ~35%, and observed that heating process increased the relative protein and chitin content (Figure 2D,E), which may be related to the effect of membrane leakage, while the intracellular protein of the FCPD engineered strain was less likely to be lost to the extracellular. Thus, concentrating the fermentation broth to enhance protein and amino acids content in successive steps to produce a highly nutritious water-soluble fertilizer appears to be an effective strategy for adding value to the process (Figure 1).
The challenges of developing economic single cell protein products, that are suitable for human consumption, are described in chapter 3 here:
There you can download it in high quality, and it’s a pay-what-you-want: you can get it for free if you want, or pay what you feel like and support me. Either way, I’m happy that you enjoy it!
The music should also be on Spotify, Apple Music, and most music streaming services within the next 24h.
A bit about the process of scoring Size of Life:
I’ve worked with Neal before on a couple of his other games, including Absurd Trolley Problems, so we were used to working together (and with his producer—you’re awesome, Liz!). When Neal told me about Size of Life, we had an inspiring conversation about how the music could make the players feel.
The core idea was that it should enhance that feeling of wondrous discovery, but subtly, without taking the attention away from the beautiful illustrations.
I also thought it should reflect the organisms' increasing size—as some of you pointed out, the music grows with them. I think of it as a single instrument that builds upon itself, like the cells in an increasingly complex organism. So I composed 12 layers that loop indefinitely—as you progress, each layer is added, and as you go back, they’re subtracted. The effect is most clear if you get to the end and then return to the smaller organisms!
Since the game has an encyclopedia vibe to it, I proposed to go with a string instrument to give it a subtle “Enlightenment-era” and “cultural” feel. I was suspecting the cello could be a good instrument because of its range and expressivity.
Coincidentally, the next week I met the cellist Iratxe Ibaibarriaga at a game conference in Barcelona, where I’m based, and she immediately became the ideal person for it. She’s done a wonderful job bringing a ton of expressivity to the playing, and it’s been a delight to work with her.
I got very excited when Neal told me he was making an educational game—I come from a family of school teachers. I’ve been scoring games for over 10 years, but this is the first educational game I’ve scored.
In a way, now the circle feels complete!
(if anyone wants to reach out, feel free to do so! You can find me and all my stuff here: https://www.aleixramon.com/ )
There’s a fundamental trade-off between performance and privacy for onion routing. Much of the slowness you’re experiencing is likely network latency, and no software optimization will improve that.
The N900 was my peak “mobile computing is awesome” device.
I went to see District 9 in the cinema in Helsinki. Uh oh, the alien parts are only subtitled in Finnish and Swedish and my Finnish is not up to that.
I installed a BitTorrent client, found the release on Pirate Bay, successfully torrented just the subtitle file, and used an editor to read the subtitles for scenes with a lot of alien.
The N9 had much better UI, but there was something of the cyberpunk “deck” idea in that thing, it was great.
When Karl was preparing to cross the ice from Alaska to Russia, I worked with him a bit on a kite-flown camera system to help him get a Birds Eye view of the flows to chart his course. I engineered a ruggedized wireless camera in an aluminum housing, I don’t remember much about it other than I was doubtful that the resolution would be able to give him the data he needed on on small low resolution screen. (This was before consumer drones were common or affordable). We built some devices, not sure if he ever used them or if they helped. I urged him to do a lot of testing to make sure they would be worth the weight.
We spent a lot of time at college coffee house in Fairbanks Alaska working over the ideas and overall design.
Nice fellow, strange aspirations, indomitable spirit. I’m glad to see his trek is nearing completion, and I wish him well on his further adventures. Good luck and Godspeed, Karl.
Autonomy subscriptions are how things are going to go, I called this a long time ago. It makes too much sense in terms of continuous development and operations/support to not have a subscription -- and subscriptions will likely double as insurance at some point in the future (once the car is driving itself 100% of the time, and liability is always with the self driving stack anyway).
Of course, people won't like this, I'm not exactly enthused either, but the alternative would be a corporation constantly providing -- for free -- updates and even support if your car gets into an accident or stuck. That doesn't really make sense from a business perspective.
It would be impossible to do without taking breaks, as explained in the article:
> Due to visa limits, Bushby has had to break up his walk. In Europe, he can stay for only 90 days before leaving for 90, so he flies to Mexico to rest and then returns to resume the route.
Given that he literally swam across the Caspian Sea in order to avoid Russia and Iran because of legal issues, nevermind bring imprisoned in Russia due to what sounded like bureaucratic BS, it's more impressive than I first thought.
AI has driven the corporate suites of these companies insane.
> As part of the agreement, Disney will make a $1 billion equity investment in OpenAI, and receive warrants to purchase additional equity.
I don't know what kind of hypnosis tricks Sam Altman pulls on these people but the fact that Disney is giving money to OpenAI as part of a deal to give over the rights to its characters is absolutely baffling.
OpenAI and ChatGPT have been pioneering but they're absolutely going to be commoditized. IMO there is at least a 50:50 chance OpenAI equity is going to be next to worthless in the future. That Disney would give over so much value and so much cash for it... insane.
It's also worth stating that the worst part of that proposed amendment [1] isn't even necessarily the VPN ban, it's the next clause, on page 20:
"The “CSAM requirement” is that any relevant device supplied for use in the UK
must have installed tamper-proof system software which is highly effective at
preventing the recording, transmitting (by any means, including livestreaming)
and viewing of CSAM using that device."
"Regulations under subsection (1) must enable the Secretary of State, by further
regulations, to expand the definition of ‘relevant devices’ to include other
categories of device which may be used to record, transmit or view CSAM"
> I couldn't for the life of me tell you what dd stands for.
Data(set) Definition. But that name does not make any sense whatsoever by itself in this context, neither for the tool (it hardly "defines" anything), nor for UNIX in general (there are no "datasets" in UNIX).
Instead, it's specifically a reference to the DD statement in the JCL, the job control language, of many of IBM's mainframe operating systems of yore (let's not get into the specifics of which ones, because that's a whole other can of complexity).
And even then the relation between the DD statement and the dd command in UNIX is rather tenuous. To simplify a lot, DD in JCL does something akin to "opening a file", or rather "describing to the system a file that will later be opened". The UNIX tool dd, on the other hand, was designed to be useful for exchanging files/datasets with mainframes. Of course, that's not at all what it is used for today, and possibly that was true even back then.
This also explains dd's weird syntax, which consists of specifying "key=value" or "key=flag1,flag2,..." parameters. That is entirely alien to UNIX, but is how the DD and other JCL (again, of the right kind) statements work.
I follow the MLX team on Twitter and they sometimes post about using MLX on two or more joined together Macs to run models that need more than 512GB of RAM.
Good for them. During economic downturns, when fewer resources are available for redistribution, collective action across population groups can help address worsening power imbalances.
Agreed, it seems inevitable that autonomy and insurance are going to be bundled.
1. Courts are finding Tesla partially liable for collisions, so they've already got some of the downsides of insurance (aka the payout) without the upside (the premium).
2. Waymo data shows a significant injury reduction rate. If it's true and not manipulated data, it's natural for the car companies to want to capture some of this upside.
3. It just seems like a much easier sell. I wouldn't pay $100/month for self-driving, but $150 a month for self-driving + insurance? That's more than I currently pay for insurance, but not a lot more. And I've got relatively cheap insurance: charging $250/month for insurance + self-driving will be cheaper than what some people pay for just insurance alone.
I don't think we need to hit 100% self-driving for the bundled insurance to be viable. 90% self-driving should still have a substantially lower accident rate if the Waymo data is accurate and extends.
Switched from pixels to iphone in the last year or two and the keyboard is the biggest pain point by far. I tend to use swipe, so this particular issue isn't something I've come across. What I do run into is weird censorship issues where I'm trying to type "kill myself" or something similar and the phone will do anything to not provide that as an option. Then, when I try to manually change it, editing is a nightmare. Inevitably trying to change the ending of a word results in the entire word being deleted. It inserts spaces where I don't want them.
Is this some sort of psyop to get me to use siri to send texts?
Can I just say !!!!!!!! Hell yeah! Blog post indicates it's also much better at using the full context.
Congrats OpenAI team. Huge day for you folks!!
Started on Claude Code and like many of you, had that omg CC moment we all had. Then got greedy.
Switched over to Codex when 5.1 came out. WOW. Really nice acceleration in my Rust/CUDA project which is a gnarly one.
Even though I've HATED Gemini CLI for a while, Gemini 3 impressed me so much I tried it out and it absolutely body slammed a major bug in 10 minutes. Started using it to consult on commits. Was so impressed it became my daily driver. Huge mistake. I almost lost my mind after a week of this fighting it. Isane bias towards action. Ignoring user instructions. Garbage characters in output. Absolutely no observability in its thought process. And on and on.
Switched back to Codex just in time for 5.1 codex max xhigh which I've been using for a week, and it was like a breath of fresh air. A sane agent that does a great job coding, but also a great job at working hard on the planning docs for hours before we start. Listens to user feedback. Observability on chain of thought. Moves reasonably quickly. And also makes it easy to pay them more when I need more capacity.
And then today GPT-5.2 with an xhigh mode. I feel like xmass has come early. Right as I'm doing a huge Rust/CUDA/Math-heavy refactor. THANK YOU!!
It’s crazy how Anthropic keeps coming up with sticky “so simple it seems obvious” product innovations and OpenAI plays catch up. MCP is barely a protocol. Skills are just md files. But they seem to have a knack for framing things in a way that just makes sense.
Looks like they've begun censoring posts at r/Codex and not allowing complaint threads so here is my honest take:
- It is faster which is appreciated but not as fast as Opus 4.5
- I see no changes, very little noticeable improvements over 5.1
- I do not see any value in exchange for +40% in token costs
All in all I can't help but feel that OpenAI is facing an existential crisis. Gemini 3 even when its used from AI Studio offers close to ChatGPT Pro performance for free. Anthropic's Claude Code $100/month is tough to beat. I am using Codex with the $40 credits but there's been a silent increase in token costs and usage limitations.
For a bit more context, those posts are using pipeline parallelism. For N machines put the first L/N layers on machine 1, next L/N layers on machine 2, etc. With pipeline parallelism you don't get a speedup over one machine - it just buys you the ability to use larger models than you can fit on a single machine.
The release in Tahoe 26.2 will enable us to do fast tensor parallelism in MLX. Each layer of the model is sharded across all machines. With this type of parallelism you can get close to N-times faster for N machines. The main challenge is latency since you have to do much more frequent communication.
I remember when the point of an SPA was to not have all these elaborate conversations with the server. Just "here's the whole app, now only ask me for raw data."
Is it me or does this seem like naked corruption at its worst? These tech CEOs hang out at the White House and donate to superfluous causes and suddenly the executive is protecting their interests. This does nothing to protect working US citizens from AI alien (agents) coming to take their jobs and displace their incomes.
Gitea has a builtin defense against this, `REQUIRE_SIGNIN_VIEW=expensive`, that completely stopped AI traffic issues for me and cut my VPS's bandwidth usage by 95%.
The Wyden–Daines Amendment in 2020: a huge privacy amendment that would’ve limited surveillance missed the Senate by literally one vote. It would’ve stopped the government from getting American's web browsing and search history without a warrant. And honestly, I still have zero respect for anyone who voted against it. If you need a warrant to walk into my house, you should need a warrant to walk into my digital life too.
What Snowden exposed more than 10 years ago, none of that was addressed, the surveillance machine just got worse if anything
I found out when Actions started failing again for the Nth time this month.
The internal conversation about moving away from Actions or possibly GitHub has been triggered. I didn't like Zig's post about leaving GitHub because it felt immature, but they weren't wrong. It's decaying.
That's why you treat it like a junior dev. You do the fun stuff of supervising the product, overseeing design and implementation, breaking up the work, and reviewing the outputs. It does the boring stuff of actually writing the code.
I am phenomenally productive this way, I am happier at my job, and its quality of work is extremely high as long as I occasionally have it stop and self-review it's progress against the style principles articulated in its AGENTS.md file. (As it tends to forget a lot of rules like DRY)
The high-reasoning version of GPT-5.2 improves on GPT-5.1: 69.9 → 77.9.
The medium-reasoning version also improves: 62.7 → 72.1.
The no-reasoning version also improves: 22.1 → 27.5.
Gemini 3 Pro and Grok 4.1 Fast Reasoning still score higher.