Tags: min

657

sparkline

Tuesday, April 8th, 2025

‘An Overwhelmingly Negative And Demoralizing Force’: What It’s Like Working For A Company That’s Forcing AI On Its Developers - Aftermath

Grim reading from the games industry, especially if you work at Shopify where the CEbrO has just mandated that you have to use this shite.

Monday, April 7th, 2025

Denial

The Wikimedia Foundation, stewards of the finest projects on the web, have written about the hammering their servers are taking from the scraping bots that feed large language models.

Our infrastructure is built to sustain sudden traffic spikes from humans during high-interest events, but the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs.

Drew DeVault puts it more bluntly, saying Please stop externalizing your costs directly into my face:

Over the past few months, instead of working on our priorities at SourceHut, I have spent anywhere from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale.

And no, a robots.txt file doesn’t help.

If you think these crawlers respect robots.txt then you are several assumptions of good faith removed from reality. These bots crawl everything they can find, robots.txt be damned.

Free and open source projects are particularly vulnerable. FOSS infrastructure is under attack by AI companies:

LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

You try to do the right thing by making knowledge and tools freely available. This is how you get repaid. AI bots are destroying Open Access:

There’s a war going on on the Internet. AI companies with billions to burn are hard at work destroying the websites of libraries, archives, non-profit organizations, and scholarly publishers, anyone who is working to make quality information universally available on the internet.

My own experience with The Session bears this out.

Ars Technica has a piece on this: Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries .

So does MIT Technology Review: AI crawler wars threaten to make the web more closed for everyone.

When we talk about the unfair practices and harm done by training large language models, we usually talk about it in the past tense: how they were trained on other people’s creative work without permission. But this is an ongoing problem that’s just getting worse.

The worst of the internet is continuously attacking the best of the internet. This is a distributed denial of service attack on the good parts of the World Wide Web.

If you’re using the products powered by these attacks, you’re part of the problem. Don’t pretend it’s cute to ask ChatGPT for something. Don’t pretend it’s somehow being technologically open-minded to continuously search for nails to hit with the latest “AI” hammers.

If you’re going to use generative tools powered by large language models, don’t pretend you don’t know how your sausage is made.

Friday, March 28th, 2025

Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries - Ars Technica

As it currently stands, both the rapid growth of AI-generated content overwhelming online spaces and aggressive web-crawling practices by AI firms threaten the sustainability of essential online resources. The current approach taken by some large AI companies—extracting vast amounts of data from open-source projects without clear consent or compensation—risks severely damaging the very digital ecosystem on which these AI models depend.

Wednesday, March 26th, 2025

Go To Hellman: AI bots are destroying Open Access

AI companies with billions to burn are hard at work destroying the websites of libraries, archives, non-profit organizations, and scholarly publishers, anyone who is working to make quality information universally available on the internet.

Friday, March 21st, 2025

FOSS infrastructure is under attack by AI companies

More on how large language bots are DDOSing the web:

LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

Thursday, March 20th, 2025

Please stop externalizing your costs directly into my face

Over the past few months, instead of working on our priorities at SourceHut, I have spent anywhere from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale.

This matches my experience with The Session. In fact, while I had this article open in a tab, I had to go deal with a tsunami of large language model bots. It’s really fucking depressing.

Please stop legitimizing LLMs or AI image generators or GitHub Copilot or any of this garbage. I am begging you to stop using them, stop talking about them, stop making new ones, just stop. If blasting CO2 into the air and ruining all of our freshwater and traumatizing cheap laborers and making every sysadmin you know miserable and ripping off code and books and art at scale and ruining our fucking democracy isn’t enough for you to leave this shit alone, what is?

Sunday, March 16th, 2025

Build It Yourself | Armin Ronacher’s Thoughts and Writings

We’re at a point in the most ecosystems where pulling in libraries is not just the default action, it’s seen positively: “Look how modular and composable my code is!” Actually, it might just be a symptom of never wanting to type out more than a few lines.

It always amazes me when people don’t view dependencies as liabilities. To me it feels like the coding equivalent of going to a loan shark. You are asking for technical debt.

There are entire companies who are making a living of supplying you with the tools needed to deal with your dependency mess. In the name of security, we’re pushed to having dependencies and keeping them up to date, despite most of those dependencies being the primary source of security problems.

But there is a simpler path. You write code yourself. Sure, it’s more work up front, but once it’s written, it’s done.

Friday, March 7th, 2025

Plane GPS systems are under sustained attack - is the solution a new atomic clock? - BBC News

A fascinating look at the modern equivalent of the Longitude problem.

Tuesday, March 4th, 2025

Cold Album Drumming - full-album drum covers by Brad Frost

This is a great new musical project from Brad:

Brad Frost plays drums to the albums he knows intimately, but has never drummed to before. Cover to cover. No warm-up. No prep. Totally cold. What could possibly go wrong?

I really enjoyed watching all of The Crane Wife and In Rainbows.

Sunday, March 2nd, 2025

Hallucinations in code are the least dangerous form of LLM mistakes

The moment you run LLM generated code, any hallucinated methods will be instantly obvious: you’ll get an error. You can fix that yourself or you can feed the error back into the LLM and watch it correct itself.

Compare this to hallucinations in regular prose, where you need a critical eye, strong intuitions and well developed fact checking skills to avoid sharing information that’s incorrect and directly harmful to your reputation.

With code you get a powerful form of fact checking for free. Run the code, see if it works.

Monday, February 17th, 2025

trot

Working on this project is great but ten minutes into it and I already miss the resilience of the web. I miss how you have to really fuck things up to make a browser yell at you or implode.

Friday, February 14th, 2025

AI is Stifling Tech Adoption | Vale.Rocks

Want to use all those great features that have been in landing in browsers over the past year or two? View transitions! Scroll-driven animations! So much more!

Well, your coding co-pilot is not going to going to be of any help.

Large language models, especially those on the scale of many of the most accessible, popular hosted options, take humongous datasets and long periods to train. By the time everything has been scraped and a dataset has been built, the set is on some level already obsolete. Then, before a model can reach the hands of consumers, time must be taken to train and evaluate it, and then even more to finally deploy it.

Once it has finally released, it usually remains stagnant in terms of having its knowledge updated. This creates an AI knowledge gap. A period between the present and AI’s training cutoff. This gap creates a time between when a new technology emerges and when AI systems can effectively support user needs regarding its adoption, meaning that models will not be able to service users requesting assistance with new technologies, thus disincentivising their use.

So we get this instead:

I’ve anecdotally noticed that many AI tools have a ‘preference’ for React and Tailwind when asked to tackle a web-based task, or even to create any app involving an interface at all.

Tuesday, January 21st, 2025

Software Folklore ― Andreas Zwinkau

Detective stories and tales of bughunting in software and hardware.

Sometimes bugs have symptoms beyond belief. This is a collection of such stories from around the web.

What I’ve learned about writing AI apps so far | Seldo.com

LLMs are good at transforming text into less text

Laurie is really onto something with this:

This is the biggest and most fundamental thing about LLMs, and a great rule of thumb for what’s going to be an effective LLM application. Is what you’re doing taking a large amount of text and asking the LLM to convert it into a smaller amount of text? Then it’s probably going to be great at it. If you’re asking it to convert into a roughly equal amount of text it will be so-so. If you’re asking it to create more text than you gave it, forget about it.

Depending how much of the hype around AI you’ve taken on board, the idea that they “take text and turn it into less text” might seem gigantic back-pedal away from previous claims of what AI can do. But taking text and turning it into less text is still an enormous field of endeavour, and a huge market. It’s still very exciting, all the more exciting because it’s got clear boundaries and isn’t hype-driven over-reaching, or dependent on LLMs overnight becoming way better than they currently are.

Tuesday, January 7th, 2025

HTML Is Actually a Programming Language. Fight Me | WIRED

When haters deny HTML’s status as a programming language, they’re showing they don’t understand what a language really is. Language is not instructing an interlocutor what to do in a way that leaves no room for other interpretations; it is better and richer than that. Like human language, HTML is conversational. It is remarkably adept at adapting to context. It can take a different shape on any machine, from a desktop browser or an e-reader screen to a mobile app or a screen reader for the blind (so long as that device is built to present hypertext).

Hell, yeah!

Ultimately, even as HTML has become the province of professionals, it cannot be gatekept. This is what makes so many programmers so anxious about the web, and sometimes pathetically desperate to maintain the all-too-real walls they’ve erected between software engineers and web developers.

Hell, yeeeeaaaaahhh!!!

What other programmers might say dismissively is something HTML lovers embrace: Anyone can do it. Whether we’re using complex frameworks or very simple tools, HTML’s promise is that we can build, make, code, and do anything we want.

Monday, December 30th, 2024

CSS wants to be a system - daverupert.com

CSS wants you to build a system with it. It wants styles to build up, not flatten down.

Truth!

Wednesday, December 4th, 2024

Tabloid: the clickbait headline programming language

Tabloid is a turing-complete programming language for writing programs in the style of clickbait news headlines.

Thursday, November 21st, 2024

I don’t have time to learn React - Keith Cirkel

React is a non-transferable skill.

React proponents might claim that React will teach you modern UI, but from what I’ve seen it barely copes with modern UI. autofocus is broken, custom elements don’t work in all but the experimental version, using any “modern” features like dialog or popovers requires useEffect, and the synthetic event system teaches you so little about how DOM actually works. This isn’t modern UI, it’s UI from 2013 at its inception. I don’t have the time left in my career to pick up UI paradigms that haven’t evolved much beyond from when Barack Obama was in office.

When I mentor early career developers and they ask me what they should learn, I can’t say React, they don’t have time. I mean sure, pick up enough React to land you the inevitable job doing it, but it’s not going to level up your career.

Tuesday, November 12th, 2024

The meaning of “AI”

There are different kinds of buzzwords.

Some buzzwords are useful. They take a concept that would otherwise require a sentence of explanation and package it up into a single word or phrase. Back in the day, “ajax” was a pretty good buzzword.

Some buzzwords are worse than useless. This is when a word or phrase lacks definition. You could say this buzzword in a meeting with five people, and they’d all understand five different meanings. Back in the day, “web 2.0” was a classic example of a bad buzzword—for some people it meant a business model; for others it meant rounded corners and gradients.

The worst kind of buzzwords are the ones that actively set out to obfuscate any actual meaning. “The cloud” is a classic example. It sounds cooler than saying “a server in Virginia”, but it also sounds like the exact opposite of what it actually is. Great for marketing. Terrible for understanding.

“AI” is definitely not a good buzzword. But I can’t quite decide if it’s merely a bad buzzword like “web 2.0” or a truly terrible buzzword like “the cloud”.

The biggest problem with the phrase “AI” is that there’s a name collision.

For years, the term “AI” has been used in science-fiction. HAL 9000. Skynet. Examples of artificial general intelligence.

Now the term “AI” is also used to describe large language models. But there is no connection between this use of the term “AI” and the science fictional usage.

This leads to the ludicrous situation of otherwise-rational people wanted to discuss the dangers of “AI”, but instead of talking about the rampant exploitation and energy usage endemic to current large language models, they want to spend the time talking about the sci-fi scenarios of runaway “AI”.

To understand how ridiculous this is, I’d like you to imagine if we had started using a different buzzword in another setting…

Suppose that when ride-sharing companies like Uber and Lyft were starting out, they had decided to label their services as Time Travel. From a marketing point of view, it even makes sense—they get you from point A to point B lickety-split.

Now imagine if otherwise-sensible people began to sound the alarm about the potential harms of Time Travel. Given the explosive growth we’ve seen in this sector, sooner or later they’ll be able to get you to point B before you’ve even left point A. There could be terrible consequences from that—we’ve all seen the sci-fi scenarios where this happens.

Meanwhile the actual present-day harms of ride-sharing services around worker exploitation would be relegated to the sidelines. Clearly that isn’t as important as the existential threat posed by Time Travel.

It sounds ludicrous, right? It defies common sense. Just because a vehicle can get you somewhere fast today doesn’t mean it’s inevitably going to be able to break the laws of physics any day now, simply because it’s called Time Travel.

And yet that is exactly the nonsense we’re being fed about large language models. We call them “AI”, we look at how much they can do today, and we draw a straight line to what we know of “AI” in our science fiction.

This ridiculous situation could’ve been avoided if we had settled on a more accurate buzzword like “applied statistics” instead of “AI”.

It’s almost as if the labelling of the current technologies was more about marketing than accuracy.

Sunday, November 3rd, 2024

New CSS that can actually be used in 2024 | Thomasorus

Logical properties, container queries, :has, :is, :where, min(), max(), clamp(), nesting, cascade layers, subgrid, and more.