Follow

So much of my experience of browsing the web is fighting sites over how I want them displayed, that I'm beginning to think that I don't want a web browser that renders web pages, and instead I want a web browser that scrapes content from web pages and displays it according to my preferences.

Like, no one can make a web browser because the web is so complex, and we mostly use that complexity for:

  • Running complex client-side applications in a cross-platform virtual machine. Which is really cool, but I kinda hate that the UI and the service are so often tied together.
  • Bombarding me with distractions when I just want to read an article or watch a video or whatever.

And.. maybe a simple client would be better than a complex VM and layout engine a lot of the time?

I've started working on this. The web has gotten sufficiently annoying, even with ad-blocking, that it feels worth it.

I reached a point where I can parse most of an HTML header and now I'm exhausted.

Still working on this. Got as far as the body tag and into the header of the page I'm working on. But, surprise, the HTML is malformed. They failed to close an element. So I had to add a rule so that, if an ancestor element is closed, it treats that as implicitly closing the current element.

Browsers don't care if your HTML is well-formed or not so this isn't surprising.

@madewokherd something more deserving of the term "user agent"

@madewokherd firefox has some feature that simplifies a page for better reading. But not sure how well that fits your usecase. Maybe sites are also completely broken with that...

@madewokherd i wonder if a better architecture wouldn't be building this around headless chrome

@madewokherd the train of thought is, first it's just html, but someday you're probably going to find some website that needs to fetch the content with javascript, so maybe it does make sense to reuse all the browser crap but wrap it in a highly controlled layer. but that does make it a very different kind of project

@tbodt When that happens, I will write my own logic to fetch the content.

Sign in to participate in the conversation
Computer Fairies

Computer Fairies is a Mastodon instance that aims to be as queer, friendly and furry as possible. We welcome all kinds of computer fairies!