Just because stuff is "old" doesn't mean it's not interesting... I found myself having a discussion with a colleague recently about the state management patterns that Sitecore uses for things like
SecurityDisabler
and how they work in the ASP.Net pipeline. It's not new tech, but it is an interesting pattern which you might find uses for outside your XP implementations...
There are certain "rules of programming" that I keep hearing about in my career. One that came up in an interesting work debate recently was "you should never use regular expressions to parse HTML". Don't get me wrong - there can be a lot of useful knowledge wrapped up in these rules, but should we always follow them to the letter? I think it's an interesting question...
The second idea on my "little things I'd meant to add to this blog for a while" list was reading time estimates. Like the reading progress indicator from before, this shouldn't be tricky, and in this case I wanted to write it down in case anyone else working with Statiq was interested in achieving something similar on their site.
Time for the final part of my series on controlling a web browser. With code to load a browser, and the overarching State Machine to control it, this part finishes off with the code for some states to load a page and extract its markup. Plus a few conclusions...
Continuing from my previous post about firing up a browser in order to automate it, this post moves on to the overall pattern for how the browser can be controlled.
I bumped into an issue recently where I needed to write some code to scrape a bit of HTML. The usual .Net approach of using an
HttpClient
didn't work here - the web site in question made use of some client-side JavaScript to generate mark-up at runtime. So I needed a different approach to fetch the resulting HTML. A while back I'd written some code to
grab images of rendered HTML using the Chromium DevTools APIs, and I figured I could play a similar game here...
I noticed the other week that the sitemap file my blog was generating included the urls, but none of the other metadata that they can report. To be honest, I'm not sure if search engines pay much attention to this these days, but since the schema for the files includes other options I decided to see if I could add them.
I was looking at writing a tool in .Net 7 the other day which would benefit from having an option to load and unload plugin extensions. Reloadable plugins could be a bit tricky in .Net 4, but doable. But that's changed dramatically in more recent framework versions, in some ways that are better and interesting.
My recent post about messing up with inheritance came out of some work to migrate some (fairly old) T4 Template code generation to .Net's newer Source Generators feature. Excluding my own mistakes, this process wasn't as easy as I'd hoped. So it seemed like a good topic to jot some notes down about, in case others are facing similar challenges...
The other day I had a scenario where I wanted to be able to display the date an app was built in its UI. While you can always fall back to a "just make that a string in your code" approach, after a bit of digging I discovered a better way. It turns out recent .Net code has some clever patterns to help with this...