I've written a few times before about trying to smooth out the rougher edges of the process of blogging with some custom tooling. Both the site generator I'm using these days, and the simple editor tool I hacked together to suit my writing process. I realised recently that one of those rough edges that remained in the process was the need to manually commit my writing to source control, so I wondered what it might take to wire that into my editing tool...
A while back I wrote about the transition from T4 templates to using Roslyn Source Generators for generating code in .Net Core solutions. While that worked for me, and I was able to get it to do what I needed, I was never really happy with all the output source as literal strings in the generator code. Recently I had another potential use for generated code, so I decided to try and fix this issue...
Just because stuff is "old" doesn't mean it's not interesting... I found myself having a discussion with a colleague recently about the state management patterns that Sitecore uses for things like
SecurityDisabler
and how they work in the ASP.Net pipeline. It's not new tech, but it is an interesting pattern which you might find uses for outside your XP implementations...
There are certain "rules of programming" that I keep hearing about in my career. One that came up in an interesting work debate recently was "you should never use regular expressions to parse HTML". Don't get me wrong - there can be a lot of useful knowledge wrapped up in these rules, but should we always follow them to the letter? I think it's an interesting question...
The second idea on my "little things I'd meant to add to this blog for a while" list was reading time estimates. Like the reading progress indicator from before, this shouldn't be tricky, and in this case I wanted to write it down in case anyone else working with Statiq was interested in achieving something similar on their site.
Time for the final part of my series on controlling a web browser. With code to load a browser, and the overarching State Machine to control it, this part finishes off with the code for some states to load a page and extract its markup. Plus a few conclusions...
Continuing from my previous post about firing up a browser in order to automate it, this post moves on to the overall pattern for how the browser can be controlled.
I bumped into an issue recently where I needed to write some code to scrape a bit of HTML. The usual .Net approach of using an
HttpClient
didn't work here - the web site in question made use of some client-side JavaScript to generate mark-up at runtime. So I needed a different approach to fetch the resulting HTML. A while back I'd written some code to
grab images of rendered HTML using the Chromium DevTools APIs, and I figured I could play a similar game here...
I noticed the other week that the sitemap file my blog was generating included the urls, but none of the other metadata that they can report. To be honest, I'm not sure if search engines pay much attention to this these days, but since the schema for the files includes other options I decided to see if I could add them.
I was looking at writing a tool in .Net 7 the other day which would benefit from having an option to load and unload plugin extensions. Reloadable plugins could be a bit tricky in .Net 4, but doable. But that's changed dramatically in more recent framework versions, in some ways that are better and interesting.