During a period of uncharacteristically active Windows development, I created a small set of utilities. Mostly, these were experiments in certain techniques, and when I could imagine some use for the utility outside my own world, I've made it available here.
At that time, web applications were few and far between, so these utilities have to be installed on your Windows machine. The links here lead to an installation procedure rather than to any download of software. I have imagined nearly all these utilities in web versions—unrestricted to type of computer, but those notions remain Projects for the Future.
This curious utility analyzes a text file (intended to be the text of a book), and makes a list of all unique words within the text. It also alphabetizes that list, then notes how many times the word was used. It was developed against Project Gutenberg texts but can work on any text that has been converted to (perhaps, saved as) ASCII text.
It has proven useful to check for misspellings, oddly enough.
Potentially the most useful of the utilities, SiteMapper looks at the web page to which you've pointed it, then develops a list of all internal links accessible from that page. In practice, this usually shows the contents of a web site, presuming that you've begun at the lead-in page of the site.
A nice feature is that it begins where you tell it to begin and ignores anything above that point.
It will be hard for most people to understand the need for this utility, but if you've played with analyzing web server logs, you'll know why I wrote it. Web IPs are allocated in chunks (of varying levels—there can be chunks withing chunks within chunks...). Many users make us of an IP address assigned more or less randomly from within their provider's chunk. Many chunks are dedicated to machines doing server indexing. If a machine accesses your server but just for indexing it, that machine cannot be counted as a user, which fact is very useful when you're trying to analyze how users visit your site.
My introduction to this problem occurred when I analyzed logs of the public web site of our organization, and reported to management that a few people had actually navigated the entire. Nope. Turned out, all were indexing machines, and no one had navigated the entire site.
IP addresses (like 10.1.10.1) are entered into the logs as text. Sorting a list of such numbers—via normal means— would put 10.1.10.1 ahead of 22.214.171.124, which was driving me mad. Now, no more problem.