LiveJournal recently launched the My Guests page which tells you who has been reading your journal.

I don't like it.  It states with excessive clarity that my journal is a pile of crap that nobody reads.  My Guests is supposed to show "the 100 most recent visitors this month" but I see only 18.  The service was just announced, yet somehow 12 of my 18 visitors have already figured out the secret incantation for hiding their names???  I realize that furries tend to be [ profile] furrtive but this is ridiculous!

[ profile] aethwolf, [ profile] atomicat, [ profile] carol_kitty: Welcome, furiends!  Thank you for visiting!

[ profile] jordan179: Thank you for commenting!

[ profile] sleepy16: I don't speak Russian well enough to talk to you and I don't really want to read a journal that consists mostly of kitteh photos.

[ profile] rantasarus: Hi there!  Is my journal interesting?  I have no idea why.

p.s.: Yes, I realize that pre-launch recording of visitors was "spotty" and things will probably improve next month.  But by then most furs will have opted out so it still won't do me any good.  I would like to see how many hits the RSS feed for my journal gets, but that's a paid-only feature and I don't want it enough to pay for it, so I can't have it.
Jakob Nielsen does some serious analysis on his logfiles!  I'm mainly familiar with Zipf curves as they apply to vocabulary usage: typically half the words used in a book are used only once, one quarter are used twice, etc.  But the dozen or so most-used words in a book (a, an, the) occur far more often than Zipf predicts, just like Google is five times as popular as a source of hits to Nielsen's pages than Zipf would predict.  So I think Zipf isn't exactly the right curve after all.

Wil Wheaton does it again!  You think his essay can't get any crazier, then he finds a new seam of ore in his Crazy Mine.

The Master Contract for my online work (for a company that I'll be calling "𝔾" on this blog).  The blue text is the other guy's changes.  I think this contract makes it clear why lawyers hate me!  Unfortunately, this work doesn't pay well enough to support my family and I still need to find a day job.

A severance deal with ℱ was eventually worked out at the end of May(!), so I currently have what amounts to a no-show job that continues to pay my salary until the end of July.  Meanwhile, there's more work I might do for ℱ in the fall.  There's an OEM customer of theirs whose products I've worked on for years.  Their contract requires ℱ to do a new product for them by the end of this year.  They recently found out that this year's product will have to be done using an obsolete hardware platform.  I'm one of the only engineers in the world who knows about that old stuff and is still interested in working for ℱ, and there's a mountain of old software (in Arabic, much of it stuff I wrote) that needs amending in only a few short months.  Hiring me to do this work is the only sensible move, but for ℱ… I'm thinking it's less than 50% likelihood that I'll get the job.  They'll probably end up telling the customer that the project just can't be done (ℱ hates being cornered).

That meme-filling form (that [ profile] giza suggested I write) is continuing its popularity.  20 hits came in during the last minute!  But only 577 in the last hour.  Meanwhile my new write your own meme has gotten *zero* hits—absolutely no one but me has ever clicked on the Submit button.  I guess I need to advertise it on one of those meme-spreading LJ communities.
My latest post at (Stock up on food+water: Armageddon in Senate is coming) did quite well: it was read over 1200 times by at least 900 different people.  17 of them elected to give me comments.  Amazingly, 18 of them used their dK membership privilege to "recommend" my post!  I think a post needs about 50 recommends to get prominent placement on the homepage.

So who felt like reading this post?  All I know is who bothered to comment, and most of them say very little about themselves.  It seems several of them are Christian Survivalists, which I should have expected given the post's apocalyptic title.  These are the kind of people who stocked up on food+water to prepare for Y2K, because they think the Book of Revelation is a political guide for our times.  When perchance I daydream about preaching on some religious topic or other, I usually imagine that my audience would consist of Orthodox Jews, but if it's the Survivalists who want to listen to me then perhaps I could just make a few adjustments...

One commenter clearly got bent out of shape by my use of the phrase "take your toys out of the sandbox and go home", which I stole from a comment by [ profile] galen_keman.

I'm a little concerned about the large number of commenters who have PoliticalCompass scores like (−8,−8), and flaunt their scores in their signature-lines.  I just cannot imagine being so polarized on those issues.  My score is a measly (+0.75,−4.00), which I think is supposed to put me in the Clinton/Blair camp.  For comparison, the World's Smallest Political Quiz used to call me a "Left Liberal" but now it says I'm a "Libertarian".  That's odd; I almost never agree with anything Eric Raymond says!  But I usually agree with Markos Moulitsas Zúniga ("Corruption is *not* a partisan issue!") and the only Canadian politibloggers I follow are NDP-ers.

No news yet from Immigration Canada, but it's only January and I just have to be patient.  Unlike what I imply (and some commenters state) in the dK post, the United States is not *really* a Fascist dictatorship yet.  In a real dictatorship, anyone who said the sort of things I say in that post would get dragged out of bed in the middle of the night and never be heard from again.  Oh wait, excuse me for a moment while I go answer the doorbell...

Logfile analysis — summary

Which of these charts is better?

Hits per hour (initial post = 1:44pm EST, last hit = 12:37pm next day)
100 10 1 ▇▇ ██ ██ ██ ██ ▄▄ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ▄▄ ██ ██ ██ ▅▅ ██ ██ ▁▁ ██ ██ ▂▂ ██ ▃▃ ██ ▅▅ ▅▅ ▅▅ ██ ▅▅ ▅▅ ▅▅ ██
  +00 +01 +02 +03 +04 +05 +06 +07 +08 +09 +10 +11 +12 +13 +14 +15 +16 +17 +18 +19 +20 +21 +22 +23

I sort of like the first one because of its low bandwidth and use of arcane Unicode characters.  But I suppose the second one is prettier.

Logfile analysis — details

To save time and money, I've reprogrammed my visitor-sniffer not to look up the city-names for new IP addresses associated with dK readers; instead they are just labelled "Kossack".  This causes the repeat visitors (with real cities instead of "Kossack") to stand out.  Unfortunately, many of these repeats are false hits:
  • Several hits came from addresses identified as the troll [ profile] zi_mugudarina, but probably none of them is really him: I had to assign him a large block of 65536 addresses to cover the variety of proxy-servers that he uses to hide his true identity.
  • One hit was labelled as coming from [ profile] aethwolf, but actually it came from a block of 4096 addresses for an ISP that Aeth is no longer using.  I've been putting off the addition of expiration-dates to my IP records, which would eliminate this kind of false hit.
  • I have one friend who works at NASA, so any hit from NASA is usually assumed to be him.  I got a hit from NASA on my dK post, but it's quite likely to be from one of the multitude of other weekend NASA workers who read dK.  My visitor-sniffing algorithm assumes that I have a small number of friends, widely spaced around the world.  It doesn't work so well for a million-member website like dK.
  • One hit got assigned to the image leech category, which is clearly erroneous.  I don't get leeches anymore; the category is obsolete now that I've figured out how to keep my posts out of LJ's "latest images feed".  This dK reader happened to be using one of a block of 4096 addresses assigned to Ameritech DSL in the RBACK1 neighborhood of Chicago, with which I've had some previous dealings:
    • A leech visited from there on 6 October 2004.
    • Somebody from there who was Googling for "PEDO SEX" stopped by on 3 April 2005.
    • A dK reader from an adjacent neighborhood (RBACK7) showed up on July 1, causing me to refine my database to exclude him from the block labelled "leech".
    • Now I get another dK reader from there, but he's in the same RBACK1 block with the original leech.  For now I'll just relabel the whole block "mixed", but really I should mark the individual addresses.
  • Somebody at The 3M Company (St. Paul MN) read my cookie contest post last May.  (Dunno what brought him there, the sniffer isn't *that* good.)  He was using Windows.  My dK post got a hit from the same individual IP address, now using a Mac.  Maybe it's the same guy and he got some new hardware, but there's no way to prove it.
  • The minor-league diarists at dK produce a continuous tidal wave of new posts that nobody could possibly keep up with, so it's rather unlikely that anyone who had read one of my previous posts would also see this new one.  But somebody using Roadrunner in Glenmont NY saw my first dK post last July and also this latest one.  Roadrunner is a cable-modem ISP, so "same IP address six months later" quite probably means "same individual", and besides the new hit's browser ID is exactly the same as the old one's (that sameness is convenient for me and my hobby, but InternetExplorer is virus-bait and its users *really* need to keep downloading all the patches for their browser!)
  • Some person or robot at NetSweeper, Inc. in Guelph ON visited my home page on 10 November 2004.  Then in May, June, August, and October of 2005 there were hits from him/her/it to my Image Leeches essay.  I'm seeing these random hits on the Leeches essay coming from all over the world and have formed several mutually-inconsistent theories about what they might mean, but anyway that's a topic for another post.  So now I get a hit from that same IP on my dK post and I have to wonder, "Did this person click on my post because 'Pyesetz' is a familiar name, or does he/she have *NO IDEA* that his/her computer is periodically downloading my essay?"
Recently I restarted my visitor-sniffing program.  In 20 days it recorded 3091 hits, of which 80% (2458) were from search-engines. This is way too much data!  There were only 183 hits from actual search-engine users, plus 71 from CYD readers (via this thread).

I've added some more code to my journal.php program that will hopefully cut down on all those hits from search-engines:
  • Miva is now banned.  Their database is for private corporate use, of no value to my potential friends.  I'm thinking about banning Yahoo! Slurp, since they keep scanning my images but never show them in their search-results.
  • The Expires: header is now 1 month instead of 1 week, which should cut back on the rereading of unchanged pages (but new entries might not be noticed for a month).
  • The tags and the Link and Parent links for comments are now marked with rel="nofollow", which is supposed to cause search-engines to ignore them.  We'll have to see whether that really works.
  • The Leave a comment links now use rel="nofollow" instead of being trashed by my program.  Under the old system, anyone who found my journal through a search-engine could never comment on anything!
I tried an ego-surf at Yahoo.  It came up with this and this, both of which seem spurious (my story mentions both Cherokee totems and hot tubs, but only as props) and the current versions of these pages don't mention me.

Even more boring news: I've now updated all my PHP files at Furtopia to have month-long expirations.  Previously only the journal had any expiration at all.  Reduced traffic, here I come!


Jul. 2nd, 2005 10:58 am
A Google search for I18N_UnicodeString gives as the top search result, followed by  Since the link from PEAR to the source code for this package is now dead, my page is the top-ranked link that is still working.  I believe this is the explanation for why my page was mentioned on this forum (warning: fractured English from German speakers).

My first-ever post to dK got 445 click-throughs!  Less than 30 were from people I've already met through LiveJournal.  About 350 were clicks via the "latest 20 diaries" list on dK's front page (i.e., people choosing my article solely based on its provocative title, since my username has no reputation at that site).  As far as I can tell, the rest would have to be clicks on the "read more" link from dK's "recent diaries" list: these people had read my first paragraph and then decided they wanted to see more about the topic I had chosen.  Thank you thank you thank you!  And extra thanks to the four people who gave me comments.  No mutual back-scratching involved—I've never commented on anyone else's diary at dK—they just wanted to talk amongst themselves on my topic.

Originally I had created my dK account in order to comment on this article, in which Kos slights McCain by leaving him out of the "top ten most popular Senators" list, even though his popularity is exactly equal to Bayh's, who was included—McCain's popularity is significant considering how many elephants think he's a traitor for saving our country from the "nuclear option".  But there's a 24-hour delay between account-creation and commenting privileges.  The story went cold before I could say anything.

I don't think my moniker seemed out of place there.  My post was rational with an angry undertone, and used mixed wild metaphors, two features commonly seen among political bloggers whose online names include a canine reference.
I've gotten past [ profile] brad's lameness filter!  And this was only my second attempt!

Alexa says my website is one of the top 33 destinations on Furtopia!  And [ profile] loganberrybunny is only three times as popular as I am.  But Technorati claims that nobody who matters ever links to my journal— so why do they even mention me?

I've been working on taxes this week and haven't been doing daily upkeep on my logfile.  Ugh!  Now I have 275 new visitor-IP's to catalogue.  There are actually some interesting hits besides the usual pedo-search trash.  CYD is talking about me again.  Looks like they're happy with the comments my friends made about them in this post.  Hey, Donotsue, here's another mention of your name for the next time you go ego-surfing!

My monograph on search-engine users seems to be having quite an effect.  First off, it's a link-fest.  When people search for "pedophilia" they now tend to land on the "search-engine users" page instead of the more specific page for their search—and often as not they end up clicking on something off-topic like Who needs backups?.  But (perhaps coincidentally) Google seems to be ranking all of my pages higher now.  I'm getting quite a few hits from people searching for stuff like "ip city lookup free php code".  I can't explain why there's a sudden burst of interest in that field, other than that there's always been interest but suddenly my page is on the first screenful of search-hits.  My copy of John Downey's I18N_UnicodeString.php actually ranks only slightly lower (29 vs 24) than the author's official copy, for queries like "PHP Unicode storing".  I don't know why people would skip past his to click on my copy of the same thing.

I am just starting to get the barest inkling of the mind-boggling scale of Internet usage, which reminds me of AT&T's "Spaceship Earth" ride at Disney.  Hopefully I'll never get the kind of email that Wil Wheaton has to deal with.

This quiz says I act like I'm only 33 years old.  Ah, would that I were so young again!  And this quiz predicts that I will die at age 79.  I doubt it—neither my father nor my grandfathers lived that long, although some great-grandfathers did.

You Are Creepy
Serial killers would run away from you in a flash.

Grruwlf!  Snoop-Dog here, again setting the Wayback Machine for: time = "Fall 2004" and place = "my website's logfile".  Today's topic is The Search-Engine Users.  This time I shall organize the data according to how I felt when I saw these search-queries in my logfile.  The porn-links are in section "Disgusted".

My logfile shows 318 entries whose referers are search-engine users, with 237 distinct search-queries, some of which I've combined into near-synonymous groups.  For example, it seems to me that someone searching for  "M/F furry" and someone who wants "M/F yiff" are probably looking for the same pages on the web.  I use the notation [5×] to indicate that there are four other queries similar to the one shown.

There are 729 logfile entries from the engines themselves, stopping by to keep their databases current, which is more than twice the number of hits from actual users.  In part this is my fault for not putting "Expiry" headers on the output from my .PHP programs.  That causes some engines to reread the files multiple times per week even though they don't change for months at a time.  Problem is, I can't decide what expiry time I should use.  This person recommends 48 hours for HTML files, which seems too short.

Happy, Sad, Quizzical, Cynical, Angry, Disgusted )
Grruwlf!  I'm Snoop-dog,¹ your friendly neighborhood troll.  This post is the first in what may become an occasional series on the topic of website logfile analysis.  No, wait!  It isn't *that* boring!  In fact, it's far more interesting than watching grass grow, waiting for a pot of water to boil, etc.
¹Yes, it seems I've acquired a new nickname.  Begone, PieSplatz!  Thou art *so* 2004!

My logfile contains 6,393 entries for the period 14-Sep-04 through 31-Dec-04.  Eliminating duplicative entries reduces the total to about 4,206 accesses.  I got hits from 1,237 different IP addresses, which I've grouped into perhaps 738 distinct people visiting my website.  I've managed to find names for only 49 of these people, but those named visitors were responsible for 44.6% of all the accesses.  Interesting search-queries that have brought people to my site include "pheromone cockroach starve male", "eating deer organ meats", "dog humping cookie monster", "children having sex pedo crimes frequency......  *snore zzzzzzzzzzzzzzz*

Huh?  Oh, sorry.  Must have dozed off!  Today's topic is the "LJ image leeches".  These are induhviduals who like to look at just the pictures embedded in public LiveJournal posts, while ignoring the surrounding text.  During Q4 of 2004 I made four public-posts-with-pictures and captured website hits from 112 of these, um, "people" I suppose you could call them.  Or they could be called "bandwidth-sucking sex-starved pimply-faced adult-wannabe waste products", but that would be unfair to the ones without acne.  So here is what I have learned about them:

Boring analysis of least-desirable visitors to a website that ISN'T EVEN YOURS )  )


