On the subject of feature bloat.

WoW Log Parser Screenshot

It actually parses the whole entries, not just these fields.

A while back, now, my guild was working on killing heroic Omnotron Defense System, and we had a problem. Some people in our raid were happily whacking Electron with various spells and melee attacks and whatever else while his Unstable Shield is up. This was causing a backlash spell called Static Shock to kill our mans. Problem is, nobody would fess up to it, either through ignorance or shame.

The solution, as in so many things, was Perl. I wrote a Perl script called WBUE (Who Blew Up Electron?) that reads the wow combat log and looks for events that coincide with people, well… blowing up Electron. The heart of this Perl script was a small parser I wrote that only half-asses the job of understanding each line, but it was good enough for what I needed from the script.

sub parse_entry {
    # This is far from comprehensive. The actual parsing of this probably would require
    # a genuine parser, given the documented complexity of the data.
    my $filter = shift or die "Must pass a parse filter to parse_entry";
    my $input = shift or return;
    my $starttime = shift; # or die "Need a start time in parse_entry";
    my $endtime = shift; # or die "Need an end time in parse_entry";

    my %entry;
    my $chunknum = 0;
    my $chunktype = 0;
    my @chunks = split(/,/, $input);
    foreach (@chunks) {
        if ($chunknum == 0)  {
            $_ =~ /^(\S+)\s+(\S+)\s+(\S+)/;
            $entry{"date"} = $1;
            $entry{"texttime"} = $2;
            $entry{"time"} = convert_to_time($2);
            $entry{"event"} = $3;

            return undef if ($entry{"time"} < $starttime || $entry{"time"} > $endtime);

            if ($entry{"event"} =~ /SWING_DAMAGE/) { $chunktype = 1; }
            elsif ($entry{"event"} =~ /SWING_MISSED/) { $chunktype = 2; }
            elsif ($entry{"event"} =~ /(SPELL\S*)|(RANGE)/) { $chunktype = 3; }
            elsif ($entry{"event"} =~ /ENVIRONMENTAL/) { $chunktype = 4; }
            elsif ($entry{"event"} eq "DAMAGE_SHIELD") { $chunktype = 3; }
        } elsif ($chunknum == 1) {
        	$entry{"sourceGUID"} = $_;
        } elsif ($chunknum == 2) {
            /^"*(.+?)"*$/; $entry{"sourceName"} = $1;
            if (!$1) {warn "Problem parsing $_ for source name";}
        } elsif ($chunknum == 3) {
            $entry{"sourceFlags"} = $_;
        } elsif ($chunknum == 4) {
            $entry{"destGUID"} = $_;
        } elsif ($chunknum == 5) {
            /^"*(.+?)"*$/; $entry{"destName"} = $1;
            if (!$1) {warn "Problem parsing $_ for dest name";}
        } elsif ($chunknum == 6) {
            $entry{"destFlags"} = $_;
        } elsif ($chunknum == 7) {
            if ($chunktype == 1) { $entry{"damage"} = $_; }
            elsif ($chunktype == 2) { $entry{"missType"} = $_; }
            elsif ($chunktype == 3) { $entry{"spellid"} = $_; }
            elsif ($chunktype == 4) { $entry{"environmentalType"} = $_; }
        } elsif ($chunknum == 8) {
            if ($chunktype == 1) {$entry{"overkill"} = $_; }
            elsif ($chunktype ==2 ) {$entry{"amountMissed"} = $_; }
            elsif ($chunktype == 3) {
				/^"*(.+?)"*$/; $entry{"spellName"} = $1;
				if (!$1) {warn "Problem parsing $_ for spell name";}
            }
            elsif ($chunktype == 4) {$entry{"damage"} = $_; }
        } elsif ($chunknum == 9) {
            if ($chunktype == 1) {$entry{"school"} = $_; }
            elsif ($chunktype == 3) {$entry{"spellSchool"} = $_; }
        }
        $chunknum++
    }

    if ( $filter->(\%entry) ) { return \%entry; }
    else { return undef; }
}

As I discovered after writing this, having a tool that makes finger-pointing in raids automatic, easy, and impersonal enough to not offend people is popular. The script started adapting, and a second version was produced, called WHA (Who Hit Arcanotron?)! From there, well, I’ve already fielded requests to write a tool for such things as tracking who got hit by Squall line. After that, well, what engineer doesn’t want to make a killer tool?

From there, “Eranthe’s WoW Log Parser” was born. It’s a general tool for building queries on combat logs that have been imported off of disk. So far, it’s not anything super fancy, except it does a bangup job at reading in 200+ megabyte log files. It’s even multithreaded! The chore of multithreading it will, I’m sure, turn into a future post about how arcane threading libraries can be, even when you’re using ones as easy as Grand Central Dispatch on MacOS X.

My parser is the first big project of a data analytic sort that I’ve tried to tackle since graduate school. It’s taught me some lessons, like that a gigabyte of RAM isn’t really a stopping point for memory consumption anymore. Huge working sets are fast and viable on a modern machine, compared to the early 2000s. Back then, I’d have had to worry about reading and writing the log to disk while maintaining a small working set in memory. That doesn’t seem to be a problem anymore.

On the other hand, I have yet to actually write the analytical code, short of hooking up an NSPredicateEditor to my NSArrayController. We’ll see what happens when I stop working on this blog’s template and get back to grinding out some code for it.

Next time: WoWCombatLog.txt, why do I hate you so much? Would it have been too hard to at least be properly comma delimited?

[[lifebloom alloc] init];

Greetings, salutations, and hello! (Insert animated-gif of construction worker here.) This is mah new blog, and I’m just getting going setting up the site. Once that’s done, I hope to share with the blogosphere my thoughts on writing a log parser for WoW combat logs in Objective-C. Should be fun!

Stay tuned!