I've gotten most of the second part of Violator done...just need to do some graphical output and clean things up a bit. This is the real reason I changed the name of DUmpDiver to Violator. It "violates" the DUmmies a lot more.
See, Violator is a "feeder application" for Margaret. As it counts the DUmmies, it reads, stores and indexes EVERY NAUSEATING WORD they type.
The purpose of doing this is to enable psycholinguistic analysis of the stinking commies that post at the DUmp.
Based on the work of Dr. J.W. Pennebaker, the DUmmy posts are analyzed against the Linguistic Inquiry Word Count (LIWC)...
http://www.liwc.net/liwcdescription.phpLIWC is a dictionary of words that are categorized by psychological meaning. Pennebaker et al have proven conclusively that word use frequency in a particular category is indicative of overall psychological state.
Here is an example of output from Margaret (sorry it's in XML)...
<forum name="GD">
<category name="anger" terms="121" termsUsed="79372" score="1.69882335314923" />
<category name="comm" terms="127" termsUsed="134292" score="2.87429302198655" />
<category name="death" terms="29" termsUsed="10884" score="0.23295360297934" />
<category name="i" terms="9" termsUsed="38160" score="0.816750228747853" />
<category name="negemo" terms="345" termsUsed="179800" score="3.8483147570457" />
<category name="optim" terms="70" termsUsed="61079" score="1.30729264207783" />
<category name="posemo" terms="265" termsUsed="209241" score="4.47844954437708" />
<category name="posfeel" terms="43" termsUsed="31705" score="0.678591876374494" />
<category name="sad" terms="72" termsUsed="32320" score="0.691754910721452" />
<category name="sexual" terms="49" termsUsed="20159" score="0.431469283577777" />
<category name="swear" terms="32" termsUsed="27159" score="0.581292438746408" />
<category name="we" terms="11" termsUsed="24948" score="0.533969725021002" />
</forum>
This is based on a one day Violator scan covering yesterday (the Glen Beck gathering in DC).
I am analyzing their posts in 12 categories:
anger
comm (communication words)
death
i (talking about themselves in first person)
negemo (negative emotion words)
optim (optimistic words)
posemo (positive emotion words)
posfeel (positive feeling words)
sad
sexual
swear (dirty words)
we (collective)
This line
<category name="anger" terms="121" termsUsed="79372" score="1.69882335314923" />
says "In the anger category, there are 121 terms, there were 79372 usages of these 121 terms. The LIWC score is 1.698..."
The LIWC score is a percentage of the total number of terms divided by the number of times the category terms (like anger) were used. The higher the number, the higher the psychological category.