Carrying on in the EventStore series...
Okay, back to more practical things now we've covered how easy temporal queries are with the event store.
Ever wondered how happy developers from different languages were? Well, let's find out
First off, I downloaded a list of words for both positive and negative sentiment from the internet, here are the references to the studies done which provided these word lists for use:
Minqing Hu and Bing Liu. "Mining and Summarizing Customer Reviews."
Proceedings of the ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD-2004), Aug 22-25, 2004, Seattle,
Washington, USA,
Bing Liu, Minqing Hu and Junsheng Cheng. "Opinion Observer: Analyzing
and Comparing Opinions on the Web." Proceedings of the 14th
International World Wide Web conference (WWW-2005), May 10-14,
2005, Chiba, Japan.
So, how to use this? Well, I just pasted the list of words into a file in vim, and ran a macro over them to convert them into two arrays like so:
var happyWords = [ "yay", "funsome", "winsome" ]
var sadWords = [ "boo", "crap", "lame" ]
There are actually about 5000 words in total, but essentially what I'm going to do is partition by language and keep a count of
Now, real sentiment analysis is a little more complicated than simply looking for words, but we'll be happy with this for now, let's have a look at the projection:
function collectHappinessIndexOfCommit(commit, state) {
var index = 0
for(var i in happyWords) {
if(commit.message.indexOf(happyWords[i]) >= 0)
state.happycount++
}
for(var i in sadWords) {
if(commit.message.indexOf(sadWords[i]) >= 0)
state.sadcount++
}
state.commits++
}
fromStreams(['github-commits'])
.partitionBy(function(ev) {
if(ev.body.repo)
return ev.body.repo.language
})
.when({
"$init": function() {
return {
commits: 0, sadcount: 0, happycount: 0
}
},
"Commit": function(state, ev) {
collectHappinessIndexOfCommit(ev.body.commit, state)
},
})
I guess I'll say that my "happiness index" can be expressed by
var index = happycount / sadcount
Or something similar (not the point of this post, if you want to change it then modify the JS on this page..), let's have a look at the chart of happiness over languages
Wow, look at those guys writing Delphi! Presumably they've got the best work/life balance ever known, or they know something the rest of us don't. The folk doing Puppet? I guess when your job is automating the crap that nobody else wants to touch you're going to be pretty miserable most of the time ;-)
Actually, most of the "old school" languages hang around to the right and the "new school" to the left - is this an indication that unhappy people jump ship sooner than others?
Note: The differences are actually hilariously small, and although there is a huge amount of data it is likely not statistically that relevant, this is just a bit of fun
2020 © Rob Ashton. ALL Rights Reserved.