Collect
This post is continued from the previous post.
Today, I will show an example of private decision making with the three key steps, Collect, Viz and Imagine. As the example, I will take the case of joining a meetup. Suppose that you are going to join an event, but you do not know most of participants. In this event, participants need to register the event site and show their profiles. Some make their twitter accounts open. You want to make acquaintance in this event. This is the start point.
Collect
Most of event sites have web APIs, for example, Meetup, Eventbrite, ATND. To write a few lines code, you can get various data in relation to events. Okay, let's take ATND as an example. ATND is provided by Recruit, Japanese famous company, and I usually use this site. For this purpose, I have prepared the R package, firstdate. You can install the package with devtools::install_github. After the installation, you can acquire the data from API with just one line. Here's the code.
# devtools::install_github("dichika/firstdate")
library(firstdate)
users <- getATNDEventUsers(eventid=48048)
colnames(users)
## [1] "nickname" "status" "twitter_img" "twitter_id" "user_id"
The data is composed of five columns, twitter_id, user_id, status, twitter_img, nickname. Next let's see how many people have twiter accounts.
have_twitter <- sum(!is.na(users$twitter_id))
cat(have_twitter,"/", nrow(users), "(",round(100*have_twitter/nrow(users),1), "% ) people have twitter accounts.")
## 65 / 91 ( 71.4 % ) people have twitter accounts.
71.4% of attendantces have twitter accounts.
What topics are they interested in? The choice of topics for a talk is very important step. Let's collect their recent tweets. To get tweets, you need the authorization of Twitter. You have to register your accounts as the twitter developers and get API key and API secret. With your key and your secret, you can acquire recent 180 tweets of participants. Today, we will get first 10 particpants as the example.
tweets <- ldply(participants[participants$twitter_id!="holidayworking",]$twitter_id,
function(x)getTwitter(x, key="Your key", secret = "Your secret"))
Can you get tweets successfully? Okay, let' move next step, Vis.
Viz
For vizualization, you can choose various methods, for example, network, timeline, simple bar charts and so on. In this case, we try to visualize tweets as simple timelines. As follows, the frequency and the contents have variation among participants.
visTL(tweetdata=tweets, group="name", path="twTL.html")
browseURL("twTL.html")
Imagine
The final step is Imagine. This is a simple step. You just imagine the tweets of participants and what topic they are interested in. In some case, mathematical models and machine learning methods will help your imagination. For example, LDA might summarise messy tweet data and extract topics. At present I do not implement any methods to firstdate but will do it soon.
So far, I have introduced three steps for private decision making, Collect, Viz and Imagine. In this blog I will keep introducing other useful cases.