March 14, 2014

Exploring the github archive

Recently, many people show their own package on Github. But there is no archive site speciaized for R on Github, so we have to read blogs, twitter, Google+, and so on… to search for the information of useful packages. The seek time is very enjoyable but it takes much time… Then I have decided to build the archive site for personal use.

There are two steps to the goal.

First, I get github archive data with Google Big Query.

Second, I build the site from the data with rCharts and RPubs.

Here's the result. I only show the head (n= 50). As you can see, it isn't perfect. There are many duplicate repositories, repositories with no content… Next, I will clean this data and make the archive site with them.

No comments:

Post a Comment