Analyzing Millions of GitHub Commits

https://www.youtube.com/watch?feature=player_embedded&v=U_LNo_cSc70

It’s a very interesting talk, what’s more interesting is the tool developed by the presenter.

You can find the slides here: http://www.igvita.com/slides/2012/bigquery-github-strata.pdf

The tool: http://www.githubarchive.org/ Github page: https://github.com/igrigorik/githubarchive.org/tree/master/bigquery

I actually tried a query to search the top 100 Android repository by push

Here’s my query on https://bigquery.cloud.google.com/: ”

SELECT repository_name, count(repository_name) as pushes, repository_description, repository_url
FROM [githubarchive:github.timeline]
WHERE type=”PushEvent”
AND repository_language=”Java”
AND PARSE_UTC_USEC(created_at) >= PARSE_UTC_USEC(‘2012-04-01 00:00:00’)
AND repository_description CONTAINS “Android”
GROUP BY repository_name, repository_description, repository_url
ORDER BY pushes DESC
LIMIT 100

Here is the result

From the result I immediately spotted that several projects in cyanogenmod repository is on the List. From that I know that cyanogenmod(http://www.cyanogenmod.org/) is aftermarket firmware.

query1