Mining the Social Web

O’Reilly recently held a 1/2 price sale for electronic versions of some of their newer books. I bit on Mining The Social Web 2nd Edition by Matthew Russell. Having just started the book, I’m not yet in a position to really comment, but I was intrigued by several aspects. Basically the book is about using the Python programming language to access social media sites through their API (application programming interface).

Instead of having the reader set up Python on your own machine, he created a virtual environment that could be run on any platform using a program Vagrant. The VM uses a Python teaching framework called IPython Notebook which allows you to have interactive code examples embedded in text.
The combination of the virtual machine, Vagrant and IPython Notebook obviates the need to document and maintain all the possible permutations of each platform (Mac/Windows/Linux) and version of Python, all the related ancillary libraries, and whatever is needed to install and maintain updates. 
The repository for all of the Python code used in the book and the setup for the Vagrant virtual machine is on GitHub. GitHub is a cloud-based version control system, with a lightweight social media overlay that allows people to collaborate on programming projects. 
Vagrant can use different virtual machine software, but it just happens that the setup for the iPython Notebook provided uses the command-line version of VirtualBox. Vagrant installs an instance of Ubuntu 12. with all of the provisions for hosting Python and the iPython Notebook, as well as a web server which makes this available on port 8888.  So, the installation sequence for Windows is:
1. Download and install VirtualBox if you haven’t already.
2. Download and install Vagrant.
3. Establish an account on GitHub
4. Download the GitHub client. This includes two applications, the GitHub terminal, and the GitHub GUI Manager.
5. Using Git, “Clone” the GitHub directory for Mining the Social Web Second Edition. This makes a copy of the GitHub directory on your local machine.
6. Start a command line session, and CD to the MTSWSE directory (whatever you’ve named it) Run vagrant up. This starts the creation of the virtual machine, and the full provisioning. It is about a twenty minute process. And, to be sure, given all of the output that it generates, it looks as if a multi-hour job has been automated.

A couple troubleshooting tips. If Vagrant looks like it is installing “the default VM”,  or if you see a reference to PROCESS32 you’ve somehow missed the GitHub part… maybe the directory is wrong? What you should expect to see instead is PROCESS64.

If correct, the provisioning process will end with a line
DEBUG: Exiting. 

The real sign of success is to open a web browser and navigate to http://localhost:8888
That will bring up the main page of the IPython book..

My favorite error message to date:
DL is deprecated, use FIDDLE.  

The Vagrant documentation refers to another program called Chef. Chef manages the provisioning of servers, and is used especially in cloud,VM and multi-server installations to manage all of the server instances. 


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s