DH Box is really taking shape! We have a bare bones version of our server image up and running thanks to all of Steve’s hard work over the last week. We have revised our project plan with new milestone dates and a clear cut set of tasks we need to accomplish. We are working hard on everything we need to do now and also looking forward to the next phase.
User experience testing and documentation will be very important over the next few weeks. We need to be sure that people who are not already familiar with the command line, cloud computing, and DH tool installation will find DH Box easy and convenient to use. Documentation (aka the “user manual”) will be the key to helping users make the most of DH Box. We have decided to use Read the Docs to host our documentation. Read the Docs allows us to host documentation files on our website and update our documentation when pushing to the GitHub repository that hosts our website – this means updating our online documentation is as simple as updating text on our website! One great benefit of using a utility like Read the Docs is our documentation will be easily maintainable, will be forkable by contributors, will be available online, and will be searchable.
The DH Box team has made exciting strides over the past week!
As some may know, DH Box will be available on a pre-installed, pre-configured Debian cloud server. To achieve this, we are using Amazon Web Services. For those who aren’t initiated, AWS is a vast cloud computing infrastructure (with internet servers throughout the world) that offers services very similar to what a physical computer would. But AWS brings unrivaled scale, flexibility, and economy (pay as you go pricing).
DH Box’s Intro to Cloud Computing
Dennis Tenen led the DH Box team through its first group workshop on setting up a virtual web server image, a.k.a. an “EC2 Instance.” The virtual web server contains an Amazon Machine Image (think of it as an identical copy) of an operating system. DH Box will be freely available for users to launch their own instance of ours. This solution saves users the trouble of downloading and installing tools to their own computers.
What do users need to access DH Box in the Cloud?
It will be pretty simple- users must sign up for a free AWS account. And we’re making use of AWS’ CloudFormation (templates that deploy services rapidly) utility to automate many of the steps required to launch a new AMI instance. We also have custom scripts to automate the launch of DH Box files and software once users copy our server image. We’re really excited about being introduced to this powerful service, and even more encouraged that our configuration templates will allow DH Box users to dive swiftly into DH inquiry.
This is just the beginning- we’re focusing heavily on providing thorough documentation so that DH Box users will have everything they need to get up and running. Stay tuned!
Special thanks to Prof. Dennis Tenen for his amazing Intro to Cloud Computing Workshop.
In the interest of spreading the mission of DH Box far and wide, I’ve been working on a brief presentation that might also serve as an online introduction to the project. It’s available here. Take a look!
I’ll be using these slides to give a short talk about DH Box to faculty this Tuesday at Hunter College. It looks like we’ll be making quite a few presentations like this one, because as it turns out, building a community is one of the key factors determining success for DH Box. We will need the help of an invested community to:
- Determine which tools should be included
- Identify new platforms to target
- Contribute to documentation
- Spread awareness about DH Box
and it seems clear that in-person meetings and discussions are the best way for us to create interest in our work. That’s not to discount social media approaches at all; they allow for broad outreach we couldn’t manage otherwise. But in-person conversation allows us to demonstrate and discuss DH Box in greater depth, thus solidifying each potential user’s understanding and their relationship with us and our project.
This week the DH Box team reconsidered their choice of platform, with the help of Dennis Tenen, a professor at Columbia University in the Digital Humanities and New Media Studies program (and former developer with Microsoft).
A couple weeks ago we were surprised and delighted to find that another team had come up with the idea for a portable tool that could help users quickly get going with DH applications. And this week we found that Professor Tenen and colleagues had also discussed how to tackle such a project and had come up with yet a different solution! In discussing that solution, we found it matched our aim of providing an ease of quickly setting up an environment for new users and made us change our focus for both implementation and outreach.
We’ll keep a description of Professor Tenen’s proposed approach for a later post, but say that using his method circumvents big issues we would have encountered with our original proposal — what if users don’t have one of the operating systems that a DH Box install script was written for? Moreover, what if an addition works for one operating system but not others (a painful lesson Steve learned this week!)? What if unaccustomed users have issues with the install scripts? Or with the command line? This will save us a lot of user issues in the long run. We were happy to hear we wouldn’t have to give up our Raspberry Pi pursuits — Professor Tenen was also excited about the potential of the hyper portable/affordable Raspberry PI platform, suggesting a DH Box ‘lite’ version to be later produced.
So, we will be abandoning the install script approach and with it the need for a robust way to deal with different operating system issues. Our main issues will now be:
- Creating meticulous documentation to get unaccustomed users up-and-running
- Maintenance of new releases of the DH tools in DH Box
- Building of a community invested in suggesting improvements for DH Box and helping with maintenance
Professor Tenen suggested starting with GitHub for organizing pending tasks (as GitHub “Issues”) into Milestones, recording documentation on a GitHub Wiki, and inviting users to enter requests through new Issues.
Not only did Professor Tenen’s suggestions prove invaluable, but forming a relationship with him did as well — he offered to continue meeting with the DH Box team weekly, to present a workshop at CUNY on the technologies he suggested for us, and to help start our documentation based off his own.
The DH Box team is very excited to dive into our new implementation strategy and to work through how maintenance and community building will be executed.
A huge shoutout and special thank you to Professor Dennis Tenen!
The DH Box team has been working hard on defining the scope for DH Box and setting up our project plan. We’ve started using Asana as our project management tool. As the project manager, I’m really enjoying Asana. It’s flexible, easy, and it allows our team to collaborate on building the plan as we go. It’s also very nice that it tracks everything and sends out plenty of reminders!
Our scope has been narrowing down as we refine our concept of DH Box. We are thinking more about who will use DH Box and thinking about the best way to make it a valuable toolkit for introductory students in digital humanities classes.
Pedagogy is a key part of the digital humanities at the CUNY Graduate Center and the Praxis Network. Our focus for the first phase of development will be text analysis and topic modeling including key tools such as Mallet, Natural Language Toolkit (NLTK), and the Stanford Named Entity Recognizer. We are going to build an interactive textbook using IPython Notebook. The textbook will be bundled with the DH Box install scripts and it will help orient students with the tools through interactive code execution. We have also thought more about our platform and what would be most useful for our users. We are going to make DH Box available for download not only for Raspberry Pi but also for Linux, Mac, and hopefully Windows.
As we have narrowed down our scope, we are also discovering a much wider range of connections to the DH community. Our professor, Matt Gold, has put us in touch with his colleague Dennis Tenen. GC Digital Fellow Micki Kaufman suggested we check out Ian Milligan’s work and we’ve found amazing stuff in Big Digital History: Exploring Big Data through a Historian’s Macroscope, a co-written manuscript by Shawn Graham, Ian Milligan, and Scott Weingart. My library colleague Roxanne Shirazi, who edits the dh+lib blog, suggested we check out an idea for a project called DH creator stick which George Williams proposed at THATCamp Piedmont 2012 (see also a blog post by Mark Sample).
We’re amazed by the range of rich ideas we are beginning to discover. We hope to reach out to the DH community and ask for advice and feedback as DH Box takes shape.