Here at PressForward, the process of developing software for scholarly communication began with a question: How can a person make staying current with the work and conversations in their field a manageable task and still have the energy to participate in those conversations? An academic’s work requires tracking the contents of journals and book lists from relevant publishers, being aware of conference presentations and published editorials and thought pieces, and keeping up to date on discipline-specific methodology or data. But as the internet has made publishing more accessible through blogs, repositories, digitized archives and online journals and publications, the job of keeping up to date has become more than one person can handle.
This is the challenge the PressForward Initiative faced in 2011. At that time, an increasing number of scholarly conversations and materials were being shared on the open web. In emerging fields, such as digital humanities, a growing number of people were sharing and conducting their work in public. In topical or interdisciplinary fields like microbiology of the built environment, scholars were trying to encourage awareness and connections between the research in one field to another. Given this environment of abundance, in which scholarship had numerous, diverse outlets, PressForward began to search for a way to help scholars keep up with work within this changing landscape.
We knew that we needed an easy way to aggregate content from the web and share select pieces, but suspected that none existed yet. Our goal was to allow a scholar’s effort to be concentrated on the actual reading, considering, and selecting of content for distribution, rather than the process of searching it out. By facilitating the aggregation, curation, and distribution workflow, we hoped to enable research communities to surface and highlight valuable materials from the open web.
With generous funding from the Alfred P. Sloan foundation, a four-person editorial group consisting of two faculty and two graduate students at the Roy Rosenzweig Center for History and New Media began with some critical elements:
- a list of sources (RSS and Atom feeds, along with Twitter users)
- an existing publication using WordPress content management system (Digital Humanities Now)
- available labor (PressForward staff and GRAs)
The tool we produced reflects years of development, scholarship, and conversation among the faculty, grad students, readers, and feedback from editors-at-large. While the end result is specific and carefully-considered, the process itself is emblematic of the amount of time and care needed to prototype functional, flexible software. This post details that process.
Phase 1: Experiment with Existing Tools and Define Scope
Like all good projects, we started by looking to see what others were doing. We updated and expanded the formal “environmental scan” of the grant proposal, later released as a white paper on the locations of gray literature on the open web. We also took a close look at the curation tools and processes that already existed, how others created publications using them, and what was useful about them. We looked closely at the available technology and asked ourselves, “can we make modifications? Or do we need to create something entirely new?”
We then embarked on a three-year, four-part process that included simultaneously exploring and developing:
- technical functionalities
- editorial workflows
- publishing processes
- community involvement
Ultimately, we wanted to create a technology and a methodology that reflected our scholarly values. As a result, we pursued an editorial process that included consideration, collaborative discussion, contextualization, and attribution in order to experiment with the technical components of curating, and the social element of community nominations. Rather than jumping to an up/down voting or crowdsourcing system like those behind Slashdot or Reddit, we modified a familiar process and retained a role for a content editor, because, at this point in time, it more closely resembled accepted scholarly communication practices and was more likely to be accepted by scholars and practitioners.
PressForward was able to start working toward these goals by experimenting further with an already experimental publication: Digital Humanities Now (DHNow). DHNow was a great case study because it already existed as an aggregated and curated publication and had an audience that generally was open to experimentation and discovery.
In October 2011, we began to build an editorial workflow dependent upon existing tools in order to gain enough experience with the current options to define the scope of our development work and create a wish list for functionality and features. This involved using Google Reader and Google Plus to develop a procedure for locating and choosing content for DHNow.
The workflow for our 4-person editorial team involved the following steps:
- aggregate RSS subscriptions in Google Reader
- read, comment, and nominate materials using Google Reader commenting and starring functionality
- create posts of selected content in WordPress manually using “Press This” bookmarklet
- modify and prepare content for distribution on front end within WordPress Posts dashboard
We practiced this workflow and discussed selection criteria for several weeks prior to our public re-launch. However, the day before we re-launched DHNow, Google changed its product (of course). So given the change in the technology, we adjusted our methodology so that the editorial group would:
- read RSS feeds independently in Google Reader
- nominate content by sending posts to Google Plus
- discuss potential content in Google Plus using “comment” feature
- create post of selected content in WordPress manually using “Press This” bookmarklet
- modify and prepare content for distribution on front end within WordPress Posts dashboard
This process, we knew, could never be a long-term solution, because it relied on a third-party service, which could (and did!) change or disappear. It wasn’t the ideal solution for other reasons, too. The separation of content review and discussion from the platform of publication required a lot of separate steps. In addition, a lot of manual work was required to recreate the content in our WordPress installation for distribution on our site. You can see the extensive process detailed below:
For a point of comparison, at the same time we were using Google products in DHNow, we installed the open source feed reader Tiny Tiny RSS (TTRSS) for a separate prototype publication, Global Perspectives on Digital History, on our server. TTRSS behaved much like Google Reader, except the user interface was less streamlined, and it was not as fast because it was hosted on our server rather than Google’s massive servers.
The workflow used by the four editors of GPDH was similar to the Google Reader workflow in DHNow, with the starring and sharing of content occurring within the feed reader interface. There was no easy way to have a conversation about potential content, however, so that had to take place on a platform other than TTRSS and WordPress.
After several months of consistent publication (and the breakneck daily publication speed of DHNow), we were ready to have members of the community contribute to the review and nomination process. We did this for two reasons: we wanted to build a publication that reflected the interests of an ever-expanding community of practice; and we wanted to develop a replicable model for collaboratively-edited publications. We invited volunteers (called editors-at-large) to help influence the materials considered and selected for distribution, and also to assist with the work load of reviewing over 1,000 posts per week.
We designed a protocol for volunteers to subscribe to our aggregated content and share their nominations through Google Reader. When Google removed the ability to share posts from Google Reader, we incorporated a hacked script to enable the sharing to continue. This second adjustment to our process forced by changes in Google services was even more proof that reliance on an external service provider could not be a long-term solution.
Phase 2: Prepare Requirements and Development Plan
By June 2012, we had created a wish list for the ideal solution to the challenge of aggregating, curating, and disseminating web content. Our wish list included:
- one login for the aggregation, discussion, selection, and distribution process
- a way to collect text, images, and video content from web (through feeds and a bookmarklet)
- a way to retain and display the original source and attribution information of the collected content
- an easy-to-use interface and comfortable reading environment
- a reliable, stable, modular, and controllable platform
Ultimately, we determined that having one platform to host the aggregation, discussion, selection, and distribution would be the best approach. We could not adjust Google services, and although TTRSS was open source, we decided it was not the right software, nor was it a useful starting point to modify. We had confirmed that we wanted to commit to WordPress as our distribution platform, because its widespread use and committed user community suggested long-term sustainability, and ease of modification through plugin architecture.
Thus, in Summer 2012 we turned to an investigation of how we could integrate a feed reader into WordPress. WordPress already had a basic feed reader built in to the system, however it did not have the capacity nor the interface for sustained editorial work. We then had to figure out: Could we modify the core code? Or could we take that functionality and create our own plugin?
Based on our experience publishing DHNow and GPDH, our requirements of any tool we created included the following features:
- located in one single platform
- aggregates content from RSS/Atom feeds
- adds web content through bookmarklet
- includes images, text, video, etc.
- displays a comfortable reading environment
- offers a way to comment
- supports a way to hold for review
- retains post content through each step
- allows editing of content prior to appearance on front end
- retains full attribution in metadata and text of post
- logs display numbers of items aggregated and nominations taken
- modular structure allows for future improvements
We also wanted this system to be self-hosted and contained (rather than a black box), and able to run without overloading or slowing down a server or WordPress install.
After preparing our wish list and requirements, we drafted a plan for development. We asked WordPress development experts Boone Gorges (the developer of BuddyPress) and Aram Zucker-Scharff (a journalism and WordPress expert) for feedback. They responded positively, and began development in August 2012, twelve months after our initial research began. The code was organized and delivered through GitHub. The editorial group and developers tracked the progress of the plugin and options for future development in regular meetings.
As the developers worked on information architecture and basic functionality, the editorial group began to work on a potential user interface. We wrote a UX narrative and created basic wireframes, using white boards and pieces of paper prior to the creation of formal documents. We asked web designer Jeremy Boggs to help style the PressForward user interface, which relied on an easily modified version of Twitter Bootstrap. With the functionality under construction, and a UI in development, we continued to publish DHNow with the hacked-together system throughout the winter.
Phase 3: Beta testing, Using, Refining, and Documenting
By February 2013 we were able to test the plugin in sandboxes and provide bug reports. Aram Zucker-Scharff fine-tuned the plugin’s functionality, including adding a modification that improved the system for feed retrieval. At this time we also enhanced the testing to focus on reliability and usability, knowing that we needed the first and wanted the second. In addition, we considered the visual aspects of the plugin, including display options and the placement of icons.
In April and May 2013 we prepared documentation and developed a few more features for a public beta release. Fortunately, we released in mid-June 2013, two weeks before Google Reader closed. We had successfully transitioned from a third-party service just in the nick of time!
The plugin beta included the major functionalities that would be included in the public release the following year:
- aggregation of RSS feeds
- bookmarklet to collect web content
- readability integrated for a comfortable reading environment
- commenting enabled
- starring, sorting, and archiving
- nomination process to mark potential content
- under review space to separate content under consideration
- send to draft functionality
- formatting of drafted content prior to publication
- auto-redirect back to original content
- RSS out to expose all the content
- metadata retention
- exposure of custom fields possible through theme modification
Incorporating the plugin in the DHNow publication meant that we had to adjust our editorial workflow, too. Prior to the release, we had developed and confirmed the new processes in our sandbox installations. We also were able to assist a few collaborators who were beta testers in their own PressForward publications. Using the plugin meant that we were responsible for creating WordPress user accounts for our guest editors, who continued to increase in numbers. On the plus side, this meant that we also had a lot more beta testers who could report problems and add feature requests.
After the beta release, development continued with an eye toward stability and features to enhance usability. In early 2014 we began a rigorous testing process across OS and browser platforms, single and multi-site installations, and varying user roles. Documentation of the plugin code continued, and we created an extensive user manual.
Just before release, we prepared screen shots of the final UI, finalized documentation, and worked on publicity materials such as logos, pamphlets, and, of course, stickers!
Phase 4: Post-release Bug Fixes and Re-evaluation
As always seems to be the case, our initial development plan included many features that went beyond what was possible in a first release. The creation of the feed aggregator was more complicated and more difficult than expected. Moreover, because we were creating publications at the same time, we found ourselves considering many useful improvements to WordPress that would streamline our process, such as a plugin to manage and adjust the roles for the large numbers of users we had on the site.
Looking forward, we have a number of items on our development wishlist, including ability to aggregate content beyond RSS/Atom (e.g. through API, OAI-PMH, and other repository services) and easier exposure of content and action metrics. We’ll work with our new pilot partners to further develop the plugin to meet the needs of diverse users, from individuals to multi-national organizations.
As we wrap up the first PressForward grant, we are very proud of the creation and release of a documented and functioning plugin that is modifiable and usable by anyone. If you’d like to contribute, check out our GitHub page. If you’re new to plugin development, we have a set of Starter Issues that are good entry points.
 You will see that we initially thought we would create input connectors from services, and have a directory to add subscriptions. In the end, the basic creation was all that was feasible with the resources available. However, users can still find their own services and create RSS subscriptions out of them.