Open@VT

Open Access, Open Data, and Open Educational Resources

Author Archives: Philip Young

A New Issue of Virginia Libraries on “Exploring Openness”

Virginia Libraries cover

Virginia Libraries (cover design by Brian Craig)

Virginia Libraries, the journal of the Virginia Library Association, has recently undergone some significant changes. Formerly a non-peer reviewed quarterly, it’s now an annual peer-reviewed volume, with a first issue on the theme “Exploring Openness” (full disclosure: I was a peer reviewer for two articles submitted for this issue, and fellow blogger Anita Walz authored an article on OER). A broad range of open-related topics is addressed, but for the sake of brevity I’d like to highlight two standout articles (please do check out the full table of contents).

The hype over MOOCs may be past, but I think dismissing them completely is premature. In Just How Open? Evaluating the “Openness” of Course Materials in Massive Open Online Courses (PDF), Gene R. Springs (The Ohio State University) examines the status of texts assigned in 95 courses offered by Coursera or edX. Of 49 courses listing a textbook, 20 of these were freely available; of 44 courses listing or linking to non-textbook readings, 31 linked to or embedded only freely available resources. It’s great to have this quantitative data on MOOC openness. There’s much more data in the article, which is a welcome contribution to the MOOC literature.

The second standout article in this issue is Contextualizing Copyright: Fostering Students’ Understanding of Their Rights and Responsibilities as Content Creators (PDF) by Molly Keener (Wake Forest University). It’s important that students know about the bundle of rights known as copyright both as consumers and creators in the knowledge ecosystem. Keener’s information literacy instruction employs scenarios relevant to students (included as an appendix) and incorporates copyright-related aspects of popular culture. Clearly such instruction is needed:

Most students are unaware that they own copyrights, or that simply because a photograph is free to access online does not mean that it is free to be reused.

Every university should have this kind of instruction to help students understand the environment in which information is created and used. Keener’s article is highly recommended.

While there’s almost everything to like about the new direction Virginia Libraries is taking, one oversight by the editorial board should be pointed out. At the bottom of the table of contents (PDF) the journal states the following:

The Virginia Library Association firmly espouses open access principles and believes that authors should retain full copyrights of their work. The agreement between Virginia Libraries and the author is license to publish. The author retains copyright and thus is free to post the article on an institutional or personal web page subsequent to publication in Virginia Libraries. All material in the journal may be photocopied for the noncommercial purpose of educational advancement.

It’s great that authors can retain copyright, but a journal cannot “firmly espouse open access principles” without openly licensing the content. Peter Suber succinctly defined OA as “digital, online, free of charge, and free of most copyright and licensing restrictions.” This means content should not just be available but also openly licensed (many get the first part but not the second). Leading OA journals have published thousands of articles under a Creative Commons Attribution (CC BY) license, which gives re-use permissions in advance. It’s also the license for this blog. Librarians should be more aware than most about copyright restrictions for sharing research, and Anita’s article in this issue gives a full list of Creative Commons licenses. Hopefully the editorial board will make Virginia Libraries fully OA by licensing future issues CC BY.

The co-editors of this special issue, Candice Benjes Small and Rebecca K. Miller, deserve praise for its quality and for helping the journal begin a new direction. Virginia Libraries is now seeking a volunteer to be the new editor (see the position description). Interested applicants should send a cover letter and résumé to Suzy Szasz Palmer at palmerss@longwood.edu by July 24, 2015.

A Response to Jeffrey Beall’s Critique of Open Access

I recently became a member of the American Association of University Professors (AAUP) and today was dismayed to see Jeffrey Beall’s article What the Open-Access Movement Doesn’t Want You to Know in the latest issue of its journal, Academe. (I joined because as a member of Virginia Tech’s Faculty Senate, AAUP has been helpful in advising us on increasing the role of Faculty Senate in university governance.)

For those who may not know, Jeffrey Beall is a librarian at the University of Colorado-Denver, and through his blog Scholarly Open Access exposes academic “predatory publishers” (pay-to-publish scams that perform little to no peer review) and other sketchy doings in academic publishing. While this is a tremendous service to the scholarly community, he has unfairly blamed these problems on open access as a whole. It became apparent just how off the rails Beall had gone when he published The Open-Access Movement is Not Really about Open Access in the journal TripleC (in the non-peer reviewed section; also see Michael Eisen’s response, Beall’s Litter). If you enjoy right-wing nuttiness (yes, George Soros is involved) you really should read it.

Beall’s critiques of open access are not always as factual as they could be, so as an open access advocate I am concerned when his polemics are presented to an academic audience that may not know all the facts. So below is my response to selections from his article:

The open-access movement has been around for more than a dozen years

Actually it has been around longer than that- Stevan Harnad made his “subversive proposal” in 1994 on a Virginia Tech email list.

The open-access movement is a coalition that aims to bring down the traditional scholarly publishing industry and replace it with voluntarism and server space subsidized by academic libraries and other nonprofits. It is concerned more with the destruction of existing institutions than with the construction of new and better ones.

This is quite an evidence-free paragraph. Where is the coalition, and where is the goal stated of bringing down the traditional scholarly publishing industry? Who has said all we need is voluntarism and server space? No one I know of.

The movement uses argumentum ad populum, stating only the advantages of providing free access to research and failing to point out the drawbacks (predatory publishers, fees charged to authors, and low-quality articles).

There is frequent discussion of these problems. Credit Beall for bringing attention to predatory publishers, but it’s less of a problem than he makes it out to be (and one seemingly devoid of data- Beall would strengthen his claims if he could document the number of authors victimized and/or the amount of money lost). A majority of open access journals do not charge authors, and those that do usually have waivers. There are also plenty of high-quality open access journals like PLOS Biology, generally considered tops in its field. And we know that “low-quality articles” could never appear in a subscription journal.

It’s hard to argue against “free”—and free access is the chief selling point of open-access publishing…

Actually open access is not just about “free.” OA means free as in cost (to the reader) but also free as in freedom (open licensing). As a librarian, Beall should know the barriers that copyright presents in the use of scholarship by libraries and researchers. OA advocates know that scholarly publishing does cost something, and are actively working on alternatives to the broken subscription model.

In the so-called gold open-access model, authors are charged a fee, called the “article processing charge,” upon acceptance of a manuscript.

This is simply wrong. Gold open access describes OA journals that publish peer-reviewed articles. A majority of them do not have an article processing charge (APC). APCs are just one model of providing open access. It’s true that predatory publishing is based on this model as a money-making scam. This is why authors need to know something about the journals where they submit articles.

Some publishers and journals do not charge fees to researchers and still make their content freely accessible and free to read. These publishers practice platinum open access, which is free to the authors and free to the readers.

“Platinum” open access must be Beall’s invention, because no one else uses this term. Open access journals (“gold” open access) includes journals with fees and those without fees.

A third variety of open-access publishing, often labeled as green open access, is based in academic libraries…

Lots of libraries do have repositories, but it’s not accurate to say that all (or even most) archiving is based there. There are plenty of disciplinary repositories, and for-profit ones like Academia.edu.

…the green open-access movement is seeking to convert these repositories into scholarly publishing operations. The long-term goal of green open access is to accustom authors to uploading postprints to repositories in the hope that one day authors will skip scholarly publishers altogether.

Maybe some think this, but I wouldn’t call it widespread. Most scholarly publishing in libraries (that is, journal or monograph publishing) is a separate operation from article archiving. And no one thinks peer review can be skipped, which seems to be an implication here.

Despite sometimes onerous mandates, however, many authors are reluctant to submit their postprints to repositories.

This is unfortunately true, but Beall doesn’t mention that many of the “onerous mandates” were passed unanimously by the same faculty members who must observe them, because they became convinced of the benefits of open access to research.

Moreover, the green open-access model mostly eliminates all the value added that scholarly publishers provide, such as copyediting and long-term digital preservation.

Most OA advocates agree that scholarly publishers provide value- after all, some of them publish OA journals. But the choice of examples is odd. I’m one of many authors who has had the experience of copy editing actually introducing errors into my carefully composed article. And in some cases repositories are a better bet for long-term digital preservation than journals, which can stop publishing without a preservation plan. In short, the value added that is claimed by many publishers is coming under question, and rightfully so in my view.

The low quality of the work often published under the gold and green open-access models provides startling evidence of the value of high-quality scholarly publishing.

This makes little sense. An archived (“green”) article can be of the highest quality and may have been published in one of the prestigious journals Beall venerates. And again, there are many well regarded open access journals.

When authors become the customers in scholarly communication, those with the least funds are effectively prevented from participating; there is a bias against the underfunded.

Many OA advocates have identified the same problem with APCs, especially for authors from the developing world. But many of these journals have waivers, most OA journals don’t have charges, and new models are being developed that subsidize journals without charge to either author or reader. It’s not accurate to portray fee-based publishing as the only open access model.

Subscription journals have never discriminated on the basis of an author’s ability to pay an article-processing charge.

No, they just discriminate against libraries.

Gold open access devalues the role of the consumer in scholarly research… Open access is making readers secondary players in the scholarly communication process.

This is just laughable. Yes, we should feel sorry for all those readers who can freely access all the peer-reviewed research that their tax dollars likely paid for.

In the next section of his article, “Questioning Peer Review and Impact Factors” Beall mostly critiques the doings of predatory publishers, which no one really disputes. But in criticizing predatory publishers (again unfairly extending his critique to all open access publishing) he gives subscription publishing a free pass. If you don’t think bad information has appeared in prestigious peer-reviewed subscription journals, try searching “autism and immunization” or “arsenic life.” Beall’s reverence for the journal impact factor isn’t supported by any facts (see my post Removing the Journal Impact Factor from Faculty Evaluation). So predatory publishers using fake journal impact factors shouldn’t be a concern- it’s a bogus metric to start with. Moreover, Beall fails to acknowledge that open peer review, in whatever form, would largely solve the problem of predatory publishing. If a journal claims to do peer review, then let’s see it!

If you’re an author from a Western country, the novelty and significance of your research findings are secondary to your ability to pay an article-processing charge and get your article in print.

Again- waivers are available and the majority of OA journals don’t have fees. It’s interesting that Beall uses words like “novelty” and “significance” here, as if unaware of real problems in peer review caused by these assessments (which are not attributable to predatory publishing).

Open-access advocates like to invoke the supposed lack of access to research in underdeveloped countries. But these same advocates fail to mention that numerous programs exist that provide free access to research, such as Research4Life and the World Health Organization’s Health Internetwork Access to Research Initiative. Open access actually silences researchers in developing and middle-income countries, who often cannot afford the author fees required to publish in gold open-access journals.

Once again, OA is not all about fees. It’s also odd that so many people from the developing world are huge open access advocates. Beall fails to mention that the large publishing companies have a lot of control over which countries get access and which do not. If they decide that India, for example, can afford to pay, then they don’t provide access. Wider open access would make these programs unnecessary. The main thing silencing researchers in developing countries is basic access to research, which inhibits their own research efforts.

…the top open-access journals will be the ones that are able to command the highest article-processing charges from authors. The more prestigious the journal, the more you’ll have to pay.

There may be some truth to this, and it’s a concern I share. However, APCs may be subject to price competition (an odd omission from someone who is so market-oriented). Beall has identified the biggest problem to my mind, which is journal prestige. Prestige means that mostly we are paying for lots of articles to be rejected, which are then published elsewhere. Academia needs to determine whether continuing to do this is very smart, and whether other sources of research quality or impact might be available.

The era of merit in scholarly publishing is ending; the era of money has begun.

Another laugher. Beall must be unaware of his own library’s collections budget, or the 30-40% annual profit made by Elsevier, Wiley, Informa, etc. If he is concerned about merit (and especially predatory publishing), he ought to be advocating for some form of open peer review.

Most open-access journals compel authors to sign away intellectual property rights upon publication, requiring that their content be released under the terms of a very loose Creative Commons license.

As opposed to subscription journals, most of which which compel authors to transfer their copyright? Many open access journals allow authors to retain copyright.

Under this license, others can republish your work—even for profit—without asking for permission. They can create translations and adaptations, and they can reprint your work wherever they want, including in places that might offend you.

Wouldn’t it be awful to have your work translated or reprinted? I mean, no one actually wants to disseminate their work, do they? This is mostly scare-mongering about things that might happen .001% of the time. And because of the ever-so-slight chance someone might make money from your work, or it might be posted to a site you don’t agree with, we shouldn’t share research? This blog is licensed CC BY, and I don’t care if either of those things happen. What’s not logical is for these largely unfounded fears to lead us back to paywalls and all-rights-reserved copyright.

Scholarly open-access publishing has made many tens of thousands of scholarly articles freely available, but more information is not necessarily better information.

I don’t think anyone has ever claimed this. Even if there were only subscription journals, there would be new journals and more articles published.

Predatory journals threaten to bring down the whole cumulative system of scholarly communication…

I think there may be some exaggeration here.

In the long term, the open-access movement will be seen as an ephemeral social cause that tried and failed to topple an industry.

Open access is not looking very ephemeral at the moment. The “industry” seems to be trying to find ways to accommodate it so they don’t go out of business. Open access advocates are not necessarily against the “industry,” just the broken subscription/paywall model they use. Indeed, traditional publishers like Elsevier and Wiley are profiting handsomely from hybrid open access, and starting OA journals or converting existing ones to open access.

Be wary of predatory publishers…

Finally, something we can agree on!

Book Review: Issues in Open Research Data

Issues in Open Research Data

Moore, Samuel A. (ed.), Issues in Open Research Data (London: Ubiquity Press, 2014).

Bringing together contributed chapters on a wide variety of topics, Issues in Open Research Data is a highly informative volume of great current interest. It’s also an open access book, available to read or download online and released under a CC BY license. Three of the nine chapters have been previously published, but benefit from inclusion here. In the interest of full disclosure, I’m listed as a book supporter (through unglue.it) in the initial pages.

In his Editor’s Introduction, Samuel A. Moore introduces the Panton Principles for data sharing, inspired by the idea that “sharing data is simply better for science.” Moore believes each principle builds on the previous one:

  1. When publishing data, make an explicit and robust statement of your wishes.
  2. Use a recognized waiver or license that is appropriate for data.
  3. If you want your data to be effectively used and added to by others, it should be open as defined by the Open Knowledge/Data Definition— in particular, non-commercial and other restrictive clauses should not be used.
  4. Explicit dedication of data underlying published science into the public domain via PDDL or CC0 is strongly recommended and ensures compliance with both the Science Commons Protocol for Implementing Open Access Data and the Open Knowledge/Data Definition.

In “Open Content Mining” Peter Murray-Rust, Jennifer C. Molloy and Diane Cabell make a number of important points regarding text and data mining (TDM). Both publisher restrictions and law (recently liberalized in the UK) can block TDM. And publisher contracts with libraries, often made under non-disclosure agreements, can override copyright and database rights. This chapter also includes a useful table of the TDM restrictions of major journal publishers. (Those interested in exploring further may want to check out ContentMine.)

“Data sharing in a humanitarian organization: the experience of Médecins Sans Frontières” by Unni Karunakara covers the development of MSF’s data sharing policy, adopted in 2012 (its research repository was established in 2008). MSF’s overriding imperative was to ensure that patients were not harmed due to political or ethnic strife.

Sarah Callaghan makes a number of interesting points in her chapter “Open Data in the Earth and Climate Sciences.” Because much of earth science data is observational, it is not reproducible. “Climategate,” the exposure of researcher emails in 2009, has helped drive the field toward openness. However, there remain several barriers. The highly competitive research environment causes researchers to hoard data, though funder policies on open data are changing this. Where data has commercial value, non-disclosure agreements can come into play. Callaghan notes the paradox that putting restrictions on collaborative spaces makes sharing more likely (the Open Science Framework is a good example). She also shares a case in which an article based on open data was published three years before the researchers who produced the data published. It is becoming likely that funders will increasingly monitor data use and require acknowledgement of data sources if used in a publication. Data papers (short articles describing a dataset and the details of collection, processing, and software) may encourage open data. Researchers are more likely to deposit data if given credit through a data journal. However, data journals need to certify data hosts and provide guidance on how to peer review a dataset.

In “Open Minded Psychology” Wouter van den Bos, Mirjam A. Jenny, and Dirk U. Wulff share a discouraging statistic: 73% of corresponding authors failed to share data from published papers on request. A significant barrier is that providing data means substantial work. Usability can be enhanced by avoiding proprietary software and following standards for structuring data sets (an example of the latter is OpenfMRI). The authors discuss privacy issues as well, which in the case of fMRI includes a 3D image of the participant’s face. The value of open data is that data sets can be combined, used to address new questions, analyzed with novel statistical methods, or used as an independent replication data set. The authors conclude:

Open science is simply more efficient science; it will speed up discovery and our understanding of the world.

Ross Mounce’s chapter “Open Data and Palaeontology” is interesting for its examination of specific data portals such as the Paleobiology Database, focusing in particular on the licensing of each. He advocates open licenses such as the CC0 license, and argues against author choice in licensing, pointing out that it creates complexity and results in data sharing compatibility problems. And even though articles with data are cited more often, Mounce points out that traditionally indexing occurs only for the main paper, not supplementary files where data usually resides.

Probably the most thought-provoking yet least data-focused chapter is “The Need to Humanize Open Science” by Eric Kansa of Open Context, an open data publishing venue for archaeology and related fields. Starting with open data but mostly about the interaction of neoliberal policies and openness, the chapter deserves a more extensive analysis than I can give here, but those interested in the context against which openness struggles may want to read his blog post on the subject, in addition to this chapter.

Other chapters cover the role of open data in health care, drug discovery, and economics. Common themes include:

  • encouraging the adoption of open data practices and the need for incentives
  • the importance of licensing data as openly as possible
  • the challenges of anonymization of personal data
  • an emphasis on the usability of open data

As someone without a strong background in data (open or not), I learned a great deal from this book, and highly recommend it as an introduction to a range of open data issues.

Open Data Day/CodeAcross Event Recap

Blacksburg’s first celebration of Open Data Day and CodeAcross was organized by Code for NRV, our local Code for America brigade, and the University Libraries, which hosted the event in Newman Library’s Multipurpose Room. Originally scheduled for Saturday, February 21 (the official Open Data Day observed in hundreds of cities around the world), due to rapidly accumulating snow we had to postpone until Sunday. As it turned out, a water leak closed the library around mid-day Saturday, so things worked out for the best. (Our apologies to registrants for the sudden change in plans.)

Open Data Day logo

The first event of the morning was a mapping roundtable led by Peter Sforza, director of the Center for Geospatial Information Technology at Virginia Tech. In addition to looking at a lot of cool maps, we identified three potential areas for collaboration:

  • 3D Blacksburg – an effort to develop a common, shared 3D spatial reference model for Blacksburg and the New River Valley.
  • Contributing more authoritative data to OpenStreetMap for Blacksburg and Virginia by working with GeoGig.
  • Opening data that CGIT compiles for projects and research, for example crash data from the Virginia Department of Transportation.
Mapping Roundtable

Peter Sforza Leads the Mapping Roundtable

For the journalism roundtable, we were joined by Scott Chandler, Design/Production Adviser for the Educational Media Company at Virginia Tech, and Cameron Austin, former editor of the Collegiate Times. One problem the CT has is finding/keeping programmers to help with data, such as their academic salaries database. Code for NRV will try to help with recruitment. A database of textbook costs was identified as a possibility to work on that would be of particular interest to students.

Blacksburg town council member Michael Sutphin joined us for the public policy roundtable, which included interesting discussions of town planning notifications and ways to encourage citizen engagement (such as the underutilized site Speak Up Blacksburg). Some of the project ideas included:

  • Visualizations of the town’s historical budget data that could benefit the public and town officials.
  • Opening the raw data used to create tables and maps in the town’s comprehensive plans.
  • Analysis of emails to and from local government officials to create visualizations of the most commented on topics in the town, e.g. word clouds and tag lists.

Our hackathon emerged from the morning’s mapping roundtable, so perhaps it’s not surprising that the projects were geographic in nature:

  • One volunteer used the Virginia Restaurant Health Inspection API created by Code for Hampton Roads to create a map of Blacksburg restaurants and their health scores.
  • An architecture student started a project that will use open 3D geospatial data from Virginia Tech to design pathways that are sculpted for the landscape.
  • Researchers from the Virginia Bioinformatics Institute adapted a model used in Ebola research to optimize placement of EMS staging areas during flood emergencies in Hampton Roads, Virginia. The model uses open data sets like the location and elevation of every roadway in Virginia to determine which streets would still be navigable during a flood.
Waldo Jaquith

Waldo Jaquith

To kick off our events Friday evening, we were very happy to have Waldo Jaquith speaking on “Open Government Data in Virginia” prefaced by a brief introduction to Open Data Day/CodeAcross by Ben Schoenfeld, co-leader of the Code for NRV brigade. Waldo Jaquith is the director of the U.S. Open Data Institute, an organization building the capacity of open data and supporting government in that mission. See the video of his talk below.

Thanks to everyone who turned out Friday and/or Sunday!

Thanks to the University Libraries’ Event Capture Service for the video below.

Learn About Open Data at Open Data Day/CodeAcross!

Join us for Blacksburg’s first observance of Open Data Day/CodeAcross, organized by Virginia Tech’s University Libraries and Code for NRV, our local Code for America brigade, this Friday and Saturday, February 20-21, 2015. We will be one of more than 100 Open Data Day and CodeAcross events taking place around the world on February 21. We welcome area residents and local government officials as well as faculty, staff, and students at Virginia Tech to find out how open data can improve our community (coding not required!). Registration is requested to help us with logistics, and for VT faculty, NLI credit is available (look for the sign-in sheet as well).

Waldo Jaquith

Waldo Jaquith

Friday, February 20, 2015
5:30pm to 7:00pm
Newman Library Multipurpose Room (first floor)

To kick off our events, we are very pleased to have Waldo Jaquith speaking on “Open Government Data in Virginia” which will be followed by a brief introduction to Open Data Day/CodeAcross. Waldo Jaquith is the director of the U.S. Open Data Institute, an organization building the capacity of open data and supporting government in that mission. In 2011, in acknowledgement of his open data work, Jaquith was named a “Champion of Change” by the White House and, in 2012, an “OpenGov Champion” by the Sunlight Foundation. He went on to work in open data with the White House Office of Science and Technology Policy. Jaquith, a 2005 Virginia Tech graduate, lives near Charlottesville, Virginia with his wife and son.

Open Data Day logo

Saturday February 21, 2015
9:30am to 5:00pm (lunch provided)
Newman Library Multipurpose Room (first floor)
Registration requested!

Open Data Day/CodeAcross will offer three tracks for coders and non-coders alike. First, there will be a sequence of one-hour discussion roundtables led by experts on the relationship of open data with mapping (10am), journalism (11am), public policy (1pm), health (2pm), and research (3pm). Second, there will be a mapping project emerging from the mapping roundtable and lasting the rest of the day. Third, for the coders there will be a hackathon using open government data in Virginia. Around 4pm, we will gather together, talk about our projects and what we learned, and plan for the continuation of projects. Attendees may move between these three strands as they like- or just come for one roundtable. Lunch is provided! While all events are free and open to the public, please register online to help us plan for the roundtables, lunch, and wireless access for those without a Virginia Tech affiliation. If you have questions, please contact me, Philip Young at pyoung1@vt.edu or 540-231–8845. Hope to see you there! #OpenDataDay #CodeAcross

CodeAcross logo

Open@VT on Mastodon

Loading Mastodon feed...