Research Policies Committee

Friday, December 8, 2006

10:00-12:00 Noon

4006 Fleming

 

RPC members present: Carl Akerlof, Jerald Bachman, Michael Combi, Mary Haan (chair), Josephine Kasa-Vubu, Angela Kuznia, Cynthia Marcelo, Michelle Sargent (SACUA student support), Kristin Seefeldt, Elizabeth Young 

RPC members absent: Toni Antonucci, Chris Baldwin, David Blair, Usha Pasupuleti, Carol Persad, Kazuhiro Saitou, Qiang Zhu 

OVPR staff present: Lois Brako, Judy Nowack, Jim Shayman, Jacqueline Hoats-Shields (committee staff) 

Guests: Margaret Hedstrom (agenda item 2), James Randolph (agenda items 2 and 3) 

The meeting was called to order at approximately 10:07 a.m. 

1.      Continued discussion of the draft revised Intellectual Property Policy

Judy Nowack explained that OVPR would like RPC to have a chance to react to the policy before it is presented to the Regents for approval later this month. Vice President Forrest will be talking to SACUA about it on Monday and Mary Haan is also invited to that session.  Prof. Haan opened the discussion and various questions were addressed. The differences between the revision and the earlier version of the policy were described, the major  change directly affecting faculty members being the lifting of the prior requirement that inventors waive the ordinary royalties through the University when they are also involved with start-up companies that are licensing the invention from the University.  It was noted that this change brings University policy in step with our peers.  Ms. Nowack said that at the last RPC meeting OVPR heard about need for additional guidance for graduate students and this will be put in place, but does not entail more changes to the policy itself.  Prof. Haan asked if the committee was ready to endorse the policy and said a vote was needed.

Jerald Bachman: Motion to endorse the revised Intellectual Property Policy.

Cynthia Marcelo:  Seconded motion.

Vote: All in favor of endorsement. 

2.      Data archiving/data sharing issues

Guests: Margaret Hedstrom, Associate Professor, School of Information; and

James Randolph, Senior Associate Director, Division of Research Development and Administration 

Before the guests arrived Mary Haan noted that NIH currently requires  a datasharing plan (or an acceptable explanation of why data sharing is not feasible or appropriate) to be included in the proposal for projects that receive funding greater than $500,000   It is possible to request funding from NIH for this aspect of a proposal. It was noted that NASA has a data archiving service for each of its science areas with specific data formats and storing rules.  In addition, the National Archive of Medicine (NIH) provides data archiving for gene sequencing. 

James Randolph arrived.  Mary Haan said the overall question is where do we put our data when we archive it.  There are different levels of commitment and different standards among disciplines. This is not necessarily covered by project funding, though perhaps it could be built in.  A member of the group asked whether study sections look in a detailed way at the data sharing plans in proposals, down to protection of subjects and de-identification of data, speculating that study sections are overburdened and not likely to be able to examine the plans closely.  Overall, the responsibility for doing the archiving appears to be on the investigator. 

Margaret Hedstrom arrived and introduced herself.  She has been a faculty member in the School of Information since 1995 and has been involved in data preservation since 1980 including work on the issue of how to preserve state government records in electronic form. She has done work on data preservation with the Library of Congress and has been on a National Research Council committee looking at the National Archives. She recently attended an NSF workshop on long-term data archiving.  She typically works with data in very heterogeneous sets.  She is currently writing a paper on the NIH data sharing policy.   

Prof. Hedstrom shared some observations on the present state of data archiving. There is an ongoing question about who is responsible for long-term data curation.  There are institutional repositories for the humanities and social sciences.  But in the life sciences there are not strong institutional roles for data curation.  In the past such raw data was not always kept because it was not necessarily reusable. She said a recent NSF report on long-term data curation may be of interest. The NSF workshop looked at the role of university libraries in this endeavor but there is not agreement that libraries are the right place for storing all types of data.  Another resource may be scholarly associations.  The mechanisms available now are quite varied and some disciplines are more advanced.  For example, high energy physics is very advanced but their data sets are relatively simple.   

Many funding agencies say investigators must archive data but provide very little guidance and no mechanisms to do it.  Furthermore, investigators are not very interested in data archiving and most don’t have the skills to do it.  As a last resort some put the data up on their website but this doesn’t accomplish the goal of making it a resource to be shared or combined.  Thus there is a need for a broad data archiving infrastructure, but the question remains of who will fund it and create it. To this end putting extra money in one’s research grant for archiving is not effective since it doesn’t address the larger issue. 

A committee member remarked that sometimes data sets are wrong, and the long accepted standard of publishing one’s data – and having it peer reviewed – is how the scientific community has sifted out work that contains errors.  So now there is the concern that raw datasets will be preserved without undergoing peer review and those containing errors will have an implicit stamp of approval through appearing on a website somewhere.  While peer review of a whole data set may be hard to come by, Prof. Hedstrom said established archives do provide quality controls which are specific to disciplines and methodologies.  Another important concern is confidentiality of subject data; while any single data set may adequately protect individual identities, when you combine several data sets together one might be able to figure out identities.  This issue has come up already in social science research. 

Another committee member noted that we are constrained by the technology available at any given time.  For example, retrieving data in the future may be difficult and costly depending on how formats change.  In addition, the metadata we see as valuable today will seem primitive in the future.  Perhaps making an open-ended commitment to data being available forever is not feasible.  Prof. Hedstrom agreed that it is a naïve assumption that we should save everything forever.  There are issues of decay and redundancy and it is possible very little of what is saved will even be used in the future. 

Prof. Haan asked how many grants currently held by investigators at UM are archived and if so how.  Prof. Hedstrom said that as for social science data, only about 10-15% has been archived.  She said it is noteworthy that so little has been archived in a field that has long had the tools available.   

There are many opportunities for the university to play a leadership role here. We have a lot of experience and expertise in this area.  Myron Gutmann of ICPSR and Brian Athey were mentioned.  Paul Courant is working with the Library of Congress on the economics of data archiving. The School of Information will likely submit a proposal to NSF for work on long-term data curation.  The issue is so large there are many angles from which to approach it.  Areas for attention include regulatory compliance, how to develop an infrastructure, and what is the role of the institution and local experts in providing cutting edge assistance to faculty. It was noted there are real tensions between IRB requirements and the archiving and reuse of data.  Judy Nowack noted that the federal interpretations regarding IRB oversight in sharing human subject data are surely relevant, but are currently in a state of flux.  

Committee members agreed faculty will not want a lot of rules and prescription about how to archive their data. Being guided to a range of options would be best, allowing each discipline to determine its own way ultimately.  One of the options that should be made available to investigators is a way to state why it is not possible or practical to archive data in some cases.   

It was remarked that if there is any kind of requirement for data archiving, it is also necessary to have a provision for data "euthanasia."  Prof. Hedstrom said that in social science archives, the amount of use is inextricably linked to ease of access. The use is very lumpy.  There are also sleepers that no one is ever going to use.  It would be helpful to know what are some of the attributes of a data set that will determine the level of its use.  Storing data in perpetuity may not even be suggested by the NIH policy.  

In terms of the regulations, is this construed as an obligation of the PI or also of the institution?  Does it outlast the life of a PI?   There is a fundamental need for clarity about who owns the data, and who has the stewardship responsibility.  

Ms. Nowack said OVPR is planning a conference on issues about human subject data and tissue repositories for the second week of February, and many of these issues will be discussed. 

3.   Discussion of NIH Multiple Principal Investigator Policy

Guest: James Randolph, Senior Associate Director, Division of Research Development and Administration 

Mary Haan introduced Jim Randolph and invited discussion of the NIH multiple investigator policy. Mr. Randolph said that NIH has long used the single investigator model which essentially gives most the credit to one person (the PI) and little credit to collaborators. To address this problem, in early 2005 the Office of Science and Technology Policy (OSTP) instructed federal research agencies to accommodate multiple PIs in grant applications.  The NIH is the only agency to incorporate the directive. They will be allowing multiple PIs to be proposed on all applications, although such a structure in any particular proposal might not survive agency review. The policy will be implemented in 2007.  February 1st is the first major NIH round with multiple PIs allowed.  

There are many issues that must still be worked out in such arrangements.  Problematic areas may include how funding is distributed, and how publications will identify the two PIs.  It was noted that the financial part is not difficult if the multiple PIs are at the same institution.  The NIH has welcomed applications for multiple PIs at different institutions, but once you have separate awards, you lose the control mechanism to ensure researchers collaborate. NIH has directed reviewers to make an assessment of whether the application is truly a multiple PI proposal.   

On multiple PI proposals, in place of a single project director, a "contact PI" must be designated. Prof. Haan noted that the NIH requires the applicant to have a leadership plan.  

Committee members agreed that this appears to be a policy that will benefit developing investigators and seems to give a fairer representation of the actual division of work on many projects.  The comment was made that it may be a challenge for peer review groups and that study sections will need to be instructed on how to review these.  

In terms of how this will be handled on a practical level at DRDA, Mr. Randolph said a PAF supplement for sub-accounts will probably be a required part of every PAF. Input from participants will be sought.  It is uncertain whether other agencies will begin to follow suit. 

One question without a clear answer yet is how faculty should list such a grant on their other support. 

3.      Consideration of minutes from November 10, 2006

Mary Haan: Motion to approve the minutes as written.

Elizabeth Young: Seconded motion.

Vote: All in favor. 

The meeting was adjourned at approximately 11:58 a.m.