Thursday, April 14, 2011

Inter-Reviewer Agreement

A subject that recently came up that I believe reflects faulty thinking about proposal reviewing. It is the subject of inter-reviewer agreement. To explain, when annotators or translators work, it is possible, and even desirable, to test the agreement among the annotators or translators to be sure that the results are sound. Inter-annotator agreement means that, if you brought in someone new and had them do the annotation or translation, the result would still be more or less the same. Recently, I've learned that some people expect that the same consideration should be given to scientific proposal review.

It is instructive for this topic to relate an experience within a very large, interdisciplinary program, which included proposals in social sciences and information technology. To ensure "fair" review, it was decided to have two separate panels review such proposals: one in the social sciences and one in information technology. The surprising result was that the information technology reviewers had little variance among themselves as to their evaluations of the proposals, but the social science reviewers were extremely broad in their evaluations. Evaluations from the latter on some proposals ranged the entire gamut of possible scores on the same proposal. While there are a lot of possible hypotheses one could propose as to why this difference occurred, it is at least a demonstration that inter-reviewer agreement clearly varies across disciplines.

After thinking about why this result should be the case (and it was demonstrated more than once), it may be due to a couple of different reasons. One possible reason could be that information technology research is well-supported by the government, and the social sciences are not. When a discipline is well-supported, it gives the opportunity for researchers to meet fairly often in review panels and learn how to appreciate and accept the scientific views of their colleagues. The opposite might be said of disciplines that are not well-supported. When the views of colleagues are understood and accepted, it may cause others in the same field to bow to the expert and accept their evaluations of research in areas in which they are expert. It may also cause others who are not so expert to get at least a glimmer of appreciation of others' research and what good research in that field looks like. Proposal evaluation panels are often opportunities for instruction for the panelists in this way.

Another possible reason why information technology reviewers agree more than social science reviewers could be that the social sciences are much broader than information technology, resulting in a greater diversity of opinion. While I tend to accept this view, I believe a confounding factor is that the social sciences have also not progressed toward common understandings even in sub-disciplines of social science precisely because the field has been underfunded for a long time. Underfunding means a lack of strong selection processes enabled by reviews of proposals for funding and subsequent support for the best ideas. In other words, many different ideas bloomed, with little to no selection of the best from among them. So, they all continue to exist in their own little niches.

Finally, it should be said that, in interdisciplinary proposal review panels, consensus reviews should be captured as well as the different reviews of each participant. The struggle toward consensus creates the opportunity for those who are outliers in the review to explain themselves. If they are convincing, then the results are not only a more informed result for the program manager, but also it results in an education of those who were outliers as to points of view that also exist in the scientific community and why. This point is important, because those other points of view may very well be based on established results that were unknown by others on the panel.

One last point, a question was once made on what the ideal size of a review panel might be. Of course, this has never been really tested, and it depends upon the diversity of fields assembled. Nevertheless, it would be very hard to make a case for less than 15-20 reviewers on a single panel. Panel assembly is a sampling process, and experience has demonstrated that the danger of omitting important expertise tends to be less of concern with panels of this size. Larger panels are also unwieldy, so more than 20 may also be a bad idea for other reasons having to do with time management.

Monday, December 6, 2010

Political Appointees

In this era of Federal Employees being denied their raises as a token gesture toward easing the budget deficit, I am surprised that no one has addressed the role of political appointees in the Federal Government. In my experience, political appointees are usually placed in relatively high positions reporting to the head of the agency, who perhaps should be the only political appointee in any agency. Nevertheless, these other placements do occur and are rampant in some agencies. In all cases I've observed, which were many, these appointees were not qualified for the positions they held, had no understanding of the mission nor the culture of the agency, and usually only obstructed activities rather than try to create new methods or processes to improve government. While I am unaware of the salaries of these appointees, I would be very surprised if they were not sufficiently high to satisfy the recipient expectation of payment for their support during a past campaign.

Casting Federal Employees with a broad brush and treating them all the same, as the salary freeze did, is wrong. Most civil servants are hard working and deserve to be paid a just wage for the level of experience and education that they have. Critics who compare Federal Employees' wages with wages in the civilian workforce usually do not take into account that the more menial tasks of government work are not carried out by government employees but by contractors.

There is fat in the Federal Employee workforce, and it is the political appointee. Ridding the government of all such appointees except for agency heads who are cabinet members, would not only reduce the Federal budget for salaries, but would in all likelihood increase the efficiency of the Federal government and increase morale among employees who might then have a upward position to aspire to that is not "taken" by a political hack. I strongly believe that the total Federal employee budget and the number of Federal "employees" can be reduced in one move and result in a stronger, more efficient government. Simply get rid of the political spoils process that allows, sometimes rampant, hiring of political employees into non-agency-head positions.

Tuesday, November 16, 2010

How to Down-Select

Program managers need to be aware of a potential disconnect between the desire to use mid-term reviews as an opportunity to “down-select” some of the projects and the desire to keep hands-off funded research projects. It is important to not micro-manage research from the funding organization. Micro-management of research from the funding agency is never a good approach to keep projects “on-track”.

When a review panel understands that a review will result in the termination (“down-select”) of some of the projects in a program, they will take a far different view of the project report than that taken by a NSF program, for example. In fact, the nature of the review will be similar to that of an original proposal being reviewed for initial funding. Typically observed is a concern with the level of direction that projects receive regarding what should be in a mid-term review and what reviewers need in order to offer advice on whether the program should continue the project or not.

In many instances, the mid-term report is written in a perfunctory manner, failing to address important questions that the panel reviewers needed in order to make judgments as to whether or not the project should be continued. In many cases, information that was lacking included:

  • Honest discussion of what actually was already accomplished versus what was proposed;
  • If objectives have changed, there should be a discussion of what has changed along with reasons for the changes;
  • Information on papers in progress as well as listing of those already accepted for publication, if any; and
  • Listing of all project participants, including students, and how they interact to accomplish the overall goals of the project – i.e., information that would show that the project is being managed appropriately and effectively.

Effective management and a reasonable management plan behind it are most often the primary causes of failed research projects involving more than one primary researcher. Collaboration is very difficult to achieve without such a plan. Most researchers have their own agendas and prefer to use funds to continue those agendas rather than support the needs of a larger collaboration.

One of the most frequently discussed issues during a review is the number and quality of publications made available for review. In projects that have been underway less than two years, it is probably inappropriate to require a listing of published journal articles. It normally takes at least a year or more for an article to be reviewed and published in a major scientific journal, so such a requirement for this stage of review is probably inappropriate. Such papers, when they do appear, appear to be based on work done prior to the funded project. It is often more useful in mid-term reviews to see a list of conferences attended and papers presented at international conferences as a measure of progress as well as a measure of international stature.

With appropriate guidance from a program, mid-term reviews of projects can be conducted fairly and effectively because such guidance ensures that the reviewers will have the information they need to give useful advice to the program.

Saturday, October 9, 2010

The Tenure Issue

Tenure is an issue mostly associated with academic freedom, and the achievement of tenure in an American university is often associated with the ability to teach and do research on topics that, prior to tenure, may have negatively affected one's academic career. The question of whether or not tenure remains an appropriate system for the United States is not one we're addressing here. Instead, we will relate a few ways in which the tenure system affects the progress of science through government funding. Even though the US Government has no official position on tenure in relation to reviewing proposals or getting research funding, there are still strong relationships to be recognized between tenure and the overall progress of science.

Prior to being awarded tenure, junior faculty seek to publish as much as possible in order to build a strong case for being awarded tenure. This effect of the tenure process is good for the progress of science for all the obvious reasons already mentioned earlier in this blog. What someone publishes is also of critical importance as well as how much they publish. If a junior faculty member publishes too much out of scope of the discipline of their potential tenure committee, this will likely negatively affect their tenure case. Tenure committees normally seek to strengthen the field they represent rather than have it change or even evolve by accepting for tenure someone who may be considered to be on the "fringe" of the field, or worse, outside the field. In this respect, the tenure process reflects the political reality of how it proceeds. Tenure committees are actually not unlike small political parties in this sense, seeking to increase the influence and direction that they already represent. For funding programs seeking to diversify a field or to generate interdisciplinary fields, this is a strong negative influence. Rarely will you find a junior faculty member willing to step outside the bounds of their potential tenure committee to seek funding for a revolutionary interdisciplinary idea. When this does happen, it is usually a natural combination, such as a combination of the field with teaching, with computation, or with the collection and sharing of large data sets. Rarely is it a combination involving two core sciences. Bioinformatics is one major exception, being a combination of biology and information science. Even in this case, however, it wouldn't have happened had it not already become obvious after the genomic era that biology is an information science anyway.

There are also influences on government funding of science after the award of tenure. While one would expect a newly-tenured faculty member to begin to diversify and take more risk in their approaches, that is not normally observed. Possibly the reason is that, after spending 7 years keeping tightly within the bounds of a field, they have become fully enculturated in the field and no longer aspire to change. Innovation becomes more of a challenge when a faculty member already has PhD students to supervise, classes to teach, and now service on the tenure committee that keeps them focused on the field.

One outcome of tenure that has to be mentioned because it is so noticeable from the point of view of government funders is that of ego. Achievement of tenure is a difficult and highly political process. When successful, the faculty member is treated differently both by those within the University as well as by those in the field. Such treatment is almost like that of royalty. That may seem overly strong a label, but the actual fealty shown by junior faculty, not yet tenured, can be a strong and potentially negative influence on one's personality. A tenured faculty member is not only going to serve on a tenure committee in their own university, they will be asked for letters of reference by other tenure committees in other universities representing the field. They will serve as either editors or reviewers of major journals in the field. And, unfortunately, they also tend to serve more often on review committees for funding. While program managers should try to include junior faculty as much as possible to teach them how to write fundable proposals, program managers like to "score points" with the field by selecting reviewers who have recognition in the field, and those people are likely to be tenured. For tenured faculty with weak egos to begin with, all this recognition can create an egotist. I have worked equally in industry, academia, and government, and I have never encountered stronger egos, in the negative sense, than I have in academia, and this should be of concern for government funders.

Government funding of science is aimed at progress in the field, but achievement of progress can be hindered by not only the inertia effect of the tenure process, but even more by the influence of very strong egos. Big names in a field are likely to influence outcomes just because of their name rather than because of reasoned argumentation. Ego-tainted, big names tend to make matters worse by influencing outcomes to increase their own standing and entourage in the field. Only the competitive nature of government funding can control this since, having tenure, faculty are protected in their university position. Program managers must immunize themselves to these processes when they direct funding.

Saturday, October 2, 2010

US Government Corruption in Funding of Science?

A question I have been asked is whether or not there is corruption in US Government funding of science. This is an expected question in an era of public scrutiny of government and its spending. I can only comment on what I have actually observed since this sort of thing is not usually published, if it exists at all. My experiences along these lines are with the National Science Foundation, DARPA, DHS, and the Intelligence Community, and the answer is that I have seen wrongful activity in the funding of science in the US, but my answer requires elaboration.

In the National Science Foundation, when corruption occurs, which I believe is rare there, it is intensely pursued by an independent Inspector General's office. For the one serious case I observed, there was an investigation in which I was interviewed as a witness. I don't know what the outcome was, but I believe that, if wrongful acts were found, they were dealt with appropriately. Program Managers in the NSF are required to attend annual workshops where they are given case studies to consider. Most of such case studies are, on the surface, open to interpretation, but at the core, either a criminal act or at least an ethical violation. I trust the NSF system because there are a large number of ways in which the NSF Inspector General gets information about potential problems, and their investigations are thorough, detailed, and unbiased.

The matters in the Department of Defense and the Intelligence Community are very different. It's not that there are more violations, or that there is no Inspector General whose job it is to carry out investigations. There is an IG in every US Government Agency. The problem, I believe, stems from the nature of the business. Almost all staff in these organizations, including program managers, are required to hold active security clearances due to the nature of the research they fund and it's potential impact on National Security. This means that only those who "need to know" actually learn about research projects or their outcomes until, or unless, they are published in the open literature. The circle of those who "need to know" is usually tightly controlled, creating a structural problem for detecting and dealing with corruption - i.e., the number of those involved is far smaller, resulting in a much smaller sampling of people from various disciplines or points of view. This means that those who are involved have to be much more vigilant and willing to report potential problems than, say, those in the National Science Foundation.

Does it work? Are these people more vigilant such that wrongful acts are detected, investigated, and dealt with properly? In cases I've observed, I'd have to say no, unfortunately. The same system that protects National Security also provides a shield that prevents disclosure, and humans, being what they are, always have a certain degree, even if small, of stepping over the line in cases in which they are personally involved. Sure, we all go over the speed limit at times, but these cases are more than being a few miles over a posted limit. The cases I've observed were serious, in my opinion. Such cases were justified by those involved by self rationalizations having to do with importance of the work, going with a research performer they "trust" with such important work rather than follow required procedure, or simply the need to take such risks in order to get important work done at all since it may be of the type not many others wish to engage in.

The US system of security works well in cases where it has to, but it must be recognized that it has unintended side effects such as these. Some might say there are whistle-blower protections, and observed cases must be reported. At what cost? Is someone to risk not just their career, but a potential criminal prosecution just to provide this information? I don't think so. The risks are far to great for anyone I know of to make statements regarding potential wrong-doing they've observed. I suppose if the case were to involve loss of life or flagrant criminality, the result might be different, but funding of science usually does not involve that level of seriousness. It is, however, misuse of taxpayer dollars, and that in itself may be reason enough for a serious reconsideration of how the US funds research in agencies having to do with National Security.

Thursday, September 30, 2010

Darwinian Science?

Since progress in science tends to be competitive and occurs in a population of scientists (see blog below), it is inviting to think of it as a Darwinian process. That is, science proceeds by way of a large variety of approaches tried by a population of scientists, and selection occurs on those approaches in terms of their successes in experimentation. It is an attractive way to think about scientific progress because it depends upon the generation of variety by broad investment by the government and upon repeatable experimentation to prove approaches that work. It would be wrong, however, to label this as a Darwinian view of science.

Since science is a phenomenon of ideas and not of genes, it is a Lamarckian process rather than a Darwinian one. In other words, progress is achieved by the passing on of adaptations through learning within one's lifetime rather than through any increased fitness of successful scientists, although successful scientists do tend to attract and train more new scientists than unsuccessful ones. The ideas and techniques that lead to successes in scientific experiments are published and passed on in a much shorter interval than that required to produce new scientists. Scientific progress is like any cultural evolution, in that way.

That said, how should this recognition of scientific progress as a Lamarkian evolution affect science funding programs? Clearly, one should encourage a diversity of approaches in order to improve the chances of "covering" the search space of options to a solution to any scientific problem. One should also encourage rapid and widely-distributed publication of all results. These lessons are not new.

Frequently lost on program managers, however, is the fact that, in order for the evolutionary process to proceed, one must also develop a competitive process among the scientists working to solve the same problem. Selection of a scientific "solution" to a problem is relatively meaningless without a corresponding set of attempts that failed to bring about a successful result. This means that program managers must not only expect failures, but they must be willing to fund sufficient variety of approaches, that is, take sufficient risk, that failures are generated and published! In examining failures, scientists learn valuable lessons about the causes of success and where the causality can be attributed in a success. Without them, it is not possible to know what aspects of the successful approach actually led to the desired result.

The propensity of program mangers to seek and publicize "winners" is only part of the job of effective program management because the wider the net is cast, the more accurately one can not only recognize success, but why an approach has succeeded.

Friday, September 17, 2010

Pasteur's Quadrant

Stokes' book, Pasteur's Quadrant (1997) describes how Vannevar Bush got it wrong when he postulated a continuum from basic research to applied research. He claims that these are actually two separate dimensions along which research can be characterized. Pasteur was a good example of being high on both basic and applied research. He was not only looking for a cure for a specific disease, but he was also doing research on a basic mechanism of disease.

While NSF is best characterized as funding research high on the basic dimension, they are not likely to fund significant efforts that are high on the applied dimension. At least, that is true in comparison to other funding agencies. DARPA, for example, tends to fund research that is high on the applied research dimension, but not significantly so on the basic dimension. In fact, the "color of money" of DARPA tends to prohibit this. Defense research dollars are categorized according to whether they are basic or applied, but not both.

A particularly important challenge to face for a country seeking to maximize return on investment of research dollars is whether or not to spend funds on basic research because the return is so risky and so far in the future. This may be a false issue if one takes Stokes' view. Research topics can be identified that are both basic and applied, and there may be a way to do this intentionally.

Reviews of proposals from other countries has led me to believe that there is an emphasis on targeting research areas that will create industrial partnerships and quick wins in new applications. Unfortunately, the topics often involve the creation of an engineering artifact with new capabilities rather than deep investigation into the fundamentals of the science behind the topic. At the end of the project, a payoff might develop signaling success for the funder in demonstrating increased World market share in some area. At the rate at which competition drives engineering applications these days, however, that success is likely to be short-lived, unless a deep understanding of the principles involved are understood as well. With such a deeper scientific understanding, one can continue to create new artifacts and even understand the drivers for what makes them successful in the first place.