Exploring Crowd Evaluations for Innovations
Abstract
Along with the widespread use of open innovation methods in the new product development process some organizations today do not struggle to generate enough ideas. These organizations are rather confronted with vast amounts of... [ view full abstract ]
Along with the widespread use of open innovation methods in the new product development process some organizations today do not struggle to generate enough ideas. These organizations are rather confronted with vast amounts of heterogeneous ideas (Berg-Jensen, Hienerth & Lettl, 2014; Piezunka & Dahlander, 2015; van Knippenberg, Dahlander, Haas & George, 2015). As a result these organizations face problems of efficiency and effectiveness when it comes to the evaluation process of these ideas. With regard to efficiency a problem arises because crowdsourcing has been shown to be a useful tool to attract a high number of potential ideas (Bayus, 2013). Organizations are easily overwhelmed by sheer number of ideas which have to be evaluated. They have to commit substantial resources to the evaluation process of mainly low quality submissions to identify the few high quality submissions. Furthermore, prior research suggests that high workloads of evaluation panels are associated with discrimination of novel ideas (Criscuolo, Dahlander, Grohsjean & Salter, 2016). With regard to effectiveness problems arise when organizations aim to access knowledge which is distant to the focal organization. Because this knowledge is ex definition unknown to the organization it remains questionable if the focal organization is capable to evaluate those ideas in a valid way (Afuah & Tucci, 2012; Piezunka & Dahlander, 2015).
In that line, there is rising interest, as can be seen by recent calls for research by leading journals, in the involvement of crowds in the idea selection phase what can be called crowd evaluations. In crowd evaluations the task of evaluation is outsourced to an a priori unknown group of self-selected evaluators (based on Howe, 2006). While some studies suggest that there is no significant difference between crowd evaluations and the judgement of experts (e.g. Magnusson, Wästlund & Netz, 2016; Mollick & Nanda, 2015), other studies suggest that there are systematic differences (e.g. Boudreau, Guinan, Lakhani & Riedl, 2016; Ozer, 2009). Thus, a better understanding is needed a) how evaluator characteristics, like expertise, experience, or motivation, influence the validity of evaluations, b) how evaluations should be aggregated if individuals differ significantly in validity of their evaluations, and c) how crowds should be composed to arrive at a valid crowd evaluation. In that line, this dissertation investigates the following research questions:
• How do the characteristics of evaluators influence the validity of their evaluations?
• How do different aggregation mechanisms and crowd compositions influence the validity of crowd evaluations?
Determining the validity of evaluations is a notoriously difficult task because it requires a “true value” as a benchmark. For many measures, e.g. originality, however, it is nearly impossible to determine an objective “true value”. For other measures, e.g. market potential, objective measures might be observable ex post. But observations can only be made for those products which are introduced to the market. Thus, this measure is systematically distorted by not accounting for the counterfactual. Concurrent validity is commonly used in the innovation and crowdsourcing literature (Blohm, Riedl, Füller & Leimeister, 2016).
To investigate the above stated research questions this dissertation consists of two studies, where the unit of analysis is changed from the individual in study 1 to the crowd in study 2. In the first study the effects of evaluator characteristics on the validity of their evaluations will be investigated. Therefore, an idea competition was launched in collaboration with the Federal Chancellery of Austria on the topic of new applications for open data. All submissions were then evaluated in an open crowd evaluation, where participants were invited to evaluate the ideas, prototypes and early stage product with regard to their originality, usefulness, feasibility and market potential as well as confidence in their evaluations. Additionally, evaluator specific characteristics were compiled. In parallel, the Federal Chancellery follows its regular evaluation process, partly using the same online tool that is used as for the crowd evaluation. To determine an objective benchmark for these studies the following process will be applied, which combines multiple persons and multiple methods: The submissions of the crowdsourcing competition will be evaluated by two independent panels. First evaluation relevant perspectives and corresponding representatives will be identified based on qualitative expert interviews. After assigning these representatives randomly to one of the two panels, one panel will evaluate all submissions using the consensual assessment technique (CAT) based on Amabile (1996) and one panel will use focus group technique, which are both widely used methods in innovation research (e.g. Blohm et al., 2016; Piller & Walcher, 2006). Then, the evaluations of both panels will be compared and interrater reliability measures (e.g. Cohen’s ƙ or Krippendorf’s α) will be used to exclude all submissions for which no sufficient agreement could be reached. The empirical data from the first study will be used for parameter estimations in the second which will test the effectiveness and robustness of different aggregation mechanisms and crowd compositions by using agent based modelling. Thus, the second study complements prior studies on information aggregation by a) building on empirical evidence for parameter estimations and b) taking into account individual differences in validity and its antecedents.
With this dissertation I intend to contribute to both, theory and practice. With regards to theory I intend to investigate the effect of individual and crowd characteristics on the validity of evaluations and what that means for organizations and their knowledge sourcing. With regard to practice I hope to help practitioners to a) understand the contingency factors for the use of crowd evaluations, b) better understand crowd evaluations and c) make use of evaluation results more wisely. This research thus complements prior theoretical contributions on evaluation validity by empirical evidence, and prior empirical contributions by the analysis of crowds as unit of analysis in the simulation study.
By August 1st the crowd evaluation and the internal evaluation should be fully completed. Furthermore, qualitative interviews for the identification of panel members will have been conducted. The actual evaluations of those panels, however, might not be available due to availability of the identified panel members.
Authors
- Tom Grad (Vienna University of Economics and Business)
- Christopher Lettl (Vienna University of Economics and Business)
- Christian Garaus (Vienna University of Economics and Business)
Topic Area
Contests, Crowdsourcing and Open Innovation
Session
MATr2B » Contests, Crowdsourcing & Open Innovation (Papers & Posters) (15:45 - Monday, 1st August, Room 112, Aldrich Hall)
Paper
Tom_Grad_Exploring_Crowd_Evaluations.pdf
Presentation Files
The presenter has not uploaded any presentation files.