A Survey of the Use of Crowdsourcing in Software Engineering

A Survey of the Use of Crowdsourcing in Software Engineering - Key Points

*Crowdsourcing an emerging distributed problem-solving model that involves the use of both the human and the machine. Software engineers are recruited online globally to perform various software engineering tasks i.e. requirement extraction, designing, coding and testing of the software. Applying this model has resulted in:

Increased parallelism thus reducing the time-to-market.
Lower costs.
Lower defect rates with flexibility development capability.

According to the survey conducted an observable increase in Crowdsourced Software Engineering has been noticed, whereas according to many of the authors not much work has been done regarding this model. The purpose of conducting the survey were:

To provide a comprehensive survey of current research progress in the field of Crowdsourced Software Engineering.
To summarize the challenges for Crowdsourced Software Engineering.
To reveal to what extent these challenges were addressed by existing work.

The survey was conducted using *Systematic Literature Review (SLR), among which three specific steps were followed in order to conduct the survey:

Identifying the related work: The research papers describing the researched contents on Crowdsourcing Software Engineering were included in the papers to be surveyed.
Assessing the quality of the papers: Paper included in the survey were either be a conference paper, a journal, a technical report or a book.
Summarizing the findings: Researched data was organized into sections:

Section 1: Background information on Crowdsourced Software Engineering.

The word 'Crowdsourcing' first originated in the year 2006 where its was defined explicitly, in which the undefined large networked workforce and open call format were considered to be the requisites for crowdsourcing.
This did raise an argument stating that the work done via crowdsourcing can also be done by a sole individual or a corporation.
Crowdsourcing required a large suitably-skilled yet undefined workforce in an open call for a specific task in hand.
Crowdsourcing first began in 2001. TopCoder development method and systems were introduced to support online distributed software development activities.
Crowdsourcing in the year 2008 was considered as an online distributed production and problem-solving model by the Brabham.
This model has been used in fields other than software engineering.
In Crowdsourced Software Engineering the end product is not always a software product. The end product depends on the type of activities distributed and obtained upon completion.
Any research involving human subjects could potentially be supported by crowdsourcing, in the sense that the identification and recruitment of suitable human subjects for an experiment could be implemented using crowdsourcing techniques.
In this application of crowdsourcing (to research studies), it is the identification of human subjects for experimentation that is important, rather than the particular research topic investigated.
If applied to software engineers then the possible problems that may arise will be similar. this process is known as 'crowdsourcing for software engineering research'. Software engineering techniques can also be used to support crowdsourcing.
Most of the papers studied applied crowdsourcing without defining it.
According to one of the definitions the labor should have requisite specialized knowledge among whom anyone can be selected to do the task.
Crowdsourcing involves 3 kinds of actors:

Actors & Roles in Crowdsourcing

Work Flow in Crowdsourcing

Crowdsourcing may help software development organizations:

Integrated elastic, external human resources to reduce cot from internal employment.
Exploit the distributed production model to speed up the development process.
Delivery of customer requested software assets with lower defect rate.
Delivery of customer requested software assets with lower cost in less time.

Crowdsourcing is becoming widely accepted by companies and organizations from military domain academic institutions to large IT companies.
DARPA created Crowdsourced Formal Verification (CSFV) program for software formal verification.
The number of workers in crowdsourcing has increased.
There is a growing number of crowdsourcing platforms built for software development domain.

Section 2: Description of practical platforms for Crowdsourced Software Engineering, together with their typical processes and relevant case studies.

Commercial Platforms:
The most widely used platform is the online competition, on-demand matching, in which the workers are selected from the registrants, and online bidding, where the developers bid for tasks before starting their work.
The platforms also focus on a broad range of task domains within software engineering.
There is also general crowdsourcing marketplace which is not designed for software engineering specifically , but can nevertheless be used to support various software development tasks, e.g. Freelancer, and Amazon Mechanical Turk etc.
Different platforms may use various models, e..g.:
TopCoder:

It has its own unique process and development model known as the TopCoder Competition Methodology that supports an independent graphic design, development, data science challenges, as well as the development of complex software (by driving sub-tasks from the decomposition of the original task).
The systematic process from top level view may resemble Waterfall model; however, each development phase is realized through a series of online competition.
Only the qualified winning solutions are accepted. Qualified outputs are used as the inputs for the subsequent development phases. (The meaning of the word 'qualified' in this context = passing with a minimum acceptance score, rated through a review process).
The diagram below represent the systematic development process followed by TopCoder:

AppStori:

It is a more recent platform for mobile application development.
Crowdfunding model is used to fund mobile app development.
It attracts app developers and consumers to work closely together.
The crowd developers can post their projects to raise funds from the crowd or to recruit other developers for app implementation.
Consumers can propose ideas for new app development, contribute money, act as beta testers and offer feedback on existing projects.
The whole development process, from conception to release, is achieved through collaboration among crowd developers and consumers.

uTest:

It is a platform used for crowdsourced software testing.
These crowd testers provide a wide range of virtual on-demand testing services, such as the functional testing, usability testing, localization testing and load testing.
It beings with a phase where the clients can specify their testing needs.
Choices concerning testing device, operating systems, geographic location and budget are made flexible by the platform.
Each project has an open call for testers and the crowd can apply to work on their desired testing tasks; however, only suitable testers will likely be favored for selection.
The testings conducted by the testers are reported in real-time and are submitted in the form of test reports for approval.
It is usually the client's responsibility to review the submitted reports and select the tester for their work.
The workers are paid via the platform according to per-project basis.

Amazon Mechanical Turk:

It is a popular platform for micro-tasks.
Small teams can avoid developing complex software by employing crowd workers on the platform.
This platform has been employed to support program synthesis, graphical user interface (GUI) testing, oracle problem mitigation, and program verification in software engineering.

StackOverflow (a question and answer website providing crowdsourced programming knowledge):

Although it directly does not contribute to software development, it has an impact on open source software development.
It is used to improve integrated software development environments and software API documentation.

Bountify:

It is a platform similar to StackOverflow but has more self-contained micro-programming tasks.
A study on program synthesis used this platform to obtain initial seeds for their genetic programming algorithm.

The platforms developed have been categorized into 3 types:

Platforms that support human knowledge in an inexpensive way.
Platforms that support contest-based software development.
Platforms that support programming competitions with a unique 'competitive collaboration' feature.

An evaluation framework was proposed with the objective to assess software crowdsourcing processes with respect to multiple objective i.e. cost, quality, diversity of solutions and crowd competitions (evaluated using min-max game theory).
Case Studies:
In-depth case studies were mainly conducted via interviews with the client companies or the management teams.
The waterfall model followed by TopCoder platform caused some issues related to the coordination with the client company, quality issues being pushed to the later stages of development which was not regarded as a good practice.
According to the management team, 2 value-adding actions were found to exist in the crowdsourcing platform i.e. the macro, market level and the micro, transaction level actions.
In another study conducted on TopCoder and AppStori, the min-max competition behavior found to be contributing to the quality and the creativity of crowdsourced software development.
According to an empirical study of the developer's strategic behavior on TopCoder revealed a cheap talk phenomena where in order to soften competition, highly rated developers tend to register for the competition early thereby seeking to deter their opponents from seeking to participate in the marketplace.
The cheap talk phenomena and the reputation mechanism used by TopCoder contributed to the efficiency of simultaneous online contests. In addition, a regression analysis was performed to study the factors that affect the quality of the contest outputs. The payment and the number of requirements factors were identified as significant predictors for final submission quality.
According to one of the case studies on a German start-up crowd testing platform called testCloud. Three types of challenges were highlighted in the case study:

Crowd Management
Process Management
Techniques Management

Section 3: View of Crowdsourced Software Engineering based on their application domains in software development lifecycle.

Crowdsourcing for Requirements Analysis:

Requirement analysis is a critical step towards the success of software project.
Traditional stakeholder analysis tools required expert's manual effort to extract stakeholder's information.
StakeSource tool was designed to identify crowdsources stakeholders involved in a stakeholder analysis process while reducing the cost of reliance on experts for approaching the stakeholders.
The early version of this tool was known as the StakeNet responsible for recommending stakeholders via social networking.
The new version of StakeSource integrated support for identifying stakeholders and prioritizing their requirements. It automates the process for large scale requirements elicitation based on social networking analysis and collaboration filtering techniques.
Several surveys have been conducted to determine the relationship between crowdsourcing and requirement engineering.
According to one of the studies made on crowdsourcing and requirement engineering, the crowd stakeholders are not only a source of requirements but also can help with requirements prioritization and release planning.
Nascimento investigated the use of crowdsourcing for requirements
categorisation based on Kano’s model. The model uses a questionnaire to help classify requirements into five categories. The value of each requirement for a given user is identified in their approach. A framework was proposed for finding stakeholders involved in the process.
Analytical Open Innovation (AOI) approach was introduced to help developers make release decisions. The crowdsourcing model enables the AOI approach to systematically gather information from customers and other stakeholders.
A conceptual crowdsourced platform known as the 'CrowdREquire' was introduced to employ a contest model allowing the crowd compete with each other in submitting requirements specification solutions to the client defined tasks.

Crowdsourcing for Software Design:

A crowdsourcing system named 'Apparition' was introduced to help designers prototype interactive systems in real-time based on sketching and function description. Experimental results showed that Apparition was able to achieve an accuracy higher than 90% regarding user’s intent, and to respond in only a few seconds.
Platforms supporting software architecture design are a few in number.
The platform proposed by the TopCoder had a limit in evolving designs from multiple designer's solutions.
Another platform allowed the designers to produce an initial design and evolve their solutions based on other's solutions.
In one other research, a solution was also proposed to evolve software designs based data and functionality contributed by the crowd.

Crowdsourcing for Software Coding:

Using crowdsourcing for software coding has focused on three sub-areas:

IDE Enhancement:

An extensive study has been conducted in the use of crowd knowledge to support coding activities in Integrated Development Environment.
Several methods and tools have been proposed to help developers with coding and debugging such as:

HelpMeOut: A social recommender system assisting in debugging with crowdsourced suggestions. HelpMeOut only supports static, compiled programming languages such as Java.
The idea that a crowd of developers may be able to provide recommendations of patches for software systems finds a strong resonance in recent work on genetic improvement, and in particular work on automated bug fixing
Genetic improvement seeks to automatically improve software systems by suggesting modifications that improve functional and nonfunctional properties. Genetic improvement regards program code as genetic material to be manipulated in the automated search for improvements.
Recent results have demonstrated the potential for this technique improve real work program’s speed , energy and dynamic memory consumption and functionality, both by fixing bugs and by adding new features.

Using the crowd knowledge to find common examples from the web, shares similarities with work on an automatic harvesting of realistic test cases from the web-based systems.
As with the potential for the combination of genetic improvement and social recommenders, this similarity also points to the possibility of hybridise versions that harvest such information from a combination of crowd and web for testing purposes.

Crowd Programming Environment: Instead of adopting crowd knowledge over the Internet to help conventional coding activities, some other studies have focused on providing systems to support crowd-based coding work.
In crowdsourcing, both human and the machine can be viewed as programmable units. There has been work on integrating the crowd as a part of the program code itself.

Program Optimization:

Crowdsourcing for Software Testing:
Crowdsourcing for Software Verification:
Crowdsourcing for Software Evolution:
Crowdsourcing for Software Documentation:
Crowdsourcing for Software Localisation:
Crowdsourcing for other Software Engineering Activities:

Section 4: Description of current issues, open problem, and opportunities.

*Crowdsourcing : It is the process of getting work or funding by enlisting the services of a number of people, either paid or unpaid, typically via the internet i.e. Outsourcing of the software to be developed.

Domains of Software Engineering

Search This Blog

A Survey of the Use of Crowdsourcing in Software Engineering - Key Points

Comments

Post a Comment

Popular posts from this blog

Software Architecture Views and Structures

Overview of MISRA C++ (2008)

How traceability of non-functional requirements is managed throughout the software development process?