Refer It Game is a two-player game where users alternate between generating expressions referring to objects in images of natural scenes, and clicking on the locations of described objects. The purpose of this game is to crowdsource natural language referring expressions, very important for research on natural language generation and dialogue systems such as Apple’s Siri and Amazon Alexa. Creating a two-player game, allows to gather and examine referring expressions directly within the game, which later allows to perform experimental evaluations on collected dataset.
Every day, people around the world communicate with each other in various ways, but what we discuss, talk and debate about, mostly concerns the visual world surrounding us. This make understanding the connection between objects in the physical world and language describing these objects a very important but challenging issue for artificial intelligence (AI).
Artificial intelligence is technology that is designed to learn and self-improve. Automation, machine learning or natural language processing are just few examples of many, various functions AI perform for us on daily basis. This creates a large range of research fields that can profit from a better comprehension of how people refer to physical objects in our world.
Recent progress in automatic computer vision techniques, have begun to create technologies for perceiving and distinguishing a large number of object categories very promising (Perronnin et al., 2012; Deng et al., 2012; Deng et al., 2010; Krizhevsky et al., 2012). As a result, there has been a surge of recent work trying to estimate higher level semantics, comprising exciting attempts to generate natural language descriptions of images automatically.
Such approaches, however, are often associated with problems where descriptions may be highly dependent on the task, open-ended and difficult to automatically evaluate. This is why we need different but related approach to problem of referring expression generation (REG). By creating available online, two-player game where individuals refer to objects in composite images of scenes from surrounding us world, we enable researchers to retrieve not only referring expressions but also relevant information. Collected dataset, can then be deeply analysed and later evaluated.
2 Literature & Technology Review
Crowdsourcing simply refers to a method of fund sourcing in which organizations or individuals use contributions from internet users to achieve a set objective. The word was adopted in 2005 and seems to combine the word ‘crowd’ and outsourcing. The belief is that crowdsourcing has to do with outsourcing work to a crowd people. There’s a difference between crowdsourcing and outsourcing because, with crowdsourcing, the work can originate from an undefined public (rather than a predetermined group). Some of the main benefits of using crowdsourcing include improved speed, adaptability, costs, quality, diversity, or versatility (Buettner, 2015).
Crowdsourcing has been highly beneficial in gathering high-quality gold standard used in making automatic systems in natural language processing. Promoted by efforts like the ESP game (von Ahn and Dabbish, 2004) and Peekaboom (von Ahn et al., 2006), Human Computation based games can be a viable approach to engage users and gather vast quantity of data inexpensively. Two player games can likewise automate verification of human provided annotations.
2.1.1 Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) simply refers to an online crowdsourcing marketplace that makes it possible for businesses and individuals to organize the use of human intelligence to carry out tasks that cannot currently be performed by computers. It’s a website that is owned by Amazon.
Jobs known as Human Intelligence Tasks can be posted by employers (HITs), such as writing descriptions, picking the very best among multiple images of a storefront, or identifying performance in music recordings. So-called workers can later search through a large collection of existing jobs; they can complete these jobs in exchange for monetary rewards as fixed by the employer. The requesting programs place jobs using an API, or the more limited MTurk Requester site which seems to be more limited. For a requester to submit an order to be accomplished through the Mechanical Turk platform, he has to submit a billing address in one of about 30 approved countries.
CrowdFlower refers to a San Francisco based crowdsourcing and data mining company. The company provides a software solution with which users can obtain access to an online workforce to label, clean, and enrich data. CrowdFlower uses an online workforce to clean up messy and fractional data. The great number of CrowdFlower users are data scientists who use the solution to build training models as well as machine learning algorithms.
As soon as data is uploaded into the system, the work is automatically allocated to contributors and is tested against established answers which are hidden within the task (this is called “job” in CrowdFlower). The system trust individuals based on the way they perform on these hidden tasks. Contributors are allowed to continue working on a particular job as long as they are still trusted. If they lose that trust, they lose the job, and their work is disregarded. The judgments of many contributors are collated and the result is given based on aggregate answers with an associated confidence score (contributors’ agreement weighed by the trust of each contributor) (http://en.turkcewiki.org/wiki/Crowdflower).