Amazon’s Mechanical Turk remains a handy way to get certain digital tasks done cheaply and quickly, and that includes generating massive amounts of spam, according to researchers.
The Mechanical Turk service is a labor marketplace where requesters can post specific tasks for small amounts of money. The Human Intelligence Tasks (HITs) are usually repetitive and can’t easily be done by a machine, such as tagging pictures based on a certain attribute.
A casual glance over the marketplace shows a “tremendous number” of spammy HITs that often try to game social media metrics, according to a blog post by the chief researcher, Panagiotis Ipeirotis, associate professor at New York University’s Stern School of Business and George A Kellner faculty fellow for Information, Operations, and Management Sciences.
The Mechanical Turk study, in conjunction NYU academics Dahn Tamir and Priya Kanth, found that 41 percent of all the HITs posted by requesters who joined the marketplace between September and October were spam. The assumption was that long-term requesters were not spammers, but that needed to be verified, said Ipeirotis.
“We could not have been more surprised,” Ipeirotis told eWEEK. The team had begun the project to see if they could identify the “bad guys” to sift out their requests to make the site easier to work with. “We were hoping the exact opposite, that it would be some number like 5 percent so that we could apply filters,” he said.
Spammy HITs include testing the ads in a Website, creating a Twitter account and following the requester, writing a positive review on Yelp, and downloading an app, according to Ipeirotis.
The study looked at 5,842 HITs posted by 1,733 requesters over the two-month period. The “activity patterns” were similar to those of the general requester population, he said.
Since this was a study about Mechanical Turk, it is quite fitting that Ipeirotis and his team turned to Mechanical Turk to identify which of the selected HITs were spam. His HIT gave guidelines on what to identify as spam, including creating fake accounts, lead generations, asking workers to post fake ads, click on ads, requests to bump up SEO by creating fake ratings and reviews, and requests for personal information.
“Interestingly enough, we got a ridiculous amount of spam from the workers” in response to the HIT, he said, before settling on 11 trusted workers to complete the task.
The analysis also showed that 32 percent of the new requesters posted only spam, Ipeirotis said. Very few accounts posted both spam and legitimate HITs, he said.
Scammers tend to post HITs with higher rewards, and the best paying HITs tend to be spam-related, according to the results. “Perhaps because they do not pay?” wondered Ipeirotis. In terms of pricing, 80 percent of legitimate HITs are priced at less than $1, while only 60 percent of the spam HITs are.
The spam problem is mainly one of image, said Ipeirotis. There are plenty of people with problems that can be solved on Mechanical Turk, but they don’t want to associate their names on a page where they might be placed next to spam. The spam is not preventing people from finding legitimate requests, he said.
For workers, it’s also a matter of trust. There’s a “fundamental level of distrust” when a new requester joins the marketplace because the workers wonder who it is and whether the new member is actually going to pay. “It’s like going to shop online expecting the retailer to scam you,” Ipeirotis said.
However, Ipeirotis ran a quick two-minute analysis on a sample of data and found about 2 to 3 percent of the spam HITs were associated with some form of malware.
That level of distrust is not good for the future growth of the site, he said. He was very disheartened that, when he informed Amazon about his research results, the company said the market will figure it out. Amazon takes a “laissez faire” approach and won’t interfere with the market, he said.
Ipeirotis wondered if Amazon was being hands-off because Amazon’s actually profiting from the spammers because the workers are being paid. To find out, he requested access to Amazon’s data, which was denied. Amazon did not respond to eWEEK’s requests for comment on this story.
Ipeirotis has seen the number of spammy HITs grow in the four years he’s been using Mechanical Turk, and began to wonder about the magnitude of the problem, he said.