Amazon's Mechanical Turk remains a handy way to get certain digital tasks
done cheaply and quickly, and that includes generating massive amounts of spam,
according to researchers.
The Mechanical
Turk service is a labor marketplace where requesters can post specific
tasks for small amounts of money. The Human Intelligence Tasks (HITs) are
usually repetitive and can't easily be done by a machine, such as tagging
pictures based on a certain attribute.
A casual glance over the marketplace shows a "tremendous number"
of spammy HITs that often try to game social media metrics, according to a blog
post by the chief researcher, Panagiotis Ipeirotis, associate professor at New
York University's Stern School of Business and George A Kellner faculty fellow
for Information, Operations, and Management Sciences.
The Mechanical
Turk study, in conjunction NYU academics Dahn Tamir and Priya Kanth, found
that 41 percent of all the HITs posted by requesters who joined the marketplace
between September and October were spam. The assumption was that long-term
requesters were not spammers, but that needed to be verified, said Ipeirotis.
"We could not have been more surprised," Ipeirotis told eWEEK. The
team had begun the project to see if they could identify the "bad
guys" to sift out their requests to make the site easier to work with.
"We were hoping the exact opposite, that it would be some number like 5 percent
so that we could apply filters," he said.
Spammy HITs include testing the ads in a Website, creating a Twitter account
and following the requester, writing a positive review on Yelp, and downloading
an app, according to Ipeirotis.
The study looked at 5,842 HITs posted by 1,733 requesters over the two-month
period. The "activity patterns" were similar to those of the general
requester population, he said.
Since this was a study about Mechanical Turk, it is quite fitting that
Ipeirotis and his team turned to Mechanical Turk to identify which of the
selected HITs were spam. His HIT gave
guidelines on what to identify as spam, including creating fake accounts, lead
generations, asking workers to post fake ads, click on ads, requests to bump up
SEO by creating fake ratings and reviews, and requests for personal
information.
"Interestingly enough, we got a ridiculous amount of spam from the
workers" in response to the HIT, he
said, before settling on 11 trusted workers to complete the task.
The analysis also showed that 32 percent of the new requesters posted only
spam, Ipeirotis said. Very few accounts posted both spam and legitimate HITs,
he said.
Scammers tend to post HITs with higher rewards, and the best paying HITs
tend to be spam-related, according to the results. "Perhaps because they
do not pay?" wondered Ipeirotis. In terms of pricing, 80 percent of
legitimate HITs are priced at less than $1, while only 60 percent of the spam
HITs are.
The spam problem is mainly one of image, said Ipeirotis. There are plenty of
people with problems that can be solved on Mechanical Turk, but they don't want
to associate their names on a page where they might be placed next to spam. The
spam is not preventing people from finding legitimate requests, he said.
For workers, it's also a matter of trust. There's a "fundamental level
of distrust" when a new requester joins the marketplace because the
workers wonder who it is and whether the new member is actually going to pay.
"It's like going to shop online expecting the retailer to scam you,"
Ipeirotis said.
However, Ipeirotis ran a quick two-minute analysis on a sample of data and
found about 2 to 3 percent of the spam HITs were associated with some form of
malware.
That level of distrust is not good for the future
growth of the site, he said. He was very disheartened that, when he
informed Amazon about his research results, the company said the market will
figure it out. Amazon takes a "laissez faire" approach and won't
interfere with the market, he said.
Ipeirotis wondered if Amazon was being hands-off because Amazon's actually
profiting from the spammers because the workers are being paid. To find out, he
requested access to Amazon's data, which was denied. Amazon did not respond to
eWEEK's requests for comment on this story.
Ipeirotis has seen the number of spammy HITs grow in the four years he's
been using Mechanical Turk, and began to wonder about the magnitude of the
problem, he said.