While most of us dread being the recipient of a robocall, Aude Marzuoli actually looks to attract and collect fraudulent calls to her robocall honeypot, aka, the phoneypot. Marzuoli, a data scientist at Pindrop Security, first provided details about the phoneypot and a sample of 100,000 calls it collected in the first half of 2016 during a session at the Black Hat USA security conference last week.
In an interview with eWEEK, Marzuoli provided additional insight into her study and the results it found.
“We suspected that out of all the phone scams that hit consumers, there would be some infrastructure behind it,” Marzuoli told eWEEK.
What Marzuoli didn’t know before conducting the study was how much, or little, infrastructure it takes to place 100,000 calls. As it turns out, more than half (51 percent) of the calls the phoneypot recorded could be attributed to only 38 distinct telephony infrastructures. Marzuoli defines a telephony infrastructure as a grouping of phone numbers and back-end call centers operated by a phone fraud group. Pindrop’s technology platform provides a voice fingerprinting capability that was used to help analyze recorded calls from the phoneypot.
The people and organizations behind phone scams aren’t just an annoyance to consumers; they’re also part of the wider cyber-security challenge, Marzuoli said. If an individual divulges personal information over the phone to an attacker, the attacker will use that information to impersonate the individual in other places, including financial services transactions.
In conducting the phoneypot research, Marzuoli and Pindrop faced a number of challenges, including making sure that attackers didn’t know which phone numbers are owned by Pindrop. The study looked at a sample of 100,000 out of 1 million calls received by Pindrop between February and June 2016.
Among the surprising findings from the phoneypot study was that the numbers that called weren’t necessarily the same as those that consumers have complained about in various online forums.
“What I found is that among the phone numbers that are responsible for two-thirds of all online complaints, they only represented 2 percent of numbers calling our honeypot,” Marzuoli said. “Meaning that people online only complain about the very frequent callers, but they are really only a small sample of all the bad phone numbers out there that are spamming people.”
In addition, most phone numbers only show up once or twice, which makes many forms of traditional analytics and machine learning ineffective at fully understanding what is going on with robocalling, she said. That’s why the additional step of actually recording the 100,000 calls was taken—to further analyze the voices and content of the robocalls to try to determine additional patterns.
Among the different phone fraud campaigns detected in the phoneypot are ones related to Google search engine optimization (SEO) as well as attackers claiming to be from the Internal Revenue Service.
At this point, Marzuoli is not providing full attribution on the worst phone fraud offenders. She added that for any given phone number, a call could potentially transit across multiple carriers, making it challenging and expensive to fully backtrace the source origination for a single number.
“Instead of just looking for one number though, we’re looking for a group of say a hundred numbers that look unrelated but we know come from the same source,” Marzuoli said. “The problem then of finding the individual or people behind the calls become easier as there is a much bigger data set and a more reliable source of information.”
Sean Michael Kerner is a senior editor at eWEEK and InternetNews.com. Follow him on Twitter @TechJournalist.