Step No. 1: Understand the Data
Step No. 1: Understand the data
Even though the operations team has been looking at the CDR data for years, this is their first foray into deeper analytics. They have to begin by taking a more in-depth look at their data and determining what attributes in the data may be useful. They suspect fraud is being accomplished using specific phone numbers distributed across regions, so they start there.
What is extremely useful at this point is a data profiling tool. A good profiling tool allows the operations team to begin to discover trends in their data. The team's use of fields in the CDRs has been limited to this point. They need to ensure the quality of values in the fields targeted for the fraud detection project.
Along with a good profiling tool, the operations team also needs a data analytics tool. Besides profiling the data for discovery, they'll also want to experiment with different aggregations of the data. For example, aggregating the data by region and phone number may show that certain phone numbers are used much more than others. This in itself does not mean their use is fraudulent but it may help support the team's suspicions.