Using R to fight fraud and cybercrime!

By andy

Online fraud and cybercrime are hot topics these days.  It involves huge amounts of data derived from internal and external sources, multiple agencies, multiple formats and tools. Investigators manage terabytes of data per case, and extra terabytes are added and curated monthly!

Optimising investigations requires modern tools that are supercharged by smart data. They need to deliver speedy success, even on a very large scale.

So why do investigators need speed? The most successful police and local government officers are generating unique “data experiences” and user/victim journeys in real-time. They are constantly collecting digital evidence across scores of cases based on ever-changing circumstances and available data. As in financial trading models, smart data analytics to combat the criminals must be done fast to seize opportunities that are here today and gone in just a few seconds.

So if speed is important, why over-complicate it by using a Smart Data solution? Simply put, the results generated by smart data analytics leapfrog traditional analytics for two primary reasons:-

1. Firstly, analysts can increase the number of attributes or variables considered in machine learning algorithms (e.g. geodata, social and transactional data). New data types may contain unique, unexpected attributes that, when included in predictive modelling, can offer novel insights.

2. Secondly, looking at more data over a longer period (e.g. 3 months rather than 3 weeks) allows analysts and investigators to uncover trends or outliers that might not have been apparent in a shorter window.

So why use R? Smart Data strategies, when connecting with advanced analytics tools using R, could be a step change for UK Police, DWP, HMRC, NHS Protect and local government investigators. Smart Data and R could become the primary tools for developers and users supporting investigations, and be a platform to using advanced analytics and cognitive solutions to combat fraud and cybercrime.

Strong interest in R through 2015/16 has led to a busy open source community. UK law enforcement agencies can benefit from this when collaborating on fraud prevention and investigation. It will allow seamless sharing of models, algorithms and coding best practice. It will present opportunities for flexibility and adaptability in shared standards and platforms for evidential data.

Many analytics software vendors offer some sort of integration with R that makes it easier to use.  Users can then develop preset algorithms and circulate those to other agencies where their presets have been successful and can simply be connected to bespoke smart data strategy boards.

This can only help to improve the chances of a successful outcome (after all, that’s what we all want).  Imagine improved prediction, classification, segmentation, forecasting, sequence/association/geo-data pattern discovery and detection that can all be applied to improve results. Imagine reducing cost where UK Police Forces and Law Enforcement Agencies are seeing budgets slashed and have limited technical resources.

Now let’s summarize what we’re currently faced with, and what R and smart data can achieve: –

  • Vast quantities of data!
  • More and more complex investigations!
  • UK Police Forces need a workable analytics solution to improve predictions and success rates 
  • R and Smart Data is the answer!

I’d be interested to hear your views on this and how your organisation is integrating smart data along with R and any other open source code. Drop me a line with any comments, look forward to hearing from you!