outliners

What is Anomaly Detection?

Anomaly detection detects data points that do not fit well to the expected pattern of a given group. It has a wide range of applications such as fraud detection, surveillance, diagnosis, data cleanup, and predictive maintenance. By working with tens thousands of systems and data metrics applications that change from minute to anther , and considering the elements involved in detecting such outliers ;the game becomes exponentially more difficult, which tend to be "humanly impossible."

When we have Anomalies ?

Anomalies are rare under most conditions. Therefore , even when the data is available, often there will be few dozen anomalies exist among millions of regular data points, the definition of an anomaly is 2% of each dimension of the provided data. Example, if we are having the customer credit card data over 1000 records, and we are tracking the transaction amount, transaction count per month, transaction date and client age. Thus we have four factors and every factor should contribute with maximum 2% of anomaly point. Therefore we should have 80 anomaly points as a (Result 2 percent * 4 factors * 10000 number of record). Anomaly shouldn’t exceed 2% of data as unusual behavior

Insurance

Insurance business is based on data. Historical data and operational data shapes the underwriting and claim management and\or reinsurance agreements. The existing of the big data in the insurance Industry creates anomalies in many fields :

  • Claims

    Most of insurance claims are co-related; the claim amount, outstanding amount, deductibles, discount, co-payment and the paid amount. These dimension are co- related in amount and related to claim frequency. Our anomaly detection solution detects any irrelevant amount that does not fit with the company/policy/insured patterns. Figure-1 below shows that contractual discount and paid amount don’t fit with the company policy/history regular paid amount and contractual discount amount.
  • Providers

    Provider controls the major amount of claim cost in the insurance industry. Medical and motor providers vary in their pricing tables; however, doing the same procedure/spare part change shouldn’t differ in the paid amount by the insurance company regardless the provider who is providing this service. Anomaly detection solution provide an auto control to provider bills and procedures. Moreover, we provide an auto detect mechanism to detect high frequency procedure/ treatment / third party provider co-relation with the insurer or/and with the provider network and mark it as possible fraud or anomaly.
  • Repair Parts and Claim Estimate

    Claim estimation and repair part data is co-related when dealing with same damaged parts, car model, estimated labor cost and replaced parts. Anomaly detection will detect under/over estimated claim amounts .

Medical Anomaly
Fig.1 - Anomaly; High reported amount & Paid amounts, while the contractual discount is low

Banks

  • Credit Risk

    The Credit Approval Form, along with any supporting documentation (appraisal, financial statements, contracts, published information, independent credit reporting agencies, etc.) are utilized by a Loan Committee / Credit Committee to formulate a decision regarding the granting of credit to the respective applicant. These information should be co-related to the credit limit. Anomaly detection should detect any weakness in the application data that doesn’t comply with the bank policy.
  • Credit Card Fraud

    Credit card data consists of 'normal' and 'risky' transactions. Risky transactions are assumed to be anomalous and dissimilar from the normal data. We use number of transactions for a period of time, withdrawal amounts, and the weekend\ weekday transaction pattern to detect normal vs risky transactions.

How do we detect anomalies?

At DataCave, we collect your data, analyize and engineer it then target the required anomaly class, find the co-relations and causation between the data. Prepare an initial design of every variable weight and contribution in the target class. The first offline model is presented in order to test and tune the anomaly detection algorithm. After initial confirmation we create the algorithm to auto detect the anomalies, and submit it as offline report. In the final stage we will connect the anomaly detection solution to the client database; in order to have online access (intranet or cloud) to auto detect anomalies in new entry datas.

Request a Demo »