Methodology

This section describes the basic ideas and notation that Deacto uses to perform actionable analytics. For more information, refer to the notebook that provides a basic illustration of Deacto on a real-world dataset.

Data

There is a dataset containing a collection of some objects a company is interested in using their data to its advantage. There are \(N\) instances in data. Each instance represents one business object and consists of a unique identifier, \(M\) input features and target feature (\(y\)).

Mathematically,

\[data = ( < X_{1},y_{1} > ,..., < X_{N},y_{N} > )\]

where:

  • \(F = \ \left\{ F_{1},\ldots,F_{m},\ \ldots,\ F_{M} \right\}\) – input feature set

  • \(D(F_{m})\) – input feature \(m\) domain

  • \(X_{n} \in {D(F}_{1}) \times \ldots\ \times D(F_{m}) \times \ldots,{D(F}_{M})\) – instance defined on set of possible instances

  • \(x(m,n)\) - the value of \(m\) feature in instance \(n\)

  • \(y_{n} \in D(y)\) – the value of target feature in instance \(n\) , when 2 values are possible: \(|D(y)| = 2\)

Business Object Value

Target feature value in data presents some object state. Each object’s state value is mapped to the potential value of object for company.

Mathematically,

\[BOS = \ \left\{ {BOS}_{1},\ \ldots,\ {BOS}_{i},\ldots\ {BOS}_{n\_ state} \right\}\]
\[where\ \ {BOS}_{i}\epsilon\ D(y),\ \forall i\]

\({BOV}_{i} = \ \left\{ {BOS}_{i}:value \right\}\)

\({where\ \ BOV}_{i}\ \epsilon\ R\) , \(\forall i\)

For example, in customer retention domain, defining value requires mapping between whether customer stays or leave to money value a company would benefit in each case.

One simple example fir such function as follows:

\[BOV\ :\{ Stay\ :100\ ,\ Leave\ :0\}\]

when value for “Stay” is 100 which calculated as sum of potential profits from customer if it would stay with the company. In case a customer leaves the company, the benefit is 0.

Actions

The aim is to identify a set of actions that can transform business objects from a less valuable state to a more valuable one while considering potential costs. This will result in the optimization of expected utility. The subset of input feature sets that are under control and can be modified to formulate an action is called the actionable feature set (AFS), while the subset of input feature sets that are out of control and cannot be changed is called the non-actionable feature set (NAFS).

An action is presented for given feature \({AFS}_{i\_ afs}\) in \(AFS\) by changing the feature from source value to target value when both are defined on the feature domain \(D({AFS}_{i\_ afs})\).

Mathematically,

\[AFS = \ \left\{ {AFS}_{1},\ \ldots,\ {AFS}_{i\_ afs},\ldots\ {AFS}_{n\_ afs} \right\}\]
\[{AFS}_{i\_ afs\ }\epsilon\ F\ ,\ \forall i\_ afs\]
\[NAFS = \ \left\{ {NAFS}_{1},\ \ldots,\ {NAFS}_{i\_ nafs},\ldots\ N{AFS}_{n\_ nafs} \right\}\]
\[N{AFS}_{i\_ nafs}\epsilon\ F\ ,\ \forall i\_ nafs\]
\[Actions = \ \left\{ A_{1},\ldots,A_{i\_ action},\ldots,\ A_{n\_ action} \right\}\]
\[A_{i\_ action} = \{{AFS}_{i\_ afs}\ :\ src \rightarrow \ tgt\}\]

Costs

Costs are presented by a set of functions that map between each possible business action to money value (cost) for such change for one business object.

Mathematically,

\[{Cost}_{i\_ afs} = \{(src,tgt)\ :cost\ ,\ src,tgt\ \epsilon\ D({AFS}_{i\_ afs})\}\]
\[Costs = \ \left\{ {Cost}_{1},\ \ldots,\ {Cost}_{i\_ afs},\ldots\ {,\ Cost}_{n\_ afs} \right\}\]

For example, in the customer retention domain, data can contain information about different discounts and how it impacts customer retention. Giving a discount to customers has a cost for business but can make retention effort more effective bringing a benefit for customers that would stay with a company.

If we want to consider granting a discount of type A (discount_A_flg) we’d formulate a cost function as :

\[{Cost\ }_{discount\_ A\_ flg} = \{(0,1) \rightarrow 20\ ,\ (1,0)\ \rightarrow + inf\}\]

This cost function defines a cost of 20 if the algorithm considers granting a discount for some customer. Since there is no possibility to cancel discount, cost function maps an alternative change to infinite cost.

The cost of action for given business object:

\[{Cost(A}_{i\_ action},n) = {Cost}_{i\_ afs}\left( {x(AFS}_{i\_ afs},n),{AFS}_{i_{afs}}(tgt) \right)\]

Overall cost of action:

\[Cost\ \left( A_{i\_ action} \right) = \ \sum_{n = 1}^{N}{{Cost(A}_{i\_ action},n)}\]

Utility

Represents money value of expected added benefit for a set of actions while considering benefits and costs.

For some action and business object:

\[Utility\ \left( A_{i\_ action},n \right) = \ BOV(c^{*})*\mathrm{\Delta}Prob(c^{*},n) - \ Cost\ \left( A_{i\_ action},n \right)\]

For example, for given value and cost above, let assume granting a discount for customer \(n\) increases his probability to \(c^{*} = Stay\ \)from 0.4 to 0.8 and this is the only action we consider. The expected utility as follows:

\[Utility\ (discount\_ A\_ flg,n) = \ 100*(0.8 - 0.4) - 20 = 20\]

Overall utility of action:

\[Utility\ \left( A_{i\_ action} \right) = \ \sum_{n = 1}^{N}{BOV(c^{*})*\mathrm{\Delta}Prob(c^{*},n) - \ Cost\ \left( A_{i\_ action} \right)}\]

Ultimately, our goal is to discover a set of actions that can yield the maximum overall utility for the business:

\[Utility\ ({Actions}^{*}) = \ \sum_{{i\_ action}^{*}}^{{n\_ action}^{*}}{Utility\ \left( A_{i\_ action} \right)}\]

\({Actions}^{*} = \ \left\{ A_{1},\ldots,A_{{i\_ action}^{*}},\ldots,\ A_{{n\_ action}^{*}} \right\}\)

\({where\ \ A}_{{i\_ action}^{*}}\ \ \epsilon\ Actions\) , \(\forall i\)_action

This set optimal of actions is presented to user while they are detailly described, evaluated, and ranked.