This is in reference to an article published here -

A Guide to Sequence Prediction using Compact Prediction Tree (with codes in Python)

In the prediction phase, step 3 it writes the logic to compute score be given as -

If the item is not present in the dictionary, then,

**score = 1 + (1/number of similar sequences) +(1/number of items currently in the countable dictionary+1) 0.001*

otherwise,

**score = (1 + (1/number of similar sequences) +(1/number of items currently in the countable dictionary+1) 0.001) * oldscore*

Can anyone explain how was this score formulation derived?

This is the original paper -

Compact Prediction Tree: A Lossless Model for Accurate Sequence Prediction

Where the score computing logic is -

"The primary scoring measure is the support. But in the case where the support of two items is equal, the confidence is used. We define the support of an item s_i as the number of times s_i appears in sequences similar to S, where S is the sequence to predict. The confidence of an item s_i is defined as the support of si divided by the total number of training sequences that contain si (the cardinality of the bitset of s_i in the II). "