Is it possible to build text classification model , that depends on previous result of other model?


Suppose I have training data on xxx.arff file as the following below,

 @relation model1
        @attribute class-att {TAG1,TAG2}
        @attribute classification string
        TAG1,'bla1 bla1 here some text that I want to build with 1st classification'
        TAG2,'bla2 bla2 here some text that I want to build with 1st classification'

and 2nd model which has input of 1st output classification

  @relation test  
        @attribute class-att {OUT1,OUT2,..}
        @attribute severity {RED, BLUE, GREEN}
        @attribute result string //potentially is TAG1/TAG2 from previous result...
        OUT1,RED,'here I want result of model 1: E.g TAG1/TAG2'
        OUT1,BLUE,'here I want result of model 1: E.g TAG1/TAG2'

So , My purpose to build “test” model that dependent on 1st model’s result.

BTW, I’m using Weka 3.8 with j.48 / FilterClassifier

Is it possible , any suggestions please ?



What you are essentially referring to is stacked ensembling. I’ll just explain what stacking is -

In stacking, multiple layers of machine learning models are placed one over another where each of the models passes their predictions to the model in the layer above it and the top layer model takes decisions based on the outputs of the models in layers below it.

Let’s understand it with an example:

Here, we have two layers of machine learning models:

  • Bottom layer models (d1, d2, d3 ) which receive the original input features(x) from the dataset.
  • Top layer model, f() which takes the output of the bottom layer models (d1, d2, d3 ) as its input and predicts the final output.
  • One key thing to note here is that out of fold predictions are used while predicting for the training data.

Here, we have used only two layers but it can be any number of layers and any number of models in each layer. Two of the key principles for selecting the models:

  • The individual models fulfill particular accuracy criteria.
  • The model predictions of various individual models are not highly correlated with the predictions of other models.

To read more about stacking and its allied ensemble techniques, you can refer this article

In Weka, you can do this as follows (reference article):

  1. Click the “Choose” button and select “Stacking” under the “meta” group.
  2. Click on the name of the algorithm to review the algorithm configuration.