Skip to content

Extremely low probability on seemingly accurate predictions. #122

@rohitgadia

Description

@rohitgadia

I have a small set of tagged data (financial securities offerings) with a very straightforward structure and pattern.

I have been playing around with the 'algorithm' parameter of the Trainer Class.

  1. If I use 'ap' (averaged perceptron) and train a small model for the use case mentioned above and try to predict on a new set of input I am getting a model probability over 0.5 in most cases. Although upon looking closely at the predictions, there are a few mistakes clearly.
  2. But when I use 'arow' algorithm and train my model, and try to predict on the same input set I get very low probability generally in the order of (10^-6) for an 100% accurate prediction.

Below is a list of labels predicted by the ap and arow models respectively. Also, the actual labels are specified below.

Actual Labels for an input (can't share input here, security issues)

['NN', 'coupon', 'maturity_date', 'maturity_date', 'par_amount', 'CD', 'rate', 'oas', 'NN', '.']

Labels Predicted by Model trained using AROW Algorithm. (Extremely low Probability)

[['NN', 'coupon', 'maturity_date', 'maturity_date', 'par_amount', 'CD', 'rate', 'oas', 'NN', '.']]
Model Probabilty: 2.89272422392235e-08

On the other hand

Labels Predicted by Model trained using AP Algorithm.

[['NN', 'coupon', 'maturity_date', 'par_amount', 'par_amount', 'par_amount', 'rate', 'oas', 'NN', '.']]
Model Probability: 0.8457047663567325

I have trained the models with the same parameters, the same number of max_iterations, same data, and all other hyperparameters are also the same.

Am I missing something here? Why is it so low on the AROW model where the prediction is 100% accurate. I am using the model probability data to further filter the processed records for a more funneled consumption.

Thanks,
R

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions