
^{
Description
}
This program is an implementation of the Logistic Regression model algorithm for classification introduced by [Cox, 1958; Truett et al., 1967] and described in [Amini, 2015; p.7375]. The developed algorithm is a supervised learning model that is
employed to explain the effects of the feature characteristics of instances on their binary {0,1} categorical outputs. The main hypothesis is that for each observation , the logarithm of the ratio of posteriors is a linear combination of its features
where, are model parameters that are usually learned by maximizing the complete loglikelihood of the data. By shiftting the output space to {1,1}, the maximiztion of the loglikelihood is then equivalent to the minimization of the logistic surrogate of the 0/1 loss, which on a training set writes
Efficient first and second order optimization techniques are generally applied to achieve the minimization [Hastie et al. 2009]. The proposed program is based on the gradient conjugate technique.
^{
} ^{Download and Installation}
^{}
The program is free for scientific use only and it is developed on Linux with gcc and the source code is available from:
http://ama.liglab.fr/~amini/LR/LogisticRegression.tar.bz2
After downloading the file, and unpackting it:
> bzip2 cd LogisticRegression.tar.bz2  tar xvf 
you need to compile the program in the new directory LogisticRegression/
> make
After compilation, two executables are created:
 LogisticRegressionlearn (for training the model)
 LogisticRegressiontest (for testing it)
^{
} ^{Training and testing}
^{}
Each example in these files is represented by its class label (+1 or 1) followed by its plain vector representation. In LogisticRegression/example/ there are four (training_set and test_set) files, from UCI repository.
^{Train the model:}
> LogisticRegressionlearn [options] input_file parameter_file
Options are:
e (float)  Precision (default 1e4),

d ({0,1})  Display (default 0),

?  Help

^{Test the model:}
> LogisticRegressiontest input_file parameter_file
^{
} ^{Example}
^{}
> LogisticRegressionlearn d 1 e 0.01 example/IONOTrain ParamsLRIONO
The training set contains 210 examples in dimension 34
Iteration:0 Loss:0.707320
Iteration:5 Loss:0.450916
Iteration:10 Loss:0.331135
Iteration:15 Loss:0.274300
Iteration:20 Loss:0.239811
Iteration:25 Loss:0.221960
Iteration:30 Loss:0.207585
Iteration:35 Loss:0.196435
Precision:0.923077 Recall:0.970588 F1measure:0.946237 Error=0.071429
> LogisticRegressiontest example/IONOTest ParamsLRIONO
Prediction on the test set containing 141 examples in dimension 34
Precision:0.865979 Rappel:0.943820 mesureF:0.903226 Erreur=0.127660
^{
} ^{Disclaimer
} ^{}
This program is publicly available for research use only. It should not be distributed for commercial use and the author is not responsible for any (mis)use of this algorithm.
^{
} ^{Bibliography}
^{}
[Amini, 2015] MassihReza Amini. Apprentissage Machine: de la théorie à la pratique. Eyrolles, 2015.
[Cox, 1958] David Roxbee Cox, DR. The regression analysis of binary sequences (with discussion). Journal of the Royal Statistical Society. Series B, 20: 215242, 1958.
[Hastie et al. 2009] Trevor Hastie, Robert Tibshirani, and Jerome Friedmann. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2009.
[Mohri et al. 2012] Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of Machine Learning. MIT Press, 2012.
[Truett et al., 1967] Jeanne Truett, Jerome Cornfield, William Kannel. A multivariate analysis of the risk of coronary heart disease in Framingham. Journal of chronic diseases 20 (7): 51124, 1967.

