Naive-Bayes inspired effective pre-conditioner for speeding-up logistic regression

Nayyar Abbas Zaidi, Mark James Carman, Jesus Cerquides, Geoffrey Ian Bawtree Webb

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

9 Citations (Scopus)

Abstract

We propose an alternative parameterization of Logistic Regression (LR) for the categorical data, multi-class setting. LR optimizes the conditional log-likelihood over the training data and is based on an iterative optimization procedure to tune this objective function. The optimization procedure employed may be sensitive to scale and hence an effective pre-conditioning method is recommended. Many problems in machine learning involve arbitrary scales or categorical data (where simple standardization of features is not applicable). The problem can be alleviated by using optimization routines that are invariant to scale such as (second-order) Newton methods. However, computing and inverting the Hessian is a costly procedure and not feasible for big data. Thus one must often rely on first-order methods such as gradient descent (GD), stochastic gradient descent (SGD) or approximate second-order such as quasi-Newton (QN) routines, which are not invariant to scale. This paper proposes a simple yet effective pre-conditioner for speeding-up LR based on naive Bayes conditional probability estimates. The idea is to scale each attribute by the log of the conditional probability of that attribute given the class. This formulation substantially speeds-up LR's convergence. It also provides a weighted naive Bayes formulation which yields an effective framework for hybrid generative-discriminative classification.

Original languageEnglish
Title of host publicationProceedings, 14th IEEE International Conference on Data Mining (ICDM 2014)
EditorsRavi Kumar, Hannu Toivonen, Jian Pei, Joshua Zhexue Huang, Xindong Wu
Place of PublicationLos Alamitos CA USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages1097-1102
Number of pages6
ISBN (Electronic)9781479943029
ISBN (Print)9781479943029
DOIs
Publication statusPublished - 2014
EventIEEE International Conference on Data Mining 2014 - Shenzhen, China
Duration: 14 Dec 201417 Dec 2014
Conference number: 14th
http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7022262 (Conference Proceedings)

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
NumberJanuary
Volume2015-January
ISSN (Print)1550-4786

Conference

ConferenceIEEE International Conference on Data Mining 2014
Abbreviated titleICDM 2014
CountryChina
CityShenzhen
Period14/12/1417/12/14
Internet address

Keywords

  • classification
  • discriminative-generative learning
  • logistic regression
  • pre-conditioning
  • stochastic gradient descent
  • weighted naive Bayes

Cite this