Show simple item record

dc.contributor.authorMasegosa, Andres
dc.contributor.authorMartinez, Ana M.
dc.contributor.authorLangseth, Helge
dc.contributor.authorNielsen, Thomas D.
dc.contributor.authorSalmeron, Antonio
dc.contributor.authorRamos-López, Dario
dc.date.accessioned2017-11-15T08:44:21Z
dc.date.available2017-11-15T08:44:21Z
dc.date.created2017-07-26T00:57:42Z
dc.date.issued2017
dc.identifier.citationInternational Journal of Approximate Reasoning. 2017, 88 435-451.nb_NO
dc.identifier.issn0888-613X
dc.identifier.urihttp://hdl.handle.net/11250/2466330
dc.description.abstractIn this paper we present an approach for scaling up Bayesian learning using variational methods by exploiting distributed computing clusters managed by modern big data processing tools like Apache Spark or Apache Flink, which e ciently support iterative map-reduce operations. Our approach is de ned as a distributed projected natural gradient ascent algorithm, has excellent convergence properties, and covers a wide range of conjugate exponential family models. We evaluate the proposed algorithm on three real-world datasets from di erent domains (the Pubmed abstracts dataset, a GPS trajectory dataset, and a nancial dataset) and using several models (LDA, factor analysis, mixture of Gaussians and linear regression models). Our approach compares favourably to stochastic variational inference and streaming variational Bayes, two of the main current proposals for scaling up variational methods. For the scalability analysis, we evaluate our approach over a network with more than one billion nodes and approx. 75% latent variables using a computer cluster with 128 processing units (AWS). The proposed methods are released as part of an open-source toolbox for scalable probabilistic machine learning (http://www.amidsttoolbox.com) Masegosa et al. (2017).nb_NO
dc.language.isoengnb_NO
dc.publisherElseviernb_NO
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/deed.no*
dc.titleScaling up Bayesian variational inference using distributed computing clustersnb_NO
dc.typeJournal articlenb_NO
dc.typePeer reviewednb_NO
dc.description.versionacceptedVersionnb_NO
dc.source.pagenumber435-451nb_NO
dc.source.volume88nb_NO
dc.source.journalInternational Journal of Approximate Reasoningnb_NO
dc.identifier.doi10.1016/j.ijar.2017.06.010
dc.identifier.cristin1483087
dc.description.localcode© 2017 Elsevier Ltd. This is the authors' accepted and refereed manuscript to the article, locked until 2019-06-28 due to copyright restrictions. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/nb_NO
cristin.unitcode194,63,10,0
cristin.unitnameInstitutt for datateknikk og informasjonsvitenskap
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode2


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal