Model-based clustering of large networks

dc.citation.firstpage1010en_US
dc.citation.issueNumber2en_US
dc.citation.journalTitleThe Annals of Applied Statisticsen_US
dc.citation.lastpage1039en_US
dc.citation.volumeNumber7en_US
dc.contributor.authorVu, Duy Q.en_US
dc.contributor.authorHunter, David R.en_US
dc.contributor.authorSchweinberger, Michaelen_US
dc.date.accessioned2017-05-03T18:24:05Zen_US
dc.date.available2017-05-03T18:24:05Zen_US
dc.date.issued2013en_US
dc.description.abstractWe describe a network clustering framework, based on finite mixture models, that can be applied to discrete-valued networks with hundreds of thousands of nodes and billions of edge variables. Relative to other recent model-based clustering work for networks, we introduce a more flexible modeling framework, improve the variational-approximation estimation algorithm, discuss and implement standard error estimation via a parametric bootstrap approach, and apply these methods to much larger data sets than those seen elsewhere in the literature. The more flexible framework is achieved through introducing novel parameterizations of the model, giving varying degrees of parsimony, using exponential family models whose structure may be exploited in various theoretical and algorithmic ways. The algorithms are based on variational generalized EM algorithms, where the E-steps are augmented by a minorization-maximization (MM) idea. The bootstrapped standard error estimates are based on an efficient Monte Carlo network simulation idea. Last, we demonstrate the usefulness of the model-based clustering framework by applying it to a discrete-valued network with more than 131,000 nodes and 17 billion edge variables.en_US
dc.identifier.citationVu, Duy Q., Hunter, David R. and Schweinberger, Michael. "Model-based clustering of large networks." <i>The Annals of Applied Statistics,</i> 7, no. 2 (2013) Project Euclid: 1010-1039. https://doi.org/10.1214/12-AOAS617.en_US
dc.identifier.doihttps://doi.org/10.1214/12-AOAS617en_US
dc.identifier.urihttps://hdl.handle.net/1911/94136en_US
dc.language.isoengen_US
dc.publisherProject Eucliden_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.subject.keywordsocial networksen_US
dc.subject.keywordstochastic block modelsen_US
dc.subject.keywordfinite mixture modelsen_US
dc.subject.keywordEM algorithmsen_US
dc.subject.keywordgeneralized EM algorithmsen_US
dc.subject.keywordvariational EM algorithmsen_US
dc.subject.keywordMM algorithmsen_US
dc.titleModel-based clustering of large networksen_US
dc.typeJournal articleen_US
dc.type.dcmiTexten_US
dc.type.publicationpublisher versionen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
euclid.aoas.1372338477.pdf
Size:
2.61 MB
Format:
Adobe Portable Document Format
Description: