Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data

dc.citation.firstpage1
dc.citation.journalTitleJournal of Machine Learning Research
dc.citation.lastpage73
dc.citation.volumeNumber22
dc.contributor.authorWang, Minjie
dc.contributor.authorAllen, Genevera I.
dc.date.accessioned2021-10-20T16:32:00Z
dc.date.available2021-10-20T16:32:00Z
dc.date.issued2021
dc.description.abstractIn mixed multi-view data, multiple sets of diverse features are measured on the same set of samples. By integrating all available data sources, we seek to discover common group structure among the samples that may be hidden in individualistic cluster analyses of a single data view. While several techniques for such integrative clustering have been explored, we propose and develop a convex formalization that enjoys strong empirical performance and inherits the mathematical properties of increasingly popular convex clustering methods. Specifically, our Integrative Generalized Convex Clustering Optimization (iGecco) method employs different convex distances, losses, or divergences for each of the different data views with a joint convex fusion penalty that leads to common groups. Additionally, integrating mixed multi-view data is often challenging when each data source is high-dimensional. To perform feature selection in such scenarios, we develop an adaptive shifted group-lasso penalty that selects features by shrinking them towards their loss-specific centers. Our so-called iGecco+ approach selects features from each data view that are best for determining the groups, often leading to improved integrative clustering. To solve our problem, we develop a new type of generalized multi-block ADMM algorithm using sub-problem approximations that more efficiently fits our model for big data sets. Through a series of numerical experiments and real data examples on text mining and genomics, we show that iGecco+ achieves superior empirical performance for high-dimensional mixed multi-view data.
dc.identifier.citationWang, Minjie and Allen, Genevera I.. "Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data." <i>Journal of Machine Learning Research,</i> 22, (2021) JMLR: 1-73. <a href="https://hdl.handle.net/1911/111581">https://hdl.handle.net/1911/111581</a>.
dc.identifier.urihttps://hdl.handle.net/1911/111581
dc.language.isoeng
dc.publisherJMLR
dc.relation.urihttps://jmlr.org/papers/v22/19-1012.html
dc.rightsLicense: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/.
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subject.keywordIntegrative clustering
dc.subject.keywordconvex clustering
dc.subject.keywordfeature selection
dc.subject.keywordconvex optimization
dc.subject.keywordsparse clustering
dc.subject.keywordGLM deviance
dc.subject.keywordBregman divergences
dc.titleIntegrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data
dc.typeJournal article
dc.type.dcmiText
dc.type.publicationpublisher version
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
19-1012.pdf
Size:
2.17 MB
Format:
Adobe Portable Document Format
Description: