Resource management for model learning at entity level

Beyer, Christian; Unnikrishnan, Vishnu; Brüggemann, Robert; Toulouse, Vincent; Omar, Hafez Kader; Ntoutsi, Eirini; Spiliopoulou, Myra

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: http://dx.doi.org/10.25673/60208

Langanzeige der Metadaten

DC Element	Wert	Sprache
dc.contributor.author	Beyer, Christian	-
dc.contributor.author	Unnikrishnan, Vishnu	-
dc.contributor.author	Brüggemann, Robert	-
dc.contributor.author	Toulouse, Vincent	-
dc.contributor.author	Omar, Hafez Kader	-
dc.contributor.author	Ntoutsi, Eirini	-
dc.contributor.author	Spiliopoulou, Myra	-
dc.date.accessioned	2022-01-26T10:31:40Z	-
dc.date.available	2022-01-26T10:31:40Z	-
dc.date.issued	2020	-
dc.date.submitted	2020	-
dc.identifier.uri	https://opendata.uni-halle.de//handle/1981185920/62159	-
dc.identifier.uri	http://dx.doi.org/10.25673/60208	-
dc.description.abstract	Many current and future applications plan to provide entity-specific predictions. These range from individualized healthcare applications to user-specific purchase recommendations. In our previous stream-based work on Amazon review data, we could show that error-weighted ensembles that combine entity-centric classifiers, which are only trained on reviews of one particular product (entity), and entity-ignorant classifiers, which are trained on all reviews irrespective of the product, can improve prediction quality. This came at the cost of storing multiple entity-centric models in primary memory, many of which would never be used again as their entities would not receive future instances in the stream. To overcome this drawback and make entity-centric learning viable in these scenarios, we investigated two different methods of reducing the primary memory requirement of our entity-centric approach. Our first method uses the lossy counting algorithm for data streams to identify entities whose instances make up a certain percentage of the total data stream within an error-margin. We then store all models which do not fulfil this requirement in secondary memory, from which they can be retrieved in case future instances belonging to them should arrive later in the stream. The second method replaces entity-centric models with a much more naive model which only stores the past labels and predicts the majority label seen so far. We applied our methods on the previously used Amazon data sets which contained up to 1.4M reviews and added two subsets of the Yelp data set which contain up to 4.2M reviews. Both methods were successful in reducing the primary memory requirements while still outperforming an entity-ignorant model.	eng
dc.description.sponsorship	Projekt DEAL 2020	-
dc.language.iso	eng	-
dc.relation.ispartof	http://link.springer.com/journal/12243	-
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	-
dc.subject	Entity-centric learning	eng
dc.subject	Stream classification	eng
dc.subject	Document prediction	eng
dc.subject	Memory reduction	eng
dc.subject	Text ignorant models	eng
dc.subject.ddc	000	-
dc.title	Resource management for model learning at entity level	eng
dc.type	Article	-
dc.identifier.urn	urn:nbn:de:gbv:ma9:1-1981185920-621591	-
local.versionType	publishedVersion	-
local.bibliographicCitation.journaltitle	Annals of telecommunications	-
local.bibliographicCitation.volume	75	-
local.bibliographicCitation.pagestart	549	-
local.bibliographicCitation.pageend	561	-
local.bibliographicCitation.publishername	Lavoisier	-
local.bibliographicCitation.publisherplace	Paris	-
local.bibliographicCitation.doi	10.1007/s12243-020-00800-4	-
local.openaccess	true	-
dc.identifier.ppn	1735990388	-
local.bibliographicCitation.year	2020	-
cbs.sru.importDate	2022-01-26T10:23:06Z	-
local.bibliographicCitation	Enthalten in Annals of telecommunications - Paris : Lavoisier, 1946	-
local.accessrights.dnb	free	-
Enthalten in den Sammlungen:	Fakultät für Informatik (OA)

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
Beyer et al._Resource_2020.pdf	Zweitveröffentlichung	1.84 MB	Adobe PDF	Öffnen/Anzeigen

Zur Kurzanzeige BibTeX EndNote