Convite da defesa de tese do Programa de Pós-Graduação em Ciência da Computação

A Coordenação do Programa de Pós-Graduação em Ciência da Computação tem a satisfação de convidá-lo para a Defesa de Tese:

Methods for Vector Representation and Topic Modeling of Short Text

Marcelo Rodrigo de Souza Pita

 

Short texts are everywhere in the Web. They are characterized by little context words and a large collection vocabulary. This makes the discovery of knowledge in short text challenging, motivating the development of novel effective methods. This work has contributions in two lines of research. In the first line, a framework that creates larger pseudo-documents for STTM is proposed, from which we derive two implementations: (1) CoFE, based on the co-occurrence of words; and (2) DREx, which relies on word vectors. We also propose Vec2Graph, a graph-based representation for corpora induced by word vectors, and VGTM, a probabilistic short text topic model that works on the top of Vec2Graph. In the second line of research, we report a investigation on proper ways of combining word vectors to produce document vectors. Experiments show competitive results both in NPMI (topic modeling) and F1 (document classification), many times with significant improvement over state-of-the-art methods.

Comissão Examinadora:

Profa. Gisele Lobo Pappa - Orientadora (DCC - UFMG)

Prof. Marcos André Gonçalves (DCC - UFMG)

Prof. Marco Antônio Pinheiro de Cristo (IComp - UFAM)

Prof. Alexandre Plastino de Carvalho (IC - UFF)

Prof. Pedro Olmo Stancioli Vaz de Melo (DCC - UFMG)

 

2 de Dezembro de 2019

13:00h

 

Sala 2077 do ICEX