Digital Repository

Graph Compression for Extraction of Succinct Knowledge Representations from Evolving Narratives

Show simple item record

dc.contributor.advisor Patil, Sangameshwar
dc.contributor.author SANKLECHA, HARSH
dc.date.accessioned 2026-05-22T10:16:59Z
dc.date.available 2026-05-22T10:16:59Z
dc.date.issued 2026-05
dc.identifier.citation 61 en_US
dc.identifier.uri http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11158
dc.description.abstract In the contemporary era of information silos, the ability to synthesise vast datasets into concise formats is a critical challenge. Timeline Summarisation (TLS) addresses this by processing extensive corpora of news articles to produce a chronological sequence of key events, where each date is paired with a brief, salient summary. For general audiences, navigating hundreds of disparate articles to reconstruct a narrative is labour-intensive; an automated timeline provides an immediate understanding of event progression and the- matic essence. This thesis explores the algorithmic generation of such timelines, a task of significant in- terest within the Natural Language Processing (NLP) community. Evaluation is typically benchmarked against gold-standard datasets, specifically T17, CRISIS, and ENTITIES, using standardised metrics to compare machine-generated outputs against human-expert references. We investigate two distinct methodological frameworks to address the TLS problem: Rhetorical Structure Theory (RST): This approach analyses the discourse structure of news articles to identify salient sentences. Our hypothesis posited that “nucleus” sen- tences, which are frequently elaborated upon by “satellite” text, represent the core infor- mation. While this method provided insights into document structure, the results did not achieve state-of-the-art performance. Graph-based Entity and Event Extraction: The second approach involved clustering sentences based on shared entities and events. We initially explored Large Language Mod- els (LLMs) for extraction; however, the computational latency proved prohibitive for large- scale datasets. Consequently, we transitioned to a more e!cient spaCy-based pipeline. Our findings indicate that graph-theoretic properties, specifically node degree, serve as ef- fective indicators for identifying the most significant events to include in a timeline. en_US
dc.language.iso en en_US
dc.subject News Timeline Summarisation en_US
dc.subject Graph Theory en_US
dc.subject RST en_US
dc.subject Event and Entity Graph en_US
dc.title Graph Compression for Extraction of Succinct Knowledge Representations from Evolving Narratives en_US
dc.title.alternative News Timeline Summarisation en_US
dc.type Thesis en_US
dc.description.embargo Two Years en_US
dc.type.degree BS-MS en_US
dc.contributor.department Dept. of Mathematics en_US
dc.contributor.registration 20211263 en_US


Files in this item

This item appears in the following Collection(s)

  • MS THESES [2219]
    Thesis submitted to IISER Pune in partial fulfilment of the requirements for the BS-MS Dual Degree Programme/MSc. Programme/MS-Exit Programme

Show simple item record

Search Repository


Advanced Search

Browse

My Account