Graph Compression for Extraction of Succinct Knowledge Representations from Evolving Narratives

SANKLECHA, HARSH

DR Home
→
THESES & PROJECT REPORTS
→
MS THESES
→
View Item

Graph Compression for Extraction of Succinct Knowledge Representations from Evolving Narratives

SANKLECHA, HARSH

URI: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/11158 Date: 2026-05

Abstract:

In the contemporary era of information silos, the ability to synthesise vast datasets into concise formats is a critical challenge. Timeline Summarisation (TLS) addresses this by processing extensive corpora of news articles to produce a chronological sequence of key events, where each date is paired with a brief, salient summary. For general audiences, navigating hundreds of disparate articles to reconstruct a narrative is labour-intensive; an automated timeline provides an immediate understanding of event progression and the- matic essence. This thesis explores the algorithmic generation of such timelines, a task of significant in- terest within the Natural Language Processing (NLP) community. Evaluation is typically benchmarked against gold-standard datasets, specifically T17, CRISIS, and ENTITIES, using standardised metrics to compare machine-generated outputs against human-expert references. We investigate two distinct methodological frameworks to address the TLS problem: Rhetorical Structure Theory (RST): This approach analyses the discourse structure of news articles to identify salient sentences. Our hypothesis posited that “nucleus” sen- tences, which are frequently elaborated upon by “satellite” text, represent the core infor- mation. While this method provided insights into document structure, the results did not achieve state-of-the-art performance. Graph-based Entity and Event Extraction: The second approach involved clustering sentences based on shared entities and events. We initially explored Large Language Mod- els (LLMs) for extraction; however, the computational latency proved prohibitive for large- scale datasets. Consequently, we transitioned to a more e!cient spaCy-based pipeline. Our findings indicate that graph-theoretic properties, specifically node degree, serve as ef- fective indicators for identifying the most significant events to include in a timeline.

Show full item record

Files in this item

Name: 20211263_HARSH_SA ...

Size: 4.151Mb

Format: PDF

This item appears in the following Collection(s)

MS THESES [2219]
Thesis submitted to IISER Pune in partial fulfilment of the requirements for the BS-MS Dual Degree Programme/MSc. Programme/MS-Exit Programme

Search Repository

Advanced Search

Browse

All of Repository
This Collection
- Titles
- Authors
- By Advisor
- By Issue Date
- Subjects
- By Type
- By Department

Graph Compression for Extraction of Succinct Knowledge Representations from Evolving Narratives

Graph Compression for Extraction of Succinct Knowledge Representations from Evolving Narratives

Abstract:

Files in this item

This item appears in the following Collection(s)

Search Repository

Browse

All of Repository

This Collection

My Account