Text Summarization: Taking Advantage of Structural Syntax, Term Expansion and Process Refinement

Elhadi, Mohamed Taybe (2024) Text Summarization: Taking Advantage of Structural Syntax, Term Expansion and Process Refinement. In: Research and Applications Towards Mathematics and Computer Science Vol. 9. B P International, pp. 142-161. ISBN 978-81-970187-9-4

Full text not available from this repository.

Abstract

This chapter is dedicated to the development and application of an extractive summarization procedure and is a detailed account of the set of experiments performed to study the utility of applying a combined structural property of a text’s sentences and term expansion using WordNet along with a local thesaurus all combined and used in the selection of the most appropriate extractive text summarization in a particular document. Sentences were tagged and normalized. Next, they were subjected to the well-known Longest Common Subsequence (LCS) algorithm for the selection of the most similar subset of sentences. Calculated similarity was based on LCS of pairs of sentences making up the document. A normalized score was calculated and used to rank the sentences. A selected top subset of the most similar sentences was then tokenized to produce a set of important keywords or terms. The produced terms were further expanded into two subsets using 1) WorldNet; and 2) a local electronic dictionary/thesaurus. The three sets obtained (the original and the expanded two) were then re-cycled to further refine and expand the list of selected sentences from the original document. The process was repeated a number of times in order to find the best representative set of sentences. A final set of the top (best) sentences was selected as candidate sentences for summarization. In order to verify the utility of the procedure, a number of experiments were conducted using an email corpus. The results were compared to those produced by human annotators as well as to those results produced using basic sentences similarity calculation method. Produced results were very encouraging and compared well to those of human annotators and Jacquard sentences similarity.

Item Type: Book Section
Subjects: GO for STM > Mathematical Science
Depositing User: Unnamed user with email support@goforstm.com
Date Deposited: 17 Feb 2024 07:18
Last Modified: 17 Feb 2024 07:18
URI: http://archive.article4submit.com/id/eprint/2688

Actions (login required)

View Item
View Item