IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
A Web Page Segmentation Approach Using Visual Semantics
Jun ZENGBrendan FLANAGANSachio HIROKAWAEisuke ITO
Author information
JOURNAL FREE ACCESS

2014 Volume E97.D Issue 2 Pages 223-230

Details
Abstract

Web page segmentation has a variety of benefits and potential web applications. Early techniques of web page segmentation are mainly based on machine learning algorithms and rule-based heuristics, which cannot be used for large-scale page segmentation. In this paper, we propose a formulated page segmentation method using visual semantics. Instead of analyzing the visual cues of web pages, this method utilizes three measures to formulate the visual semantics: layout tree is used to recognize the visual similar blocks; seam degree is used to describe how neatly the blocks are arranged; content similarity is used to describe the content coherent degree between blocks. A comparison experiment was done using the VIPS algorithm as a baseline. Experiment results show that the proposed method can divide a Web page into appropriate semantic segments.

Content from these authors
© 2014 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top