ISBN-13: 9783659528002 / Angielski / Miękka / 2017 / 56 str.
Tabular data is an existing source of information available on the web. We have started working on collection of HTML tables taken from the web. Firstly good quality tables will be identified then schema matching is done. Schema Matching identifies the number of correspondences which determines the similar elements from two different schemas. Columns and data values are compared one after the other to match schema. While searching for tabular data on the web search engine may return URL instead of returning tabular data which is main issue. So we are working on this issue we extracted data of tabular web-pages and extracted their schema and then done matching of schema by identifying the correspondence of similar elements through corpus-based technique. After schema matching, we populated data of HTML pages through joining related tables in one HTML table, which is more appropriate and helpful for users.