There is one problem that to be solved based on rapid development of Web, that is how to obtain the information required through it, therefore, it is very necessary to accomplish information extraction. Extraction from web pages adopts Wrapper to finish the best architecture of Wrapper in accuracy, robust degree as well as universal property so that to prevent influences of different website architectures and page structures on it, at the same time reduce human involvement to the maximum extent. This is a problem that must be solved in research on information extraction. This thesis puts forward a Web information extraction platform based on XML technology and conducts search and recognition of users browse tendency through application of inductive learning algorithm.
展开▼