GSAM: A deep neural network model for extracting computational representations of Chinese addresses fused with geospatial feature

Xu Liuchang; Du Zhenhong; Mao Ruichen; Zhang Feng; Liu Renyi

首页> 外文期刊>Computers，environment and urban systems >GSAM: A deep neural network model for extracting computational representations of Chinese addresses fused with geospatial feature

【24h】

GSAM: A deep neural network model for extracting computational representations of Chinese addresses fused with geospatial feature

机译：GSAM：一种深度神经网络模型，用于提取与地理空间特征融合的中国地址的计算表示

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Addresses are one of the most important geographical reference systems in natural languages. In China, due to the relatively backward address planning, there are a large number of non-standard addresses. This kind of unstructured text makes the management and application of Chinese addresses much more difficult. However, by extracting the computational representations of addresses, it can be structured and its related applications can be extended more conveniently. Therefore, this paper utilizes a deep neural language model from natural language processing (NLP) to automatically extract computational representations through an unsupervised address language model (ALM), which is trained in an unsupervised way and is suitable for a large-scale address corpus. We propose a solution to fuse addresses and geospatial features and construct a geospatial-semantic address model (GSAM) that supports a variety of downstream tasks. Our proposed GSAM constructing process consists of three phases. First, we build an ALM using bidirectional encoder representations from Transformers (BERT) to learn the addresses' semantic representations. Then, the fusion clustering results of the semantic and geospatial information are obtained by a high-dimensional clustering algorithm. Finally, we construct the GSAM based on the fused clustering results using novel fine-tuning techniques. Furthermore, we apply the extracted computational representation from GSAM to the address location prediction task. The experimental results indicate that the target task accuracy of the ALM is 90.79%, and the result of semantic geospatial fusion clustering strongly correlates with fine-grained urban neighbourhood area division. The GSAM can accurately identify clustering labels and the values of evaluation metrics are all above 0.96. We also demonstrate that our model outperforms purely ALM-based and word2vec-based models by address location prediction task.

机译：地址是自然语言中最重要的地理参考系统之一。在中国，由于相对落后的地址规划，有大量的非标准地址。这种非结构化文本使中国地址的管理和应用更加困难。但是，通过提取地址的计算表示，它可以是结构化的，并且其相关的应用程序可以更方便地扩展。因此，本文利用来自自然语言处理（NLP）的深神经语言模型来通过无监督的地址语言模型（ALM）自动提取计算表示，该模型以无监督的方式培训并且适用于大规模地址语料库。我们提出了解决熔断器地址和地理空间特征的解决方案，并构建支持各种下游任务的地理空间语义地址模型（GSAM）。我们提出的GSAM构建过程包括三个阶段。首先，我们使用来自变换器（BERT）的双向编码器表示构建ALM来学习地址的语义表示。然后，通过高维聚类算法获得语义和地理空间信息的融合聚类结果。最后，我们根据使用新型微调技术基于熔融聚类结果构建GSAM。此外，我们将来自GSAM的提取的计算表示应用于地址位置预测任务。实验结果表明，ALM的目标任务准确性为90.79％，语义地理空间融合聚类的结果与细粒度城市街区划分强烈相关。 GSAM可以准确地识别聚类标签，评估度量的值全部高于0.96。我们还通过地址定位预测任务表明我们的模型优于基于ALM的基于ALM和Word2Vec的模型。

著录项

来源
《Computers，environment and urban systems》 |2020年第5期|101473.1-101473.12|共12页
作者
Xu Liuchang; Du Zhenhong; Mao Ruichen; Zhang Feng; Liu Renyi;
展开▼
作者单位

Zhejiang Univ Sch Earth Sci 38 Zheda Rd Xixi Campus Hangzhou 310027 Zhejiang Peoples R China;

Zhejiang Univ Sch Earth Sci 38 Zheda Rd Xixi Campus Hangzhou 310027 Zhejiang Peoples R China|Zhejiang Prov Key Lab Geog Informat Sci 148 Tianmushan Rd Hangzhou 310028 Peoples R China;

Zhejiang Univ Sch Earth Sci 38 Zheda Rd Xixi Campus Hangzhou 310027 Zhejiang Peoples R China;

Zhejiang Univ Sch Earth Sci 38 Zheda Rd Xixi Campus Hangzhou 310027 Zhejiang Peoples R China|Zhejiang Prov Key Lab Geog Informat Sci 148 Tianmushan Rd Hangzhou 310028 Peoples R China;

Zhejiang Univ Sch Earth Sci 38 Zheda Rd Xixi Campus Hangzhou 310027 Zhejiang Peoples R China|Zhejiang Prov Key Lab Geog Informat Sci 148 Tianmushan Rd Hangzhou 310028 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Chinese addresses; Computational representations; Semantic/geospatial fusion; Bidirectional encoder representations from transformer; Feature fusion clustering; Address location prediction task;

机译：中国地址;计算表示;语义/地理空间融合;变压器的双向编码器表示;特征融合群;地址位置预测任务;

相似文献

外文文献
中文文献
专利

1. A Deep Neural Network Model using Random Forest to Extract Feature Representation for Gene Expression Data Classification [J] . Yunchuan Kong, Tianwei Yu Scientific reports. . 2018,第1期

机译：利用随机森林提取特征表示的深度神经网络模型，用于基因表达数据分类
2. A Deep Neural Network Model using Random Forest to Extract Feature Representation for Gene Expression Data Classification [J] . Yunchuan Kong, Tianwei Yu Scientific reports. . 2018,第1期

机译：利用随机森林提取特征表示的深度神经网络模型，用于基因表达数据分类
3. F_0 Modeling for Isarn Speech Synthesis using Deep Neural Networks and Syllable-level Feature Representation [J] . Janyoi Pongsathon, Seresangtakul Pusadee The international arab journal of information technology . 2020,第6期

机译：使用深神经网络和音节级特征表示，ISARN语音合成的F_0建模
4. Battery Degradation Modeling Based on FIB-SEM Image Features Extracted by Deep Neural Network [C] . Yoichi Takagishi, Tatsuya Yamaue Advanced automotive battery conference Asia;Conference on lithium-ion battery chemistries . 2019

机译：基于深度神经网络提取的FIB-SEM图像特征的电池退化建模
5. Prediction of Electronic Component Prices:from Classical Statistical and Machine Learning Models to Deep Neural Networks with Feature Embedding [D] . Zhang, Yu. 2019

机译：电子零件价格的预测：从经典的统计和机器学习模型到具有特征嵌入的深度神经网络
6. A Deep Neural Network Model using Random Forest to Extract Feature Representation for Gene Expression Data Classification [O] . Yunchuan Kong, Tianwei Yu -1

机译：利用随机森林提取特征表示的深度神经网络模型用于基因表达数据分类
7. Entropy information‐based heterogeneous deep selective fused features using deep convolutional neural network for sketch recognition [O] . Shaukat Hayat, She Kun, Sara Shahzad, 2021

机译：基于熵的信息的异构深度选择性融合功能，使用深卷积神经网络进行草图识别

GSAM: A deep neural network model for extracting computational representations of Chinese addresses fused with geospatial feature

摘要

著录项

相似文献

相关主题

期刊订阅