All Times ET
Keywords: Entity resolution, de-duplication, disambiguation, patent data
PatentsView, an initiative supported by the Office of Chief Economist in the US Patent & Trademark Office (USPTO), is a tool for patent search, analysis, and visualization. It pro- vides a curated and easy-to-use database of pre-granted and granted USTPO patents from 1976 to present. Not only does PatentsView carefully process and clean raw patent data collections from USPTO’s bulk XML files (https://bulkdata.uspto.gov/), but it also per- forms entity resolution of the ambiguous inventor names, assignee names, and the location names of the inventors and assignees. This process disambiguates which inventor, assignee, and location names refer to the same entity in PatentsView. In this work, we describe the entity resolution models and algorithms used in PatentsView, highlight their technical and empirical strengths, and provide examples of studies that have benefited from PatentsView’s disambiguation.