The USPTO recently announced an expansion of PatentsView, its visualization tool for US patents. First launched a few years ago, the intent behind the tool was to make 40 years of patent filing data available for free to those interested in examining “the dynamics of inventor patenting activity over time”. In spite of being limited to patents (not applications) and with a focus only on the US, it offers some interesting visualizations around locations and citations.
Director Michelle said that the tool is based on “the highest-quality patent data available,” which is the underlying USPTO data set. Through no fault of their own, the USPTO dataset is rife with spelling errors, doesn’t reflect patent reassignments, and doesn’t resolve company subsidiaries or acquisitions.
This issue is not unique to the USPTO. Other PTO offices around the world face similar barriers to presenting “clean” data. The first issue, spelling errors, merely reflects the fact that assignee information (among other fields like inventor names) is manually-entered, and hence is prone to error and inconsistency. For example, International Business Machines has been spelled 1,200 different ways as a patent assignee over the last two decades in the USPTO data set.
In addition, the PTO data doesn’t get corrected or updated based on later corrections or patent reassignments. As just one example out of millions, patent US8176440 was originally—and incorrectly—assigned to Silicon Labs. Innography filed a certificate of correction to correct the assignment, yet the USPTO data and PatentsView don’t reflect this. In fact, Innography research shows that almost 20% of US patents are reassigned in their lifetimes, so that means a company’s portfolio based on PTO data is typically about 20% wrong just on this factor alone.
Finally, the PTO data also doesn’t reflect when companies purchase each other, when there’s a spinoff, or when a subsidiary files patents — which may be the one to which all the patents are actually assigned. LinkedIn’s patents, for example, are now all owned by Microsoft, even if the reassignments haven’t been processed.
As a result, the PTO data falls far short of reflecting reality, where patents and companies are bought and sold every day, and where data-entry errors exist and are corrected. The accuracy of the data is very low when it comes to representing company patent portfolios in the real world.
Tools like PatentsView Serve a Purpose
The USPTO is making available the data that it has with a mission of transparency of the patenting and invention processes. But if the search results are something upon which IP practitioners cannot make decisions, then what is it for? There are many reasons to use the very rich information available through the patenting process, including economic research, prior-art searching, and discovering broader trends around filing patterns. It was never intended that patent professionals would utilize this data as-is, in order to inform the strategic decisions that need to be made on a regular basis: in and out licensing, merger and acquisition activities, portfolio pruning and maintenance decisions, and more.
Using the Underlying Data Exactly As It Is
While it makes sense for the PTOs to find ways to offer up the data that they have freely, with a goal to engage the community’s interest in the patenting process, it makes no sense at all that many lightweight patent analytics tools use this data verbatim and dare to tout that they offer “data quality” to IP professionals trying to do their jobs. (Read a prior post of mine on the Third Age of Patent Search tools)
Many patent analyses start with a company’s patent portfolio, such as competitive benchmarking, acquisition analysis, and negotiation preparation. In addition, just about every board-level question about patents requires accurate patent ownership information: “Are we ahead of or behind this competitor?” “What companies should we be worried about in this technology area?”
Innography has always had a mission of creating the most accurate data set possible, using other sources of information to cross-check and improve patent data accuracy. Our customers want and need to be able to make strategic decisions using intellectual property data, and they require the best data possible in order to make those decisions with confidence. For example, over 2,000 company acquisitions are processed every year by our data scientists, and our user base submits over 5,000 suggested updates every year. As a result, over the last decade, we’ve created over 10 million data-correction rules, which are constantly updated by our data science team via machine learning and crowdsourcing.
We’ve also applied our data-accuracy algorithms to inventors, resolving over 90% of misspellings and inconsistencies in inventor names. As one final data-improvement example, many applications and some grants don’t have company assignees, so Innography looks at the inventors, dates, and counsel information to derive who the company owner is. We call these “hidden assignments.”
The team at Innography wants you to focus on making sense of the reports produced. To spend your time discussing the results with other strategic leaders in order to make game-changing decisions about your business. If you are looking for software in order to perform accurate patent analyses, look for the most accurate patent data that reflects changes in the real world, and doesn’t rely on a tool’s “data quality” when it only uses the PTO’s original never-updated assignee information.