By Thom Kobayashi ‑ November 17, 2017
Patent documents from different countries are plagued with dirty data. It’s the nature of the beast. And the more active the company, with patents, or inventor activity, the higher the likelihood for varied spellings and the dirtier it inevitably gets.To data perfectionists like us, dirty data isn’t something we can accept. Every task, report, or insight is wrong before it even starts if it’s based on bad data. That’s why over ten years ago, we made data cleansing, or in the case of company and inventor names—normalisation—one of Innography’s top priorities. And why, we’ve worked so tirelessly to solve it ever since. We are so excited to share the new resolution of inventors into our most recent software release.
Innography has built a reputation for delivering the closest thing to reality when reporting company portfolios. Leveraging those proprietary algorithms as a basis and working with innovative data mining concepts, we were able to achieve that same high standard for inventor name normalisations.
Until recently, tracking inventors has been virtually impossible to get a handle on. You’ve got misspelled first names, misspelled last names, partial names, no middle name, middle initials—the possible variations are endless. For instance, even our founder’s name, Tyron Stading, which is pretty unique, has 26 spelling variations on 108 patent documents. Imagine if my name were more common (like John Smith) or had more than twelve letters.
Add company name variations and issues about ownership of the records and the hierarchies and how that gets rolled up, and then if the inventor moves cities/states/countries, and switches companies on top of all the name variations, and whew! You can see how it’s basically impossible to track an inventor—what they’ve done, what are their areas of expertise are, where they are located, who they’ve worked for and with—to truly know any inventor’s profile for certain.
By law, in some countries, like Germany and China, companies have to compensate inventors for the patents that they file. Other companies have inventor rewards programs to encourage submission of new idea that are core to the company’s innovation strategy. When our Founder, Tyron Stading, worked at IBM they had Plateaus, you got points for reaching different milestones, like filing a patent application, when the patent was granted, or key award recognition, etc. Inventors were recognised by the different plateaus that they had reached. You can imagine how difficult it is to track all that: “Is the employee here? Did that patent grant? Am I double-counting? Is this even the same person?”
One company we work with told us that they spend one month out of every quarter going through patent data just to do their inventor compensation. A program they think is very important and worthwhile, worthwhile enough to spend 4-5 months out of every year just on their own inventor recognition. That company was able to pull the same report in a fraction of the time with our latest release.
Though we’ve offered name normalisation as a function for some time, we never gave it an official release because we knew we could do much better. Now, that time has come. One year ago, our inventor name normalisation was at a 55% match rate compared to a customer “golden” data set. Today, it’s at a 92% consistency. Further, in benchmarking against another IP data vendor, we found their normalisations at 27% consistency, or 70% less accurate than ours. With much higher-quality data, we’ve been able to cut the number of inventors in our system nearly in half, driving much better fidelity.
Now you can easily find which inventors have worked on specific technologies, for specific companies, or to recognise inventors in your own organisation.
By conducting an extensive cross-comparison validation, we can identify a particular inventor by ensuring that there’s no overlap, that their subject expertise is the same, that they worked at the same company and during the same year. After all these standardisations, it’s clustered together based on expertise, years at the company, and company level. We’ve mined 18 million unique inventor profiles across 30 million inventor employment histories that can be used to get an accurate view of inventors, reducing 70% of redundant inventor records due to misspellings, duplicates/rollups, etc.
Available in Advanced Analysis as a pull-down search and through filtering and visualisations, you’re able to do more with inventor tracking than ever before.
Plus, it makes a great resource for competitive intelligence. Use it to find out the most prolific inventor at a particular company or in your space. It can even help you recruit your “secret weapon,” by identifying the top expert in a technology area— whether they are in China, Korea, or anywhere else in the world.
Using different mining techniques and approaches to how we map out this data, we are able to create highly accurate employment histories for inventors. Available as a special feature of IdeaScout, Inventor Resumes are like a self-generating Linkedin for inventors. A useful tool given that many inventors are typically not active, or even registered, on LinkedIn.
Other solutions require you to create and maintain your own profiles, but IdeaScout comes with preloaded profiles. Profiles are aggregated by pulling improved, normalised data from public inventor patent records (profile pictures can be manually uploaded).
Currently we have 18 million profiles, each of which highlight a number of different metrics, like the inventor’s previous employment, expert network, how innovative they are in a space, and more.
So, there you have it—Innography’s Inventor Normalisation, in a very large nutshell.
February 27, 2020
Inventions becoming inventors… It might sound like sci-fi, but it’s not really such a far-fetched idea. In fact, it’s a ...Read more
April 9, 2019
Choosing intellectual property search and analysis software is daunting. Start by determining what your organisation nee...Read more