If the deep dark secret of neural networks is that “no one really knows how the most advanced algorithms do what they do,” where does that leave patents arising from the incorporation of this technology?

Patents require a “who” and a “how” associated with them.  There is someone (or more than one person) identified as the inventor – the “who.”  The patent also should contain an explanation of the methods to make and use the invention – the “how.”  Generally, the issues that arise in these areas are straightforward.  The inventorship reflects separating out who conceived of the invention from others that contributed in different ways, such as carrying out experiments directed by the inventor.  The “how” in patent parlance is referred to as enablement.  It often comes down to the assessment of whether the patent sufficiently explains how to make and use the invention in view of what is known to those in the field at the time the patent application was filed.

Now that machine learning and deep neural networks have entered the picture, the “who” and the “how” may not be as easily identified.

Who is the inventor?

Let’s take a drug, for example.  Old style – a researcher identifies potential chemical structures to act on a specific target.  The compounds are tested and further structural modifications are made.  New structures are tested and eventually, a single structure or small group of more active structures shines through.  A patent on these potential drugs will generally identify the selected chemical structures, how to chemically synthesize them, how to test them for target activity and how these candidates are to be administered to achieve their intended clinical purpose.

New style – a computer is trained with data associating chemical structures with effects on targets generally (e.g., an amalgam of clinical data, adverse event data, and research data from publications) and then this in silico approach is further used to select compounds having a specific effect on a selected target.  The potential candidates can then be validated using in vitro and animal models.  What if a patent is filed before that validation?  Can it identify the required inventor?

An inventor is one who contributes to the invention as it is claimed by the patent.  This turns on two parts – what is encompassed by the patent claims and what constitutes conception for each of these claims.  For a potential drug, claims could include the chemical structure, a pharmaceutical formulation of the compound, a method of using the compound to impact a biological target and a method of administering the compound to treat a disease.  Taking the chemical structure and the method of using it to impact the target, these are two areas where the in silico screening in the above example has arrived at the answers.  The neural network selected the candidate compounds and did so based on the predicted impact of the structures on the target.  So, is the network the inventor for these patent claims?

Conception is defined as the “formation in the mind of the inventor, of a definite and permanent idea of the complete and operative invention, as it is hereafter to be applied in practice.”[1]  For the chemical structure and target effect claims, the neural network could fit into this definition.  For the target effect, even if arguably some validation studies are necessary to verify the compound’s effect, these studies are generally straight-forward and routine.  Case law says that conception is complete where only ordinary skill (in other words, routine experimentation) is required to reduce the invention to practice.  This analysis also applies to the inventorship for the chemical structure claims. Where a method of making the compound is not known, simply having the structure may be insufficient for complete conception.[2]  However, for known compounds, synthesized in the past by others or that follow routine methods, the identification of the compound by the neural network may constitute complete conception.

Is the neural network the sole inventor for these claims?  Neither the scientist who wrote the algorithms and provided the training data set, nor the individual who decided the approach of using the neural network to discover the chemical compounds will be inventors.  Inventorship requires “a particular solution to the problem at hand, not just a general goal or research plan [one] hopes to pursue.”[3]  These individuals provided the plan and the overall goal, but themselves never conceived of the claimed chemical structures.

To play devil’s advocate, I might argue that neural network is not necessarily the inventor either.  Case law says that a researcher who simply follows the instructions of another is not an inventor.[4]  If the software is only following the instructions provided by the coders, can it be an inventor?  But this is the crux of the matter:  Is a neural network just following instructions? A neural network is generally viewed as learning from the data it receives and the patterns it recognizes.  This goes beyond simply following a set of defined steps.  Is that learning “conception” in this new age?

Enablement: How to make the claimed invention

The functionality of neural networks and how they learn brings us to the second issue – the “how” of the invention.  A patent is required to describe “the manner and process of making and using [the invention], in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same.”[5]

Let’s take a different example to illustrate a possible dilemma with this requirement: a method for identifying cancerous cells by reviewing images of tissue sections, identifying the cancerous lesions and providing a predictive score on the ability to treat the cancer. The method in our example uses a neural network trained on data to identify the cancerous cells in the images and to associate particular features of the cellular images with treatment outcomes.  Can the patent sufficiently describe how the algorithms make these identification and derive the predictive scoring?

Recent press has suggested that neural networks have become sophisticated to the level where it is not clear how the network makes decisions.  For instance, if the system learns to recognize a dog, what features does it use to say it’s a dog and not a cow – the fur, the posture, the head shape?  As an April MIT Tech Review publication put it in “The Dark Secret at the Heart of AI,” “No one really knows how the most advanced algorithms do what they do.”

In one analysis, researchers analyzed how a neural network analyzed images – specifically whether a single neuron or groups of neurons in the network were responsible for detecting patterns. They found that single neurons were responsible.  To reach this conclusion, the experiment required a set of images that were labeled down to the pixel level and the researchers knew what part of the network was perceiving each specific part of the image.  This was an experimental setup for the question at hand, but it is not the usual case.

In the cancer-detecting method above, researchers might not know what aspects of the images the software hones in on for cancer identification or the features or patterns of features that the network uses to arrive at the predictive treatment outcome. The neural network is trained to detect patterns, and patterns of patterns in the tissue section images.  The algorithms also identify and exploit correlations between these patterns and known treatment outcomes (such as from the medical literature).  But the neural network is not usually configured to reveal what patterns it has selected from the training data to arrive at its conclusions.

Until we have an understanding of how a neural network arrives at its answers – in our patent example, what features of the images it identifies as cancerous and what it correlates with other data to arrive at a predictive outcome score – the “how” may not be possible to fully describe.  The patent in the example above would likely not be able to disclose to someone in the field how to arrive at the same result.  While the patent system does not require absolute predictability of success from the description set out in the specification, it does require a reasonable expectation of success.  The lack of predictability would be particularly true where the patent does not reveal the full details of the coding for the algorithms and the data set used for training.

Where does this leave patents that incorporate neural networks?

In many cases, this leaves us no different from where “old school” technologies sit.  On the inventorship issue, the US patent office doesn’t generally review inventorship before granting the patent.  Enablement typically comes up in what the patent office sees as “unpredictable arts.”  This historically has not been computer-related technologies, but tends to surface in the patents for biological and chemical inventions.  Depending on how the claims are written and what art unit of the patent office reviews the application, enablement may never be mentioned before the patent is granted.

Then why do I raise these topics? Because patents once issued can be challenged and often are challenged on these points in litigation and more recently, through post grant review proceedings.  So, while patents may issue – heads up – these points may be coming down the pike.


[1] Hybritech Inc. v. Monoclonal Antibodies Inc., 802 F. 2d 1367, 1376 (Fed. Cir. 1986).

[2] Falana v Kent State Univ, 669 F.3d 1349 (Fed Cir 2012).

[3] Burroughs Wellcome Co. v. Barr Lab., Inc., 40 F.3d 1223, 1228 (Fed.Cir.1994).

[4] Stern v. Tr. of Columbia Univ., 434 F.3d 1375 (Fed. Cir. 2006); Fritsch v. Lin, 21 USPQ2d 1737, 1739 (Bd. Pat. App. & Inter. 1991).

[5] 35 U.S.C. §112(a).