questel_logo
closeClose
  • Patent
    menu-bracket
    • Our Patent solutions
  • TrademarkNew
    menu-bracket
    • Our Trademark solutions
  • Innovation
    menu-bracket
    • Our Innovation solutions
  • SolutionsAI
    menu-bracket
    • Solutions
  • Resources
    menu-bracket
  • Contactmenu-bracket
More
    • Learn & Supportmenu-bracket
      • Learn and support
          • Webinars & Eventsmenu-bracket
            Are you interested in attending one of our online or onsite event?
          • Product Trainingsmenu-bracket
            Customer success is our priority. Increase your skills in the use of Questel’s software
          • Product Newsmenu-bracket
            A platform dedicated to software and platforms news and evolutions
          • Best-in-class Customer Experiencemenu-bracket
            Our goal is to exceed our clients' expectations and share best practices
          • IP Trainingmenu-bracket
            Increase the IP-IQ of your entire organization with engaging IP training programs
          • Newsletter subscriptionmenu-bracket
            Sign up for our quarterly patent and trademark newsletters and set your email preferences below.
          • Resource Hubmenu-bracket
            Stay up-to-date with industry best practices with our latest blogs
    • About Questelmenu-bracket
      • Careers →menu-bracket
        • Login Pages →menu-bracket
          • Learn & Supportmenu-bracket
            • Learn and support
                • Webinars & Eventsmenu-bracket
                  Are you interested in attending one of our online or onsite event?
                • Product Trainingsmenu-bracket
                  Customer success is our priority. Increase your skills in the use of Questel’s software
                • Product Newsmenu-bracket
                  A platform dedicated to software and platforms news and evolutions
                • Best-in-class Customer Experiencemenu-bracket
                  Our goal is to exceed our clients' expectations and share best practices
                • IP Trainingmenu-bracket
                  Increase the IP-IQ of your entire organization with engaging IP training programs
                • Newsletter subscriptionmenu-bracket
                  Sign up for our quarterly patent and trademark newsletters and set your email preferences below.
                • Resource Hubmenu-bracket
                  Stay up-to-date with industry best practices with our latest blogs
          • About Questelmenu-bracket
            • Careers →menu-bracket
              • Login Pages →menu-bracket
              EN
              questel-menu
              Patent sequence search - Orbit BioSequence
              Blog Post / Published, July 28, 2022

              Patent sequence search - Orbit BioSequence

              Back to Resource hub
              Back to Resource hub
              overview

              OBS: From simple patent sequence search to variant analysis

              DNA, RNA sequences as well as proteins have been disclosed in patents since the 60s and a few even before that. Many laws have been created and modified over time to allow different types of biological material to be patented, such as naturally occurring sequences, modified sequences, sequences used in diagnostics, sequences from plants and many other types. We have recently seen that vaccines are a hot topic and some, such as RNA vaccines, do include sequences. Industrial domains that publish sequences could seem surprising, but the food industry or detergent manufacturers for instance are some of them. Obviously, pharmaceutical industry, biotech, agrochemical and seed companies produce the bulk of sequence patents. So, why is patent sequence searching important and why is it different to other types of patent searching?

               

              Patent Sequence Data

              Starting in the 90s, and the human genome project, genomic and mRNA sequences started to become more common in patents. In some cases, whole genomes (from bacteria, fungus) which can be made up of millions of base pairs were published. Private companies disclosed and, in some cases claimed millions of short sequences. All this is happening when all patents were purely filed on paper. Electronic filing of patents and supplemental materials such as sequence listings finally became available in part due to sequence patent . Since then, we have seen an increase in the number of patents with sequences, and despite the massive rise in Chinese patents, the worldwide newly published number of patents with sequences still follows a linear curve.

               

              Published sequence patents trends in Orbit BioSequence             Historical trend of newly published sequence patents from 2000 to 2020 available in Orbit BioSequence. 

              The historical big three authorities (USPTO, EPO and WIPO) publish their sequences. Some other authorities are very compliant such as JPO, KIPO and CIPO.  Others are less systematic or stuck in the past, unfortunately. But even for highly compliant authorities, rules and laws on what sequences should be disclosed vary. It is, thus, highly recommended to have a family view of your patents since a sequence patent might be different in an EPO document than in the USPTO or WIPO documents of the same family. 

              Why patent sequence search is different? 

              Traditional IP searching is done with keywords.  Since searching with keywords is imperfect, they are often combined with patent classes, synonymous lists, and many other features that, basically, attempt to alleviate the pain induced by the lack of accuracy of keywords. 

              Biological sequence searches are different for several reasons.  First, there is a common language to describe DNA/RNA and amino acid sequences, entirely independent from the native language the patent is written in. So, no need for natural language translation. 

              Second, since sequences can be very long, several publication standards have existed over time to treat them separately in a sequence listing.  Thus, a large majority of published sequences are simple to treat electronically. This can be contrasted with chemistry where images are still an acceptable form of publication. 

              Third, unless your sequence is very short, you will always want to find sequences similar to yours, not just identical.  This is particularly important since small errors (OCR mistakes, publisher errors) can be somehow controlled. By contrast, if you searched for the keywords “bread yeast”, you would not find “bead yeast” even if the latter could be a spelling mistake.   

              Fourth, for the last 20 years or so, sequences published in a patent are numbered and referred to by the keywords SEQ ID NO.  It is easy in most cases to know if, say, hit sequence 5 is claimed since it is referred to by its number in the claims section as SEQ ID NO. 5. This is a unique feature of sequences and one that is critically important, allowing us to highlight sequence instances as (claimed) for instance. 

               A sequence aligned to a sequence appearing in three USPTO 

              A sequence aligned to a sequence appearing in three USPTO documents, claimed or not in Orbit BioSequence.

              Alignments and algorithms 

              Patent sequence searching consists in aligning your query sequence to sequences in a database using specific algorithms and parameters.  This is all pretty complicated, however, it can be limited to a few use cases. At the risk of oversimplifying the problem, either you use a long gene sequence to find similar sequences, or you use a short sequence. For the former, everything works well, just make sure to compare your gene to both nucleotide and protein databases since you don’t know beforehand what a patent could claim or disclose. In the latter case of a short sequence, things are bit more difficult. You might want to find sequences that are perfectly matching your query or permit a few mismatches. Do you want to allow gaps?  If you use 3 or 6 antibody CDRs, do you want all aligned to other CDRs, or heavy or light chains?  All those questions might lead to different algorithms and parameters.But don’t worry, we have extensive documentation and a great helpdesk. Complex problems don’t always lead to simple solutions! 

               

              Pairwise vs. Variant multiple sequence alignment 

              Eventually, you will see pairwise alignments, in other words, your query sequence aligned to a patent sequence. This will give you intricate details of the differences between the two sequences and, combined with the available patent information, will help you decide if this alignment is relevant to your FTO, patentability, …  Indeed, there can be a lot of sequences in the same patent family and many families.  You will need to browse through many, but we can help with filters that will lead to only the most relevant alignments and families.  However, you will miss a global view of the alignments. How many patent sequences have a lysine at position 34 of your query?  This can only be done with a variant analysis. 

              A variant analysis will stack all patent sequences aligned to your query and will give you a global view for each query position. In other words, it will create a multiple alignment based on your query sequence. You can query, modify, and export the dataset, and most importantly, explore the variations to give you new insight into what your competitors are doing or what area are never modified for instance. 

              Orbit BioSequence Variant Analysis

              Variations at several positions using Orbit BioSequence Variant Analysis

               

              Orbit BioSequence (OBS) 

              With an extensive access to patent sequences as well as non-patent sequences, Orbit BioSequence  is the perfect tool for your FTO, patentability and business intelligence searches. By easily combining patent data and sequences, OBS will make your patent sequence searches a lot easier than other tools purely dedicated to sequences. Antibody and CDR, genes, primers can all be used, combined, and explored. 

               

              Interested to find out more? Contact us for specific advice or support, or watch the recording of our recent webinar Smart & visual sequence variations explorer in patent data By Orbit BioSequence.  

              Questel

              A world leader in intellectual property and innovation management.

              • Patent
                • IP Intelligence Software
                • IP Management Software
                • Preparation & Prosecution Copilots
                • Patent Services
                • Patent Strategy & Administration
              • Trademark
                • Clearance & Watch Platform
                • IP Management Software
                • AI Assistants for Trademark Productivity
                • Trademark, Design & Domain Services
              • Innovation
                • Innovation Management Software
                • Innovation Services
              • Solutions
                • Artificial Intelligence in IP
                • Integrated IP Ecosystem
                • Solutions for Law Firms
                • Solutions for Life Sciences
                • Solutions for R&D and Innovation
                • Language Solutions
              Stay in touch !
              Your email address is only used by Questel to send you the news, updates and offers. You can unsubscribe at any time by clicking on the unsubscribe link at the bottom of the newsletter. For more information about the Processing of your Personal Data by Questel, please visit our pivacy policy page.
              © 2025 Questel. All rights reserved.
              Consent choices
              • Cookies policy
              • Data Privacy Policy
              • Legal notice