This case study walks through how we implemented a LLM enhanced search engine for RightSubmission (acquired), a regulatory consulting software tool.
To market a new medical device in the U.S., obtaining authorization from the FDA is a critical step, often achieved through a 510(k) submission. This process requires demonstrating that the new device is substantially equivalent to a previously cleared device. However, identifying such devices can be time-consuming due to the limitations of the FDA's web interface.
Adding to the challenge, the device data needed for a 510(k) submission is only available in non-searchable PDF files within the FDA's system. Fortunately, because the dataset is public, it’s possible to develop a more efficient and user-friendly solution to streamline this process.
Search implementations are not a new concept, so why does RightSubmission need a large language model (LLM)? The answer lies in the unique challenges consultants face when searching for substantially equivalent devices.
On the FDA site, searches are limited to fixed attributes like submission dates or company names, which can be helpful for narrowing options but still require consultants to manually review dozens or even hundreds of devices to find substantially equivalent ones. What they truly need, however, is the ability to "find devices with similar indications for use to my new one." This type of search requires the computer to understand and compare the nuances of language—an area where LLMs excel.
By leveraging an LLM, RightSubmission can bridge this gap, offering consultants a powerful tool to streamline their work.
The old workflow involved running a keyword search like "heart monitor" and then manually reviewing 510(k)s for fetal monitoring. The new search allows the consultant to search "non-invasive maternal fetal heart monitor" and immediately retrieve relevant results.
Most search engines are able to deal with misspelled content by applying stemming, but it isn't perfect. A LLM enhanced search is able to improve on this because of its tokenization and vectorization workflow.
Effectively using the old search required some insider knowledge both around how to structure the search and what to search for. The new LLM based search removes the need to know how to "search correctly" making it easier for new consultants to get started.