Press Release
QuSeed: artificial intelligence at the service of innovation
At Startup Bakery, artificial intelligence is much more than just a simple ingredient in the business recipe. We go beyond the use of AI. We create it and make it available to our startups to drive innovation.
So do you do Deep Tech startups? No! But we believe that AI is an enabling element, a lever to exploit to amplify the impact of our SaaS (in this regard, we have already talked about it in the past).
And what do you do?
We do AI. To explain exactly and practically what we do, there is no better way than to read the recipe of QuSeed and see the ingredients used.
QuSeed is Startup Bakery's proprietary SaaS that supports us in researching innovation trends. It is constantly evolving, allowing us to explore cutting-edge technologies, test them quickly, and integrate them into the development cycle, making the new components available within the entire group and thus creating a ready-to-use AI.
Since our SaaS collects and analyzes a large amount of unstructured data (i.e., articles found online) daily, we decided to employ a series of NLP (Natural Language Processing) techniques to extract value from this information. These data are also paired with financial data, which are manipulated and exploited to identify significant trends in the world of innovation.
Below are illustrated some of the main technologies and techniques we adopt for the development of QuSeed.
Keyword extraction
Keyword extraction from texts allows for faster consultation and processing of documents, briefly providing the key terms around which a text is built. One of the best natural language processing libraries designed to simplify the extraction of significant keywords from a text is keyBERT, which – based on embedding techniques – uses vector representations of words to quickly identify the semantic keys of a document.
keyBERT enables us to extract essential information from the texts collected in QuSeed, simplifying text analysis and reducing the complexity of NLP operations.
Vector databases
Vector databases represent a type of database that implements a series of techniques to organize, store, and quickly retrieve information based on vectors and not just on identifiers or textual labels.
Thanks to the explosion of applications that use LLM, generative AI, and semantic search, vector databases have become a must to efficiently process the vast amount of data generated by models and to enable the introduction of advanced features, such as semantic information retrieval, long-term memory, clustering, etc.
Within QuSeed, we use vector databases to store the embeddings of the articles to be analyzed and utilize them for content search (semantic search) and to provide context to our chatbot (Austin), making it capable of answering questions regarding these contents.
Semantic search
Semantic search is an information retrieval approach that aims to understand the meaning and context of a search query in order to return more relevant and meaningful results compared to a search based solely on keywords. This technology seeks to interpret the meaning of words in the query and provide results that are more consistent with the user's actual intent.
QuSeed’s free searches leverage this approach, allowing users to receive relevant answers, both in searching for financial data and in navigating the news we collect.
OpenAI API and LangChain
The OpenAI APIs provide access to powerful language models with natural language processing (NLP) capabilities. These models are capable of responding intelligently to inputs based on natural language.
LangChain is a framework designed to simplify the creation of NLP applications – using Large Language Models (LLMs) – that are context-aware (providing contextually appropriate responses) and wide-ranging (allowing for expanded use cases of LLMs).
The two, combined with QuSeed's robust internal database, power our proprietary chatbot Austin.
OpenAI enables Austin to understand and respond naturally to human input, creating a smooth and intuitive communication experience, allowing users to interact with the chatbot as they would with a human being, without having to adapt their language or terminology.
The LangChain framework allows us to manage and chain requests sent to OpenAI, ensuring that each conversation with Austin is coherent and well-contextualized.
Statistical analysis and Time series analysis
Statistical analysis is the method used to collect, analyze, and interpret data. One of its branches is time series analysis, which deals with data that change over time. These techniques allow for identifying patterns and correlations in the data in order to better understand the phenomena underlying the data. In the startup field, they can be used to identify trends, new investment opportunities, emerging sectors, etc.
QuSeed utilizes statistical methods such as the Mann-Kendall test to conduct in-depth time series analysis, detect patterns and anomalies, and identify the presence of hidden trends in the data or significant events over time. Additionally, it uses derived time series from proprietary metrics related to the analyzed articles, measuring aspects such as the Information Diffusion Rate or user interest.
QuSeed also conducts a detailed study on the correlation between different time series, providing valuable insights into the relationships and interconnections between various data, thus giving a comprehensive and in-depth view of the informational landscape being addressed.
SetFit and Contrastive Learning
SetFit is a powerful and versatile framework that can be used to train a variety of machine learning models on small datasets. Meanwhile, contrastive learning is a machine learning approach that relies on differentiating between similar and different examples with the aim of learning to distinguish between examples that belong to the same category and those that belong to different categories.
Both are used in the learning process of Bouncer, the proprietary machine learning model in QuSeed for advanced filtering of all acquired document data, allowing for sufficient examples to achieve high precision, despite a starting dataset with limited size.
The main goal of Bouncer is to selectively eliminate data deemed unhelpful for our analyses, ensuring that only the most relevant information is included in our studies and offering a targeted and optimized approach to information management.
Clustering
Clustering techniques are methods used in data analysis to group sets of similar observations. We utilize a hybrid approach to clustering, which includes DBSCAN, a technique that identifies clusters based on the density of elements, combined with custom algorithms to dynamically manage the addition of documents to identified clusters.
Clustering allows us to identify the topics of QuSeed, i.e., the sets of articles from which trend analyses begin.
Research and development are the foundation of our work
All these technologies and techniques enable QuSeed to:
Extract essential information from the collected texts
Store and retrieve information efficiently
Search for information in a relevant and meaningful way
Understand and respond naturally to human inputs
Identify trends, new investment opportunities, and emerging sectors
Selectively eliminate data deemed unhelpful.
QuSeed is a tool that can be used by startups, companies, and investors to identify innovation trends and make informed decisions.
However, the technologies listed in this article are just a few of those we adopt and that make up both QuSeed and the AI components available to each of our startups.
The choice of each technology, framework, and technique is the result of months of research and testing in order to identify the best path in terms of performance, integration, quality, and costs.
Sustainable technological innovation is the main ingredient at Startup Bakery, and for this reason, we constantly invest in research and development, aiming to identify the most advanced technologies and techniques to generate the best business ideas and support the startups in our startup studio in their growth journey.
Startup Bakery is the Italian startup studio specialized in creating B2B SaaS companies with Artificial Intelligence. We offer aspiring Co-Founders the opportunity to develop a business idea. We create investment opportunities for Professional Investors. We help companies in the innovation process.