ToOne: transform documents into answers
How can businesses rapidly extract, understand, and use insights buried within oceans of unstructured data? Think about contracts, specifications, instruction manuals, policies… To address this challenge, we've created 'ToOne', an AI-powered business analyst application designed to retrieve precise information from extensive documentation in mere seconds. Simply by typing a query into a chat interface, users can get a precise answer instantly. With ToOne, insights are just one query away.
Flexible document management with secure access
Upload and organize standard business formats (.txt, .pdf, .doc, .docx, .md, .html) into dedicated project spaces with granular access controls. Thanks to ToOne's modular design, the system can be easily extended to support additional file types and integrate with various data sources like JIRA tickets, wikis and email systems. Team members see only what they need, while you maintain the flexibility to expand document coverage as your needs grow.
Trust every answer with source-backed responses
Every response comes with direct links to the source documents, letting you verify information instantly and dive deeper when needed. This transparent approach ensures you can always trace answers back to their original context.
Estimate 10.12.2023.xls
Ticket 2345
Under the hood: ToOne's core capabilities
How does ToOne search for the right answer?
ToOne uses Retrieval Augmented Generation (RAG) architecture to process queries against the document base. Document chunks are embedded and semantically indexed during ingestion, enabling contextually relevant passage retrieval at query time. The retrieved passages are then used to augment Large Language Model prompts, ensuring responses are both accurate and anchored in source documentation.
Documents are organized into projects, with metadata describing them to help resolve potential consistency issues. To ensure optimal processing, we prepare documents for embedding by splitting them into smaller chunks first.
The app converts individual chunks of documents into embeddings, and then stores them in a vector database. Options for embedding include OpenAI API and local open-source models. The numerical form of embeddings allows AI models to easily understand the text context, predict similarities, and provide accurate responses.
Users can interact with ToOne by asking questions, which the application embeds in order to search the database for the most suitable responses. This contextual search process involves ranking top responses, which are sent to OpenAI for interpretation by the LLM.
How is the Azure DevOps integration set up?
Authentication
The application implements OAuth2 authentication protocol to connect with Azure DevOps organizations. Persistent access to the Azure DevOps REST API is maintained through authorization tokens, minimizing authentication overhead.
Permissions management
The system inherits Azure DevOps permission structure with two primary access tiers: administrative (Project Manager, Business Analyst) and general user. This model preserves existing organizational access controls while supporting ingestion of various Azure DevOps artifacts.
Main challenges
Avoiding GPT hallucinations.
We aimed for precise answers from documents without compromising on the conversational human-like outputs, in other words, minimal hallucinations from GPT models. To achieve this, we relied on several techniques to prompt, such as assigning roles and demonstrating examples. This approach led to an improvement in output accuracy.
No hallucinations
OpenAI embeddings as a technical solution.
We opted for OpenAI embeddings, in contrast to other AI techniques, due to its ability to process vast amounts of data — an impressive feature that provides a strategic advantage in data management.
Data embedding
Document preparation for embedding.
The process of preparing business documents for effective embedding brought another hurdle our way. Our solution involved breaking vast documents into smaller, digestible chunks. This included formatting those documents — deleting empty lines and tidying up the appearance.
Big docs embedding
Data synchronization with AzDo.
Handling extensive datasets demands intelligent data handling strategies due to REST API limitations. We are looking into ways to sync only data changes, a feat we find both challenging and crucial.
AzDo API
Development roadmap
Broader third-party integration.
As the application already successfully integrates with AzDo, connecting it with other popular platforms such as Google Workspace, Jira, and GitHub can extend its versatility and utility. This will enable users to retrieve data from various sources, simplifying the data analysis process.
Company database connection.
We plan to bridge the connection to the internal databases of companies, increasing the reach of data search.
Use cases
Government organizations
Fintech companies
Healthtech companies
Software development firms
Law firms
Researchers
Consulting firms
Help desks
Customer support
Product recommendation companies
Book clubs