Arize AI debuted new capabilities for fine tuning and monitoring large language models (LLMs). The offering brings greater control and insight to teams looking to build with LLMs.
Now available as part of the free product, Arize's LLM observability tool evaluates LLM responses, pinpoints where to improve with prompt engineering, and identifies fine-tuning opportunities using vector similarity search.
The new offering is built to work in tandem with Phoenix, an open source library for LLM evaluation.
Leveraging Arize, teams can:
- Detect Problematic Prompts and Responses: By monitoring a model's prompt/response embeddings performance using LLM evaluation scores and cluster analysis, teams can narrow in on areas their LLM needs improvement.
- Analyze Clusters Using LLM Evaluation Metrics and GPT-4: Automatically generate clusters of semantically similar data points and sort by performance. Arize supports LLM-assisted evaluation metrics, task-specific metrics, along with user feedback. An integration with ChatGPT also enables teams to analyze clusters for deeper insights.
- Improve LLM Responses with Prompt Engineering: Pinpoint prompt/response clusters with low evaluation scores. Workflows suggest ways to augment prompts to help your LLM models generate better responses and improve acceptance rates.
- Fine-Tune Your LLM Using Vector Similarity Search: Find problematic clusters, such as inaccurate or unhelpful responses, to fine-tune with better data. Vector-similarity search clues you into other examples of issues emerging, so you can begin data augmentation before they become systemic.
- Leverage Pre-Built Clusters for Prescriptive Analysis: Use pre-built global clusters identified by Arize algorithms, or define custom clusters of your own to simplify RCA and make prescriptive improvements to your generative models.
"Despite the power of these models, the risk of deploying LLMs in high risk environments can be immense," notes Jason Lopatecki, CEO and Co-Founder of Arize. "As new applications get built, Arize LLM observability is here to provide the right guardrails to innovate with this new technology safely."
The Latest
Industry experts offer predictions on how NetOps, Network Performance Management, Network Observability and related technologies will evolve and impact business in 2025 ...
In APMdigest's 2025 Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 6 covers cloud, the edge and IT outages ...
In APMdigest's 2025 Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 5 covers user experience, Digital Experience Management (DEM) and the hybrid workforce ...
In APMdigest's 2025 Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 4 covers logs and Observability data ...
In APMdigest's 2025 Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 3 covers OpenTelemetry, DevOps and more ...
In APMdigest's 2025 Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 2 covers AI's impact on Observability, including AI Observability, AI-Powered Observability and AIOps ...
The Holiday Season means it is time for APMdigest's annual list of predictions, covering IT performance topics. Industry experts — from analysts and consultants to the top vendors — offer thoughtful, insightful, and often controversial predictions on how Observability, APM, AIOps and related technologies will evolve and impact business in 2025 ...
Technology leaders will invest in AI-driven customer experience (CX) strategies in the year ahead as they build more dynamic, relevant and meaningful connections with their target audiences ... As AI shifts the CX paradigm from reactive to proactive, tech leaders and their teams will embrace these five AI-driven strategies that will improve customer support and cybersecurity while providing smoother, more reliable service offerings ...
We're at a critical inflection point in the data landscape. In our recent survey of executive leaders in the data space — The State of Data Observability in 2024 — we found that while 92% of organizations now consider data reliability core to their strategy, most still struggle with fundamental visibility challenges ...